Website Scraper Pro

Verified

Run a local script to scrape a single web page into clean markdown or deterministic JSON with Crawl4AI. Use when: user needs direct page retrieval from a URL...

51 downloads

$ Add to .claude/skills/

$ openclaw install

About This Skill

# Skill: Website Scraper Pro

When to use - The user wants the content of a single web page from a specific URL. - The user wants clean markdown extracted from an article, docs page, blog post, or landing page. - The user wants a JS-aware scrape for a page that depends on client-side rendering. - The user wants deterministic query-focused narrowing of one page without using an AI model inside the skill. - The user wants structured JSON output with markdown, title, links, and metadata.

When NOT to use - The user wants a broad web search across multiple sources. - The user wants a site-wide crawl, recursive crawl, or multi-page extraction workflow. - The user wants AI summarization, synthesis, or answer generation inside the scraper itself. - The user wants authenticated browser automation or interactive form submission.

Commands

Scrape a page to markdown

```bash uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py <URL> ```

Scrape a JS-heavy page

```bash uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py <URL> --js ```

Scrape a page and narrow by query

```bash uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py <URL> --query "<TEXT>" ```

Return deterministic JSON

```bash uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py <URL> --format json ```

Examples

```bash # Default markdown scrape uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com

# JS-aware scrape uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com --js

# Query-focused retrieval uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com --query "documentation examples"

# JSON output uv run /root/.openclaw/workspace/skills/website-scraper-pro/src/main.py https://example.com --format json ```

Output

Default output is clean markdown for a single page.
`--query` keeps the output deterministic and non-LLM.
`--format json` returns deterministic JSON with fields such as `title`, `url`, `markdown`, `links`, and `metadata` when available.

Notes

This v1 does not use AI models internally. It is a deterministic retrieval tool only.
The skill is single-page only. It does not do deep crawling, site maps, schema extraction, or RAG.
`uv run` reads the inline `# /// script` dependency block in `main.py` and installs `crawl4ai` in an isolated environment.
If browser setup is missing, run one-time setup commands such as:
- `uv run --with crawl4ai crawl4ai-setup`
- `uv run --with crawl4ai python -m playwright install chromium`
Do NOT use web search for this workflow when a direct URL is available.
Call `uv run src/main.py` directly as shown above.

Use Cases

Scrape single web pages into clean markdown using Crawl4AI locally
Extract structured JSON data with deterministic schemas from web pages
Run scraping locally without cloud dependencies or API keys
Process scraped content for AI agent consumption and analysis
Build automated content extraction pipelines for research and monitoring

Pros & Cons

Pros

+Local execution — no cloud dependencies, API keys, or rate limits
+Dual output — clean markdown or deterministic JSON based on needs
+Crawl4AI handles JavaScript rendering for dynamic content

Cons

-Single-page focused — no built-in crawling or link following
-Requires local Python environment with Crawl4AI installed

FAQ

What does Website Scraper Pro do?

Run a local script to scrape a single web page into clean markdown or deterministic JSON with Crawl4AI. Use when: user needs direct page retrieval from a URL...

What platforms support Website Scraper Pro?

Website Scraper Pro is available on Claude Code, OpenClaw.

What are the use cases for Website Scraper Pro?

Scrape single web pages into clean markdown using Crawl4AI locally. Extract structured JSON data with deterministic schemas from web pages. Run scraping locally without cloud dependencies or API keys.

100+ free AI tools

Writing, PDF, image, and developer tools — all in your browser.

AI Humanizer

Make AI text undetectable

AI Detector

Free, unlimited

PDF Tools

Merge, split, compress

Next Step

Use the skill detail page to evaluate fit and install steps. For a direct browser workflow, move into a focused tool route instead of staying in broader support surfaces.

Open Free Tools Try AI Detector