Skip to content

Scrapling - Stealth Web Scraper

Verified

Web scraping using Scrapling — a Python framework with anti-bot bypass (Cloudflare Turnstile, fingerprint spoofing), adaptive element tracking, stealth headl...

286 downloads
$ Add to .claude/skills/

About This Skill

# Scrapling Skill

Source: https://github.com/D4Vinci/Scrapling (open source, MIT-like license) PyPI: `scrapling` — install before first use (see below)

> ⚠️ Only scrape sites you have permission to access. Respect `robots.txt` and Terms of Service. Do not use stealth modes to bypass paywalls or access restricted content without authorization.

Installation (one-time, confirm with user before running)

```bash pip install scrapling[all] patchright install chromium # required for stealth/dynamic modes ```

  • `scrapling[all]` installs `patchright` (a stealth fork of Playwright, bundled as a PyPI package — not a typo), `curl_cffi`, MCP server deps, and IPython shell.
  • `patchright install chromium` downloads Chromium (~100 MB) via patchright's own installer (same mechanism as `playwright install chromium`).
  • Confirm with user before running — installs ~200 MB of dependencies and browser binaries.

Script

`scripts/scrape.py` — CLI wrapper for all three fetcher modes.

```bash # Basic fetch (text output) python3 ~/skills/scrapling/scripts/scrape.py <url> -q

# CSS selector extraction python3 ~/skills/scrapling/scripts/scrape.py <url> --selector ".class" -q

# Stealth mode (Cloudflare bypass) — only on sites you're authorized to access python3 ~/skills/scrapling/scripts/scrape.py <url> --mode stealth -q

# JSON output python3 ~/skills/scrapling/scripts/scrape.py <url> --selector "h2" --json -q ```

Fetcher Modes

  • http (default) — Fast HTTP with browser TLS fingerprint spoofing. Most sites.
  • stealth — Headless Chrome with anti-detect. For Cloudflare/anti-bot.
  • dynamic — Full Playwright browser. For heavy JS SPAs.

When to Use Each Mode

  • `web_fetch` returns 403/429/Cloudflare challenge → use `--mode stealth`
  • Page content requires JS execution → use `--mode dynamic`
  • Regular site, just need text/data → use `--mode http` (default)

Python Inline Usage

  • For custom logic beyond the CLI, write inline Python. See `references/patterns.md` for:
  • Adaptive scraping (`auto_save` / `adaptive` — saves element fingerprints locally)
  • Session/cookie handling
  • Async usage
  • XPath, find_similar, attribute extraction

Notes

  • MCP server (`scrapling mcp`): starts a local network service for AI-native scraping. Only start if explicitly needed and trusted — it exposes a local HTTP server.
  • `auto_save=True`: persists element fingerprints to disk for adaptive re-scraping. Creates local state in working directory.
  • Stealth/dynamic modes use Chromium headless — no `xvfb-run` needed.
  • For large-scale crawls, use the Spider API (see Scrapling docs).

Use Cases

  • Automate browser interactions for web scraping and testing
  • Extract structured data from websites using headless browser automation
  • Control browsers via MCP protocol for AI-driven web automation
  • Navigate websites, fill forms, and capture screenshots programmatically
  • Scrape dynamic JavaScript-rendered content that simple HTTP requests cannot access

Pros & Cons

Pros

  • +Solid adoption with 572+ downloads
  • +Clean CLI interface integrates well with automation pipelines and AI agents
  • +Well-structured approach ensures consistent and reliable results
  • +Integrates smoothly into existing workflows

Cons

  • -Requires installing external dependencies before use
  • -Focused scope means it may not cover edge cases outside its primary use case
  • -May require adaptation for non-standard project configurations

FAQ

What does Scrapling - Stealth Web Scraper do?
Web scraping using Scrapling — a Python framework with anti-bot bypass (Cloudflare Turnstile, fingerprint spoofing), adaptive element tracking, stealth headl...
What platforms support Scrapling - Stealth Web Scraper?
Scrapling - Stealth Web Scraper is available on Claude Code, OpenClaw.
What are the use cases for Scrapling - Stealth Web Scraper?
Automate browser interactions for web scraping and testing. Extract structured data from websites using headless browser automation. Control browsers via MCP protocol for AI-driven web automation.

100+ free AI tools

Writing, PDF, image, and developer tools — all in your browser.