Scrapling - Stealth Web Scraper

Verified

Web scraping using Scrapling — a Python framework with anti-bot bypass (Cloudflare Turnstile, fingerprint spoofing), adaptive element tracking, stealth headl...

286 downloads

$ Add to .claude/skills/

$ openclaw install

About This Skill

# Scrapling Skill

Source: https://github.com/D4Vinci/Scrapling (open source, MIT-like license) PyPI: `scrapling` — install before first use (see below)

> ⚠️ Only scrape sites you have permission to access. Respect `robots.txt` and Terms of Service. Do not use stealth modes to bypass paywalls or access restricted content without authorization.

Installation (one-time, confirm with user before running)

```bash pip install scrapling[all] patchright install chromium # required for stealth/dynamic modes ```

`scrapling[all]` installs `patchright` (a stealth fork of Playwright, bundled as a PyPI package — not a typo), `curl_cffi`, MCP server deps, and IPython shell.
`patchright install chromium` downloads Chromium (~100 MB) via patchright's own installer (same mechanism as `playwright install chromium`).
Confirm with user before running — installs ~200 MB of dependencies and browser binaries.

Script

`scripts/scrape.py` — CLI wrapper for all three fetcher modes.

```bash # Basic fetch (text output) python3 ~/skills/scrapling/scripts/scrape.py <url> -q

# CSS selector extraction python3 ~/skills/scrapling/scripts/scrape.py <url> --selector ".class" -q

# Stealth mode (Cloudflare bypass) — only on sites you're authorized to access python3 ~/skills/scrapling/scripts/scrape.py <url> --mode stealth -q

# JSON output python3 ~/skills/scrapling/scripts/scrape.py <url> --selector "h2" --json -q ```

Fetcher Modes

http (default) — Fast HTTP with browser TLS fingerprint spoofing. Most sites.
stealth — Headless Chrome with anti-detect. For Cloudflare/anti-bot.
dynamic — Full Playwright browser. For heavy JS SPAs.

When to Use Each Mode

`web_fetch` returns 403/429/Cloudflare challenge → use `--mode stealth`
Page content requires JS execution → use `--mode dynamic`
Regular site, just need text/data → use `--mode http` (default)

Python Inline Usage

For custom logic beyond the CLI, write inline Python. See `references/patterns.md` for:
Adaptive scraping (`auto_save` / `adaptive` — saves element fingerprints locally)
Session/cookie handling
Async usage
XPath, find_similar, attribute extraction

Notes

MCP server (`scrapling mcp`): starts a local network service for AI-native scraping. Only start if explicitly needed and trusted — it exposes a local HTTP server.
`auto_save=True`: persists element fingerprints to disk for adaptive re-scraping. Creates local state in working directory.
Stealth/dynamic modes use Chromium headless — no `xvfb-run` needed.
For large-scale crawls, use the Spider API (see Scrapling docs).

Use Cases

Automate browser interactions for web scraping and testing
Extract structured data from websites using headless browser automation
Control browsers via MCP protocol for AI-driven web automation
Navigate websites, fill forms, and capture screenshots programmatically
Scrape dynamic JavaScript-rendered content that simple HTTP requests cannot access

Pros & Cons

Pros

+Solid adoption with 572+ downloads
+Clean CLI interface integrates well with automation pipelines and AI agents
+Well-structured approach ensures consistent and reliable results
+Integrates smoothly into existing workflows

Cons

-Requires installing external dependencies before use
-Focused scope means it may not cover edge cases outside its primary use case
-May require adaptation for non-standard project configurations

FAQ

What does Scrapling - Stealth Web Scraper do?

Web scraping using Scrapling — a Python framework with anti-bot bypass (Cloudflare Turnstile, fingerprint spoofing), adaptive element tracking, stealth headl...

What platforms support Scrapling - Stealth Web Scraper?

Scrapling - Stealth Web Scraper is available on Claude Code, OpenClaw.

What are the use cases for Scrapling - Stealth Web Scraper?

Automate browser interactions for web scraping and testing. Extract structured data from websites using headless browser automation. Control browsers via MCP protocol for AI-driven web automation.

100+ free AI tools

Writing, PDF, image, and developer tools — all in your browser.

AI Humanizer

Make AI text undetectable

AI Detector

Free, unlimited

PDF Tools

Merge, split, compress

Next Step

Use the skill detail page to evaluate fit and install steps. For a direct browser workflow, move into a focused tool route instead of staying in broader support surfaces.

Open Free Tools Try AI Detector