Browser Automation CLI
VerifiedAutomate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications.
$ Add to .claude/skills/ About This Skill
# Browser Automation
Automate browser interactions using Stagehand CLI with Claude.
First: Environment Selection (Local vs Remote)
- The skill automatically selects between local and remote browser environments:
- If Browserbase API keys exist (BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID in .env file): Uses remote Browserbase environment
- If no Browserbase API keys: Falls back to local Chrome browser
- No user prompting: The selection happens automatically based on available configuration
Setup (First Time Only)
Check `setup.json` in this directory. If `setupComplete: false`:
```bash npm install # Install dependencies npm link # Create global 'browser' command ```
Commands
All commands work identically in both modes:
```bash browser navigate <url> # Go to URL browser act "<action>" # Natural language action browser extract "<instruction>" ['{}'] # Extract data (optional schema) browser observe "<query>" # Discover elements browser screenshot # Take screenshot browser close # Close browser ```
Quick Example
```bash browser navigate https://example.com browser act "click the Sign In button" browser extract "get the page title" browser close ```
Mode Comparison
| Feature | Local | Browserbase | |---------|-------|-------------| | Speed | Faster | Slightly slower | | Setup | Chrome required | API key required | | Stealth mode | No | Yes | | Proxy/CAPTCHA | No | Yes | | Best for | Development | Production/scraping |
Best Practices
- Always navigate first before interacting
- View screenshots after each command to verify
- Be specific in action descriptions
- Close browser when done
Troubleshooting
- Chrome not found: Install Chrome or use Browserbase mode
- Action fails: Use `browser observe` to discover available elements
- Browserbase fails: Verify API key and project ID are set
For detailed examples, see EXAMPLES.md. For API reference, see REFERENCE.md.
Use Cases
- Navigate websites and fill forms using natural language action commands
- Extract structured data from dynamic web pages with AI-powered observation
- Take screenshots of rendered pages for visual QA or monitoring
- Scrape bot-protected sites in production using Browserbase stealth mode
- Automate multi-step web workflows like login, search, and data export
Pros & Cons
Pros
- +Natural language actions (e.g. 'click the Sign In button') eliminate the need to write CSS selectors
- +Auto-detects and switches between local Chrome and cloud Browserbase environments based on config
- +Browserbase mode provides built-in stealth, proxy, and CAPTCHA bypass for production scraping
Cons
- -Requires Browserbase API key and project ID for cloud/stealth mode — no free alternative built in
- -Natural language action parsing can misidentify elements on complex pages with ambiguous UI labels
- -First-time setup requires npm install and npm link — not a zero-config experience
FAQ
What does Browser Automation CLI do?
What platforms support Browser Automation CLI?
What are the use cases for Browser Automation CLI?
100+ free AI tools
Writing, PDF, image, and developer tools — all in your browser.