Browser Js

Verified

Lightweight CDP browser control for AI agents. Token-efficient alternative to the built-in browser tool — 3-10x fewer tokens per interaction. Use when browsi...

296 downloads

$ Add to .claude/skills/

$ openclaw install

About This Skill

# browser-js

Lightweight CLI that talks to Chrome via CDP (Chrome DevTools Protocol). Returns minimal, indexed output that agents can act on immediately — no accessibility tree parsing, no ref hunting.

Setup

```bash # Install dependency (one-time, in the skill scripts/ dir) cd scripts && npm install

# Ensure browser is running with CDP enabled. # With OpenClaw: # browser start profile=openclaw # Or manually: # google-chrome --remote-debugging-port=18800 --user-data-dir=~/.browser-data ```

The tool connects to `http://127.0.0.1:18800` by default. Override with `CDP_URL` env var.

Alias setup (optional)

```bash mkdir -p ~/.local/bin cat > ~/.local/bin/bjs << 'WRAPPER' #!/bin/bash exec node /path/to/scripts/browser.js "$@" WRAPPER chmod +x ~/.local/bin/bjs ```

Commands

``` bjs tabs List open tabs bjs open <url> Navigate to URL bjs tab <index> Switch to tab bjs newtab [url] Open new tab bjs close [index] Close tab bjs elements [selector] List interactive elements (indexed) bjs click <index> Click element by index bjs type <index> <text> Type into element bjs upload <path> [selector] Upload file to input (bypasses OS dialog) bjs text [selector] Extract visible page text bjs html <selector> Get element HTML bjs eval <js> Run JavaScript in page bjs screenshot [path] Save screenshot bjs scroll <up|down|top|bottom> [px] bjs url Current URL bjs back / forward / refresh bjs wait <ms>

Coordinate commands (cross-origin iframes, captchas, overlays): bjs click-xy <x> <y> Click at page coordinates via CDP Input bjs click-xy <x> <y> --double Double-click at coordinates bjs click-xy <x> <y> --right Right-click at coordinates bjs hover-xy <x> <y> Hover at page coordinates bjs drag-xy <x1> <y1> <x2> <y2> Drag between coordinates bjs iframe-rect <selector> Get iframe bounding box (for click-xy targeting) ```

How it works

`elements` scans the page for all interactive elements (links, buttons, inputs, selects, etc.) — including those inside shadow DOM (web components). This means sites like Reddit, GitHub, and other modern SPAs that use shadow DOM are fully supported. The scan recursively pierces all shadow roots.

Returns a compact numbered list:

``` [0] (link) Hacker News → https://news.ycombinator.com/news [1] (link) new → https://news.ycombinator.com/newest [2] (input:text) q [3] (button) Submit ```

Then `click 3` or `type 2 search query` — immediately actionable, no interpretation needed.

Auto-indexing: `click` and `type` auto-index elements if not already indexed. You can skip calling `elements` first and go straight to `click`/`type` after `open`. Call `elements` explicitly when you need to see what's on the page.

After navigation or AJAX changes: Elements get re-indexed automatically on next `click`/`type` if stamps are stale. For manual re-index, call `elements` again.

Real mouse events: `click` uses CDP `Input.dispatchMouseEvent` (mousePressed + mouseReleased) instead of JS `.click()`. This triggers React/Vue/Angular synthetic event handlers that ignore plain `.click()` calls. Works reliably on SPAs like Instagram, GitHub, LinkedIn.

File uploads

`upload` uses CDP's `DOM.setFileInputFiles` to inject files directly into hidden `<input type="file">` elements — no OS file picker dialog. Works with Instagram, Twitter, any site with file uploads.

```bash bjs upload ~/photos/image.jpg # auto-finds input[type=file] bjs upload ~/docs/resume.pdf "input.file-drop" # specific selector ```

Token efficiency

| Approach | Tokens per interaction | Notes | |----------|----------------------|-------| | bjs | ~50-200 | Indexed list, 1-line responses | | browser tool (snapshot) | ~2,000-5,000 | Full accessibility tree | | browser tool + thinking | ~3,000-8,000 | Plus reasoning to find refs |

Over a 10-step flow: ~1,500 tokens (bjs) vs ~30,000-80,000 (browser tool).

Typical flow

```bash bjs open https://example.com # Navigate bjs elements # See what's clickable bjs click 5 # Click element [5] bjs type 12 "hello world" # Type into element [12] bjs text # Read page content bjs screenshot /tmp/result.png # Verify visually ```

Shadow DOM support

bjs automatically pierces shadow DOM boundaries. Sites built with web components (Reddit, GitHub, etc.) work out of the box — `elements`, `click`, `type`, and `text` all recurse into shadow roots. No special flags needed.

Coordinate commands (iframes, captchas, overlays)

When you can't use `click` by index — e.g. the target is inside a cross-origin iframe (captcha checkbox, payment form, OAuth widget) — use coordinate-based commands that dispatch real CDP Input events at the OS level. These bypass all DOM boundaries.

Workflow for clicking inside an iframe: ```bash bjs iframe-rect 'iframe[title*="hCaptcha"]' # Get bounding box # Output: x=95 y=440 w=302 h=76 center=(246, 478)

bjs click-xy 125 458 # Click checkbox position ```

`iframe-rect` returns the iframe's position on the page. Add offsets to target specific elements inside it (e.g. a checkbox is typically near the left side).

Other uses:
`hover-xy` — trigger hover menus, tooltips that need mouse position
`drag-xy` — slider controls, drag-and-drop, canvas interactions
`click-xy --double` — double-click to select text, expand items
`click-xy --right` — context menus

When to use coordinate commands vs `click`:
`click <index>` — always preferred when the element shows up in `elements`
`click-xy` — only when the target is inside a cross-origin iframe or otherwise unreachable by DOM indexing

Tips

`elements` with a CSS selector narrows scope: `bjs elements ".modal"`
`eval` runs arbitrary JS and returns the result — use for custom extraction
`text` caps at 8KB — enough for most pages, won't blow up context
`html <selector>` caps at 10KB — for inspecting specific elements
Pipe through `grep` to filter: `bjs elements | grep -i "submit\|login"`

Use Cases

Navigate multi-step web flows (login, search, checkout) with 3-10x fewer tokens than built-in browser tools
Interact with shadow DOM elements on modern SPAs like Reddit, GitHub, and LinkedIn
Upload files to web forms programmatically without triggering OS file picker dialogs
Click inside cross-origin iframes (captcha checkboxes, payment forms) using coordinate-based CDP events
Extract visible page text capped at 8KB for token-efficient content scraping

Pros & Cons

Pros

+Uses ~50-200 tokens per interaction vs 2,000-8,000 for standard browser tools — 10x savings over multi-step flows
+Auto-pierces shadow DOM boundaries so modern web component sites work out of the box
+Real CDP mouse events (mousePressed/mouseReleased) trigger React/Vue/Angular handlers that plain .click() misses
+Compact indexed element list lets agents act immediately without parsing accessibility trees

Cons

-Requires Chrome running with CDP enabled (remote-debugging-port) — not a standalone browser
-Coordinate-based commands for cross-origin iframes need manual offset calculation
-No built-in session management or cookie persistence across runs

FAQ

What does Browser Js do?

Lightweight CDP browser control for AI agents. Token-efficient alternative to the built-in browser tool — 3-10x fewer tokens per interaction. Use when browsi...

What platforms support Browser Js?

Browser Js is available on Claude Code, OpenClaw.

What are the use cases for Browser Js?

Navigate multi-step web flows (login, search, checkout) with 3-10x fewer tokens than built-in browser tools. Interact with shadow DOM elements on modern SPAs like Reddit, GitHub, and LinkedIn. Upload files to web forms programmatically without triggering OS file picker dialogs.

100+ free AI tools

Writing, PDF, image, and developer tools — all in your browser.

AI Humanizer

Make AI text undetectable

AI Detector

Free, unlimited

PDF Tools

Merge, split, compress

Next Step

Use the skill detail page to evaluate fit and install steps. For a direct browser workflow, move into a focused tool route instead of staying in broader support surfaces.

Open Free Tools Try AI Detector