Browser Harness: Give Your LLM a Real Browser and Get Out of the Way

Browser automation has always been brittle — selectors break, flows change, and you spend more time maintaining the scraper than using the data. Browser Harness flips the model: instead of you writing every selector and handler upfront, the LLM writes what's missing as it goes. It landed 4,378 GitHub stars this week, and after spending a few days with it, I can see why. This is the first browser automation tool I've used that actually feels like delegating rather than babysitting.

What It Actually Is

Browser Harness is a thin CDP (Chrome DevTools Protocol) bridge — one WebSocket to Chrome, ~592 lines of Python, no Playwright, no Selenium, no framework overhead. The agent gets helpers.py preloaded with a starter set of browser functions, and when it hits a task that needs something missing, it edits helpers.py itself and carries on.

● agent: wants to upload a file
│
● helpers.py → upload_file() missing
│
● agent edits the harness and writes it    helpers.py   192 → 199 lines
│                                                       + upload_file()
✓ file uploaded

That's not a metaphor. The agent literally opens the file, appends the function, and continues. The next time you run a task that needs file upload, it's already there. The harness gets smarter with use.

Setup is genuinely two steps: enable remote debugging in Chrome, paste the setup prompt into Claude Code or Codex, and you're running. The setup prompt itself is part of the repo — it tells the agent to read install.md, connect to your browser, and do a quick demo by starring the GitHub repo (only if you say yes).

Self-Healing Selectors in Practice

Traditional scrapers break because they're hardcoded to a DOM that changes. Browser Harness sidesteps this — the LLM reads the live DOM, figures out what to click, and if the selector it used last time doesn't exist, it finds another path. There's no selector registry to maintain.

The domain-skills/ directory is where task-specific knowledge accumulates. When the agent figures out something non-obvious about a site — say, that LinkedIn's message input only becomes editable after a specific click sequence — it writes that to a skill file. You don't write the skills; the agent does, from real execution.

# helpers.py — this is what the agent starts with and extends
async def click(selector: str):
    await page.querySelector(selector).click()

async def type_text(selector: str, text: str):
    el = await page.querySelector(selector)
    await el.click()
    await el.type(text)

async def get_text(selector: str) -> str:
    el = await page.querySelector(selector)
    return await page.evaluate('el => el.textContent', el)

# The agent adds upload_file(), handle_captcha(), scroll_to_bottom()
# as it needs them. You never write these manually.

The free cloud tier (3 concurrent browsers, proxies, captcha solving, no card required) means you can spin up sub-agents on isolated browsers for parallel tasks without touching your local Chrome session.

Wiring It Into a Next.js/Supabase App

The practical integration pattern is straightforward: run Browser Harness as a sidecar Python process, expose a simple HTTP endpoint, and call it from your Next.js API routes. Store results in Supabase.

// app/api/scrape/route.ts
export async function POST(req: Request) {
  const { task, url } = await req.json()

  // Browser Harness runs as a local sidecar on :8765
  const result = await fetch('http://localhost:8765/run', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ task, url })
  }).then(r => r.json())

  const supabase = createClient(
    process.env.NEXT_PUBLIC_SUPABASE_URL!,
    process.env.SUPABASE_SERVICE_KEY!
  )

  await supabase.from('scrape_results').insert({
    task,
    url,
    data: result.data,
    created_at: new Date().toISOString()
  })

  return Response.json(result)
}

For production, swap the local sidecar for the cloud browser API and you've got a scalable pipeline with no browser infra to manage.

What I'd Build With This

Competitive pricing monitor. Point it at 10 competitor product pages, have it extract current prices and stock status on a schedule, store in Supabase, surface changes in a Next.js dashboard. The self-healing selectors mean it survives redesigns without maintenance.

AI-assisted job application tracker. Give it a list of job board URLs, have it extract role details, apply basic filters (seniority, salary range where visible), and log results with a screenshot to Supabase Storage. The domain skill for each job board builds up automatically over multiple runs.

End-to-end test writer for your own app. Run it against your staging environment with a task like "complete the onboarding flow as a new user". It'll navigate, find the path through, write the helpers it needs, and you end up with a replayable skill file that doubles as a regression test — without writing a single Playwright test manually.

I've been waiting for browser automation that doesn't feel like constant maintenance. The self-writing helpers and agent-generated skill files are the thing that actually makes this different. Whether it holds up across genuinely complex, multi-session workflows at scale is still an open question for me, but for internal tools and agent pipelines, this is the first approach I'd reach for.