🎯 A customizable, anti-detection cloud browser powered by self-developed Chromium designed for web crawlers and AI Agents.πŸ‘‰Try Now
Back to Blog

Hermes Agent + Scrapeless: 1-Line CDP Integration for Anti-Detection Web Agents

James Thompson
James Thompson

Scraping and Proxy Management Expert

06-May-2026

TL;DR:

  • One config-line integration. Hermes Agent by Nous Research has a built-in browser tool that already speaks the Chrome DevTools Protocol. Pointing it at Scrapeless Scraping Browser is a single browser.cdp_url line in ~/.hermes/config.yaml. No SDK install, no CLI subprocess, no agent-side code change.
  • Every Hermes browser action runs in the Scrapeless cloud browser. browser_navigate, browser_snapshot, browser_click, browser_type, browser_scroll, browser_press, browser_get_images, and browser_vision execute against Chromium hosted in the Scrapeless cloud, behind residential proxies, with anti-detection fingerprinting on every session.
  • Multi-channel reach. Hermes' gateway exposes the agent over Telegram, Discord, Slack, WhatsApp, Signal, email, and a CLI. With Scrapeless wired in, the cloud browser becomes the back-end of any chat-driven research, lead-generation, or monitoring workflow without exposing a separate scraping endpoint.
  • Anti-detection cloud browser, residential proxies in 195+ countries. Scrapeless Scraping Browser handles JavaScript rendering, residential-proxy egress, fingerprint randomization (UA, timezone, WebGL, canvas), and session persistence at the platform level, so the agent can focus on the task instead of evasion plumbing.
  • Three integration paths, same underlying cloud browser. Direct CDP (this post), the Scrapeless MCP server, or the Scrapeless skill dropped into ~/.hermes/skills/. Pick the surface that matches how the rest of the agent is configured.
  • Free to start. New Scrapeless accounts include free Scraping Browser runtime β€” sign up at app.scrapeless.com.

Introduction: from local Chromium to a hardened cloud browser

Hermes Agent is an open-source autonomous agent with persistent memory, autonomous skill creation, and a multi-channel gateway. Out of the box it ships a browser tool that uses an accessibility-tree model β€” pages render to text snapshots with interactive elements labeled @e1, @e2, @e3, and the LLM drives navigation and form-fill against those refs. That works well on benign pages and for documentation lookups.

The commercial web is a different surface. Cloudflare Turnstile, reCAPTCHA, Akamai Bot Manager, IP-reputation lists, and JavaScript-only SPAs sit between every retailer, marketplace, and SERP and any automated client. Local Chromium, even in headed mode, falls into the bucket of traffic those layers reject. The work the agent could do β€” pull pricing from a category page, monitor a competitor's listings, fill an authenticated form, extract a typed dataset for downstream RAG β€” stalls at the first interstitial.

Scrapeless Scraping Browser is an anti-detection cloud browser powered by a self-developed Chromium build. It exposes a Chrome DevTools Protocol endpoint, ships residential proxies in 195+ countries, and randomizes fingerprints per session. Hermes' browser tool already speaks CDP. The integration is one config line. This post walks through the wiring, the prompts the agent will accept, and the discover β†’ extract pattern that scales the combination across sites. For the same Scrapeless Scraping Browser primitive over the Model Context Protocol, see the MCP integration post; for the Python LangChain surface, see the LangChain integration post.


What You Can Do With It

  • Multi-channel research assistant. A Telegram or Slack message asking "compare the homepage and pricing pages of these three competitors" turns into a real cloud-browser session that renders each page, extracts structured records, and replies inside the chat thread.
  • Lead generation from public directories. Have the agent walk a directory listing, extract contact rows, and dedupe by domain β€” the cloud browser handles the JS rendering and the proxy rotation that the directory's anti-bot layer requires.
  • Pricing and stock monitoring. Schedule the agent to render product pages on a cadence, diff against the previous snapshot, and ping a Discord channel when a tracked SKU drops below a threshold.
  • Authenticated form-fill with human in the loop. Drive a job application or a vendor onboarding form to the final review screen, take a full-page screenshot, and stop before submit so a human can approve.
  • SERP monitoring across regions. Pin a residential proxy country (proxy_country=DE, proxy_country=JP) per session and pull the result list a local user would see.
  • Visual-regression QA on web apps. Use browser_vision and full-page screenshots to compare staging vs. production renders without standing up a separate Playwright pipeline.
  • Live web data for RAG. Render publisher pages to clean text, hand the result to the agent's extractor, and embed the typed records into Hermes' persistent memory for retrieval-augmented answers in future turns.

Why Scrapeless Scraping Browser

Scrapeless Scraping Browser is a customizable, anti-detection cloud browser designed for web crawlers and AI agents. For Hermes Agent specifically, it brings:

  • Chrome DevTools Protocol surface β€” Hermes' browser tool already speaks CDP. The cloud browser drops in behind the same tool calls without recompilation, configuration sprawl, or new code paths.
  • Residential proxies in 195+ countries β€” geo-bound queries return the listings a local user would see, with rotation per session and no per-request setup.
  • Cloud-side JavaScript rendering β€” full Chromium with the page hydrated before extraction, so SPAs, infinite-scroll feeds, and lazy-loaded panels are first-class targets for browser_snapshot and browser_vision.
  • Anti-detection fingerprinting on every session β€” UA, timezone, language, screen resolution, WebGL, and canvas randomized per session; consistent identities available via custom-fingerprint API when continuity matters.
  • Session persistence through session_ttl (60–900 seconds) and session_name query parameters on the WSS endpoint, so multi-step Hermes flows reuse the same warm browser, cookies, and scroll position across tool calls.
  • Single management surface β€” one API key, one cloud account, dashboard-side recording for replay, and free runtime credits on the new-account plan.

Get your API key on the free plan at app.scrapeless.com. The full Scraping Browser surface is documented at docs.scrapeless.com.


Prerequisites

  • Hermes Agent installed. The official installer covers Linux, macOS, WSL2, and Termux on Android: curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash. The setup wizard runs on first launch.
  • A Scrapeless account and API key β€” sign up at app.scrapeless.com and copy the key from Settings β†’ API Key Management.
  • Python 3.11 or newer β€” Hermes' runtime requirement.
  • A chat-model API key β€” Hermes is provider-agnostic (Nous Portal, OpenRouter, NVIDIA NIM, Xiaomi MiMo, and any custom OpenAI-compatible endpoint). Configure whichever provider Hermes is already wired to.
  • Basic familiarity with editing ~/.hermes/config.yaml or running Hermes' CLI subcommands.

Install

The full setup is four sub-steps. Each one is independently verifiable, so you can pause and confirm before moving on.

1. Get the Scrapeless API key

Sign up at app.scrapeless.com, open the dashboard, and from Settings β†’ API Key Management create a key. Copy the value β€” it goes into the Hermes config in step 2.

2. Point Hermes' browser tool at the Scrapeless WSS endpoint

Open ~/.hermes/config.yaml (create the file if it does not exist) and add the browser.cdp_url line. The Scrapeless CDP endpoint accepts the API key, the proxy country, and the session TTL as query parameters:

yaml Copy
browser:
  cdp_url: "wss://browser.scrapeless.com/browser?token=YOUR_SCRAPELESS_API_KEY&proxy_country=US&session_ttl=600"

That single line redirects every Hermes browser tool call β€” browser_navigate, browser_snapshot, browser_click, browser_type, browser_scroll, browser_press, browser_get_images, browser_vision β€” through the Scrapeless cloud browser. The accessibility-tree representation Hermes uses to label @e1, @e2, @e3 is generated by the same Chromium build the cloud browser runs, so existing prompts and skills keep working.

If editing the YAML by hand is inconvenient, the CLI form /browser connect "wss://browser.scrapeless.com/browser?token=YOUR_SCRAPELESS_API_KEY&proxy_country=US" does the same thing for the current session without persisting it.

3. Set the API key out of the config file (recommended)

For shared repos or multi-user shells, keep the secret out of the YAML. Hermes config supports ${VAR} substitution; export the key once and reference it from the URL:

macOS / Linux (bash or zsh) β€” append to ~/.zshrc or ~/.bashrc:

bash Copy
export SCRAPELESS_API_KEY="your_api_token_here"
source ~/.zshrc          # or ~/.bashrc

Windows (PowerShell) β€” persistent, user-scoped:

powershell Copy
[Environment]::SetEnvironmentVariable("SCRAPELESS_API_KEY", "your_api_token_here", "User")

Then update the config to interpolate the variable:

yaml Copy
browser:
  cdp_url: "wss://browser.scrapeless.com/browser?token=${SCRAPELESS_API_KEY}&proxy_country=US&session_ttl=600"

4. Verify the connection

Restart Hermes so it picks up the new config, then ask the agent:

"Open https://example.com, take a full-page screenshot, and tell me the page title."

A successful run returns the page title (Example Domain) and the screenshot path within a few seconds. If the agent reports a 401 or a hang on browser_navigate, see the FAQ β€” almost every first-run failure is the API key, the proxy region, or the WSS URL pasted with a stray space.


How you actually use this: prompt your agent

After the config-line change, you drive Scrapeless from Hermes by talking to the agent β€” not by writing CDP glue. The agent owns the discover β†’ extract loop and picks the browser tools turn by turn. Hermes' multi-channel gateway means the same prompts work from Telegram, Discord, Slack, WhatsApp, Signal, email, or the local CLI.

Prompts you can paste

You type What the agent does
"Open https://news.ycombinator.com and return the top five stories with title, URL, author, score, and comment count as JSON." browser_navigate β†’ browser_snapshot β†’ typed extraction.
"Compare the pricing pages of these three SaaS competitors and summarize the differences." Multi-tab navigation, browser_get_images for plan tiers, LLM summary.
"Pull the homepage and pricing page from https://example.com as if I were in Tokyo." Restart the session with proxy_country=JP (see "Pin a region" below), then render.
"Watch this Greenhouse careers page and tell me which roles match staff engineer or infra." Navigate, snapshot the listing block, filter rows by keyword, return structured rows.
"Take a full-page screenshot of https://example.com and save it to example.png." browser_navigate β†’ browser_screenshot (full=true).
"Fill the contact form at <URL> with my name, email, and a short message β€” but stop before submit so I can review." Snapshot the form, map prompt fields to @e1/@e2/…, browser_type, screenshot, halt at the submit ref.
"The extraction came back empty yesterday β€” rerun with session recording enabled so I can replay it." Reissue the same flow with recording=true on the WSS URL; replay link surfaces in the Scrapeless dashboard.
"Open the Amazon product page at <URL> from a US egress and return title, price, rating, review count." Pinned-region session, snapshot, structured extract.

Worked example

You type:

Open https://news.ycombinator.com, return the top five stories with rank, title, URL, author, score, age, and comment count as a JSON array.

The agent's plan (in plain English):

  1. browser_navigate "https://news.ycombinator.com/" β€” open the page in the cloud browser.
  2. browser_snapshot β€” read the accessibility tree to find the row labels and text.
  3. Iterate the first five tr.athing rows; pull title, URL, author, score, age, and comment count from each row and its sibling.
  4. Return a typed JSON array; treat any field absent on the row as null rather than failing the whole extraction.

What you get back:

json Copy
[
  { "rank": 1, "title": "Claude Opus 4.7", "url": "https://www.anthropic.com/news/claude-opus-4-7", "author": "meetpateltech", "score": 889, "age": "3 hours ago", "comments": 693 },
  { "rank": 2, "title": "Codex for Almost Everything", "url": "https://openai.com/index/codex-for-almost-everything/", "author": "mikeevans", "score": 208, "age": "1 hour ago", "comments": 84 },
  { "rank": 3, "title": "Qwen3.6-35B-A3B: Agentic coding power, now open to all", "url": "https://qwen.ai/blog?id=qwen3.6-35b-a3b", "author": "cmitsakis", "score": 586, "age": "4 hours ago", "comments": 286 },
  { "rank": 4, "title": "Cloudflare's AI Platform: an inference layer designed for agents", "url": "https://blog.cloudflare.com/ai-platform/", "author": "nikitoci", "score": 145, "age": "5 hours ago", "comments": 29 },
  { "rank": 5, "title": "Launch HN: Kampala (YC W26) – Reverse-Engineer Apps into APIs", "url": "https://www.zatanna.ai/kampala", "author": "alexblackwell_", "score": 37, "age": "3 hours ago", "comments": 28 }
]
// Schema reflects the typed shape the agent returns. Field values are illustrative samples.

Shaping prompts

Phrasing Effect
"Use a German egress." Restart the cloud-browser session with proxy_country=DE on the WSS URL.
"Keep the session warm for the next ten minutes." Bump session_ttl=600 so multi-step flows reuse the same browser.
"Enable session recording." Append recording=true β€” the dashboard exposes a replayable video for the run.
"Return markdown, not raw HTML." The agent feeds the snapshot through its extractor and returns the markdown view.
"Stop before the final submit." Hermes' built-in pattern β€” drive the form, screenshot, halt at the submit ref.

Steps 1–5 below are the under-the-hood reference. Read them once to see how the discover β†’ extract pattern composes; then trust the agent to apply it to whatever request the operator hands over from chat.


Step 1 β€” Connect to Scrapeless Scraping Browser

The connection is the WSS URL from the install step. The Hermes browser tool dials it on first use and reuses the same socket for the lifetime of the session.

yaml Copy
# ~/.hermes/config.yaml
browser:
  cdp_url: "wss://browser.scrapeless.com/browser?token=${SCRAPELESS_API_KEY}&proxy_country=US&session_ttl=600"

Three query parameters do most of the work:

  • token β€” the Scrapeless API key. Required.
  • proxy_country β€” the residential proxy country (ISO-3166 alpha-2, e.g. US, DE, JP, GB). Defaults to a global pool; pin it for geo-bound listings.
  • session_ttl β€” how long the cloud browser stays alive after the last command, in seconds. Range 60–900. Higher TTLs are right for multi-step flows; the default of 60 is right for one-shot extractions.

Scrapeless manages session spin-up and anti-bot handling at the platform layer, so the cloud browser returns a warm, ready session to Hermes without any connection-handling logic on the agent side.


Step 2 β€” Discover with browser_navigate + browser_snapshot

Open the page and read it as an accessibility tree before extracting. The snapshot returns text labels for every interactive ref (@e1, @e2, @e3, …) and the surrounding text content β€” enough for the agent to pick the right element without guessing CSS selectors.

text Copy
You: Open https://example.com/products and snapshot the page.
Agent: browser_navigate "https://example.com/products"
       browser_snapshot
       [returns accessibility tree with @e1 = search input, @e2 = sort dropdown,
        @e3..@e22 = product cards with title + price + rating refs]

browser_snapshot is the load-bearing call. It is what turns a CDP-style live page into something an LLM can reason about turn by turn. Skip it and the agent ends up running browser_get_html and slicing strings, which is more brittle and uses more tokens. The snapshot is the discover step in the discover β†’ extract pattern; every extraction below assumes it ran first.


Step 3 β€” Extract with structured prompts

With the snapshot in hand, the extraction is a regular LLM tool-call: the agent reads the refs and surrounding text, picks the fields it needs, and returns a typed record. No CSS selectors, no JS evals β€” the snapshot already contains the data the model needs.

text Copy
You: From the snapshot, return the top 10 products as JSON with title, price, rating, and product URL.
Agent: [returns JSON array with 10 rows; missing fields are null]

For non-trivial pages (multi-tab catalogs, infinite-scroll feeds, A/B-rendered variants), supplement the snapshot with browser_scroll to hydrate lazy-loaded panels, then re-snapshot. The cloud browser handles the JS rendering; Hermes handles the loop.


Step 4 β€” Drive a multi-step interaction

The same browser tools handle form-fill, navigation chains, and human-in-the-loop reviews. The pattern: snapshot β†’ identify ref β†’ act β†’ snapshot β†’ next ref.

text Copy
You: Open https://app.example.com/contact, fill name, email, and message,
     screenshot the form, and stop before submit so I can review.
Agent: browser_navigate "https://app.example.com/contact"
       browser_snapshot
       # @e1 [input] "Full name", @e2 [input] "Email",
       # @e3 [textarea] "Message", @e4 [button] "Submit"
       browser_type @e1 "Jane Doe"
       browser_type @e2 "jane@example.com"
       browser_type @e3 "Hello, I'd like to talk about ..."
       browser_screenshot --full review.png
       # halt β€” @e4 is not pressed until the human approves review.png

Real input events fire on the cloud browser, so client-side form validation runs exactly as it would for a human visitor. The stop-before-submit pattern keeps a human in the loop at the last step β€” the recommended default for any action with real-world consequences (job applications, vendor onboarding, payment forms).


Step 5 β€” Pin a region and persist a session across turns

For any target where listings vary by egress region (Google SERPs, Amazon by-marketplace, hotel/flight booking, local-business directories), pin proxy_country to the user's intended region. For multi-step flows that need warm cookies and session state across multiple agent turns (paginated SERPs, authenticated dashboards, multi-page forms), set session_ttl higher and reuse the same session_name.

yaml Copy
# Tokyo egress, 15-minute warm session, replayable video
browser:
  cdp_url: "wss://browser.scrapeless.com/browser?token=${SCRAPELESS_API_KEY}&proxy_country=JP&session_ttl=900&session_name=tokyo-research&recording=true"

Switching regions mid-conversation is a /browser connect away β€” Hermes drops the current socket, dials the new URL, and the next browser_navigate runs through the new exit. Recording is the highest-leverage flag for an unattended pipeline: every run shows up in the Scrapeless dashboard as a replayable video, so when the agent reports an empty extraction the operator can see what the cloud browser actually rendered.


What You Get Back

json Copy
[
  { "rank": 1, "title": "Claude Opus 4.7", "url": "https://www.anthropic.com/news/claude-opus-4-7", "author": "meetpateltech", "score": 889, "age": "3 hours ago", "comments": 693 },
  { "rank": 2, "title": "Codex for Almost Everything", "url": "https://openai.com/index/codex-for-almost-everything/", "author": "mikeevans", "score": 208, "age": "1 hour ago", "comments": 84 },
  { "rank": 3, "title": "Qwen3.6-35B-A3B: Agentic coding power, now open to all", "url": "https://qwen.ai/blog?id=qwen3.6-35b-a3b", "author": "cmitsakis", "score": 586, "age": "4 hours ago", "comments": 286 }
]
// Schema reflects the typed shape the agent returns from a Hacker News snapshot.
// Field values are illustrative samples.

A few honest observations on what to expect when this runs against the live web:

  • Hydration timing varies by site. The cloud browser waits on domcontentloaded by default. SPAs that hydrate prices through a second XHR may need a browser_scroll or a brief wait before the snapshot reflects the final DOM. Re-snapshot once if a field is consistently null.
  • Selector-free extraction is more resilient than CSS selectors but not immune to layout drift. The accessibility tree changes when sites add a new column or rename a button; re-prompt the agent to re-discover refs rather than encoding them in a saved skill.
  • Anti-bot interstitials show up as a redirect in the snapshot. When a site front-loads a Cloudflare or Akamai challenge that the cloud browser cannot transparently complete, the snapshot reports the challenge page rather than the target. Widen the fingerprint or pin a different proxy region.
  • browser_vision complements browser_snapshot. For visually complex pages (price tables embedded as images, charts, infographics), the vision tool is the right escape hatch β€” it sends a screenshot to the multimodal model rather than the accessibility text.
  • Session recording is cheap and high-leverage. recording=true on the WSS URL costs nothing on the free plan and turns "the agent did something weird" into a clickable video in the dashboard.

Conclusion: scale your Hermes-driven web data pipeline

Wiring Hermes Agent to Scrapeless Scraping Browser collapses to one config line β€” browser.cdp_url in ~/.hermes/config.yaml pointed at the Scrapeless WSS endpoint with token, proxy_country, and session_ttl set. Every existing Hermes browser tool (browser_navigate, browser_snapshot, browser_click, browser_type, browser_scroll, browser_press, browser_get_images, browser_vision) flows through the cloud browser with anti-detection, residential proxies, and JavaScript rendering handled at the platform layer. The agent keeps the prompts and skills it already had; the cloud browser keeps the agent on real pages.

For the same Scrapeless Scraping Browser primitive over different protocols, see the MCP integration post and the LangChain integration post. For site-specific worked examples that drill into the discover β†’ extract pattern, see the Amazon scraper post, the Etsy scraper post, the Google search scraper post, or the Home Depot scraper post. The pattern that holds in production is consistent across all of them: pin a region, snapshot before extracting, persist the session for multi-step flows, and treat absent fields as nullable.


Ready to Build Your AI-Powered Data Pipeline?

Join our community to claim a free plan and connect with developers building Hermes-driven data pipelines on Scrapeless: Discord Β· Telegram.

Sign up at app.scrapeless.com for free Scraping Browser runtime and adapt the patterns above to the channels, regions, and pages your Hermes deployment needs. See the Scraping Browser product page and pricing to scale beyond the free tier.


FAQ

Q: Is web scraping with Hermes Agent and Scrapeless legal?

Scraping publicly visible data is broadly permitted in most jurisdictions, but rules vary by country and by site terms of service. Review the target site's ToS, respect the Robots Exclusion Protocol where applicable, do not collect personal data without a lawful basis, and consult counsel for commercial-scale pipelines.

Q: Do I need a residential proxy?

Yes for any site with meaningful anti-bot protection, which is most retailers, marketplaces, and SERP endpoints. The Scrapeless WSS endpoint routes through the residential pool by default; the proxy_country query parameter pins the egress country.

Q: The first connection hangs or returns a 401. What now?

Scrapeless manages session spin-up and anti-bot handling server-side, so the cloud browser returns a warm session without any connection-handling logic on the agent side. A 401 means the token is wrong β€” recheck the API key in the WSS URL; a hang usually means a malformed proxy_country or a stray space in the URL.

Q: A site returns Access Denied. What now?

Change proxy_country to a different region and reissue /browser connect to spin a fresh fingerprint. If the page persistently blocks, contact Scrapeless support to confirm the block is at the platform level rather than at the account level.

Q: Selectors keep breaking. How do I survive DOM rotation?

Use browser_snapshot rather than browser_get_html + CSS selectors. The accessibility-tree representation is more stable across layout drift, and the agent re-discovers refs each turn instead of relying on hard-coded paths.

Q: How many concurrent workers per host?

Three concurrent renders per host is the documented ceiling for stable runs. For multi-host pipelines, run independent agent loops per host rather than one loop hammering a single domain.

Q: Can I use this without an AI agent?

Yes. The Scrapeless WSS endpoint is plain CDP β€” any Puppeteer or Playwright script connects with puppeteer.connect({ browserWSEndpoint: ... }) or chromium.connectOverCDP(...) and gets the same cloud browser. Hermes is the recommended path when chat-driven research or multi-channel reach matters; the CDP endpoint is the lower-level fallback.

Q: Can I swap Hermes for another agent?

Yes. Any agent that supports a custom CDP endpoint (Browserbase-style integrations, Browser Use, custom Playwright/Puppeteer pipelines, the Scrapeless MCP server, or the Scrapeless skill drop-in) connects to the same WSS endpoint. The integration surface is the protocol, not the client.

Q: How do I keep cookies and login state across multiple agent turns?

Set session_ttl to a longer value (300–900 seconds), give the session a stable session_name, and avoid restarting the connection between calls. The cloud browser keeps the same Chromium profile, cookies, and scroll position warm across the lifetime of the session.

Q: Where do I see what the cloud browser actually rendered?

Append recording=true to the WSS URL. Every run surfaces in the Scrapeless dashboard as a replayable video, so an empty extraction or a surprise interstitial is visible end to end without instrumenting the agent.

Q: Should I use the WSS endpoint, the MCP server, or the skill drop-in?

Pick the surface that matches the rest of the agent. WSS is the smallest change for an agent that already speaks CDP (this post). The MCP server is the right choice for multi-tool MCP-first setups that already manage tool surfaces through the protocol. The Scrapeless skill dropped into ~/.hermes/skills/ is the right choice when the rest of the deployment relies on agentskills.io for tool discovery. All three drive the same underlying cloud browser.

At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this blog is for demonstration purposes only and does not involve any illegal or infringing activities. We make no guarantees and disclaim all liability for the use of information from this blog or third-party links. Before engaging in any scraping activities, consult your legal advisor and review the target website's terms of service or obtain the necessary permissions.

Most Popular Articles

Catalogue