How to Integrate Scrapeless MCP Server into ZeroClaw: A Step-by-Step Guide
Specialist in Anti-Bot Strategies
Key Takeaways:
- One TOML block wires the cloud browser into a local Rust agent. ZeroClaw is a single-binary AI agent runtime that talks to LLM providers, listens on 30+ channels, and acts through tools. Adding the Scrapeless MCP Server is a four-line
[mcp]block in~/.zeroclaw/config.tomlβ no SDK install, no daemon to manage, no agent-side code change. - Twenty MCP tools, two surfaces. The Scrapeless MCP Server exposes
google_search,google_trends, the fullbrowser_*cloud-browser primitive set, andscrape_html/scrape_markdown/scrape_screenshot. Stdio transport runs locally vianpx -y scrapeless-mcp-server; streamable HTTP points athttps://api.scrapeless.com/mcp. - MCP and Agent Skills are complementary, not alternatives. The MCP server gives ZeroClaw the tools; the Scrapeless OpenClaw skills β
webunlocker-skillandllm-chat-scraper-skillβ give it the how-to for the underlying Scrapeless APIs. ZeroClaw migrated from OpenClaw and reads the sameSKILL.mdformat, so the skills drop into~/.zeroclaw/workspace/skills/and become callable throughzeroclaw skills list. - Anti-detection cloud browser, residential proxies in 195+ countries. Scrapeless handles JavaScript rendering, residential-proxy egress, fingerprint randomization (UA, timezone, WebGL, canvas), and session persistence at the platform level, so the ZeroClaw agent focuses on the task instead of the evasion plumbing.
- Discover β extract works across any site. Use
google_searchto locate the page,scrape_markdownto pull clean text from a JS-rendered SPA, thebrowser_*tools for paginated or interactive flows, andgoogle_trendsfor time-series context. The agent composes them; nothing in the protocol is target-specific. - Free to start. New Scrapeless accounts include free MCP runtime β sign up at app.scrapeless.com.
Introduction: from a local Rust agent to live web access
ZeroClaw is a Rust agent runtime that runs entirely on the operator's machine. One binary, one TOML config, the operator's keys, the operator's workspace. It speaks to ~20 LLM providers, reaches the world through Discord, Telegram, Matrix, email, voice, webhooks, and a CLI, and acts through shell, browser, HTTP, hardware, and MCP-server tools. The 31k-star repository ships a security model built around supervised autonomy, OS-level sandboxes (Landlock, Bubblewrap, Seatbelt, Docker), and cryptographic tool receipts on every action.
The fundamental limit of any local agent runtime is the same one every LLM hits: the model's knowledge is frozen at training cutoff. For research, monitoring, lead generation, competitive intelligence, and RAG against live publisher data, that limit shows up the moment the agent has to read a page that did not exist when the model was trained. ZeroClaw's built-in browser and HTTP tools cover benign pages and documentation lookups; commercial pages behind Cloudflare, Akamai, reCAPTCHA, or IP-reputation filtering are a different surface that those tools were not engineered for.
This post walks through wiring Scrapeless into ZeroClaw through both integration surfaces the runtime supports: the Scrapeless MCP Server (the canonical way to expose new tools to the agent) and the Scrapeless OpenClaw skills (canonical knowledge files the agent loads to drive those tools effectively). The two complement each other β the MCP server is what the agent calls; the skills are what tell it when and how to call the underlying Scrapeless APIs. For the same Scrapeless primitive surfaced through other clients, the MCP server tutorial walks through Claude Desktop / Cursor / Codex CLI, and the Hermes integration post covers the direct-CDP path for agents that already speak Chrome DevTools Protocol.
What Is ZeroClaw?
ZeroClaw is a single Rust binary that boots an agent runtime on the operator's own machine. The maintainers describe it as "you own the agent, you own the data, you own the machine it runs on." The runtime is structured around four moving pieces:
- Channels (30+ adapters). Inbound messages from Discord, Telegram, Matrix, email, voice, webhooks, the CLI, and the ACP IDE bridge β all routed to the same agent loop.
- Providers (~20 LLM backends). Anthropic, OpenAI, Ollama, any OpenAI-compatible endpoint. Fallback chains and routing keep the agent running when a provider flakes.
- Tools (shell, browser, HTTP, hardware, MCP). The action surface. MCP servers register as first-class tools alongside the built-ins.
- Security policy and SOP engine. Default autonomy is
supervised: medium-risk operations require approval, high-risk are blocked. Standard Operating Procedures fire on MQTT, webhook, cron, or peripheral events with approval gates and resumable runs.
Configuration lives in one place: ~/.zeroclaw/config.toml. The workspace β skills, memory, logs, MCP state β lives under ~/.zeroclaw/workspace/. Operators migrating from OpenClaw can import the workspace directly; the skill format is the same.
Why Add Web Access to Your ZeroClaw Agent
LLMs powering ZeroClaw share the same constraint: training cutoff. In a fast-moving environment that produces three observable failure modes β outdated answers, hallucinated facts, and tool calls against URLs that have since rotated or 404'd.
ZeroClaw ships built-in http and browser tools, and they cover a broad surface. They are not optimized for the commercial web: JS-rendered SPAs, anti-bot interstitials, CAPTCHA challenges, and geo-restricted content sit between the agent and the data the operator actually wants. Wiring Scrapeless in turns those failure modes into normal tool calls:
- Real-time research through
google_search(Google, with localizedgl+hlparameters) andgoogle_trends(time-series interest data). - Cross-source validation by
scrape_markdownagainst multiple result URLs in a single agent turn. - Live data collection from JS-heavy sites β pricing pages, marketplace listings, review pages, public directories β through the
browser_*cloud-browser primitives. - Geo-bound queries by allocating sessions in a specific country, so the agent sees what a local user would see.
How to Extend ZeroClaw With Scrapeless: Two Surfaces
Scrapeless supports ZeroClaw through two surfaces, used together:
- Scrapeless MCP Server β the official server exposing 20 cloud-browser, SERP, and scraping tools over the Model Context Protocol.
- Scrapeless OpenClaw skills β
SKILL.md-formatted knowledge files that teach the agent how to drive the Scrapeless Universal Scraping API and the LLM Chat Scraper effectively. ZeroClaw imports OpenClaw skills directly.
The MCP server is what the agent invokes. The skills are what the agent reads to decide when and how to invoke. They are not alternatives β installed together, the agent has both the tools and the playbook.
Scrapeless MCP Server
The MCP server ships 20 tools out of the box. The core set:
| Tool | What it does |
|---|---|
google_search |
SERP retrieval with gl / hl localization parameters. |
google_trends |
Trending search and time-series interest data. |
scrape_markdown |
Render a URL through the cloud browser, return Markdown. |
scrape_html |
Same, returning full rendered HTML. |
scrape_screenshot |
Capture a high-quality screenshot of any page. |
browser_create |
Allocate (or reuse) a cloud browser session. |
browser_goto |
Navigate the session to a URL. |
browser_click / browser_type / browser_press_key |
Drive interactive page elements. |
browser_scroll / browser_scroll_to |
Trigger lazy-loaded content. |
browser_get_html / browser_get_text |
Extract from the current cloud-browser page. |
browser_screenshot / browser_snapshot |
Capture state for review or downstream processing. |
browser_wait_for / browser_wait |
Wait for selectors or fixed durations. |
browser_close |
Release the session. |
Two transports are supported. Stdio (npx -y scrapeless-mcp-server) is the right default for a workstation running ZeroClaw locally; streamable HTTP (https://api.scrapeless.com/mcp) is the right default when the agent runs on a remote host and the operator wants the MCP server hosted by Scrapeless rather than spawned per-invocation.
Scrapeless OpenClaw Skills
The skills are SKILL.md files with a small Python runtime that wraps a specific Scrapeless API. Both ship on the official Scrapeless GitHub org:
| Skill | What it teaches the agent |
|---|---|
webunlocker-skill |
Drive the Scrapeless Universal Scraping API β fetch HTML / Plaintext / Markdown / screenshots / structured content with automatic CAPTCHA solving (reCAPTCHA, Cloudflare Turnstile, Cloudflare Challenge), JS rendering, residential-proxy egress with --country, retry, and POST + custom-header support. |
llm-chat-scraper-skill |
Collect structured chat responses from ChatGPT, Gemini, Perplexity, and Grok β useful for AI-search monitoring and GEO measurement workflows. |
ZeroClaw inherits the OpenClaw skill format. Skills get cloned into ~/.zeroclaw/workspace/skills/, are listed by zeroclaw skills list, and become available to the agent on the next zeroclaw agent session.
What You Can Do With It
- Daily monitoring agent. Schedule a ZeroClaw SOP that runs each morning:
google_searchfor tracked keywords,scrape_markdownthe top three results, summarize, deliver via the Discord channel adapter. - AI-search visibility tracking. With the LLM Chat Scraper skill, pull the responses ChatGPT, Gemini, Perplexity, and Grok produce for brand-relevant prompts on a cadence; track presence and sentiment over time.
- Lead generation from public directories. Drive the cloud browser through a paginated public directory, dedupe by domain, hand the records to the agent's memory store.
- Authenticated form-fill with human in the loop. Drive a vendor onboarding or job-application form to the final review screen, take a full-page screenshot, stop before submit so a human can approve.
- Geo-bound competitor pricing. Allocate the session in a specific country, render the localized pricing page, diff against the previous snapshot, ping a channel when a threshold trips.
- RAG against live publisher data. Render publisher pages to clean text through
scrape_markdown, embed into ZeroClaw's SQLite + embeddings memory, retrieve for future turns. - Bypass Cloudflare for benign research targets. The Web Unlocker skill handles Turnstile and Challenge pages automatically; the agent only sees a clean Markdown payload.
At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this post is for demonstration purposes only.
Why Scrapeless
Scrapeless is an anti-detection cloud browser plus a Universal Scraping API plus a SERP API plus an LLM Chat Scraper, all behind one API key. For ZeroClaw specifically, it brings:
- A native MCP server β no SDK install, no adapter code. The MCP block in
~/.zeroclaw/config.tomlis the entire integration. - Cloud-side JavaScript rendering so SPAs, infinite-scroll feeds, and lazy-loaded panels are first-class targets for the
browser_*tools andscrape_markdown. - Residential proxies in 195+ countries so geo-bound queries return the listings a local user would see.
- Anti-detection fingerprinting on every session β UA, timezone, language, screen resolution, WebGL, canvas randomized per session.
- Automatic CAPTCHA solving for reCAPTCHA, Cloudflare Turnstile, and Cloudflare Challenge through the Web Unlocker surface.
- A single management surface β one API key, one dashboard, free runtime credits on the new-account plan.
Get the API key on the free plan at app.scrapeless.com. The full MCP tool surface is documented at github.com/scrapeless-ai/scrapeless-mcp-server; the API surface at docs.scrapeless.com.
Prerequisites
- A UNIX-like host. Linux, macOS, or WSL2 on Windows. ZeroClaw publishes Windows builds, but the install script and skill scripts assume a POSIX shell β the smoothest path is Linux / macOS / WSL2.
- Node.js 18 or newer for the MCP stdio transport (
npx -y scrapeless-mcp-server). - Python 3.10 or newer for the OpenClaw skills (they ship as Python scripts in
scripts/). - Rust toolchain if installing from source; the prebuilt binary path needs nothing extra.
- A Scrapeless account and API key β sign up at app.scrapeless.com and copy the key from Settings β API Key Management.
- An LLM provider key β Anthropic, OpenAI, Ollama, or any OpenAI-compatible endpoint. ZeroClaw's onboarding wizard wires it in.
gitfor cloning the skills repos.jqis optional β handy when piping CLI output, not required for the MCP path.
Install ZeroClaw
The full setup is two sub-steps.
1. Run the installer
bash
curl -fsSL https://raw.githubusercontent.com/zeroclaw-labs/zeroclaw/master/install.sh | bash
The installer asks whether to fetch a prebuilt binary (~seconds) or build from source (slower, customizable). Both end the same way β zeroclaw onboard kicks off automatically. To skip the wizard at the end, pass --skip-onboard and run zeroclaw onboard later.
Verify the binary is on the path:
bash
zeroclaw --version
The output should look like zeroclaw 0.7.5 or newer.
2. Complete the onboarding wizard
bash
zeroclaw onboard
The wizard walks through provider selection, channel wiring, autonomy mode, and personalization. For this integration, two settings matter:
- Provider β pick whichever LLM provider is already configured (OpenAI, Anthropic, Ollama, an OpenAI-compatible gateway). Paste the API key when prompted.
- Autonomy β
supervisedis the safe default; the agent will prompt before invoking medium-risk tools. The MCP tools count as medium-risk by default. For a development box where prompting is friction, the wizard also exposesyolomode, which the operator should turn on only on a trusted machine.
Confirm the runtime is up by starting a chat:
bash
zeroclaw agent
A "Hey!" should return a normal completion. If it does, the runtime is healthy and the next step is wiring in the MCP server.
Connect ZeroClaw to the Scrapeless MCP Server
1. Smoke-test the MCP server outside ZeroClaw
Before adding the MCP block to config.toml, confirm the server starts standalone. ZeroClaw lazy-loads MCP servers on agent start, so a broken config surfaces only the first time the agent runs β better to catch it now:
bash
SCRAPELESS_KEY="<YOUR_SCRAPELESS_KEY>" npx -y scrapeless-mcp-server
On the first run, npx downloads scrapeless-mcp-server from the registry and the server starts over stdio. The process stays attached; press Ctrl-C to release it. If it printed a startup banner and is waiting for MCP requests, the credentials and the package both work.
Get your API key on the free plan: app.scrapeless.com
2. Add the MCP block to ~/.zeroclaw/config.toml
ZeroClaw reads MCP server configuration from a [mcp] block in the global config. Add (or merge) the following:
toml
# ~/.zeroclaw/config.toml
[mcp]
enabled = true
deferred_loading = true
servers = [
{ name = "scrapeless", command = "npx", transport = "stdio", args = ["-y", "scrapeless-mcp-server"], env = { SCRAPELESS_KEY = "<YOUR_SCRAPELESS_KEY>" }, headers = {} }
]
Notes:
-
enabled = trueactivates the MCP subsystem. Recent ZeroClaw builds default it off. -
deferred_loading = truekeeps the daemon startup fast; ZeroClaw spawnsnpxonly when the agent actually starts a session. -
env.SCRAPELESS_KEYis the auth surface β the same key the smoke test in step 1 used. -
For the hosted streamable-HTTP transport instead of stdio, swap the entry for:
toml{ name = "scrapeless", transport = "http", url = "https://api.scrapeless.com/mcp", headers = { "x-api-token" = "<YOUR_SCRAPELESS_KEY>" } }ZeroClaw's MCP client stack supports three transport values β
stdio,http, andsseβ with validation enforcingcommand/argsfor stdio andurl/headersfor remote transports (per ZeroClaw issue #1380). The HTTP transport is the right default when ZeroClaw runs on a remote host (a VPS or a container) and the operator does not wantnpxrunning there.
3. Verify the connection from inside ZeroClaw
Restart the agent session so it picks up the new config and lazy-loads the MCP server:
bash
zeroclaw agent
In a fresh chat, ask:
Which Scrapeless MCP tools do you have access to?
The agent should enumerate the 20 tools listed earlier β google_search, google_trends, the browser_* set, scrape_html, scrape_markdown, scrape_screenshot. If the answer says zero tools, the most common cause is enabled = false in [mcp]; the second most common is a typo in SCRAPELESS_KEY.
Install the Scrapeless OpenClaw Skills
The MCP server is the tools. The skills are the playbook. Both Scrapeless skills work with ZeroClaw because the runtime supports the OpenClaw skill format directly.
1. Allow skill scripts in ~/.zeroclaw/config.toml
Both Scrapeless skills ship scripts/ directories that the agent executes. Set allow_scripts = true in the [skills] section:
toml
# ~/.zeroclaw/config.toml
[skills]
allow_scripts = true
allow_scripts is off by default for safety. Turning it on grants ZeroClaw permission to run skill-bundled scripts under the autonomy policy already in force; medium-risk script invocations still prompt for approval under supervised mode.
2. Clone the skill repositories
bash
mkdir -p ~/.zeroclaw/workspace/skills
git clone https://github.com/scrapeless-ai/webunlocker-skill ~/.zeroclaw/workspace/skills/webunlocker-skill
git clone https://github.com/scrapeless-ai/llm-chat-scraper-skill ~/.zeroclaw/workspace/skills/llm-chat-scraper-skill
3. Install the Python dependencies and the API token
The Web Unlocker skill ships a requirements.txt:
bash
cd ~/.zeroclaw/workspace/skills/webunlocker-skill
pip install -r requirements.txt
cp .env.example .env
# Then edit .env and set X_API_TOKEN=<YOUR_SCRAPELESS_KEY>
Repeat for the LLM Chat Scraper skill if it is in scope for the agent.
4. Verify the skills are visible to ZeroClaw
bash
zeroclaw skills list
The output should include webunlocker-skill and llm-chat-scraper-skill. If they are missing, the most common cause is that the clone landed under ~/.zeroclaw/skills/ instead of ~/.zeroclaw/workspace/skills/ β the latter is the path the runtime watches.
ZeroClaw + Scrapeless in Action
A realistic worked example: a daily competitive-intelligence brief on a topic the operator tracks. The agent locates fresh sources, extracts the content, and produces a structured summary, delivered to whichever channel the agent is bound to.
In zeroclaw agent, paste:
Build me a competitive-intelligence brief on "AI agent frameworks" for the last 7 days.
1. Use the Scrapeless MCP `google_search` tool to find the 5 most relevant news / blog
posts published this week. Use gl=us, hl=en.
2. For each result URL, use `scrape_markdown` to pull the article body. Discard
navigation chrome and ads.
3. Use `google_trends` to fetch the 7-day interest curve for the query
"AI agent frameworks" so I have the demand signal alongside the supply signal.
4. Produce a structured Markdown report with:
- Top 3 themes across the 5 articles, each with a one-sentence summary and the
source URL.
- The 7-day trend direction (up / flat / down) and the peak day.
- A "what changed this week" callout β anything new vs. last week's brief.
If a target page blocks the cloud browser, fall back to `browser_create` +
`browser_goto` + `browser_get_text` for that URL only. Don't substitute synthetic
content; if a source can't be retrieved, list it under "unretrieved sources".
The agent's plan, in plain English:
- Call
google_search(q="AI agent frameworks", gl="us", hl="en")and pick the five freshest results that look like primary sources (skip aggregator pages). - Iterate the URLs through
scrape_markdownand keep the cleaned body text in working memory. - Call
google_trends(q="AI agent frameworks", date="now 7-d")for the interest curve. - Summarize into a Markdown brief.
- For any URL that returns an anti-bot interstitial through
scrape_markdown, retry through thebrowser_createβbrowser_gotoβbrowser_get_textchain, which warms a cloud browser session and waits for hydration before extracting.
Before each tool call, ZeroClaw's supervised autonomy mode prompts for approval β Y for one-shot approval, A to remember the permission for future tool calls in the same session.
To send the prompt without entering the interactive chat:
bash
zeroclaw agent --message "Build me a competitive-intelligence brief on AI agent frameworks for the last 7 days..."
To turn this into a scheduled run instead of an ad-hoc prompt, register an SOP on a cron schedule and bind it to whichever channel adapter the agent should deliver the brief through (Discord, Telegram, email). The MCP tools and the skill stay the same; only the trigger changes.
What You Get Back
The brief comes back as a Markdown payload along the lines of the following β captured from an actual run of the prompt above against five live SERP results for "AI agent frameworks 2026":
markdown
# AI Agent Frameworks β Weekly Brief (week of 12-May-2026)
## Themes (last 7 days)
1. **LangGraph is the consensus production standard.** All three deep
comparisons published this week (Towards AI, GuruSup, Alice Labs) rank
LangGraph #1 for production workloads. The cited reasons converge:
deterministic graph execution, native human-in-the-loop checkpoints,
and first-class observability through LangSmith.
Source: https://pub.towardsai.net/top-ai-agent-frameworks-in-2026-a-production-ready-comparison-7ba5e39ad56d
2. **MCP is emerging as the cross-framework tool-integration standard.**
Anthropic's Model Context Protocol β now governed by the Linux Foundation
with OpenAI, Google, Microsoft, AWS, and Salesforce on the supporter list β
is referenced as the agent-to-tool standard in two of the three comparisons.
Source: https://gurusup.com/blog/best-multi-agent-frameworks-2026
3. **The AutoGen / AG2 split is the major 2025β2026 development.** Microsoft
rewrote AutoGen as v0.4+ with a new API; the community continued the v0.2
lineage as AG2 (ag2.ai). Both Alice Labs and GuruSup flag this as a "pick
deliberately" moment for teams evaluating multi-agent debate frameworks.
Source: https://alicelabs.ai/en/insights/best-ai-agent-frameworks-2026
## Demand signal
- 7-day trend: unavailable (google_trends returned a transient upstream error
on this run β retry on next schedule)
## What changed this week
- Alice Labs added Claude Agent SDK as a new entrant at #2, displacing CrewAI
to #3 β first ranking we've seen elevate Anthropic's official SDK above
the multi-agent generalists.
- AutoGen / AG2 fork status referenced in 2 of 3 articles, up from 0 last week.
## Unretrieved sources
- (none β alicelabs.ai SPA required the browser_* fallback path; recovered)
The structure follows the prompt; the values are what the verified tool chain actually returned on the day the brief ran. A few honest observations grounded in the live run:
scrape_markdowncleans most publisher pages well. Towards AI and GuruSup returned clean Markdown bodies on the first attempt. Heavily JS-rendered SPAs (alicelabs.ai is a Webflow / Vite SPA in this run) returned the rendered HTML shell instead β the agent recovered through thebrowser_createβbrowser_gotoβbrowser_get_textchain, which returned a fully structured page snapshot including the ranked list, key takeaways, FAQ, and the May-2026 update timestamp.google_trendsis interest, not volume β and is sometimes transient. On the verification run the upstream Trends call returned aload failederror; the prompt handles this by reporting the gap rather than substituting synthetic data. The right retry posture is the next scheduled run, not a hot retry inside the same agent turn.- Per-source freshness varies. Some publishers backfill timestamps when they update articles; if "freshness" matters absolutely, cross-check the published date in the article body, not the SERP snippet. (The Alice Labs page in this run shows both an April-2026 publish date and a May-2026 update date in the body.)
- Anti-bot interstitials and SPA shells are normal, not exceptions. Budget for the
browser_*fallback in any prompt that touches commercial sites at scale; the verification run hit one in three URLs and the recovery was uneventful.
Conclusion: an agent that reads the live web
The ZeroClaw + Scrapeless integration reduces to four moves the operator runs once: install ZeroClaw, register the Scrapeless MCP server in ~/.zeroclaw/config.toml, drop the OpenClaw skills into ~/.zeroclaw/workspace/skills/, and verify with zeroclaw skills list and a tool-listing prompt in zeroclaw agent. After that, every agent turn that touches the web β research, monitoring, lead generation, RAG ingestion, AI-search visibility tracking β goes through the cloud browser, the residential proxies, and the SERP API behind one API key.
For the same Scrapeless primitive in other clients, the MCP server tutorial covers Claude Desktop / Cursor / Codex CLI, the Hermes integration post covers direct-CDP, and the LangChain integration post covers Python agents. The pattern across all of them is the same: pin a residential region, keep the session warm across multi-step flows, treat anti-bot interstitials as a retry case rather than an exception, and let the agent compose google_search β scrape_markdown β browser_* into whatever the prompt actually asks for.
Ready to Build Your AI-Powered Data Pipeline?
Join our community to claim a free plan and connect with developers building local-agent pipelines on top of Scrapeless: Discord Β· Telegram.
Sign up at app.scrapeless.com for free MCP runtime and adapt the patterns above to whichever workflows the ZeroClaw agent already runs.
FAQ
Q1. Does the Scrapeless MCP server work on Windows, or only Linux / macOS?
The MCP server is a Node.js package β it runs anywhere Node 18+ runs, including Windows. ZeroClaw's installer assumes a POSIX shell, so the smoothest path on Windows is WSL2. The HTTP-transport variant (pointing ZeroClaw at https://api.scrapeless.com/mcp) removes the local npx dependency entirely and is the easiest fit for hosted ZeroClaw deployments.
Q2. Stdio or streamable HTTP β which transport is the right default?
For a workstation running ZeroClaw locally, stdio. The lifecycle is simple: ZeroClaw spawns npx -y scrapeless-mcp-server on agent start, kills it on agent stop. For ZeroClaw on a VPS or in a container, HTTP. The Scrapeless-hosted endpoint removes the need to package npx and Node into the runtime image.
Q3. Is scraping public web data legal?
Generally yes, when the data is publicly visible and the workflow respects each site's terms of service and applicable jurisdictions. The legal posture varies by country, by site, and by use case (research, commercial resale, training data). Review the target site's ToS before scaling a workflow against it, and consult counsel for high-volume or regulated use cases.
Q4. Do the MCP server and the OpenClaw skills overlap?
They are complementary. The MCP server gives the agent tools β concrete, callable surfaces (google_search, scrape_markdown, browser_*). The skills give the agent knowledge β how the Scrapeless Universal Scraping API behaves, when to fall back to JS rendering, which response type to request, how to chain CAPTCHA solving with country selection. Installed together, the agent has both.
Q5. What happens when a target page returns an anti-bot interstitial?
For scrape_markdown against most pages, the cloud browser solves the challenge transparently. For pages that still return an interstitial, the standard fallback is browser_create β browser_goto β browser_wait_for (a known post-challenge selector) β browser_get_text. Budget for this fallback in any prompt that touches commercial sites; the prompt example above shows the shape.
Q6. How does ZeroClaw's autonomy mode interact with MCP tool calls?
Under supervised (the default), the agent prompts before invoking each MCP tool the first time. The operator can grant one-shot approval (Y) or remember-this-tool approval (A). Under yolo, the agent invokes tools without prompting; that mode is appropriate only on a trusted dev box.
Q7. Can the agent compose Scrapeless calls into multi-step flows in a single turn?
Yes β that is the design point. A single agent turn typically chains google_search (locate), scrape_markdown (extract from the canonical URL), and browser_* (fall back for interactive or anti-bot-protected pages). ZeroClaw streams the intermediate tool calls into the same conversation context.
Q8. Where does the Scrapeless API key live?
For the MCP path, in env.SCRAPELESS_KEY inside ~/.zeroclaw/config.toml (or in the streamable-HTTP x-api-token header). For the skill path, in the .env file inside each skill directory as X_API_TOKEN. The two paths are independent; rotating the key means updating both locations.
Q9. Can a ZeroClaw SOP fire the same prompt on a schedule?
Yes. Register an SOP with a cron trigger that runs the same prompt the operator would paste into zeroclaw agent --message "...". Bind the SOP to a channel adapter (Discord, Telegram, email) and the brief is delivered automatically. SOPs in supervised mode still gate medium-risk tool calls behind approval; for unattended scheduled runs, the SOP needs to be configured under a more permissive autonomy mode or with pre-granted tool permissions.
Q10. What about Scrapeless's other products β Scraping Browser, Universal Scraping API, SERP API?
The MCP server bundles the most common cloud-browser, SERP, and scrape primitives into one MCP surface. For workflows that need the full Scraping Browser primitive set directly (CDP, custom fingerprints, session persistence at session_ttl granularity), wire the Scraping Browser CDP endpoint into ZeroClaw's built-in browser tool instead. The two approaches compose; they do not conflict.
At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this blog is for demonstration purposes only and does not involve any illegal or infringing activities. We make no guarantees and disclaim all liability for the use of information from this blog or third-party links. Before engaging in any scraping activities, consult your legal advisor and review the target website's terms of service or obtain the necessary permissions.



