How to Scrape eBay Search Results: Session Warm-Up and Anti-Detection Guide
Advanced Data Extraction Specialist
Key Takeaways:
- A cold request to eBay's search endpoint is blocked. A direct automated navigation to
https://www.ebay.com/sch/i.html?_nkw=laptoplands on an eBay error page, even from a fresh US residential cloud-browser session — eBay gates the search path more tightly than its item and browse pages. - Warming the session is the unlock. Open one persistent cloud-browser session, land on the eBay homepage first so cookies and navigation state settle, then navigate to the search URL in that same session. The search page then loads with the full result grid rendered.
- Anchor extraction on
.su-card-container. eBay rotated its result-card markup; the oldli.s-itemselector now matches nothing. The current organic result cards carry the.su-card-containerclass — select on that and read the child fields off each card. - The integration is a single CDP endpoint. Build one Scrapeless Scraping Browser WebSocket URL with your API key, and Playwright's
connect_over_cdpdrives it exactly as it would a local browser — so the rendering, residential egress, and fingerprinting all move cloud-side. - Residential egress carries the run. The Scrapeless Scraping Browser routes through residential proxies in 195+ countries and randomizes the browser fingerprint per session, so the cloud browser renders eBay pages that a local automated browser gets filtered on.
- Free to start. New Scrapeless accounts include free Scraping Browser runtime — sign up at app.scrapeless.com.
Introduction: Why eBay's search page blocks the obvious approach
eBay is one of the largest public marketplaces on the web, and its search results are a dense source of pricing, listing, and competitive data. Pricing teams track what comparable items sell for, brand-protection teams watch for unauthorized listings, and AI agents pull listing context to answer product questions. All of that lives behind the search endpoint at /sch/i.html.
The obvious approach — point an HTTP client or a local headless browser at that URL — fails fast. A cold automated request to https://www.ebay.com/sch/i.html?_nkw=laptop lands on an eBay error page rather than results. This happens even from a clean US residential IP: eBay evaluates IP reputation, device fingerprinting, request rate, and behavioral signals, and it gates the search path more aggressively than its item and browse pages. The page renders for a human and gets blocked for a script.
This tutorial builds a Python pipeline on top of the Scrapeless Scraping Browser that clears that gate the way a real visitor does — by arriving at the homepage first, letting the session warm, then moving to search inside the same session. You connect to the cloud browser over CDP with Playwright, the officially supported client, so the rendering, residential egress, and anti-detection fingerprinting all happen cloud-side. For a different large-marketplace build using the same primitive, see the best Amazon scrapers roundup; for a localized-pricing comparison across tools, see Best Zillow Scrapers in 2026.
What You Can Do With It
The warm-session pattern — homepage first, then search and detail pages in one held cloud-browser session — covers most of the jobs that an eBay data pipeline needs:
- Track competitor pricing. Pull the price and listing title from each result card on a search query, then compare against your own catalogue on a schedule.
- Monitor a product category. Walk a category or keyword query across pages and collect listings into typed records for trend analysis.
- Watch for unauthorized listings. Search for your brand or SKU and flag sellers who should not be listing it.
- Capture geo-specific results. Pin US residential egress to see the listings, currency, and availability a US shopper would see, rather than whatever an office IP resolves to.
- Feed listing context to an AI agent. Render search and item pages to clean structured fields so a retrieval layer or agent can answer product questions with current data.
- Build a price-history dataset. Snapshot the same queries over time and store the rendered results to study how prices move.
Why Scrapeless Scraping Browser
The Scrapeless Scraping Browser is a customizable, anti-detection cloud browser designed for web crawlers and AI agents. For eBay specifically, it brings:
- Anti-detection cloud browser. It runs a self-developed Chromium with full cloud-side JavaScript rendering, so the search grid, lazy-loaded images, and item details hydrate before the parser reads them.
- Residential proxies in 195+ countries. Set
proxyCountryon the connection URL and the cloud browser egresses from real residential IPs in the region you target, so eBay returns what a local shopper sees. - Per-session fingerprint randomization. Each session gets a randomized fingerprint — user agent, timezone, WebGL, and canvas — so the automated browser does not collapse into a single detectable identity.
- Session persistence via
sessionTTL. Hold one session open across the homepage warm-up and the search navigation by settingsessionTTLon the connection URL, so cookies and navigation state carry between requests in a single run. - A single CDP endpoint. Build one WebSocket URL with your API key; Playwright's
connect_over_cdpdrives it as if it were a local browser, so your parsing code does not change.
Runtime is free to start and scales with usage — see Scrapeless pricing for the tiers, and get your API key on the free plan at app.scrapeless.com.
Prerequisites
Before you start, make sure you have:
- Python 3.10+ — required by the fetcher library below.
- pip — to install the packages.
- A Scrapeless account and API key — sign up for the free plan at app.scrapeless.com, then grab your key from Settings → API Key Management.
- Basic familiarity with CSS selectors and the terminal — you will use both to fetch pages and pull values out of them.
Install
You need one package: Playwright for Python, the officially supported client for the Scrapeless Scraping Browser.
1. Install Playwright
bash
pip install playwright
Playwright's connect_over_cdp connects to the remote Scrapeless cloud browser, so you do not need to run playwright install or download any local browser binaries — the rendering happens cloud-side. Playwright holds one connection open across multiple page loads, which is what lets the homepage warm-up and the search share one session identity.
2. Set your Scrapeless API key
Export your key so it can ride the connection URL:
bash
export SCRAPELESS_API_KEY=your_api_token_here
On Windows, use setx SCRAPELESS_API_KEY "your_api_token_here" (persistent, new shell) or $env:SCRAPELESS_API_KEY="your_api_token_here" (current PowerShell session). The connection helper below reads this variable and embeds it in the URL as token.
Step 1 — Build the connection and confirm the cold request is blocked
Start by reproducing the block, so the rest of the pipeline has a clear baseline. Build the Scrapeless Scraping Browser URL, connect with Playwright, and navigate straight to the search endpoint without visiting the homepage first.
python
import os
from urllib.parse import urlencode
from playwright.sync_api import sync_playwright
def scraping_browser_url(proxy_country="US", session_ttl=240):
# The API key rides the URL as `token`; egress and lifetime are query params.
params = urlencode({
"token": os.environ["SCRAPELESS_API_KEY"],
"sessionTTL": session_ttl,
"proxyCountry": proxy_country,
})
return f"wss://browser.scrapeless.com/api/v2/browser?{params}"
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(scraping_browser_url("US"))
page = browser.new_page()
# Cold navigation straight to the search endpoint, no homepage warm-up.
page.goto("https://www.ebay.com/sch/i.html?_nkw=laptop",
wait_until="domcontentloaded")
print(page.title()) # -> "Error Page | eBay" / "Access Denied"
browser.close()
A cold automated navigation to /sch/i.html lands on an eBay error page, even though the session egresses from a clean US residential IP. eBay treats the search endpoint as a sensitive path and challenges requests that arrive without an established browsing context. The fix is not a different header or a different IP — it is arriving the way a person does, which is the next step.
Step 2 — Warm the session on the homepage, then search
The unlock is a held session. Open one cloud-browser connection, load the eBay homepage first so cookies and navigation state settle, then navigate to the search URL inside that same session. Playwright holds the single CDP connection to the Scrapeless cloud browser open and drives every page through it, so the warm-up and the search share one identity.
python
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(scraping_browser_url("US"))
page = browser.new_page()
# 1. Warm the session: land on the homepage so cookies/navigation state settle.
page.goto("https://www.ebay.com/", wait_until="domcontentloaded")
page.wait_for_timeout(2500)
# 2. Now navigate to search in the SAME session.
page.goto("https://www.ebay.com/sch/i.html?_nkw=laptop",
wait_until="domcontentloaded")
page.wait_for_timeout(3000) # let the grid hydrate
print(page.title()) # -> "Laptop for sale | eBay"
cards = page.query_selector_all(".su-card-container") # the rendered result cards
print(len(cards), "cards") # the grid is now populated
With the homepage visited first, the same search URL that was blocked in Step 1 now returns the title Laptop for sale | eBay and a populated result grid. The short wait_for_timeout after navigation lets the cards hydrate before extraction. The whole difference is order of arrival inside one held session — the homepage establishes the browsing context, and the search request inherits it.
Step 3 — Extract the result cards
eBay rotated its result-card markup, so anchor on the current class. The organic result cards carry .su-card-container; the older li.s-item selector now matches nothing. Select the cards, then read the child fields off each one.
python
def text_of(el, selector):
node = el.query_selector(selector)
return node.inner_text().strip() if node else None
records = []
for card in page.query_selector_all(".su-card-container"):
# Child selectors are illustrative — confirm the title/price/link nodes
# against the current eBay DOM, since the markup rotates.
title = text_of(card, ".su-styled-text") # listing title (illustrative path)
price = text_of(card, ".su-styled-text.s-price") # price text (illustrative path)
link_el = card.query_selector("a") # listing URL
link = link_el.get_attribute("href") if link_el else None
records.append({
"title": title,
"price": price,
"link": link,
})
print(len(records), "listings")
Anchor every extraction on .su-card-container; treat the child selectors (title, price, link) as starting points to confirm against the live DOM, because eBay reshuffles the inner markup independently of the card wrapper. Default each missing field to None so a sparse card does not crash the run — eBay omits a price on some listing formats (auctions mid-bid, "see details" placements), and a few cards are sponsored slots with a different shape.
Get your API key on the free plan: app.scrapeless.com
Step 4 — Page through results and follow listings to detail pages
Most real jobs span more than one URL. Because the session is already warm and held open, paging through results and following individual listings to their item pages costs nothing extra — the same cookies, residential identity, and fingerprint carry across the whole walk. eBay paginates the search endpoint with the _pgn query parameter.
python
rows = []
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(scraping_browser_url("US", session_ttl=300))
page = browser.new_page()
page.goto("https://www.ebay.com/", wait_until="domcontentloaded") # warm once
page.wait_for_timeout(2500)
for n in range(1, 4): # pages 1..3
url = f"https://www.ebay.com/sch/i.html?_nkw=laptop&_pgn={n}"
page.goto(url, wait_until="domcontentloaded")
page.wait_for_timeout(3000)
for card in page.query_selector_all(".su-card-container"):
link_el = card.query_selector("a")
if link_el and link_el.get_attribute("href"):
rows.append({"link": link_el.get_attribute("href")})
# Follow a listing to its item page in the same warm session.
if rows:
page.goto(rows[0]["link"], wait_until="domcontentloaded")
h1 = page.query_selector("h1") # item-page heading (confirm against the DOM)
item_title = h1.inner_text() if h1 else None
browser.close()
print(len(rows), "listings collected")
The homepage warm-up happens once, at the top of the session; every search page and item page after it reuses the established context. Item pages and browse or deals pages render directly once the session is warm — the heavier gating sits on the search endpoint specifically. For a list-to-detail crawl, collect the listing URLs from the search grid first, then fetch each through the same browser so the residential session and fingerprint stay constant across the entire walk.
Step 5 — Production hardening
Moving from a working script to a dependable job is mostly about staying within the platform's tolerances. A few rules carry most of the weight:
- Cap concurrency. Hold at ≤3 cloud-browser sessions per host. Pushing past that invites rate-limits and connection resets, and the marginal throughput rarely justifies the extra friction.
- Warm once and reuse the session. Set
sessionTTL(e.g. 240 seconds) on the connection URL, visit the homepage one time at the top of the session, then run every search and item navigation through the same Playwright connection. Re-warming per page wastes the held context and the connection handshake. - Pin
proxyCountry=US. eBay's listings, currency, and availability vary by region; pinning US residential egress keeps the results consistent with the locale you target. - Treat absent fields as nullable. Real cards omit prices, ratings, or shipping lines on some listing formats. Default missing selectors to
Nonerather than asserting they exist, so one sparse record does not break the batch.
What You Get Back
json
[
{
"title": "Dell Latitude 7420 14\" Laptop i7 16GB 512GB SSD Windows 11 Pro",
"price": "$329.99",
"link": "https://www.ebay.com/itm/1234567890"
},
{
"title": "Apple MacBook Air 13.3\" M1 8GB 256GB - Space Gray",
"price": "$489.00",
"link": "https://www.ebay.com/itm/9876543210"
},
{
"title": "Lenovo ThinkPad X1 Carbon Gen 9 i5 16GB 256GB",
"price": "$415.50",
"link": "https://www.ebay.com/itm/5556667778"
}
]
// Shape reflects the Step 3 extraction; field values are illustrative samples.
A few honest observations from running this pipeline:
- The cold search hit is denied; the warm one is not. A direct navigation to
/sch/i.htmllands on an eBay error page; visiting the homepage first inside the same held session clears it and the search returns theLaptop for sale | eBaytitle with a populated grid. - A short post-navigation wait covers hydration. The result cards load after the first paint, so a brief
wait_for_timeoutaftergotois what makes them available to the selector. .su-card-containeris the stable anchor. eBay rotated its card markup — the olderli.s-itemreturns nothing. Anchor on.su-card-containerand re-confirm the child field selectors after any redesign.- Pin
proxyCountryfor consistent results. Listings, currency, and availability vary by region; pinning US residential egress keeps the output consistent with the locale you target. - Item and browse pages render directly. The heavier gating sits on the search endpoint; once the session is warm, item, browse, and deals pages load without the homepage detour.
Conclusion: Scale your eBay listing pipeline
The pipeline reduces to four moves. Connect to one Scrapeless Scraping Browser session and warm it on the homepage. Navigate to the search endpoint inside that same held session so the request inherits an established browsing context. Extract the result grid by anchoring on .su-card-container. Then page through results and follow listings to detail pages over the one warm session. You only pay for the cloud browser when you actually need it — see Scrapeless pricing for what the free tier covers — and the rest stays plain Python.
From here, the same warm-session pattern plugs into larger marketplace builds. See the best Amazon scrapers roundup for a large-marketplace comparison, and Best Zillow Scrapers in 2026 for a localized-pricing tool comparison. Before you ship: export SCRAPELESS_API_KEY, pin proxyCountry=US, warm the session on the homepage before touching /sch/, keep concurrency at ≤3 sessions per host, anchor on .su-card-container, and treat absent fields as nullable. Connection and library guides at docs.scrapeless.com.
Ready to Build Your AI-Powered Data Pipeline?
Join our community to claim a free plan and connect with developers building eBay and marketplace data pipelines: Discord · Telegram.
Sign up at app.scrapeless.com for free Scraping Browser runtime and adapt the patterns above to the eBay queries and regions the pipeline needs.
FAQ
Do you need a proxy?
Yes — residential egress carries an eBay run. Pin US residential proxies with proxyCountry=US on the connection URL. The Scrapeless Scraping Browser supplies residential proxies in 195+ countries, so you do not have to source and rotate IPs yourself, and the egress address looks like an ordinary home connection rather than a flagged datacenter IP.
Why does the search endpoint return "Access Denied"?
A cold automated navigation to https://www.ebay.com/sch/i.html lands on an eBay error page because the request arrives without an established browsing context — eBay gates the search path more tightly than its item and browse pages. The fix is to warm the session first: open one held cloud-browser session, load the eBay homepage so cookies and navigation state settle, then navigate to the search URL in that same session. The search then loads with the Laptop for sale | eBay title and a populated grid.
My selectors stopped matching after an eBay redesign. How do you fix it?
eBay rotates its DOM. Anchor your extraction on the result-card wrapper .su-card-container rather than on a deep child path, and re-confirm the title, price, and link selectors against the current markup after a redesign. The older li.s-item selector matches nothing on the current layout, which is why the card wrapper is the stable anchor.
Are there concurrency limits you should respect?
Keep it to ≤3 cloud-browser sessions per host. Beyond that you trade a little throughput for a lot of rate-limiting and connection resets. Use bounded concurrency and a queue rather than firing every request at once.
Can this run without an AI agent?
Yes. The Python pattern above is end-to-end on its own — Playwright connects to the Scrapeless Scraping Browser over CDP, and your code warms the session, navigates, and extracts. An AI agent is an optional layer on top, not a requirement.
At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this blog is for demonstration purposes only and does not involve any illegal or infringing activities. We make no guarantees and disclaim all liability for the use of information from this blog or third-party links. Before engaging in any scraping activities, consult your legal advisor and review the target website's terms of service or obtain the necessary permissions.



