Avoid Bot Detection With Playwright Stealth: 9 Solutions for 2025

Expert Network Defense Engineer
Key Takeaways
- Playwright is a powerful browser automation tool, but its default settings can trigger bot detection mechanisms on websites.
- Implementing stealth techniques is crucial to make Playwright scripts appear more human-like and avoid being blocked during web scraping or automation tasks.
- Playwright Stealth involves a combination of strategies, including modifying browser properties, managing headers, handling cookies, and simulating realistic user behavior.
- This guide outlines 10 detailed solutions, complete with code examples, to effectively bypass bot detection with Playwright.
- For advanced anti-bot systems and large-scale data extraction, integrating Playwright with specialized services like Scrapeless provides a robust and reliable solution.
Introduction
In the rapidly evolving landscape of web automation and data extraction, tools like Playwright have become indispensable for developers and data scientists. Playwright offers robust capabilities for controlling headless browsers, enabling tasks from automated testing to web scraping. However, as websites become more sophisticated in their defense mechanisms, detecting and blocking automated traffic has become a significant challenge. Many websites employ advanced anti-bot systems designed to identify and thwart non-human interactions, often leading to frustrating 403 errors or CAPTCHA challenges. This comprehensive guide, "Avoid Bot Detection With Playwright Stealth," delves into the critical techniques required to make your Playwright scripts undetectable. We will explore 10 practical solutions, providing detailed explanations and code examples to help you navigate the complexities of bot detection. For those facing persistent challenges with highly protected websites, Scrapeless offers an advanced, managed solution that complements Playwright's capabilities, ensuring seamless data acquisition.
Understanding Bot Detection: Why Websites Block You
Websites implement bot detection for various reasons, including protecting intellectual property, preventing data abuse, maintaining fair competition, and ensuring service quality. These systems analyze numerous factors to distinguish between legitimate human users and automated scripts [1]. Common detection vectors include:
- Browser Fingerprinting: Websites examine browser properties (e.g.,
User-Agent
,navigator.webdriver
flag, installed plugins, screen resolution) to identify inconsistencies that suggest automation. - Behavioral Analysis: Unusual navigation patterns, rapid request rates, lack of mouse movements or keyboard inputs, or repetitive actions can signal bot activity.
- IP Address and Request Headers: Repeated requests from the same IP, suspicious
User-Agent
strings, or missing/inconsistent headers are red flags. - CAPTCHAs and JavaScript Challenges: These are often deployed as a last line of defense to verify human interaction.
Successfully avoiding detection requires a multi-faceted approach that addresses these various vectors, making your Playwright scripts mimic human behavior as closely as possible.
9 Solutions to Avoid Bot Detection with Playwright Stealth
1. Disable the navigator.webdriver
Flag
One of the most common indicators of automation is the navigator.webdriver
property, which is set to true
in automated browser environments. Websites can easily check this flag to identify bots. Disabling it is a fundamental stealth technique [2].
Code Operation Steps:
- Use
page.addInitScript()
to inject JavaScript that modifies thenavigator.webdriver
property:pythonfrom playwright.sync_api import sync_playwright with sync_playwright() as p: browser = p.chromium.launch(headless=True) context = browser.new_context() # Inject JavaScript to disable the webdriver flag context.add_init_script(""" Object.defineProperty(navigator, 'webdriver', { get: () => undefined }) """) page = context.new_page() page.goto("https://bot.sannysoft.com/") # A website to test bot detection page.screenshot(path="webdriver_disabled.png") browser.close()
webdriver
flag is undefined from the start, making it harder for websites to detect automation.
2. Randomize User-Agent Strings
The User-Agent
header identifies the browser and operating system to the web server. Using a consistent or outdated User-Agent
can be a strong indicator of a bot. Randomizing User-Agent
strings helps in appearing as different legitimate users [3].
Code Operation Steps:
- Maintain a list of common
User-Agent
strings and select one randomly for each request or session:pythonfrom playwright.sync_api import sync_playwright import random user_agents = [ "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/117.0", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Safari/605.1.15" ] with sync_playwright() as p: browser = p.chromium.launch(headless=True) context = browser.new_context(user_agent=random.choice(user_agents)) page = context.new_page() page.goto("https://www.whatismybrowser.com/detect/what-is-my-user-agent") page.screenshot(path="random_user_agent.png") browser.close()
User-Agent
strings, you simulate traffic from a diverse set of browsers and devices, making it harder for websites to profile your requests as automated.
3. Use Proxies and Rotate IP Addresses
Repeated requests from the same IP address are a primary indicator of bot activity. Using a pool of proxies and rotating IP addresses for each request or session is a highly effective way to distribute your traffic and avoid IP-based blocks [4].
Code Operation Steps:
- Configure Playwright to use a proxy:
python
from playwright.sync_api import sync_playwright import random proxies = [ "http://user1:pass1@proxy1.example.com:8080", "http://user2:pass2@proxy2.example.com:8080", # Add more proxies to your pool ] with sync_playwright() as p: # Launch browser with a randomly selected proxy browser = p.chromium.launch(headless=True, proxy={ "server": random.choice(proxies) }) page = browser.new_page() page.goto("https://www.whatismyip.com/") page.screenshot(path="proxy_ip.png") browser.close()
4. Simulate Realistic Mouse Movements and Keyboard Inputs
Bots often interact with web elements directly and instantaneously, which is unnatural. Simulating human-like mouse movements, clicks, and keyboard inputs can significantly reduce the chances of detection [5].
Code Operation Steps:
- Use
page.mouse
andpage.keyboard
methods to introduce delays and realistic paths:pythonfrom playwright.sync_api import sync_playwright import time import random with sync_playwright() as p: browser = p.chromium.launch(headless=False) # Use headless=False to observe movements page = browser.new_page() page.goto("https://www.example.com") # Simulate human-like mouse movement to an element and click element = page.locator("a[href='/some-link']") box = element.bounding_box() if box: # Move mouse to a random point within the element x = box["x"] + box["width"] * random.random() y = box["y"] + box["height"] * random.random() page.mouse.move(x, y, steps=random.randint(5, 15)) # Smooth movement page.mouse.click(x, y) time.sleep(random.uniform(0.5, 1.5)) # Random delay after click # Simulate human-like typing search_input = page.locator("#search-box") search_input.click() text_to_type = "Playwright Stealth" for char in text_to_type: page.keyboard.type(char) time.sleep(random.uniform(0.05, 0.2)) # Random delay between key presses page.keyboard.press("Enter") time.sleep(3) page.screenshot(path="human_like_interaction.png") browser.close()
5. Manage Cookies and Session Data
Websites use cookies to track user sessions and preferences. Bots that don't handle cookies properly can be easily identified. Maintaining a consistent session by accepting and sending cookies is vital for stealth [6].
Code Operation Steps:
- Use
browser.new_context()
to create isolated sessions andcontext.add_cookies()
orcontext.storage_state()
to manage cookies:pythonfrom playwright.sync_api import sync_playwright import json with sync_playwright() as p: browser = p.chromium.launch(headless=True) # Scenario 1: Load cookies from a previous session try: with open("cookies.json", "r") as f: saved_cookies = json.load(f) context = browser.new_context() context.add_cookies(saved_cookies) except FileNotFoundError: context = browser.new_context() page = context.new_page() page.goto("https://www.example.com/login") # Navigate to a page that sets cookies # Perform login or other actions that generate cookies # ... page.goto("https://www.example.com/dashboard") # Access a page requiring session page.screenshot(path="session_managed.png") # Scenario 2: Save current session cookies for future use current_cookies = context.cookies() with open("cookies.json", "w") as f: json.dump(current_cookies, f) browser.close()
6. Adjust Viewport Size and Device Emulation
Websites often check the viewport size, screen resolution, and device type to detect anomalies. Using a default or inconsistent viewport can be a red flag. Emulating common device configurations helps in blending in [7].
Code Operation Steps:
- Set
viewport
anduser_agent
when creating a new context:pythonfrom playwright.sync_api import sync_playwright import random viewports = [ {"width": 1920, "height": 1080}, # Desktop {"width": 1366, "height": 768}, # Laptop {"width": 375, "height": 667}, # iPhone 8 {"width": 412, "height": 846} # Pixel 4 ] with sync_playwright() as p: browser = p.chromium.launch(headless=True) # Randomly select a viewport and user agent selected_viewport = random.choice(viewports) # A corresponding user agent should be chosen if emulating a specific device # For simplicity, we'll use a generic one here, but ideally match it. user_agent_for_device = "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1" context = browser.new_context( viewport=selected_viewport, user_agent=user_agent_for_device if selected_viewport["width"] < 1000 else None # Apply mobile UA for small viewports ) page = context.new_page() page.goto("https://www.deviceinfo.me/") # Website to check device info page.screenshot(path="device_emulation.png") browser.close()
7. Avoid Headless Mode Detection
While headless mode is efficient, some anti-bot systems can detect it. Running Playwright in headful mode (with a visible browser UI) can sometimes bypass detection, especially for more aggressive systems. However, this consumes more resources [8].
Code Operation Steps:
- Set
headless=False
when launching the browser:pythonfrom playwright.sync_api import sync_playwright with sync_playwright() as p: browser = p.chromium.launch(headless=False) # Launch in headful mode page = browser.new_page() page.goto("https://www.example.com") page.screenshot(path="headful_mode.png") browser.close()
8. Use playwright-extra
and stealth-plugin
For a more comprehensive and easier way to apply multiple stealth techniques, playwright-extra
combined with puppeteer-extra-plugin-stealth
(which also works with Playwright) can be invaluable. These libraries automatically apply a suite of evasions to make your browser less detectable [9].
Code Operation Steps:
- Install the necessary libraries:
bash
pip install playwright-extra pip install puppeteer-extra-plugin-stealth
- Integrate the stealth plugin:
python
from playwright_extra import stealth_sync from playwright.sync_api import sync_playwright # Apply stealth to Playwright stealth_sync.apply() with sync_playwright() as p: browser = p.chromium.launch(headless=True) page = browser.new_page() page.goto("https://bot.sannysoft.com/") page.screenshot(path="playwright_extra_stealth.png") browser.close()
navigator.webdriver
,navigator.plugins
, and other JavaScript properties.
9. Implement Delays and Randomization in Actions
Bots often execute actions with perfect timing and speed. Introducing random delays between actions and varying the speed of interactions can make your script appear more human. This is a simple yet effective behavioral stealth technique [10].
Code Operation Steps:
- Use
time.sleep()
with random intervals:pythonfrom playwright.sync_api import sync_playwright import time import random with sync_playwright() as p: browser = p.chromium.launch(headless=True) page = browser.new_page() page.goto("https://www.example.com") # Simulate browsing with random delays time.sleep(random.uniform(2, 5)) # Initial page load delay page.click("a[href='/products']") time.sleep(random.uniform(1, 3)) # Delay after click page.locator("input[name='q']").fill("search item") time.sleep(random.uniform(0.5, 1.5)) # Delay after typing page.keyboard.press("Enter") time.sleep(random.uniform(3, 7)) # Delay after search page.screenshot(path="random_delays.png") browser.close()
At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this blog is for demonstration purposes only and does not involve any illegal or infringing activities. We make no guarantees and disclaim all liability for the use of information from this blog or third-party links. Before engaging in any scraping activities, consult your legal advisor and review the target website's terms of service or obtain the necessary permissions.