🥳Join the Scrapeless Community and Claim Your Free Trial to Access Our Powerful Web Scraping Toolkit!
Back to Blog

Avoid Bot Detection With Playwright Stealth: 9 Solutions for 2025

Michael Lee
Michael Lee

Expert Network Defense Engineer

22-Sep-2025

Key Takeaways

  • Playwright is a powerful browser automation tool, but its default settings can trigger bot detection mechanisms on websites.
  • Implementing stealth techniques is crucial to make Playwright scripts appear more human-like and avoid being blocked during web scraping or automation tasks.
  • Playwright Stealth involves a combination of strategies, including modifying browser properties, managing headers, handling cookies, and simulating realistic user behavior.
  • This guide outlines 10 detailed solutions, complete with code examples, to effectively bypass bot detection with Playwright.
  • For advanced anti-bot systems and large-scale data extraction, integrating Playwright with specialized services like Scrapeless provides a robust and reliable solution.

Introduction

In the rapidly evolving landscape of web automation and data extraction, tools like Playwright have become indispensable for developers and data scientists. Playwright offers robust capabilities for controlling headless browsers, enabling tasks from automated testing to web scraping. However, as websites become more sophisticated in their defense mechanisms, detecting and blocking automated traffic has become a significant challenge. Many websites employ advanced anti-bot systems designed to identify and thwart non-human interactions, often leading to frustrating 403 errors or CAPTCHA challenges. This comprehensive guide, "Avoid Bot Detection With Playwright Stealth," delves into the critical techniques required to make your Playwright scripts undetectable. We will explore 10 practical solutions, providing detailed explanations and code examples to help you navigate the complexities of bot detection. For those facing persistent challenges with highly protected websites, Scrapeless offers an advanced, managed solution that complements Playwright's capabilities, ensuring seamless data acquisition.

Understanding Bot Detection: Why Websites Block You

Websites implement bot detection for various reasons, including protecting intellectual property, preventing data abuse, maintaining fair competition, and ensuring service quality. These systems analyze numerous factors to distinguish between legitimate human users and automated scripts [1]. Common detection vectors include:

  • Browser Fingerprinting: Websites examine browser properties (e.g., User-Agent, navigator.webdriver flag, installed plugins, screen resolution) to identify inconsistencies that suggest automation.
  • Behavioral Analysis: Unusual navigation patterns, rapid request rates, lack of mouse movements or keyboard inputs, or repetitive actions can signal bot activity.
  • IP Address and Request Headers: Repeated requests from the same IP, suspicious User-Agent strings, or missing/inconsistent headers are red flags.
  • CAPTCHAs and JavaScript Challenges: These are often deployed as a last line of defense to verify human interaction.

Successfully avoiding detection requires a multi-faceted approach that addresses these various vectors, making your Playwright scripts mimic human behavior as closely as possible.

9 Solutions to Avoid Bot Detection with Playwright Stealth

One of the most common indicators of automation is the navigator.webdriver property, which is set to true in automated browser environments. Websites can easily check this flag to identify bots. Disabling it is a fundamental stealth technique [2].

Code Operation Steps:

  1. Use page.addInitScript() to inject JavaScript that modifies the navigator.webdriver property:
    python Copy
    from playwright.sync_api import sync_playwright
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        context = browser.new_context()
        
        # Inject JavaScript to disable the webdriver flag
        context.add_init_script("""
            Object.defineProperty(navigator, 'webdriver', {
              get: () => undefined
            })
        """)
        
        page = context.new_page()
        page.goto("https://bot.sannysoft.com/") # A website to test bot detection
        page.screenshot(path="webdriver_disabled.png")
        browser.close()
    This script runs before any other scripts on the page, ensuring the webdriver flag is undefined from the start, making it harder for websites to detect automation.

2. Randomize User-Agent Strings

The User-Agent header identifies the browser and operating system to the web server. Using a consistent or outdated User-Agent can be a strong indicator of a bot. Randomizing User-Agent strings helps in appearing as different legitimate users [3].

Code Operation Steps:

  1. Maintain a list of common User-Agent strings and select one randomly for each request or session:
    python Copy
    from playwright.sync_api import sync_playwright
    import random
    
    user_agents = [
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36",
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/117.0",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Safari/605.1.15"
    ]
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        context = browser.new_context(user_agent=random.choice(user_agents))
        page = context.new_page()
        page.goto("https://www.whatismybrowser.com/detect/what-is-my-user-agent")
        page.screenshot(path="random_user_agent.png")
        browser.close()
    By rotating User-Agent strings, you simulate traffic from a diverse set of browsers and devices, making it harder for websites to profile your requests as automated.

3. Use Proxies and Rotate IP Addresses

Repeated requests from the same IP address are a primary indicator of bot activity. Using a pool of proxies and rotating IP addresses for each request or session is a highly effective way to distribute your traffic and avoid IP-based blocks [4].

Code Operation Steps:

  1. Configure Playwright to use a proxy:
    python Copy
    from playwright.sync_api import sync_playwright
    import random
    
    proxies = [
        "http://user1:pass1@proxy1.example.com:8080",
        "http://user2:pass2@proxy2.example.com:8080",
        # Add more proxies to your pool
    ]
    
    with sync_playwright() as p:
        # Launch browser with a randomly selected proxy
        browser = p.chromium.launch(headless=True, proxy={
            "server": random.choice(proxies)
        })
        page = browser.new_page()
        page.goto("https://www.whatismyip.com/")
        page.screenshot(path="proxy_ip.png")
        browser.close()
    For large-scale operations, consider using a proxy service that handles rotation and management automatically, such as those offered by Scrapeless.

4. Simulate Realistic Mouse Movements and Keyboard Inputs

Bots often interact with web elements directly and instantaneously, which is unnatural. Simulating human-like mouse movements, clicks, and keyboard inputs can significantly reduce the chances of detection [5].

Code Operation Steps:

  1. Use page.mouse and page.keyboard methods to introduce delays and realistic paths:
    python Copy
    from playwright.sync_api import sync_playwright
    import time
    import random
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=False) # Use headless=False to observe movements
        page = browser.new_page()
        page.goto("https://www.example.com")
    
        # Simulate human-like mouse movement to an element and click
        element = page.locator("a[href='/some-link']")
        box = element.bounding_box()
        if box:
            # Move mouse to a random point within the element
            x = box["x"] + box["width"] * random.random()
            y = box["y"] + box["height"] * random.random()
            page.mouse.move(x, y, steps=random.randint(5, 15)) # Smooth movement
            page.mouse.click(x, y)
            time.sleep(random.uniform(0.5, 1.5)) # Random delay after click
    
        # Simulate human-like typing
        search_input = page.locator("#search-box")
        search_input.click()
        text_to_type = "Playwright Stealth"
        for char in text_to_type:
            page.keyboard.type(char)
            time.sleep(random.uniform(0.05, 0.2)) # Random delay between key presses
        page.keyboard.press("Enter")
        
        time.sleep(3)
        page.screenshot(path="human_like_interaction.png")
        browser.close()
    Introducing randomness in delays and movement paths makes the automation appear less robotic.

5. Manage Cookies and Session Data

Websites use cookies to track user sessions and preferences. Bots that don't handle cookies properly can be easily identified. Maintaining a consistent session by accepting and sending cookies is vital for stealth [6].

Code Operation Steps:

  1. Use browser.new_context() to create isolated sessions and context.add_cookies() or context.storage_state() to manage cookies:
    python Copy
    from playwright.sync_api import sync_playwright
    import json
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        
        # Scenario 1: Load cookies from a previous session
        try:
            with open("cookies.json", "r") as f:
                saved_cookies = json.load(f)
            context = browser.new_context()
            context.add_cookies(saved_cookies)
        except FileNotFoundError:
            context = browser.new_context()
    
        page = context.new_page()
        page.goto("https://www.example.com/login") # Navigate to a page that sets cookies
        # Perform login or other actions that generate cookies
        # ...
        page.goto("https://www.example.com/dashboard") # Access a page requiring session
        page.screenshot(path="session_managed.png")
    
        # Scenario 2: Save current session cookies for future use
        current_cookies = context.cookies()
        with open("cookies.json", "w") as f:
            json.dump(current_cookies, f)
    
        browser.close()
    Proper cookie management ensures that your bot maintains a consistent identity across requests, mimicking a returning user.

6. Adjust Viewport Size and Device Emulation

Websites often check the viewport size, screen resolution, and device type to detect anomalies. Using a default or inconsistent viewport can be a red flag. Emulating common device configurations helps in blending in [7].

Code Operation Steps:

  1. Set viewport and user_agent when creating a new context:
    python Copy
    from playwright.sync_api import sync_playwright
    import random
    
    viewports = [
        {"width": 1920, "height": 1080}, # Desktop
        {"width": 1366, "height": 768},  # Laptop
        {"width": 375, "height": 667},   # iPhone 8
        {"width": 412, "height": 846}    # Pixel 4
    ]
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        
        # Randomly select a viewport and user agent
        selected_viewport = random.choice(viewports)
        # A corresponding user agent should be chosen if emulating a specific device
        # For simplicity, we'll use a generic one here, but ideally match it.
        user_agent_for_device = "Mozilla/5.0 (iPhone; CPU iPhone OS 13_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1"
    
        context = browser.new_context(
            viewport=selected_viewport,
            user_agent=user_agent_for_device if selected_viewport["width"] < 1000 else None # Apply mobile UA for small viewports
        )
        page = context.new_page()
        page.goto("https://www.deviceinfo.me/") # Website to check device info
        page.screenshot(path="device_emulation.png")
        browser.close()
    This ensures that the browser environment appears consistent with a real user's device.

7. Avoid Headless Mode Detection

While headless mode is efficient, some anti-bot systems can detect it. Running Playwright in headful mode (with a visible browser UI) can sometimes bypass detection, especially for more aggressive systems. However, this consumes more resources [8].

Code Operation Steps:

  1. Set headless=False when launching the browser:
    python Copy
    from playwright.sync_api import sync_playwright
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=False) # Launch in headful mode
        page = browser.new_page()
        page.goto("https://www.example.com")
        page.screenshot(path="headful_mode.png")
        browser.close()
    For production scraping, this might not be scalable, but it's a useful technique for debugging and bypassing particularly stubborn detection systems.

8. Use playwright-extra and stealth-plugin

For a more comprehensive and easier way to apply multiple stealth techniques, playwright-extra combined with puppeteer-extra-plugin-stealth (which also works with Playwright) can be invaluable. These libraries automatically apply a suite of evasions to make your browser less detectable [9].

Code Operation Steps:

  1. Install the necessary libraries:
    bash Copy
    pip install playwright-extra
    pip install puppeteer-extra-plugin-stealth
  2. Integrate the stealth plugin:
    python Copy
    from playwright_extra import stealth_sync
    from playwright.sync_api import sync_playwright
    
    # Apply stealth to Playwright
    stealth_sync.apply()
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        page.goto("https://bot.sannysoft.com/")
        page.screenshot(path="playwright_extra_stealth.png")
        browser.close()
    This plugin handles many common fingerprinting techniques, such as modifying navigator.webdriver, navigator.plugins, and other JavaScript properties.

9. Implement Delays and Randomization in Actions

Bots often execute actions with perfect timing and speed. Introducing random delays between actions and varying the speed of interactions can make your script appear more human. This is a simple yet effective behavioral stealth technique [10].

Code Operation Steps:

  1. Use time.sleep() with random intervals:
    python Copy
    from playwright.sync_api import sync_playwright
    import time
    import random
    
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        page.goto("https://www.example.com")
    
        # Simulate browsing with random delays
        time.sleep(random.uniform(2, 5)) # Initial page load delay
        page.click("a[href='/products']")
        time.sleep(random.uniform(1, 3)) # Delay after click
        page.locator("input[name='q']").fill("search item")
        time.sleep(random.uniform(0.5, 1.5)) # Delay after typing
        page.keyboard.press("Enter")
        time.sleep(random.uniform(3, 7)) # Delay after search
    
        page.screenshot(path="random_delays.png")
        browser.close()
    Consistent, predictable delays are still detectable. The key is to introduce variability.

At Scrapeless, we only access publicly available data while strictly complying with applicable laws, regulations, and website privacy policies. The content in this blog is for demonstration purposes only and does not involve any illegal or infringing activities. We make no guarantees and disclaim all liability for the use of information from this blog or third-party links. Before engaging in any scraping activities, consult your legal advisor and review the target website's terms of service or obtain the necessary permissions.

Most Popular Articles

Catalogue