TinyFish
Search
Fast, structured web search
Fetch
Any URL to clean content
Agent
Multi-step web automation
Browser
Stealth Chromium sessions
All products share one API keyView docs →
Documentation
API reference and guides
Integrations
Connect with your stack
Blog
Product updates and insights
Cookbook
Open-source examples
Pricing
Overview
Enterprise-grade web data
Use Cases
What teams are building
Customers
See who builds with TinyFish
Log InContactContact
Products
SearchFast, structured web search
FetchAny URL to clean content
AgentMulti-step web automation
BrowserStealth Chromium sessions
Resources
DocumentationAPI reference and guides
IntegrationsConnect with your stack
BlogProduct updates and insights
CookbookOpen-source examples
PricingPlans, credits, and billing
Enterprise
OverviewEnterprise-grade web data
Use CasesWhat teams are building
CustomersSee who builds with TinyFish
Log InContact
TinyFish

Web APIs built for agents.

Product
  • Enterprise
  • Use Cases
  • Customers
  • Pricing
  • Integrations
  • Docs
  • Trust
Resources
  • Cookbook
  • Blog
  • Current
  • Accelerator
Connect
  • X/Twitter
  • LinkedIn
  • Discord
  • GitHub
  • Contact Us
© 2026 TinyFish·Privacy·Terms
Engineering

Why Your Stealth Plugin Isn't Working (And What Actually Does)

TinyFishie·TinyFish Observer·Apr 21, 2026·9 min read
Share
Why Your Stealth Plugin Isn't Working (And What Actually Does)

You added puppeteer-extra-plugin-stealth. You're rotating proxies. You're randomizing user agents and adding human-like delays. You're still getting a Cloudflare challenge page or a silent empty response on every third request.

The problem isn't your implementation. The problem is that the advice everyone gives—"add stealth plugins, rotate proxies"—was written for 2020s detection systems. Modern anti-bot infrastructure operates at layers that JavaScript patching can't reach.

Here's what's actually happening, why the standard advice fails against serious protection, and what the realistic options are.

Where each approach breaks down:

  1. Basic Python requests → blocked by TLS fingerprinting in milliseconds
  2. Headless Playwright + stealth plugin → blocked by behavioral and canvas analysis
  3. Residential proxies alone → blocked when fingerprint says "bot" regardless of IP
  4. All three combined, DIY → works until the site updates detection, then debugging starts

The Problem With JavaScript Stealth

puppeteer-extra-plugin-stealth works by patching JavaScript properties that detection systems look for: navigator.webdriver, fake plugins, consistent screen dimensions, and a dozen other tells that headless browsers expose by default. Against basic detection—Distil, old PerimeterX configurations, simple honeypot checks—it's effective and often enough.

The fundamental limitation is that it's JavaScript patching JavaScript. Any modification you make through JS can be detected in JS, by the same engine, at the same level. The more sophisticated detection systems don't even bother checking navigator.webdriver first. They go underneath it.

Specifically: your browser establishes a TLS connection before any JavaScript runs. The TLS handshake contains a fingerprint—cipher suites, extension ordering, JA3 hash—that identifies the library used to make the connection. Python's requests library sends a TLS fingerprint that looks nothing like Chrome. Playwright running in Node.js sends a fingerprint that differs from browser-launched Chrome in detectable ways. DataDome and Cloudflare bot management check this before serving a single byte of page content. Stealth plugins don't touch TLS.

HTTP/2 fingerprinting works the same way. Browsers negotiate HTTP/2 frames in a specific order with specific window sizes. Automation frameworks deviate from this pattern in measurable ways that exist at the protocol level, not the JavaScript level.

How Modern Detection Actually Works

Enterprise anti-bot systems—Cloudflare Bot Management, DataDome, Kasada, Akamai Bot Manager—use three layers simultaneously:

Layer 1: Network signatures. TLS fingerprint, HTTP/2 frame ordering, TCP window sizes, and IP reputation. A residential IP with a non-browser TLS fingerprint fails here. A perfect browser fingerprint from a datacenter IP also fails here. Both signals are checked, and a mismatch on either is a flag.

Layer 2: Browser fingerprinting. Canvas rendering (GPUs render slightly differently across hardware configurations), WebGL signatures, audio context outputs, font enumeration, screen metrics. These produce a fingerprint of the actual hardware running the browser. Headless Chrome running on a cloud VM produces a different fingerprint than Chrome on a physical laptop—even when all the detectable JavaScript properties are patched.

Layer 3: Behavioral analysis. Session patterns, mouse movement characteristics, scroll behavior, click timing distributions, navigation flow. Human users exhibit statistical regularities in these patterns that automation doesn't reproduce even when you add randomized delays. This layer also checks session consistency over time—the same fingerprint appearing from different geolocations within an impossible timeframe is a hard signal.

Plugins that operate at the JavaScript layer address some signals in Layer 2. They don't address Layer 1 at all. Against sophisticated implementations of Layer 3, JavaScript patches help but don't solve the fundamental behavioral mismatch.

The Arms Race You're Losing

Here's what the developer forums don't say clearly: puppeteer-extra-plugin-stealth is maintained by the community and updated reactively, after detection systems have already added new checks. When DataDome shipped improved canvas fingerprinting analysis in late 2025, the plugin didn't have a patch for weeks. Every site using DataDome that you'd previously scraped cleanly started blocking.

This is the actual cost of the DIY approach: it's not just the initial setup time. It's ongoing maintenance of a system that breaks whenever a detection provider ships an update, on a schedule you don't control.

Some sites deploy multiple detection layers from different vendors—Cloudflare at the edge plus DataDome at the application level. Bypassing one doesn't bypass the other. The combination creates a significantly higher bar than either alone.

Diagram showing three layers of bot detection and what stealth plugins vs infrastructure-level solutions cover

What "Infrastructure-Level" Stealth Means

Stealth that works against DataDome and Kasada consistently operates at the browser binary level, not the JavaScript level. This means the TLS fingerprint looks exactly like a real browser because it is produced by the same underlying network stack. Canvas and WebGL fingerprints match real hardware because the rendering is happening on real GPU hardware or realistic virtualization. Behavioral patterns reflect real browser execution rather than programmatic simulation.

This is expensive and complex to build correctly as a DIY stack. The components required:

  • Chromium compiled with specific modifications to the network stack for TLS fingerprint matching
  • GPU passthrough or realistic GPU emulation for canvas/WebGL consistency
  • Residential IP infrastructure with carrier-grade IP allocation (not proxy services reselling residential IP pools)
  • Behavioral replay based on real human session data, not randomized timing algorithms

TinyFish implements stealth at the C++ layer rather than through JavaScript injection—the browser binary produces network-level fingerprints that match real Chrome without any post-launch patching. Combined with residential proxy routing and behavioral execution patterns, this reaches the ~85% success rate across protected targets that TinyFish reports in production—a level that JavaScript-patched automation can't achieve against modern detection. The sites that resist this (Kasada on high-value financial targets, specific Akamai configurations) require additional measures.

For teams that need this level of protection without building it: TinyFish, ScrapingBee, Zyte, and Bright Data Web Unlocker all operate at the infrastructure level. The trade-off against DIY is cost per request and control over execution logic.

When DIY Is Enough (And When It Isn't)

Most websites don't use enterprise anti-bot systems. DataDome and Kasada are expensive and require ongoing tuning—small e-commerce sites, blogs, SaaS documentation, and most long-tail targets use Cloudflare's free tier or no bot protection at all. puppeteer-extra-plugin-stealth plus residential proxies handles these reliably.

The inflection point is when targets use:

  • Cloudflare Enterprise Bot Management (not the free tier)
  • DataDome (used by Reddit, Figma, Leboncoin, and many fintech targets)
  • Kasada (used by major ticketing platforms and e-commerce)
  • Akamai Bot Manager

You'll know you've hit one of these when you see your rate of empty responses increase as you scale, when stealth plugins that worked at 1 req/sec fail at 10 req/sec, or when your requests succeed on the first run and fail on subsequent runs from the same session.

For targets in these categories: the honest answer is that DIY stealth requires significant ongoing engineering investment to stay ahead of detection updates. The math changes when you calculate engineering time against the cost of a managed solution.

---

TinyFish gives you 500 free steps to test against your actual protected targets. No configuration required—describe the goal, point it at the site, see if it gets through.

**Start your free trial →**

---

FAQ

Does puppeteer-extra-plugin-stealth still work in 2026?

Against basic detection systems—simple headless browser checks, honeypot traps, basic user-agent filtering—yes, it still works. Against enterprise anti-bot systems like DataDome, Kasada, or Cloudflare Bot Management, it's insufficient on its own. These systems check TLS fingerprints and network-level signals that JavaScript plugins can't modify. The plugin is best understood as a floor, not a ceiling: it removes obvious tells but doesn't address the deeper signals that sophisticated detection uses.

What's the difference between residential proxies and datacenter proxies for bot detection?

Residential proxies route traffic through IP addresses assigned by ISPs to real home users—anti-bot systems trust these significantly more than datacenter IPs because they're associated with real humans. Datacenter IPs are recognizable as cloud infrastructure and are immediately suspicious on most protected targets. That said, a residential IP doesn't override a bad browser fingerprint: DataDome checks both, and a residential IP paired with a non-browser TLS signature still fails. Both matter, and they're checked independently.

How does TLS fingerprinting work and why can't stealth plugins fix it?

TLS fingerprinting analyzes the cipher suites, extensions, and ordering in the TLS handshake your client sends before any page content is served. Every HTTP library has a characteristic fingerprint—Python requests, curl, and Playwright each produce distinct fingerprints that differ from browser-native Chrome. Anti-bot systems match this fingerprint against a library of known patterns. Stealth plugins run JavaScript inside the browser and can't modify the underlying network stack that generates the TLS handshake—they operate at a layer that loads after the TLS connection is already established.

Is it legal to bypass bot protection when scraping?

This depends heavily on jurisdiction, the target site, and what you're doing with the data. In the US, the Computer Fraud and Abuse Act (CFAA) has been interpreted inconsistently by courts regarding scraping. The hiQ v. LinkedIn decision established that scraping publicly accessible data is generally lawful, but terms of service violations and bypassing technical access controls add complexity. Scraping for research, competition analysis, or price comparison on publicly available data is generally lower risk than scraping authenticated pages or building competing products with the data. This isn't legal advice—the specifics matter significantly.

At what scale do stealth plugins start failing?

There's no universal threshold, but behavioral analysis becomes more reliable as you scale. A single session at 1 request per minute looks more human than 50 concurrent sessions each making 10 requests per minute from the same IP pool. Rate is one signal; concurrency pattern is another; session consistency over time is a third. You can often scrape at low volume with DIY stealth against moderately protected sites, but the detection surface area grows with scale. The practical answer: test at your target volume against your target sites—failure patterns at 10x scale often aren't predictable from 1x testing.

Related Reading

  • Pillar: The Best Web Scraping Tools in 2026
  • Anti-Bot Protection for Web Agents: How TinyFish Gets Past the Front Door
  • What Is a Web Agent? The Complete Guide to AI Browser Agents
Get started

Start building.

No credit card. No setup. Run your first operation in under a minute.

Get 500 free creditsRead the docs
More Articles
Building a Browser for the Agent Era
Engineering

Building a Browser for the Agent Era

Max Luong·Apr 14, 2026
Production-Grade Web Fetching for AI Agents
Engineering

Production-Grade Web Fetching for AI Agents

Chenlu Ji·Apr 14, 2026
Why Stitched Web Stacks Fail in Production
Product and Integrations

Why Stitched Web Stacks Fail in Production

Keith Zhai·Apr 14, 2026