Blog
Ideas and Insights
Latest news, updates, and insights from TinyFish.

Anti-bot protection is a multi-layer detection system that analyzes IP reputation, TLS fingerprints, HTTP headers, browser fingerprints, behavioral patterns, and CAPTCHA challenges to distinguish bots from humans.
Your scraper works on localhost. You deploy it, run 200 requests, and every page throws a Cloudflare challenge. You add residential proxies — still blocked. You rotate user agents — still blocked. You patch navigator.webdriver — blocked again, but now for a different reason.
This is the current state of anti-bot protection in 2026. Sites don't check one signal. They check six or more, simultaneously. Fixing them one at a time is a losing game — beating it requires passing all layers at once.
Before talking about solutions, it helps to understand what web scraping bot detection actually looks like in 2026. Modern anti-bot systems — Cloudflare, PerimeterX (now HUMAN), DataDome, Akamai — don't rely on a single check. They layer detections so that passing one while failing another still gets you flagged. Understanding these layers is essential whether you're trying to bypass Cloudflare for web scraping or handle any other protection system.
Layer 1: IP reputation. Every request comes from an IP address. Anti-bot systems maintain reputation databases. Datacenter IPs (AWS, GCP, Azure) are flagged immediately — they're the obvious choice for automated traffic. Residential IPs have higher trust because they're associated with real users. Mobile IPs are even harder to detect.
Layer 2: TLS fingerprint. Before your HTTP request even reaches the server, your TLS handshake reveals what client you're using. Python's requests library, Go's net/http, Node's axios — each has a distinct TLS signature that looks nothing like a real browser. Cloudflare checks this in milliseconds.
Layer 3: HTTP headers and protocol. Real browsers send HTTP/2 with specific frame ordering and header patterns. Automation tools often default to HTTP/1.1 or send headers in the wrong order. Sites like Cloudflare flag these mismatches before you've even loaded a page.
Layer 4: Browser fingerprint. Headless Chrome has detectable properties: navigator.webdriver=true, missing plugins, inconsistent screen dimensions, no GPU renderer. Anti-bot systems check hundreds of these attributes and compare them against known browser signatures. A mismatch in any one of them raises your bot score.
Layer 5: Behavioral analysis. Real users move their mouse in non-linear paths, scroll at varying speeds, hesitate before clicking. Automated tools produce perfectly straight mouse paths or instant clicks. Modern systems use behavioral biometrics to detect this.
Layer 6: CAPTCHAs. When detection signals are ambiguous, the system throws a challenge. reCAPTCHA v3 doesn't even show a visible challenge — it scores your behavior silently and blocks you if the score is too low. Cloudflare Turnstile uses device fingerprinting and cryptographic challenges behind the scenes.

The critical insight: these layers compound. Fixing your IP while leaving your TLS fingerprint exposed still gets you blocked. Building a complete anti-bot stack means handling all six layers simultaneously.
If you're building anti-bot handling yourself, here's what a production-grade stack looks like:
Residential proxy provider. Datacenter IPs are dead for serious scraping. Residential proxies cost $3–15/GB depending on provider and geography. You need rotation logic, failover handling, and geographic targeting. Budget: $200–2,000+/month for production volume.
Fortified browser. Standard headless Chrome leaks automation signals everywhere. You need patches for WebDriver detection, plugin simulation, canvas fingerprint randomization, WebGL rendering consistency, and HTTP/2 frame ordering. Libraries like playwright-extra with stealth plugins help, but they're in a constant arms race with detection systems.
TLS fingerprint matching. Tools like curl-impersonate replicate Chrome's exact TLS handshake. But Cloudflare has adapted to detect its specific patterns. You need to stay current with each browser version's TLS signature.
Behavioral simulation. For heavily protected sites, you need mouse movement simulation with natural acceleration/deceleration curves, variable scroll speeds, and realistic timing between actions. Basic randomization doesn't work — modern systems use Fitts' Law models to detect synthetic movement.
CAPTCHA solving service. When prevention fails, you need a fallback. Services like 2Captcha ($1–3/1K solves) and CapSolver ($0.40–0.90/1K) either use human workers or AI models. Token-based solvers add 15–30 seconds per solve. Browser-integrated solvers are faster but cost more.
Retry and adaptation logic. When a request fails, your system needs to diagnose why (IP burned? fingerprint detected? behavioral flag?) and adapt — switch proxy, rotate fingerprint, change timing pattern. This is the orchestration layer most teams underestimate.
Total cost for a production DIY stack: $500–5,000/month in services alone, plus ongoing engineering time to keep it working as detection systems evolve. The real cost isn't the proxies — it's the engineer who maintains the stack.
When building your own makes sense: If your team needs full control over every component — for compliance auditing, custom fingerprint logic, or integration with existing infrastructure — the DIY path is the right one. It's also more cost-effective at very high volume (100K+ requests/month) where per-step pricing exceeds the fixed cost of maintaining your own stack. The key question is whether you have the dedicated engineering bandwidth to keep it running as detection systems evolve.
TinyFish takes a different approach. Instead of exposing anti-bot components for you to assemble, it handles all six detection layers at the infrastructure level.
From your side, the entire configuration is one parameter:
curl -N -X POST https://agent.tinyfish.ai/v1/automation/run-sse \
-H "X-API-Key: $TINYFISH_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://protected-site.com",
"goal": "Extract all product names and prices as JSON",
"browser_profile": "stealth",
"proxy_config": {
"enabled": true,
"country_code": "US"
}
}'browser_profile: "stealth" activates the full anti-bot stack. proxy_config routes through residential proxies in a specific country. That's it.
What happens behind the scenes: TinyFish runs a real Chromium-based browser session with a native stealth layer that handles fingerprint consistency, proxy rotation, and detection evasion automatically. If a request gets blocked, the system detects the block and auto-reconfigures — switching proxy, adjusting session parameters — without any input from you.
This auto-reconfiguration matters. In the Mind2Web benchmark, Task #197 on kaggle.com initially failed on an anti-bot block. On a subsequent run, TinyFish automatically reconfigured and passed Cloudflare on its own. You can watch the full execution trace — every step is public.
The difference isn't just having anti-bot tooling. It's having a system that detects blocks and adapts in real time without human input.

Honesty about limitations builds more trust than claiming universal coverage.
The following success rates are based on internal production workload analysis across enterprise customer deployments (Q4 2025 – Q1 2026). Ranges reflect variation across different site configurations within each category:
| Site Category | Examples | Success Rate |
|---|---|---|
| Global e-commerce | Amazon, eBay | 95–100% |
| European electronics retail | MediaMarkt (DE/AT/PL/ES/IT) | 91–95% |
| Professional platforms | LinkedIn, GitHub | 90–100% |
| Regional e-commerce | Otto, Alternate, Coolblue | 88–100% |
| Specialty markets | Chrono24, MercadoLibre | 90–100% |
| Video platforms | YouTube, Bilibili | 87–99% |
| Regional real estate | 99.co, EdgeProp (Singapore) | 85–100% |
For independent verification of TinyFish's web automation accuracy, see the Mind2Web benchmark results — 90% across 136 live websites, all 300 execution traces published publicly.
Cloudflare-protected sites, PerimeterX (HUMAN) systems, and most standard anti-bot configurations are handled automatically in stealth mode.
DataDome and hCaptcha. These are among the most aggressive protection systems. TinyFish can get through in some configurations, but success rates are lower and less consistent than with Cloudflare or PerimeterX. If your target site uses DataDome, test with TinyFish's free tier first. If success rates don't meet your threshold, consider pairing TinyFish with a dedicated CAPTCHA solving service like CapSolver or 2Captcha for those specific sites, or evaluate Bright Data's proxy network which has the widest IP pool for DataDome-heavy targets.
Full hard blocks. Some sites implement IP-level blocking that cannot be bypassed by any browser automation tool. If a site has decided to block all automated access, no amount of fingerprint sophistication will help.
CAPTCHAs requiring human solving. TinyFish's stealth layer is designed to prevent CAPTCHAs from being triggered in the first place. When prevention works, CAPTCHAs never appear. When a CAPTCHA does appear on a heavily protected site, the current system has limited ability to solve it automatically. This is the layer TinyFish is investing the most in right now.
Rate-sensitive sites. Sites that track request frequency over time (not just per-session) may flag even legitimate-looking traffic if volume is too high. For these sites, adding pacing to your goal description helps: "Wait 3 seconds between each action".
TinyFish offers two browser profiles:
Default to stealth for any production workflow against external sites. Switch to lite only when you've confirmed the target site has no bot protection.
The scraping industry has two schools of thought on CAPTCHAs: solve them or prevent them. The economics strongly favor prevention.
Solving CAPTCHAs: $1–3 per 1,000 solves via token-based services like 2Captcha. Each solve adds 15–30 seconds of latency. At 10,000 requests/day with a 20% CAPTCHA trigger rate, that's $2–6/day in solving costs plus 8–16 hours of cumulative latency.
Preventing CAPTCHAs: Make your traffic look human enough that CAPTCHAs never appear. Zero cost per prevented CAPTCHA. Zero latency added. The investment is in the stealth infrastructure, not per-solve fees.
TinyFish's approach is prevention-first. The stealth layer is designed to keep your bot score low enough that challenges are never triggered. When that works — and across most sites in the success rate table above, it does — you get faster execution and lower costs than any solve-based approach.
For sites where prevention isn't enough, dedicated CAPTCHA solving services can work alongside TinyFish. You'd handle the solving logic in your application layer and feed the results back into your workflow.
Here's a side-by-side of what each approach costs for a team running 10,000 requests/month against Cloudflare-protected sites:
| Component | DIY Stack | TinyFish |
|---|---|---|
| Residential proxies | $200–500/mo | Included |
| Browser infrastructure | $50–200/mo (cloud VMs) | Included |
| Stealth browser library | Free (OSS) + maintenance time | Included |
| CAPTCHA solver (fallback) | $10–30/mo | Prevention-first approach |
| LLM for agent reasoning | $50–200/mo | Included |
| Engineering maintenance | 10–20 hrs/mo ongoing | Zero |
| Full control & auditability | ✅ Every component inspectable | Managed — limited visibility into infra |
| Cost ceiling at very high volume | Predictable — no per-step billing | Per-step cost scales linearly |
| Total estimated cost | $310–930/mo + eng time | $150/mo (Pro plan, 16,500 steps) |
Cost estimates based on market rates for residential proxy providers (Bright Data, Oxylabs, IPRoyal), cloud compute (AWS/GCP on-demand), and CAPTCHA solving services (2Captcha, CapSolver) as of Q1 2026. Ranges reflect variation by provider and volume tier.
The Pro plan includes 16,500 steps/month — browser execution, residential proxy, LLM inference, and anti-bot handling all bundled at $0.012/step on overage. No separate line items.
A fair assessment: if your team already has a working DIY stack and the engineering resources to maintain it, the cost math may favor staying on it — especially at very high volume where per-step pricing adds up. The DIY path gives you full control over every component, which matters for compliance, auditing, and edge-case customization.
Where TinyFish wins is for teams that don't have (or don't want to maintain) that stack. A DIY anti-bot system is a living system — detection methods evolve, browser patches need updates, proxy pools need rotation. Someone on your team is maintaining this. TinyFish moves that burden to infrastructure you don't manage.
The best way to evaluate is to test against the site that's actually giving you trouble.
500 free steps. No credit card. Set browser_profile: "stealth", point it at your target, and see what comes back.
TinyFish's stealth mode handles Cloudflare (including Turnstile), PerimeterX (HUMAN), and most standard anti-bot configurations automatically. Success rates range from 85–100% depending on site category and protection aggressiveness. DataDome and hCaptcha are handled in some configurations but with lower consistency. Full hard blocks at the IP level cannot be bypassed by any tool.
No. Residential proxy rotation is included in every TinyFish plan at no extra cost ($0/GB). Add proxy_config: { enabled: true, country_code: "US" } to your request to route through a specific country. Supported countries: US, GB, CA, DE, FR, JP, AU.
The system detects blocks automatically and attempts to reconfigure — switching proxy, adjusting session parameters — without your input. If reconfiguration succeeds, the task continues. If it fails, the run completes with a failure status that you can inspect via the streaming URL, which includes screenshots and execution logs for every step.
Stealth mode is slightly slower than lite mode because of the additional anti-bot handling. Simple extractions in stealth typically take 10–30 seconds. Multi-step workflows take 30–90 seconds depending on complexity. For sites without bot protection, use browser_profile: "lite" for faster execution.
Yes. TinyFish's approach is prevention-first — the stealth layer keeps CAPTCHAs from being triggered. For the minority of cases where a CAPTCHA still appears, you can integrate a third-party solving service (2Captcha, CapSolver) in your application layer and feed the token back into your workflow.
Browserbase provides cloud browsers and relies on JavaScript injection for stealth — you build and maintain the anti-bot logic via Stagehand or your own code. Firecrawl handles basic anti-bot through its rendering engine, but independent testing shows lower success rates on heavily protected sites. TinyFish handles anti-bot at the infrastructure level — native stealth layer, residential proxy rotation, and auto-reconfiguration — all activated with a single parameter. See our TinyFish vs Browserbase and TinyFish vs Firecrawl comparisons for detailed breakdowns.
No credit card. No setup. Run your first operation in under a minute.

TL;DR: TinyFish is now an n8n community node. Drop it into any workflow, point it at a URL, tell it what you want, and get clean JSON back. The web just became another input in your automation pipeline.


TinyFish is launching a high-intensity virtual accelerator program, backed by $2M from Mango Capital. This accelerator is designed to fund and support the founders building the next generation of software on top of the Agentic Web. Applications open February 17, 2026. Rolling admissions.