
Your agent fetches a news article for you. HTTP 200, markdown back, looks fine. Then you realize 80% of what your agent brought back and your LLM processed was nav bars, trending headlines, and weather widgets.
That’s the actual state of web fetching for agents right now. Most fetchers return a successful response and still give the agent a bad version of the page. But, not TinyFish Fetch. We pulled fifteen real articles from five different publishers last week and ran each one through TinyFish Fetch and two well-known competitors to see how wrong they got it.
Don't take our word for it.
Take these URLs, run them in our Fetch Playground, and through whatever you've got.
Or just keep scrolling - we'll show you exactly what we found!
Fifteen articles, five publishers, same five-minute window, live retrieval, markdown output. The table below shows the median of total text returned; not article length.
Bigger numbers here generally mean more irrelevant content.
| Article | TinyFish Fetch | Service A | Service B |
|---|---|---|---|
| Daily Mail | 8,736 chars | 37,136 chars | 85,054 chars |
| Hindustan Times | 2,990 chars | 543 (headline only) | 30,571 chars |
| SCMP | 1,863 chars | 1,990 chars | 14,710 chars |
| The Guardian | 3,755 chars | empty (SOURCE_NOT_AVAILABLE) | 8,006 chars |
| New York Times | 2,136 chars (2/3) | empty (SOURCE_NOT_AVAILABLE) | empty (HTTP 403) |
Service B returned 8–10x as many characters as TinyFish Fetch on three of the four pages. Those extra characters were not deeper coverage. They were junk.
Service A was shorter on average but only returned a headline and a sign-in widget for one article and nothing at all for another. Nothing relevant. Nothing useful.
Take the Daily Mail article for instance.
The article body is about 4,300 characters. Here’s what each service fetched and fed to the model:
| Service | Total chars | % of Total that is Article Content | % of Total that is NOT Article Content |
|---|---|---|---|
| TinyFish Fetch | 4,673 | ~92% | ~8% (a small DC Insider newsletter promo line) |
| Service A | 63,400 | ~7% | ~93% (200 lines of unrelated story headlines stacked at the top) |
| Service B | 164,986 | ~3% | ~97% (full site nav, weather widget, 60+ trending links, ad slots, runtime error text) |

Rough math on that Daily Mail article: at ~4 characters per token, a 4,673-character TinyFish Fetch result is ~1,170 input tokens. Service B's 164,986-character version of the same article is ~41,000 tokens.
35× the cost for the same article, plus slower inference, plus irrelevant facts competing for attention in the context window. For one page this is a small waste. At fifty pages, it compounds into real degradation that affects response times, overall accuracy, and bottom lines.
The same pattern persists across other articles.
Hindustan Times: Service B returned 28,470 characters where the article body itself was ~3,000. The rest was the full top-nav rendered as a markdown bullet list (every Indian city page included).
SCMP: Service B returned 14,710 characters for an article whose body is ~1,863 characters. Roughly 87% of the response was section nav, edition pickers, related rails, and footer chrome.
Fetch is a browser-backed extraction service. The exact heuristics are proprietary, but the shape of the work is straightforward. We do a lot of small, site-specific things so the caller does not have to.
It also benefits from the same proprietary browser infrastructure behind our Browser API. Fetch does not need the full control surface of Browser, but using a browser we control gives us a better place to handle anti-bot systems: browser fingerprints, request behavior, proxy routing, and challenge pages.
Those details matter even when the only thing you want back is clean article text.
None of this is magic. It is just the unglamorous part of making web content usable for agents.
It missed 1 of 3 NYT URLs, and we’re actively working to improve this. (Check back soon!)
However, our competitors miss whole publishers.


Use Fetch by itself when your agent already has URLs. Point it at a page and get clean content back.
Use Search first when your agent does not know where to look yet. Search finds candidate sources. Fetch turns those sources into usable evidence.
Any usecase that reads a lot of public web pages, like news monitoring, financial research, brand intelligence, or regulatory tracking lives and dies on fetch quality. Run one URL through TinyFish Fetch and you'll see it in the output: less noise, sharper answers, fewer tokens, fewer wasted dollars.
Search and Fetch are free. No credits, no credit card.
# Search
curl "<https://api.search.tinyfish.ai?query=nvidia+earnings+2026>" \\\\
-H "X-API-Key: $TINYFISH_API_KEY"
# Fetch
curl -X POST <https://api.fetch.tinyfish.ai> \\\\
-H "X-API-Key: $TINYFISH_API_KEY" \\\\
-H "Content-Type: application/json" \\\\
-d '{"url": "<https://www.theguardian.com/any-article>", "format": "markdown"}'Grab your API Key: agent.tinyfish.ai/api-keys
Or try it out in the Playground first: agent.tinyfish.ai/playground/fetch
No credit card. No setup. Run your first operation in under a minute.