Engineering

How to Monitor 1,000 Websites in Parallel with the TinyFish API

TinyFishie·TinyFish Observer·Apr 14, 2026·10 min read

Your monitoring setup catches downtime. It doesn't catch the competitor who quietly dropped their price by 15%, the supplier whose inventory hit zero three days ago, or the regulatory page that updated its requirements last Tuesday.

Uptime monitoring and content monitoring are different problems. The first is solved. The second — monitoring what's on pages at scale, across authenticated portals, with structured output — is still largely manual or brittle.

At 10 sites, a cron job and a headless browser works fine. At 100, it starts to creak. At 1,000, you're managing a fleet of browser processes, rotating proxies, handling session state, and debugging anti-bot blocks — before you've written a single line of business logic.

TinyFish's agent API is built around a different model: describe what to check, run it in parallel, get structured JSON back. Here's how to build a monitoring pipeline that scales to 1,000 sites without managing the infrastructure.

What Traditional Monitoring Misses

Standard monitoring tools — Uptime Robot, Pingdom, Better Uptime — check whether a URL returns a 200. That's useful for availability. It tells you nothing about:

Price changes across market product pages
Inventory status on supplier portals (which require login)
Regulatory updates on government or compliance sites
Content drift — when a competitor's positioning or messaging changes
Feature launches tracked through changelog pages

For these use cases, you need a monitor that reads and understands the page, not just pings it.

The TinyFish Monitoring Architecture

TinyFish's Web Agent API accepts a natural language goal and returns structured JSON. For monitoring, the pattern is:

Define what to check per site (the goal)
Batch sites into parallel groups
Collect results and compare against previous state
Alert on diffs

With 50 concurrent agents on the Pro plan, you can check 50 sites simultaneously. For 1,000 sites, that's 20 sequential batches — total runtime depends on per-site task complexity, typically 30 seconds to 3 minutes per batch.

Setting Up Your First Monitoring Job

Authentication

All API calls use an API key from your TinyFish dashboard, passed via the X-API-Key header:

export TINYFISH_API_KEY="your_api_key_here"

Single Site Check

Before batching, test your goal description against one site:

import requests, os

def check_site(url: str, goal: str) -> dict:
    response = requests.post(
        "https://agent.tinyfish.ai/v1/automation/run",
        headers={"X-API-Key": os.environ['TINYFISH_API_KEY']},
        json={"url": url, "goal": goal}
    )
    return response.json()

result = check_site(
    url="https://example-supplier.com/product/widget-pro",
    goal="Find the current price and whether the item is in stock. Return price (string), in_stock (boolean)."
)
print(result)

Batching 1,000 Sites

With asyncio and aiohttp, run up to 50 concurrent requests matching your plan's concurrency limit:

import asyncio, aiohttp, json, os
from datetime import datetime

async def check_site_async(session, url: str, goal: str) -> dict:
    try:
        async with session.post(
            "https://agent.tinyfish.ai/v1/automation/run",
            headers={"X-API-Key": os.environ['TINYFISH_API_KEY']},
            json={"url": url, "goal": goal},
            timeout=aiohttp.ClientTimeout(total=120)
        ) as resp:
            data = await resp.json()
            return {"url": url, "status": "ok", "result": data.get("result")}
    except Exception as e:
        return {"url": url, "status": "error", "error": str(e)}

async def monitor_batch(urls: list, goal: str, concurrency: int = 50) -> list:
    results = []
    async with aiohttp.ClientSession() as session:
        for i in range(0, len(urls), concurrency):
            batch = urls[i:i + concurrency]
            tasks = [check_site_async(session, url, goal) for url in batch]
            batch_results = await asyncio.gather(*tasks)
            results.extend(batch_results)
            print(f"Batch {i//concurrency + 1}/{(len(urls)-1)//concurrency + 1} complete")
    return results

with open("sites.json") as f:
    sites = json.load(f)

goal = """Check the current price of the main product on this page.
Return: price (string), currency (string), in_stock (boolean).
If price is not visible, return null for price."""

results = asyncio.run(monitor_batch(sites, goal))

Handling Results and Diffs

from pathlib import Path

def save_and_diff(results: list, state_file: str = "monitor_state.json") -> list:
    state_path = Path(state_file)
    previous = json.loads(state_path.read_text()) if state_path.exists() else {}
    current = {r["url"]: r["result"] for r in results if r["status"] == "ok"}

    changes = [
        {"url": url, "before": previous[url], "after": result,
         "detected_at": datetime.utcnow().isoformat()}
        for url, result in current.items()
        if url in previous and previous[url] != result
    ]

    state_path.write_text(json.dumps(current, indent=2))
    return changes

changes = save_and_diff(results)
if changes:
    print(f"{len(changes)} sites changed since last run")
    for c in changes:
        print(f"  {c['url']}: {c['before']} → {c['after']}")

Monitoring Authenticated Portals

For sites that require login, include credentials in your goal description. TinyFish agents handle authentication as part of the task:

goal = """Log into the supplier portal using:
- Username: {username}
- Password: {password}

Navigate to the pricing section and find the unit price for SKU-4892.
Return: sku (string), unit_price (number), currency (string).

Do not proceed to any checkout or payment flow.""".format(
    username=os.environ["SUPPLIER_USERNAME"],
    password=os.environ["SUPPLIER_PASSWORD"]
)

The safety instruction (Do not proceed to any checkout or payment flow) matters for any commercial workflow — it prevents the agent from accidentally triggering transactions.

Monitoring at Different Frequencies

Not all sites need the same cadence. Group by volatility:

monitor_configs = [
    {
        "sites": competitor_pricing_urls,
        "goal": "Find the current price of [product] on this page.",
        "interval_hours": 1
    },
    {
        "sites": supplier_inventory_urls,
        "goal": "Check current inventory levels for our key SKUs.",
        "interval_hours": 24
    },
    {
        "sites": regulatory_sites,
        "goal": "Check if the requirements on this page have changed since last week.",
        "interval_hours": 168
    }
]

Cost Estimation

Cost depends on how many steps each monitoring task takes:

Task type	Avg steps/check	Cost @ $0.015/step	1,000 sites/day
Simple price check	3–5	$0.045–0.075	~$45–75/day
Authenticated portal	8–12	$0.12–0.18	~$120–180/day
Multi-page navigation	12–20	$0.18–0.30	~$180–300/day

Pro plan (50 concurrent, $150/month, 16,500 steps included) covers ~3,300 simple site checks before overage. Beyond that, steps bill at $0.012.

To reduce cost: keep goal descriptions narrow. "Find the price of the main product" uses fewer steps than "Find all pricing including discounts, bulk tiers, and promotions."

What This Handles That Cron + Playwright Doesn't

Anti-bot detection. Sites that block headless browsers work with TinyFish's stealth layer — no managing puppeteer-extra plugins or proxy rotation per site.

Layout changes. When a competitor redesigns their pricing page and your CSS selectors break, the agent reads the page and finds the price regardless of class names. No code update needed.

Authenticated portals at scale. Running 50 simultaneous authenticated sessions across different supplier portals would require significant session management infrastructure. With TinyFish, it's 50 API calls with different credentials in the goal.

Structured output. Instead of HTML to parse, you get JSON with the fields you requested. No BeautifulSoup, no regex, no post-processing pipeline.

---

TinyFish gives you 500 free steps to test against your actual monitoring targets — no credit card required.

**Start your free trial →**

---

FAQ

How many sites can I monitor in parallel?

Up to your plan's concurrent agent limit: 2 on PAYG, 10 on Starter, 50 on Pro. For 1,000 sites on Pro, you run 20 sequential batches of 50. Runtime per batch depends on task complexity — simple checks: 10–30 seconds each; authenticated multi-step tasks: 1–3 minutes.

How does TinyFish handle sites that block bots?

TinyFish handles anti-detection at the infrastructure level — browser fingerprint and network signatures match real browser execution. Residential proxy routing is included on all plans at no extra cost. Success rates vary by protection level; enterprise anti-bot systems like Kasada may need additional configuration.

Can I monitor sites that require login?

Yes. Include credentials in your goal description; the agent handles the authentication flow. Store credentials as environment variables, not in the goal string directly. Always add a safety instruction ("Do not proceed to checkout") for any commercial workflow.

What happens if a site changes its layout?

Nothing breaks on your end. The agent reads the page and extracts the data based on your goal description, not CSS selectors. Layout changes are invisible to your monitoring pipeline.

How do I send alerts when something changes?

Pipe the changes list from the diff function to your alerting stack — Slack via webhook, PagerDuty, email, or any webhook endpoint. TinyFish returns JSON; routing it to alerts is standard integration work.

Is this cheaper than a self-managed Playwright cluster?

At small scale (under 100 sites/day), self-managed Playwright on cheap compute is cheaper. At 100–1,000 sites/day, TinyFish's all-in pricing — browsers, proxies, LLM inference included — often undercuts the real cost of managing the infrastructure, especially accounting for engineering time on proxy rotation, anti-detection, and session management.

How to Monitor 1,000 Websites in Parallel with the TinyFish API

TinyFishie·TinyFish Observer·Apr 14, 2026·10 min read

What Traditional Monitoring Misses

Standard monitoring tools — Uptime Robot, Pingdom, Better Uptime — check whether a URL returns a 200. That's useful for availability. It tells you nothing about:

Price changes across market product pages
Inventory status on supplier portals (which require login)
Regulatory updates on government or compliance sites
Content drift — when a competitor's positioning or messaging changes
Feature launches tracked through changelog pages

For these use cases, you need a monitor that reads and understands the page, not just pings it.

The TinyFish Monitoring Architecture

TinyFish's Web Agent API accepts a natural language goal and returns structured JSON. For monitoring, the pattern is:

Define what to check per site (the goal)
Batch sites into parallel groups
Collect results and compare against previous state
Alert on diffs

Setting Up Your First Monitoring Job

Authentication

All API calls use an API key from your TinyFish dashboard, passed via the X-API-Key header:

export TINYFISH_API_KEY="your_api_key_here"

Single Site Check

Before batching, test your goal description against one site:

import requests, os

def check_site(url: str, goal: str) -> dict:
    response = requests.post(
        "https://agent.tinyfish.ai/v1/automation/run",
        headers={"X-API-Key": os.environ['TINYFISH_API_KEY']},
        json={"url": url, "goal": goal}
    )
    return response.json()

result = check_site(
    url="https://example-supplier.com/product/widget-pro",
    goal="Find the current price and whether the item is in stock. Return price (string), in_stock (boolean)."
)
print(result)

Batching 1,000 Sites

With asyncio and aiohttp, run up to 50 concurrent requests matching your plan's concurrency limit:

import asyncio, aiohttp, json, os
from datetime import datetime

async def check_site_async(session, url: str, goal: str) -> dict:
    try:
        async with session.post(
            "https://agent.tinyfish.ai/v1/automation/run",
            headers={"X-API-Key": os.environ['TINYFISH_API_KEY']},
            json={"url": url, "goal": goal},
            timeout=aiohttp.ClientTimeout(total=120)
        ) as resp:
            data = await resp.json()
            return {"url": url, "status": "ok", "result": data.get("result")}
    except Exception as e:
        return {"url": url, "status": "error", "error": str(e)}

async def monitor_batch(urls: list, goal: str, concurrency: int = 50) -> list:
    results = []
    async with aiohttp.ClientSession() as session:
        for i in range(0, len(urls), concurrency):
            batch = urls[i:i + concurrency]
            tasks = [check_site_async(session, url, goal) for url in batch]
            batch_results = await asyncio.gather(*tasks)
            results.extend(batch_results)
            print(f"Batch {i//concurrency + 1}/{(len(urls)-1)//concurrency + 1} complete")
    return results

with open("sites.json") as f:
    sites = json.load(f)

goal = """Check the current price of the main product on this page.
Return: price (string), currency (string), in_stock (boolean).
If price is not visible, return null for price."""

results = asyncio.run(monitor_batch(sites, goal))

Handling Results and Diffs

from pathlib import Path

def save_and_diff(results: list, state_file: str = "monitor_state.json") -> list:
    state_path = Path(state_file)
    previous = json.loads(state_path.read_text()) if state_path.exists() else {}
    current = {r["url"]: r["result"] for r in results if r["status"] == "ok"}

    changes = [
        {"url": url, "before": previous[url], "after": result,
         "detected_at": datetime.utcnow().isoformat()}
        for url, result in current.items()
        if url in previous and previous[url] != result
    ]

    state_path.write_text(json.dumps(current, indent=2))
    return changes

changes = save_and_diff(results)
if changes:
    print(f"{len(changes)} sites changed since last run")
    for c in changes:
        print(f"  {c['url']}: {c['before']} → {c['after']}")

Monitoring Authenticated Portals

For sites that require login, include credentials in your goal description. TinyFish agents handle authentication as part of the task:

goal = """Log into the supplier portal using:
- Username: {username}
- Password: {password}

Navigate to the pricing section and find the unit price for SKU-4892.
Return: sku (string), unit_price (number), currency (string).

Do not proceed to any checkout or payment flow.""".format(
    username=os.environ["SUPPLIER_USERNAME"],
    password=os.environ["SUPPLIER_PASSWORD"]
)

The safety instruction (Do not proceed to any checkout or payment flow) matters for any commercial workflow — it prevents the agent from accidentally triggering transactions.

Monitoring at Different Frequencies

Not all sites need the same cadence. Group by volatility:

monitor_configs = [
    {
        "sites": competitor_pricing_urls,
        "goal": "Find the current price of [product] on this page.",
        "interval_hours": 1
    },
    {
        "sites": supplier_inventory_urls,
        "goal": "Check current inventory levels for our key SKUs.",
        "interval_hours": 24
    },
    {
        "sites": regulatory_sites,
        "goal": "Check if the requirements on this page have changed since last week.",
        "interval_hours": 168
    }
]

Cost Estimation

Cost depends on how many steps each monitoring task takes:

Task type	Avg steps/check	Cost @ $0.015/step	1,000 sites/day
Simple price check	3–5	$0.045–0.075	~$45–75/day
Authenticated portal	8–12	$0.12–0.18	~$120–180/day
Multi-page navigation	12–20	$0.18–0.30	~$180–300/day

Pro plan (50 concurrent, $150/month, 16,500 steps included) covers ~3,300 simple site checks before overage. Beyond that, steps bill at $0.012.

To reduce cost: keep goal descriptions narrow. "Find the price of the main product" uses fewer steps than "Find all pricing including discounts, bulk tiers, and promotions."

What This Handles That Cron + Playwright Doesn't

Anti-bot detection. Sites that block headless browsers work with TinyFish's stealth layer — no managing puppeteer-extra plugins or proxy rotation per site.

Layout changes. When a competitor redesigns their pricing page and your CSS selectors break, the agent reads the page and finds the price regardless of class names. No code update needed.

Structured output. Instead of HTML to parse, you get JSON with the fields you requested. No BeautifulSoup, no regex, no post-processing pipeline.

---

TinyFish gives you 500 free steps to test against your actual monitoring targets — no credit card required.

**Start your free trial →**

---

FAQ

How many sites can I monitor in parallel?

How does TinyFish handle sites that block bots?

Can I monitor sites that require login?

What happens if a site changes its layout?

Nothing breaks on your end. The agent reads the page and extracts the data based on your goal description, not CSS selectors. Layout changes are invisible to your monitoring pipeline.

What Traditional Monitoring Misses

The TinyFish Monitoring Architecture

Setting Up Your First Monitoring Job

Authentication

Single Site Check

Batching 1,000 Sites

Handling Results and Diffs

Monitoring Authenticated Portals

Monitoring at Different Frequencies

Cost Estimation

What This Handles That Cron + Playwright Doesn't

FAQ

How many sites can I monitor in parallel?

How does TinyFish handle sites that block bots?

Can I monitor sites that require login?

What happens if a site changes its layout?

How do I send alerts when something changes?

Is this cheaper than a self-managed Playwright cluster?

Related Reading

Start building.

Production-Grade Web Fetching for AI Agents

Why Stitched Web Stacks Fail in Production

We Shipped an MCP Server. Then We Shipped a CLI. The CLI Won.

What Traditional Monitoring Misses

The TinyFish Monitoring Architecture

Setting Up Your First Monitoring Job

Authentication

Single Site Check

Batching 1,000 Sites

Handling Results and Diffs

Monitoring Authenticated Portals

Monitoring at Different Frequencies

Cost Estimation

What This Handles That Cron + Playwright Doesn't

FAQ

How many sites can I monitor in parallel?

How does TinyFish handle sites that block bots?

Can I monitor sites that require login?

What happens if a site changes its layout?

How do I send alerts when something changes?

Is this cheaper than a self-managed Playwright cluster?

Related Reading

Start building.

Production-Grade Web Fetching for AI Agents

Why Stitched Web Stacks Fail in Production

We Shipped an MCP Server. Then We Shipped a CLI. The CLI Won.