
Aggregation platforms run on coverage. The more hotels, studios, and venues you index, the more useful you are to the end user. The more useful you are, the more bookings you drive. Coverage is the product.
The coverage problem is that most independent hotels and fitness studios don't have APIs. They have booking interfaces — live, dynamic, JavaScript-rendered pages that show real availability in real time. The data is there. It's just not in a format your system can consume without something in between.
That something used to be partnership agreements and IT integrations — a process that takes months per property and doesn't scale past a few hundred partners. The aggregators who removed that integration requirement grew faster than the ones who didn't.
Web agents are the technical layer that eliminates the IT integration bottleneck at scale. This article covers the pattern, the tradeoffs, and where it works — with working code and two real-world cases.
If your platform only needs static directory data — addresses, phone numbers, descriptions — an extraction or a manual data entry workflow may be sufficient. The case for web agents is strongest when you need live availability that changes throughout the day.
This isn't a niche problem. It's the default state of the hospitality and fitness industries.
Large hotel chains — Marriott, Hilton, IHG — have APIs and distribution partnerships. The hotels that don't appear in your index aren't hiding: they're the boutique ryokan in Kyoto that uses a property management system from 2015, the independent fitness studio that takes bookings through Mindbody but hasn't applied to any aggregator program, the climbing gym that manages availability through a spreadsheet embedded in their website.
These properties represent the majority of the total addressable market in most verticals. In Japan alone, thousands of independent hotels operate outside the major OTA distribution networks — not because they've opted out, but because the onboarding process for traditional API integration is a barrier they never cleared.
The consequence is a coverage ceiling that's set by your BD and IT capacity, not by the number of properties that actually exist. Aggregators that can index properties without requiring those properties to do anything have a structural advantage over aggregators that can't.
Web agents remove the IT requirement on the property side. The hotel doesn't need to expose an API. The studio doesn't need to apply to your partner program. The agent navigates their existing booking interface and returns structured availability data — the same data a customer would see if they visited the page directly.

A global technology company indexes hotel availability and pricing across thousands of properties. In Japan, a significant portion of independent hotels operated outside existing API distribution networks — available for booking, but invisible to aggregators that required API access.
Using TinyFish agents to navigate live hotel booking interfaces, the platform expanded coverage in the Japanese market by approximately 4× compared to what was accessible through traditional API integrations. The hotels didn't change anything on their end. No IT engagement required. No partner onboarding process.
The result: hotels that were previously invisible became discoverable — and bookable — through the platform's interface.
This is the structural point: the limiting factor in coverage wasn't the supply of hotels. It was the integration requirement. Remove the integration requirement, and the coverage ceiling rises dramatically.
A global fitness marketplace aggregates fitness class availability across thousands of studios. Studios without API access were previously updated manually — staff checking studio booking pages, updating availability records, and maintaining the data on a delay that could stretch to weeks.
At 2,000 studios with manual updates, the coverage was manageable but the data was stale. The backlog grew faster than staff could clear it.
Using web agents to navigate studio booking interfaces directly, the platform expanded from 2,000 to over 8,000 studios. The manual update backlog was eliminated. Availability data reflects what's actually on the studio's booking page — not what someone updated last week.
The 4× coverage expansion came without a 4× increase in operational overhead. The cost structure changed because the marginal cost of adding one more studio dropped to near zero.
The core pattern is the same across hotel availability, fitness class spots, and any other live booking interface: navigate to the page as a user would, extract the structured data you need, return it in a schema your system can ingest.
pip install tinyfish
export TINYFISH_API_KEY=sk-tinyfish-*****import asyncio
import json
from datetime import datetime, timezone
from tinyfish import AsyncTinyFish, BrowserProfile
client = AsyncTinyFish()
HOTELS = [
# URLs below are illustrative — replace with verified booking page URLs before running
{
"hotel_id": "ryokan-kyoto-001",
"name": "Nishiyama Ryokan",
"booking_url": "https://nishiyama-ryokan.com/en/availability",
"check_in": "2026-04-15",
"check_out": "2026-04-17",
},
{
"hotel_id": "hotel-tokyo-042",
"name": "Shinjuku Granbell Hotel",
"booking_url": "https://granbellhotel.jp/en/shinjuku/rooms/",
"check_in": "2026-04-15",
"check_out": "2026-04-17",
},
# scale to thousands of properties
]
async def check_hotel_availability(hotel: dict) -> dict:
goal = f"""
Check room availability for check-in {hotel['check_in']} and check-out {hotel['check_out']}.
Navigate to the availability or booking section.
If a date selector is shown, enter the check-in and check-out dates provided.
Wait for availability results to load.
For each available room type, extract:
{{
"room_type": "string",
"price_per_night": number or null,
"currency": "string",
"rooms_available": number or null,
"cancellation_policy": "free / non-refundable / partial — as shown"
}}
Return JSON:
{{
"hotel_id": "{hotel['hotel_id']}",
"check_in": "{hotel['check_in']}",
"check_out": "{hotel['check_out']}",
"available_rooms": [ ...array of room objects above... ],
"fully_booked": true or false
}}
If no availability is shown, set fully_booked to true and available_rooms to [].
Do not proceed to checkout or enter any payment details.
"""
response = await client.agent.run(
url=hotel["booking_url"],
goal=goal,# (continued)
browser_profile=BrowserProfile.STEALTH,
)
# For debugging: response.streaming_url contains a live browser replay (valid 24h)
# response.result is shaped by the goal prompt above — or None on infrastructure failure
result = response.result or {}
if result.get("status") == "failure":
return {
"hotel_id": hotel["hotel_id"],
"error": result.get("reason", "goal_failed"),
"checked_at": datetime.now(timezone.utc).isoformat(),
}
return {
"hotel_id": hotel["hotel_id"],
"check_in": hotel["check_in"],
"check_out": hotel["check_out"],
"available_rooms": result.get("available_rooms", []),
"fully_booked": result.get("fully_booked", False),
"checked_at": datetime.now(timezone.utc).isoformat(),
}
async def main():
tasks = [check_hotel_availability(h) for h in HOTELS]
results = await asyncio.gather(*tasks)
print(json.dumps(results, indent=2))
asyncio.run(main())Note on result handling: status: "COMPLETED" means the browser ran — not that availability was found. A hotel showing a "fully booked" page returns COMPLETED with fully_booked: true. A hotel whose booking interface failed to load returns COMPLETED with a goal failure in result. Always check result, not just the run status.
{
"hotel_id": "ryokan-kyoto-001",
"check_in": "2026-04-15",
"check_out": "2026-04-17",
"available_rooms": [
{
"room_type": "Standard Japanese Room",
"price_per_night": 18500,
"currency": "JPY",
"rooms_available": 3,
"cancellation_policy": "free"
},
{
"room_type": "Deluxe Room with Garden View",
"price_per_night": 26000,
"currency": "JPY",
"rooms_available": 1,
"cancellation_policy": "non-refundable"
}
],
"fully_booked": false,
"checked_at": "2026-03-27T10:00:02Z"
}The same pattern applies to fitness class monitoring. The schema is denser — classes have specific times, instructors, and spot counts that change throughout the day.
async def check_studio_availability(studio_id: str, url: str, target_date: str) -> dict:
goal = f"""
Find all available fitness classes on {target_date}.
Navigate to the schedule or class booking section.
Select or filter for {target_date} if a date picker is shown.
Wait for the class schedule to load.
For each class listed, extract:
{{
"class_name": "string",
"instructor": "string or null",
"start_time": "HH:MM in 24h format",
"duration_minutes": number or null,
"spots_available": number or null,
"spots_total": number or null,
"booking_url": "direct URL to book this class or null",
"waitlist_available": true or false
}}
Return JSON:
{{
"studio_id": "{studio_id}",
"date": "{target_date}",
"classes": [ ...array of class objects above... ]
}}
If no classes are scheduled, return an empty classes array.
Do not book or reserve any classes.
"""
response = await client.agent.run(
url=url,
goal=goal,
browser_profile=BrowserProfile.STEALTH,
)
result = response.result or {}
if result.get("status") == "failure":
return {
"studio_id": studio_id,
"date": target_date,
"classes": [],
"error": result.get("reason", "goal_failed"),
"checked_at": datetime.now(timezone.utc).isoformat(),
}
return {
"studio_id": studio_id,
"date": target_date,
"classes": result.get("classes", []),
"checked_at": datetime.now(timezone.utc).isoformat(),
}{
"studio_id": "studio-sf-0291",
"date": "2026-04-15",
"classes": [
{
"class_name": "Vinyasa Flow",
"instructor": "Sarah Chen",
"start_time": "07:00",
"duration_minutes": 60,
"spots_available": 4,
"spots_total": 20,
"booking_url": "https://studiobooking.com/class/84721",
"waitlist_available": false
},
{
"class_name": "HIIT Power",
"instructor": "Marcus Rivera",
"start_time": "09:30",
"duration_minutes": 45,
"spots_available": 0,
"spots_total": 15,
"booking_url": "https://studiobooking.com/class/84722",
"waitlist_available": true
}
],
"checked_at": "2026-03-27T10:00:04Z"
}Availability data has a shelf life that varies by property type and booking velocity.
A boutique hotel in a low-demand period might have room availability that's stable for days. A popular yoga studio on a Saturday morning might sell out its 9am class in 20 minutes. The right refresh frequency depends on which scenario you're in.
| Property type | Booking velocity | Recommended refresh |
|---|---|---|
| Independent hotels, off-peak | Low | 4–12 hours |
| Independent hotels, peak season | Medium | 1–4 hours |
| Fitness studios, popular classes | High | 15–60 minutes |
| Fitness studios, off-peak slots | Low | 2–4 hours |
| Event venues with limited capacity | Very high | Real-time or near-real-time |
Search and Fetch are free on all plans — rate-limited by plan tier (Free: 5 searches/min, 25 fetches/min). Failed fetches are never charged. Failed fetches are never charged. The cost math below covers agent steps only (PAYG at $0.015/step): one studio availability check = 1 agent run (~6 steps). 1,000 studios per check × 6 steps × $0.015 = $90 per check. At hourly intervals: $2,160/day. At 4-hour intervals (recommended for most studios): $540/day. For a platform driving bookings at scale, the data freshness premium is usually worth it for high-velocity inventory and not worth it for low-velocity inventory. Segment your property list accordingly.
What stale data actually costs: a user searches for a class, sees 4 spots available, clicks through to book, and finds it full. That's a failed conversion and a trust signal that the platform's data isn't reliable. In fitness and travel, a few bad experiences materially affect retention. Freshness isn't just a technical metric.
Works well:
Works with additional configuration:
-
Doesn't work:
The booking confirmation limitation is worth emphasizing. Reading availability from a page is fundamentally different from submitting a booking on a user's behalf. Availability extraction is a read operation with no consequences if something goes wrong. Booking initiation is a write operation with financial and legal consequences. Use different reliability standards for each.
The free tier gives you 500 steps — enough to run availability checks across 500 hotel or studio pages and see what structured output looks like against your real target properties.
For platforms indexing thousands of properties or requiring SLA guarantees on data freshness, contact our enterprise team.
Can agents handle booking interfaces that require login?
Yes, for accounts you are authorized to access. Include login steps in the goal prompt using your organization's authorized credentials — the agent navigates the authentication flow the same way a user would. For platforms with complex enterprise SSO, the agent handles standard TOTP and OAuth flows natively; hardware token flows require a human-in-the-loop handoff for the auth step.
How does this compare to working directly with the booking platforms (Mindbody, FareHarbor)?
Where those platforms have APIs or data partnerships, use them — they're more reliable and purpose-built for aggregation. Agents are the right tool for properties that aren't on those platforms, or where the platform doesn't offer the data feed you need.
What happens when a studio's booking interface changes?
Goal-based navigation is more resilient than selector-based extraction. When the interface changes, the goal — "find available classes on this date" — usually remains valid. Minor goal prompt tuning may be needed after major redesigns, but it's substantially less maintenance than fixing CSS selectors.
Can this handle availability across multiple date ranges in one run?
Yes — include the date range logic in the goal prompt, or run separate concurrent agents per date. The concurrent approach is often faster and produces cleaner output per date.
What does `COMPLETED` status mean?
Infrastructure success — the browser ran. Not goal success. A hotel showing "no availability" returns COMPLETED with fully_booked: true. A page that failed to load returns COMPLETED with a goal failure in result. Always check result, not just the run status.
No credit card. No setup. Run your first operation in under a minute.