TinyFish x VideoDB: We made the World Cup searchable

The 2026 World Cup is on. Right now, somewhere, a fan is asking the internet:
Show me every yellow card from Jordan vs Argentina.
And right now, the internet is failing them.
Google returns a match report. YouTube returns a 12-minute highlight reel where the cards are somewhere in the middle. ChatGPT returns a summary. None of them return the thing the person actually asked for: the moment, on a play button.
This is the next RAG problem. Almost nobody is treating it like one.
So we built the smallest, most fun version of the fix we could think of with TinyFish doing the open-web work and VideoDB doing the video work, and we pointed it at the World Cup.
Try the Live app for free.
And check out the open-source code
RAG grew up in a text-shaped world
The first wave of retrieval was built for documents. Chunks, embeddings, top-k, rerank, done. It worked because the answers lived in paragraphs in docs, PDFs, wikis, support pages, changelogs.
That entire stack assumes the world is text.
The world is not text. The world is increasingly video.
- A 45-minute product demo where pricing comes up at minute 31.
- A 2-hour lecture where the worked example starts at 1:14:08.
- A football match where the referee pulls a card at 67:42.
- A sales call where the objection lands at 18:20.
- A livestream where the incident happens right now.
A "good" RAG response over any of these is a link. But the best response is a clip.
The industry is still shipping links.
What we built
A World Cup briefing app. You type a request in plain English:
Show me the yellow cards from Brazil vs Morocco.
Thirty seconds later, you get a playable reel of exactly those moments. Not the match or the highlights. The cards.
Type something else:
Every Vinícius dribble that ended in a foul.
Every save Bono made in the second half.
The goal and the three passes before it.
Same flow. You ask for the moment. You get the moment.
That's the whole product. The interesting part is what has to be true for it to work.
Two layers that need each other
A real "ask for the moment" workflow doesn't start with a clean video file. It starts with a sentence, from a user who doesn't even know which video to look at yet.
So the app needs two layers working together:
- A web layer. Figure out which video, with enough public context to back the choice.
- A media layer. Go inside that video, find the requested moments, and return them as something playable.
TinyFish is the web layer. VideoDB is the media layer. The fun part is what happens when you wire them up.
The web layer: TinyFish Search + Fetch
When the user asks for "yellow cards from Brazil vs Morocco," the app doesn't know:
- which Brazil vs Morocco match (group stage? knockout? a friendly from two years ago?)
- which broadcast or highlight upload is public and watchable
- which player names, timestamps, and context already exist in match reports
Before we touch a single frame of video, TinyFish Search pulls candidate sources like match pages, recaps, highlight uploads, post-match reports. TinyFish Fetch reads the relevant ones and gives the agent grounded context: date, lineup, scoreline, who was booked, when.
from tinyfish import TinyFish
client = TinyFish()
# 1. Find candidate sources for the match.
search_response = client.search.query(
query="Brazil vs Morocco 2026 World Cup yellow cards match report",
)
urls = [r.url for r in search_response.results]
# 2. Pull clean, rendered page content for the top hits.
fetch_response = client.fetch.get_contents(urls=urls[:5])
context = [(p.title, p.text) for p in fetch_response.results]Free to use, one key for both APIs. Grab one at agent.tinyfish.ai/api-keys.
The media layer: VideoDB
Once we have a video URL, VideoDB takes over. It ingests the video and indexes the scenes inside it (visual events plus spoken word), so the match itself becomes searchable.
Then the part that makes this feel different from "search returns a link": VideoDB doesn't just hand back timestamps. It compiles every matching shot into a single, freshly stitched video you can play.
import videodb
from videodb import SearchType, IndexType
conn = videodb.connect(api_key="YOUR_VIDEODB_KEY")
video = conn.upload(url=match_video_url)
# Index visual scenes so the match becomes searchable.
video.index_scenes()
# Ask for the moment.
results = video.search(
query="referee shows yellow card to player",
search_type=SearchType.scene,
index_type=IndexType.scene,
)
# Get the moment, as a single playable compilation.
results.play()The output isn't a list of timestamps you have to stitch together yourself, and it isn't the original 90-minute match. It's a brand new curated reel of only the yellow-card moments, in order, streamable over HLS.
That's what makes VideoDB feel like a database for video instead of a wrapper around a file. You don't query it and get back "here is a video, good luck." You query it and get back the answer, already edited.
The full flow
User: "yellow cards from Brazil vs Morocco"
│
▼
TinyFish Search ──► candidate match / highlight URLs
│
▼
TinyFish Fetch ──► grounded context (date, players, scoreline)
│
▼
App picks the right source video
│
▼
VideoDB ingests + indexes scenes
│
▼
VideoDB.search("yellow card") ──► matching shots
│
▼
results.play() ──► playable reelFive moving parts. One natural-language request. One play button.
The opinion
We'll say the part most posts in this space tiptoe around.
Agents don't need more chatbots. They need better senses.
A retrieval system that can only return text can only answer the small slice of human knowledge that somebody bothered to write down. Everything else (demos, lectures, matches, calls, streams) is locked inside video, and today's agents are functionally blind to it.
The next interesting agent products won't summarize a video for you. They'll jump you to the part you cared about, and then keep going.
- A support agent that opens your install tutorial at the exact step you're stuck on.
- A sales agent that pulls the clip where the prospect raised the objection, not a transcript paragraph about it.
- A learning agent that drops you into the worked example, not the chapter intro.
- A product research agent that inspects the launch demo, not the launch tweet.
- A monitoring agent that returns the 4 seconds the incident happened, not the hour of footage around it.
All of those collapse into the same primitive: ask for a moment, get the moment.
That primitive needs two things working together: a web layer that knows where to look, and a media layer that knows what's inside and can turn the matching moments into a clip. That's why we built this with VideoDB. Their video stack is what makes the answer in "ask for the moment, get the moment" actually feel like an answer.
The World Cup is just the most fun place to prove it.
Try it
The repo is small on purpose. Clone it, plug in your TinyFish and VideoDB keys, point it at a match, and ask for whatever you want. Then go build the version of this for whatever video your users actually care about.
Grab your free TinyFish API keys here: agent.tinyfish.ai/api-keys
The matches are still going. Ask for the moment.



