Building BookDeal with TinyFish: From Twelve Tabs to One Command

TL;DR
BookDeal is an open-source Python CLI that uses TinyFish Search and Fetch to compare live book listings across trusted retailers.
It filters out noisy or suspicious results, ranks listings by price, shipping, condition, format, and merchant trust, and returns the best option with backups.
In a benchmark of 100 titles, BookDeal returned at least one valid purchase option for 91 titles and found observed savings opportunities averaging $8.11 per book across selected marketplace results.
“The agent was good at exploring, but the rules were better at deciding.”
The problem
Buying a book online should take thirty seconds.
Somehow, it still turns into fifteen minutes and twelve open tabs.
The problem is not a lack of listings. The problem is noise.
Search results mix together:
- new and used books
- print and ebook formats
- rental copies
- summaries
- study guides
- audiobooks
- generic category pages
- listings with deceptively low sticker prices that hide expensive shipping
A raw lowest price is often a trap.
The goal is not just to find the lowest visible number. The goal is to find the cheapest reasonable option from a source a buyer can actually trust.
I built BookDeal to turn that messy comparison process into a single command.
What I built
BookDeal is a Python CLI that searches live book-marketplace listings, fetches promising product pages, extracts the details that matter, filters suspicious or irrelevant results, and ranks the best trustworthy deal.
A user runs a command such as:
./bookdeal "The Hobbit" --author "J.R.R. Tolkien" --year 1937 --statsBookDeal then handles the workflow in five stages:
- Search trusted retailer domains for relevant listings.
- Fetch the most promising product pages.
- Extract pricing, shipping, condition, format, and listing signals.
- Filter out suspicious, irrelevant, or low-quality results.
- Rank the remaining candidates and return the best deal with backup options.

I chose a CLI because the terminal is an efficient experimentation environment. It made it easy to inspect raw results, test search strategies, tweak ranking heuristics, and iterate without building a user interface first.
BookDeal also supports structured JSON output, which makes it easier to plug into larger automation workflows later.

Where TinyFish fit
Before writing ranking logic, I had to solve a retrieval problem:
How do you reliably discover and fetch live marketplace listings without building and maintaining a custom scraper for every retailer?
TinyFish solved two problems cleanly:
- Search relevance across retailer domains
- Structured page retrieval without custom scrapers
TinyFish’s job is to:
- search across trusted retailer domains
- discover candidate listings
- fetch the most promising product pages
- return page content that BookDeal can inspect
BookDeal’s job is to:
- extract pricing, shipping, condition, format, and merchant signals
- filter noisy or suspicious results
- rank the remaining listings
- return the cheapest reasonable option with backups
That separation let me focus on the part that made BookDeal useful: filtering and ranking.
Instead of maintaining separate scraping pipelines for Barnes & Noble, AbeBooks, ThriftBooks, Bookshop, Amazon, Books-A-Million, and other marketplaces, I could constrain searches to trusted sources and use TinyFish Fetch to inspect actual listing pages.
The distinction matters because search snippets are useful for discovery, but they are not reliable enough for final ranking.
Shipping costs may be missing. Condition details may be inconsistent. Snippets may omit the signals that determine whether a listing is actually worth recommending.
What worked
1. Ranking the cheapest reasonable deal, not the lowest sticker price
BookDeal scores each candidate across several dimensions:
- item price
- detected shipping cost
- book condition
- format, including print and ebook
- merchant trust
- suspicious terms and surrounding page context
Ebook listings do not receive a missing-shipping penalty.
Print listings with unknown shipping are penalized because the final cost may be higher than the visible sticker price.
The app also excludes listings that appear to be:
- audiobooks
- summaries
- study guides
- PDFs
- rentals
- generic search pages
- category pages
The final recommendation should point to a specific book listing that a buyer could reasonably trust.
2. Separating search, fetch, extraction, filtering, and ranking
The most important architectural decision was to keep the pipeline modular.
Each stage has a clear responsibility.
That made it possible to improve ranking heuristics without rewriting retrieval logic every time something changed.
3. Fetching full pages instead of trusting snippets
My first implementation leaned too heavily on search snippets.
It quickly became obvious that snippets were not enough.
Shipping costs appeared only on the listing page. Condition details were inconsistent. Important metadata was often missing.
Adding TinyFish Fetch into the pipeline made the rankings much more reliable because BookDeal could inspect the actual listing content before making a recommendation.
4. Restricting searches to trusted domains
Without domain filtering, searches sometimes wandered toward irrelevant pages.
In one memorable case, a query surfaced Amazon Music, likely because the search interpreted the title as audiobook-adjacent.
The fix was simple but important: start with an allowlist of trusted book retailers and validate URLs before fetching.
That kept noisy results from breaking downstream extraction and ranking.
Experimenting with agent mode
BookDeal also includes an optional agent mode.
It uses Pydantic AI with a Gemini model to plan and execute TinyFish-powered search workflows while keeping final ranking deterministic.
The agent can:
- broaden retailer coverage
- retry searches
- decide which pages are worth fetching
- adapt when the initial results are weak or ambiguous
That exploration was genuinely useful.
But the experiment also revealed a clear boundary: final ranking still worked best with explicit scoring rules.
The agent was better at exploring possibilities.
Deterministic rules were better at making consistent, repeatable decisions.

Measurable outcome
To evaluate BookDeal, I tested it against a dataset of 100 popular books spanning fiction, nonfiction, business, technology, and classic literature.
I used title, author, and publication year where available to improve search precision. BookDeal supports ISBN input for edition-specific searches, but I avoided ISBNs here because the goal was to find a valid purchase option for each book rather than restrict results to one exact edition.
| Metric | Benchmark result |
|---|---|
| Titles tested | 100 |
| Titles with at least one valid purchase option | 91 |
| Success rate | 91% |
| Average candidate listings discovered per title | 23.7 |
| Marketplace coverage | Up to 12 marketplaces |
| Observed savings opportunities identified | 19 |
| Average observed savings when a cheaper option was found | $8.11 per book |
| Maximum observed savings | $23.71 |
One concrete example came from Zero to One by Blake Masters and Peter Thiel.
In the benchmark run, BookDeal found a $7.29 listing on ThriftBooks compared with a $31.00 listing on Books-A-Million, producing the largest observed savings in the benchmark: $23.71.
The nine no-deal cases were concentrated among technical or specialized titles where coverage across the approved retailer set was thinner, and some listings were unavailable or difficult to retrieve reliably.
What I’d do differently / Lessons learned
1. Treat search results as candidates, not answers
A search result can look relevant and still point to the wrong thing:
- a category page
- a rental
- a summary
- an audiobook
- a low-quality marketplace listing
Search should produce candidates.
Validation and ranking should decide what survives.
2. Preserve evidence from fetched pages
Keep the text that explains why a listing was accepted or rejected.
That evidence makes debugging dramatically easier, especially when marketplace pages are inconsistent or change over time.
3. Use agents for exploration and rules for repeatable decisions
Agent behavior adds value when search needs to adapt.
But if the final outcome needs to be consistent and explainable, deterministic scoring is still extremely useful.
That was the main lesson from agent mode: the agent was useful for exploring, but explicit ranking rules were better for deciding.
Recommendation for other builders
If you are building a marketplace-comparison tool or another commerce workflow with live web data, start with three decisions:
- Define a narrow allowlist of trusted sources and validate URLs before fetching.
- Fetch full pages when the final decision depends on details that snippets may omit.
- Keep search, fetch, extraction, filtering, and ranking as separate, observable stages.
The broader lesson is simple: lightweight tooling can feel powerful when the workflow compression is real.
Collapsing a dozen marketplace tabs into one command sounds small on paper.
In practice, it is the kind of improvement that changes how people work.
Try it / Links
BookDeal is open source.
GitHub repo → https://github.com/Mimran0715/bookdeal.git
You can review the implementation, test the CLI locally, or adapt the workflow for another marketplace-comparison use case.
Want to build your own live web comparison workflow?
📌 Sign up free → agent.tinyfish.ai
Docs → docs.tinyfish.ai
Open source Cookbook → github.com/tinyfish-io/tinyfish-cookbook



