Build Logs

Building BookDeal with TinyFish: From Twelve Tabs to One Command

Maleeha Imran·Jun 20, 2026

TL;DR

BookDeal is an open-source Python CLI that uses TinyFish Search and Fetch to compare live book listings across trusted retailers.

It filters out noisy or suspicious results, ranks listings by price, shipping, condition, format, and merchant trust, and returns the best option with backups.

In a benchmark of 100 titles, BookDeal returned at least one valid purchase option for 91 titles and found observed savings opportunities averaging $8.11 per book across selected marketplace results.

“The agent was good at exploring, but the rules were better at deciding.”

The problem

Buying a book online should take thirty seconds.

Somehow, it still turns into fifteen minutes and twelve open tabs.

The problem is not a lack of listings. The problem is noise.

Search results mix together:

new and used books
print and ebook formats
rental copies
summaries
study guides
audiobooks
generic category pages
listings with deceptively low sticker prices that hide expensive shipping

A raw lowest price is often a trap.

The goal is not just to find the lowest visible number. The goal is to find the cheapest reasonable option from a source a buyer can actually trust.

I built BookDeal to turn that messy comparison process into a single command.

What I built

BookDeal is a Python CLI that searches live book-marketplace listings, fetches promising product pages, extracts the details that matter, filters suspicious or irrelevant results, and ranks the best trustworthy deal.

A user runs a command such as:

./bookdeal "The Hobbit" --author "J.R.R. Tolkien" --year 1937 --stats

BookDeal then handles the workflow in five stages:

Search trusted retailer domains for relevant listings.
Fetch the most promising product pages.
Extract pricing, shipping, condition, format, and listing signals.
Filter out suspicious, irrelevant, or low-quality results.
Rank the remaining candidates and return the best deal with backup options.

Figure 1. BookDeal workflow: Search → Fetch → Extract → Filter → Rank. BookDeal uses TinyFish Search to discover candidate listings and TinyFish Fetch to inspect the most promising pages before applying deterministic filtering and ranking logic.

I chose a CLI because the terminal is an efficient experimentation environment. It made it easy to inspect raw results, test search strategies, tweak ranking heuristics, and iterate without building a user interface first.

BookDeal also supports structured JSON output, which makes it easier to plug into larger automation workflows later.

Figure 2. A standard BookDeal CLI run for The Hobbit. The workflow queried eight marketplaces, fetched eight pages, ranked valid listings, and returned a best option with a backup in approximately three seconds.

Where TinyFish fit

Before writing ranking logic, I had to solve a retrieval problem:

How do you reliably discover and fetch live marketplace listings without building and maintaining a custom scraper for every retailer?

TinyFish solved two problems cleanly:

Search relevance across retailer domains
Structured page retrieval without custom scrapers

TinyFish’s job is to:

search across trusted retailer domains
discover candidate listings
fetch the most promising product pages
return page content that BookDeal can inspect

BookDeal’s job is to:

extract pricing, shipping, condition, format, and merchant signals
filter noisy or suspicious results
rank the remaining listings
return the cheapest reasonable option with backups

That separation let me focus on the part that made BookDeal useful: filtering and ranking.

Instead of maintaining separate scraping pipelines for Barnes & Noble, AbeBooks, ThriftBooks, Bookshop, Amazon, Books-A-Million, and other marketplaces, I could constrain searches to trusted sources and use TinyFish Fetch to inspect actual listing pages.

The distinction matters because search snippets are useful for discovery, but they are not reliable enough for final ranking.

Shipping costs may be missing. Condition details may be inconsistent. Snippets may omit the signals that determine whether a listing is actually worth recommending.

What worked

1. Ranking the cheapest reasonable deal, not the lowest sticker price

BookDeal scores each candidate across several dimensions:

item price
detected shipping cost
book condition
format, including print and ebook
merchant trust
suspicious terms and surrounding page context

Ebook listings do not receive a missing-shipping penalty.

Print listings with unknown shipping are penalized because the final cost may be higher than the visible sticker price.

The app also excludes listings that appear to be:

audiobooks
summaries
study guides
PDFs
rentals
generic search pages
category pages

The final recommendation should point to a specific book listing that a buyer could reasonably trust.

2. Separating search, fetch, extraction, filtering, and ranking

The most important architectural decision was to keep the pipeline modular.

Each stage has a clear responsibility.

That made it possible to improve ranking heuristics without rewriting retrieval logic every time something changed.

3. Fetching full pages instead of trusting snippets

My first implementation leaned too heavily on search snippets.

It quickly became obvious that snippets were not enough.

Shipping costs appeared only on the listing page. Condition details were inconsistent. Important metadata was often missing.

Adding TinyFish Fetch into the pipeline made the rankings much more reliable because BookDeal could inspect the actual listing content before making a recommendation.

4. Restricting searches to trusted domains

Without domain filtering, searches sometimes wandered toward irrelevant pages.

In one memorable case, a query surfaced Amazon Music, likely because the search interpreted the title as audiobook-adjacent.

The fix was simple but important: start with an allowlist of trusted book retailers and validate URLs before fetching.

That kept noisy results from breaking downstream extraction and ranking.

Experimenting with agent mode

BookDeal also includes an optional agent mode.

It uses Pydantic AI with a Gemini model to plan and execute TinyFish-powered search workflows while keeping final ranking deterministic.

The agent can:

broaden retailer coverage
retry searches
decide which pages are worth fetching
adapt when the initial results are weak or ambiguous

That exploration was genuinely useful.

But the experiment also revealed a clear boundary: final ranking still worked best with explicit scoring rules.

The agent was better at exploring possibilities.

Deterministic rules were better at making consistent, repeatable decisions.

Figure 3. BookDeal agent mode applied to Atomic Habits. The agent explores the search workflow, while deterministic ranking returns the recommended option and backups.

Measurable outcome

To evaluate BookDeal, I tested it against a dataset of 100 popular books spanning fiction, nonfiction, business, technology, and classic literature.

I used title, author, and publication year where available to improve search precision. BookDeal supports ISBN input for edition-specific searches, but I avoided ISBNs here because the goal was to find a valid purchase option for each book rather than restrict results to one exact edition.

Metric	Benchmark result
Titles tested	100
Titles with at least one valid purchase option	91
Success rate	91%
Average candidate listings discovered per title	23.7
Marketplace coverage	Up to 12 marketplaces
Observed savings opportunities identified	19
Average observed savings when a cheaper option was found	$8.11 per book
Maximum observed savings	$23.71

One concrete example came from Zero to One by Blake Masters and Peter Thiel.

In the benchmark run, BookDeal found a $7.29 listing on ThriftBooks compared with a $31.00 listing on Books-A-Million, producing the largest observed savings in the benchmark: $23.71.

The nine no-deal cases were concentrated among technical or specialized titles where coverage across the approved retailer set was thinner, and some listings were unavailable or difficult to retrieve reliably.

What I’d do differently / Lessons learned

1. Treat search results as candidates, not answers

A search result can look relevant and still point to the wrong thing:

a category page
a rental
a summary
an audiobook
a low-quality marketplace listing

Search should produce candidates.

Validation and ranking should decide what survives.

2. Preserve evidence from fetched pages

Keep the text that explains why a listing was accepted or rejected.

That evidence makes debugging dramatically easier, especially when marketplace pages are inconsistent or change over time.

3. Use agents for exploration and rules for repeatable decisions

Agent behavior adds value when search needs to adapt.

But if the final outcome needs to be consistent and explainable, deterministic scoring is still extremely useful.

That was the main lesson from agent mode: the agent was useful for exploring, but explicit ranking rules were better for deciding.

Recommendation for other builders

If you are building a marketplace-comparison tool or another commerce workflow with live web data, start with three decisions:

Define a narrow allowlist of trusted sources and validate URLs before fetching.
Fetch full pages when the final decision depends on details that snippets may omit.
Keep search, fetch, extraction, filtering, and ranking as separate, observable stages.

The broader lesson is simple: lightweight tooling can feel powerful when the workflow compression is real.

Collapsing a dozen marketplace tabs into one command sounds small on paper.

In practice, it is the kind of improvement that changes how people work.