Build Logs

I Built an AI Investigation System That Treats Contradictions as the Product, Not the Problem

LaSalle Browne·
Share
I Built an AI Investigation System That Treats Contradictions as the Product, Not the Problem

TL;DR

I built the Coherence Intelligence Terminal, or CIT, an AI-powered investigation platform that coordinates multiple agents to collect live public data across government databases, regulatory filings, structured public records, and the open web.

Instead of summarizing everything into one confident answer, CIT scores how much the sources agree with each other. When sources conflict, the system flags the contradiction and keeps the analyst in control.

In one synthetic demo investigation, CIT dispatched 7 parallel TinyFish tasks across public source categories including SAM.gov, PACER / CourtListener, SEC EDGAR, state corporate registries, FARA, USASpending.gov, and an Inspector General report repository. Three tasks returned structured findings within approximately 60 seconds, and one contradiction — a registered-agent address collision between two entities that appeared unrelated — moved the investigation from circumstantial to actionable.

Total TinyFish cost for the full sweep: $0.28 at $0.04 per operation.

“The most valuable output this system produces isn’t an answer. It’s a contradiction — because that’s the moment a human analyst actually needs to be in the loop.”

The problem

Most AI research tools behave like faster search engines.

They retrieve documents, summarize them, and present confident output.

That works when the task is simple. But in serious investigative research, the summary is often where the risk begins.

Two sources may say different things. One filing may say an entity is active, while another public record suggests a conflict. One database may show a clean status, while another indicates a relevant relationship, address overlap, or anomaly.

A summarization-first tool may smooth over those differences.

It may pick one source, collapse the conflict, and hand the user a clean answer that looks more certain than it really is.

That is dangerous because investigative work is not only about finding facts.

It is about finding where facts do not agree.

For due diligence, compliance, regulatory research, competitive intelligence, and investigative workflows, disagreement is not noise. It is often the signal.

That was the gap AlphaSage set out to address.

What I built

I built the Coherence Intelligence Terminal, or CIT, as an AI-powered investigation platform for messy, distributed, and often contradictory data.

CIT coordinates multiple agents to pull live public data from sources such as:

  • SEC EDGAR
  • SAM.gov
  • procurement records
  • public court and docket sources
  • state corporate registries
  • public government repositories
  • news and open-web sources

The user starts with a plain-language objective, such as:

“Analyze this company’s regulatory disclosures, financial anomalies, and corporate relationships.”

CIT then dispatches agent workers to collect relevant information, normalize incoming findings, and score how consistently those findings agree with one another.

The output is not a single generated answer.

It is a live investigation workspace that includes:

  • a global coherence score showing how consistently active sources agree
  • an entity relationship graph mapping corporate connections as they are discovered
  • a conflict log where flagged discrepancies are attributed to their sources
  • a source-backed claim layer that keeps the analyst in control

The core product idea is simple:

CIT does not hide contradictions. It surfaces them.

Figure 1. CIT Mission Control. Live intelligence feed, entity graph, and active investigation agents.
Figure 1. CIT Mission Control. Live intelligence feed, entity graph, and active investigation agents.
SignalSample investigation run
Global Coherence90%
Active Conflicts Flagged249
Contradiction Index42 — Critical
Agent WorkersOSINT Sweep ×2
TinyFish ExtractionActive — Public Web Collection

Note: This is a sample / synthetic investigation snapshot used to illustrate the CIT interface. It is not a production investigation, legal determination, or compliance determination.

Figure 2. Mission Queue. Initializing a new investigation with a plain-language objective.
Figure 2. Mission Queue. Initializing a new investigation with a plain-language objective.

Where TinyFish fit

TinyFish is the live public web collection layer inside CIT.

CIT uses TinyFish to reach and retrieve information from public web sources that are difficult to handle with basic fetch requests alone, including dynamic public portals, public search forms, structured government databases, and open-web pages.

TinyFish’s job is to:

  • search for relevant public sources when no specific URL is known
  • fetch structured pages when URLs are known and stable
  • navigate dynamic public web pages and search interfaces when browser automation is needed
  • return structured findings with source references

CIT’s job is to:

  • normalize TinyFish findings into typed claim objects
  • attach source IDs and provenance metadata
  • score claims for consistency against existing findings
  • surface contradictions to the analyst
  • preserve the final judgment for human review

Technically, CIT routes collection tasks into three buckets:

  1. Browser automationUsed for dynamic public portals or source interfaces that require interaction, such as public search forms or dynamically rendered pages.
  2. FetchUsed for stable public URLs and structured sources such as SEC EDGAR, USASpending.gov, or CourtListener pages when direct retrieval is sufficient.
  3. SearchUsed for open-web discovery tasks, such as news sweeps, executive profile lookups, subsidiary enumeration, or cases where no specific URL is known in advance.

In all three cases, TinyFish returns structured findings with source references.

CIT then converts those findings into typed claim objects, each tagged with a source ID, a confidence score, and a provenance token before passing them into coherence scoring.

All tasks in this Build Log are scoped to public web sources only. No private credentials, non-public systems, or credentialed sessions are used.

Figure 3. TinyFish public web collection layer. Real-time public web collection with per-source verification status.
Figure 3. TinyFish public web collection layer. Real-time public web collection with per-source verification status.
Figure 4. Case overview. Active OSINT worker pool with live agent status tracking per investigation.
Figure 4. Case overview. Active OSINT worker pool with live agent status tracking per investigation.

What worked

1. Treating contradictions as first-class outputs

The clearest validation came during a synthetic test investigation on a complex corporate entity.

The system flagged a company showing two different regulatory statuses at the same time. One public source indicated a negative contracting status. Another listed the entity as active and in good standing.

Both sources were authoritative.

Both were live.

And they contradicted each other.

In a conventional research workflow, that contradiction might disappear. The higher-ranked or more recent source may win, the summary may reflect one status, and the analyst may never see the conflict.

In CIT, the contradiction surfaced immediately as a critical alert, with both source attributions visible side by side.

The analyst — not the model — decides what that means.

Figure 5. Case board. Live entity relationship graph with active conflicts panel showing flagged discrepancies in real time.
Figure 5. Case board. Live entity relationship graph with active conflicts panel showing flagged discrepancies in real time.
Figure 6. Investigator mode. Entity deep dive with relationship matrix, identity profile, and temporal evolution of a target entity.
Figure 6. Investigator mode. Entity deep dive with relationship matrix, identity profile, and temporal evolution of a target entity.

2. Separating live collection from coherence scoring

TinyFish handles the live web collection layer.

CIT handles the reasoning layer around claims, contradictions, confidence, and analyst review.

That separation made the system easier to debug and easier to trust.

If an issue appears in the investigation workspace, the analyst can inspect:

  • which source produced the claim
  • which agent collected it
  • when it was retrieved
  • what other claims it conflicts with
  • how the conflict affected the coherence score

The system does not ask the user to trust an opaque answer.

It shows the claim trail.

3. Using structured findings instead of raw text

The first version pulled aggressively from every available source.

Output volume was high, but signal quality was low.

The problem was not access to data.

The problem was deciding what to do with it.

The shift came from scoring every incoming claim against existing findings before surfacing it. That turned data volume into investigation intelligence.

For CIT, a useful result is not just “more information.”

A useful result is a source-backed claim that can be compared, challenged, and scored against the rest of the investigation.

4. Starting from current public sources

For contradiction detection to matter, sources need to be current.

If two sources disagree, the analyst needs confidence that both were retrieved from live or recently accessed public records, not stale cached summaries.

TinyFish helped CIT treat live public collection as part of the investigation workflow rather than a separate scraping problem.

That was important because the value of the product depends on detecting disagreement across sources as they exist now.

Measurable outcome

The following numbers come from a synthetic demo investigation.

They are not a production investigation, legal determination, compliance determination, or claim about a real entity.

MetricSynthetic demo result
TinyFish tasks dispatched7
Public source categories queriedSAM.gov, PACER / CourtListener, SEC EDGAR, state corporate registries, FARA, USASpending.gov, IG report repository
Tasks returning structured findings within ~60 seconds3
Example findings returnedActive SAM.gov registration, active federal civil case, registered-agent address collision
Key contradiction surfacedShared registered-agent address between two entities that appeared unrelated
Total TinyFish cost$0.28
Cost per operation$0.04

The most important output was not the number of sources collected.

It was the contradiction that changed the investigation’s confidence level.

In the synthetic demo, CIT surfaced a registered-agent address collision between two LLCs with overlapping identifiers that initially appeared unrelated. That finding moved the investigation from circumstantial to actionable because it gave the analyst a concrete relationship to examine further.

The point of the benchmark is not to claim automated truth.

The point is to show that live public web collection can feed a coherence-first investigation workflow where contradictions are preserved, attributed, and made reviewable.

What I’d do differently / Lessons learned

1. More data does not automatically mean better intelligence

The first version of CIT pulled aggressively from every available source.

That sounded useful, but it created a signal problem.

When every source produces raw material, analysts still need to know which claims matter, which claims conflict, and which claims deserve attention.

The fix was to score incoming claims for consistency before surfacing them.

That made the system less like a document collector and more like an investigation workspace.

2. Confidence is not the same as accuracy

Every AI tool in this space tends to default to confident output.

Enterprise buyers often want clear answers, and that instinct is understandable.

But in due diligence, compliance, and regulatory intelligence, a confidently wrong answer is more dangerous than an uncertain one.

The contradiction that CIT surfaced would have been easy to flatten inside a summarization-first workflow.

That is exactly why uncertainty should not be treated as a failure state.

It is information.

3. Keep the analyst in control

CIT does not make the final judgment call.

It surfaces the conflict, shows the source trail, and gives the analyst enough context to decide what the contradiction means.

That is the right division of labor for high-stakes research.

The system should find what a human might miss.

The human should decide what it means.

4. Solve live data access before building the analysis layer

If your system depends on current public sources, solve the collection layer first.

Confirm that you can reliably reach the sources that matter, retrieve structured findings, preserve references, and pass clean outputs into the rest of your pipeline.

For AlphaSage, TinyFish helped provide that foundation.

Everything else was built on top of it.

Recommendation for other builders

If you are building AI tools for high-stakes decisions, the question your buyers are really asking is not:

“Is the AI smart?”

It is:

“What happens when the AI is wrong?”

Answer that question in your architecture before you answer anything else.

For us, that meant building a system that scores its own confidence, flags its own contradictions, attributes claims to sources, and keeps the analyst in control of every final call.

TinyFish gave us the live public web collection layer that makes those catches real and reviewable.

Everything else follows from that.

Try it / Links

AlphaSage is building the Coherence Intelligence Terminal for investigation workflows where source disagreement matters.

If your team works with due diligence, compliance, regulatory research, competitive intelligence, or multi-source public data analysis, the core lesson is simple:

Do not only ask AI to summarize what it finds.

Build a system that can show you when the sources disagree.

📌 Sign up free → agent.tinyfish.ai

Docs → docs.tinyfish.ai

Open source Cookbook → github.com/tinyfish-io/tinyfish-cookbook


Get started

Start building.

No credit card. No setup. Run your first operation in under a minute.

Get 500 free creditsRead the docs