Level 2: Evidence-Driven Research
The research example upgrades the loop from "produce an answer" to "produce an evidence-backed answer." It carries questions, claims, sources, and gaps forward in state until coverage is strong enough to stop.
Pattern
- Research questions first: the topic is decomposed into explicit questions.
- Parallel by design: the tracker can support fan-out across multiple questions even though the fake-model demo walks one open question per iteration.
- Claim-evidence matrix: each claim stores source IDs and an evidence quality score.
- Gap-driven iteration: unanswered questions and claims without sources keep the loop open.
- Citation verification: the deterministic coverage evaluator fails when evidence gaps remain.
Flow
Rendering diagram...
Example configuration
The real run setup keeps the budgets and topic explicit:
run = LoopRun(
example_id="level_2_research",
goal="Produce an evidence-backed technical report on: <topic>",
budgets=BudgetConfig(
max_iterations=max_iterations or 4,
max_model_calls=20,
stagnation_threshold=4,
),
) The tracker is the heart of the example:
class ResearchTracker:
def __init__(self) -> None:
self.questions: list[ResearchQuestion] = []
self.claims: list[ResearchClaim] = []
self.searched_queries: set[str] = set()
def get_open_gaps(self) -> list[str]:
return [claim.text for claim in self.claims if not claim.source_doc_ids] And the stop signal comes from deterministic coverage:
passed = all_answered and has_claims and not gaps
return EvaluationResult(
evaluator_name="coverage",
status=EvaluatorStatus.PASS if passed else EvaluatorStatus.FAIL,
score=score,
summary="questions_answered=<n>/<total> claims=<claims> open_gaps=<gaps>",
) Why it matters
This example makes a common agent failure mode visible: an answer can sound finished while the evidence model is still incomplete. By storing the live tracker in state and requiring source-backed claims, the loop has a concrete reason to continue or stop.
Continue to Level 3 for sandboxed coding work and resumable execution.