The paper audit workflow bridges the gap between what a paper says and what the code actually does. It systematically compares claimed methods, defaults, metrics, and data handling against the real implementation and produces a single audit artifact documenting every discrepancy.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/getcompanion-ai/feynman/llms.txt
Use this file to discover all available pages before exploring further.
Invocation
- CLI
- REPL
Workflow stages
Plan
Before starting, the lead agent outlines the audit plan: which paper, which repository, and which claims to check. The plan is written to
outputs/.plans/<slug>.md and presented to you for confirmation before any investigation begins.Gather evidence
The
researcher subagent gathers evidence from both the paper and the codebase. It reads the paper and extracts concrete claims — hyperparameters, architecture details, training procedures, dataset splits, evaluation metrics, and reported results — tagging each with its location in the paper for traceability.It then examines the codebase to find the corresponding implementation: configuration files, training scripts, model definitions, and evaluation code.Compare
Claims from the paper are systematically compared against the code. The audit calls out:
- Mismatches — hyperparameters that differ, training steps described but not implemented, evaluation procedures that deviate from the paper
- Missing code — claims in the paper with no corresponding implementation
- Ambiguous defaults — implementation choices that the paper does not specify
- Reproducibility risks — missing random seeds, non-deterministic operations without pinned versions, hardcoded paths, absent environment specifications
Cite and verify
For non-trivial audits, the
verifier subagent verifies sources and adds inline citations to the audit report, with exact file paths and line numbers for every documented mismatch.Outputs
| Artifact | Path |
|---|---|
| Audit plan | outputs/.plans/<slug>.md |
| Audit report | outputs/<slug>-audit.md |
Audit report structure
The audit report covers:- Match summary — proportion of paper claims that match the code
- Confirmed claims — claims accurately reflected in the codebase, with code references
- Mismatches — discrepancies between paper and code with evidence from both, citing paper section and code file/line
- Missing implementations — claims in the paper with no corresponding code
- Reproducibility risks — missing seeds, unpinned dependencies, hardcoded paths, absent environment specs
Subagents used
| Subagent | Role |
|---|---|
researcher | Reads the paper and codebase to extract and compare claims |
verifier | Verifies sources and adds inline citations for non-trivial audits |
When to use /audit
Use /audit when:
- Deciding whether to build on a paper’s results and want to know if the code matches the claims
- Replicating an experiment and need to identify where the paper is underspecified
- Reviewing a paper for a venue and want to verify claims against the code
- Auditing your own paper before submission to catch inconsistencies between your writeup and implementation
Related
- Experiment Replication — execute the replication steps identified during an audit
- Peer Review — simulate academic peer review with severity-graded feedback