name: paper-reviser description: Content-first revision for academic papers written in LaTeX (read-through -> line edit -> clean + diff + tracked delivery contract -> audit + verification requests). metadata: short-description: Content-first paper revision for LaTeX drafts (clean + diff + tracked)
Paper Reviser (LaTeX)
Use this skill when you want an “advisor-like” pass over a LaTeX draft:
- prioritize correctness/precision of statements (content first),
- improve structure/flow and English while preserving the author’s voice,
- output both a clean draft and auditable revision artifacts (diff plus tracked-delivery contract),
- and generate verification requests for simple, high-impact statements.
This skill is intentionally not part of research-writer:
research-writeris for producing an arXiv-ready RevTeX paper scaffold and provenance wiring.paper-reviseris for content-first revision of a paper draft (any class/template), usually earlier in the writing loop.
What It Produces
Given draft.tex, the tool writes a run directory containing:
original.tex(normalized baseline copy)clean.tex(edited clean draft; preserves preamble for full documents)changes.diff(unified diff: original → clean)tracked.tex(full-document only, and only when a reallatexdiffrun succeeds)tracked_fragment_audit.tex(fragment-only audit view; never a valid tracked delivery)changes.md(detailed list of changes + rationale)open_questions.md(items needing author confirmation / external verification)readthrough.md(global understanding: what the draft claims + structure/notation inventory)risk_flags.md(high-risk statements: overclaims/ambiguity/missing citations)global_style_notes.md(style/notation consistency notes)audit.md(independent auditor verdict + actionable feedback)response_revision_audit.md(brief audit artifact: response-localization mapping, tracked-delivery status, clean/latexdiff PDF verification status, correction-convergence note)verification_requests.md(concrete “please verify” items + suggested search queries)verification_requests.json(machine-readable form for orchestration; schema_version=1)deep_verification.md(step-by-step derivation/maths verification driven byverification_requests.md; uses localcodexCLI)deep_verification_secondary.md(optional: second independent derivation/maths verifier via--secondary-deep-verify-*)run.json+trace.jsonl(auditable run metadata + command trace)
Workflow (Human-Like)
- Read-through (no rewriting): understand the draft globally and identify risk points.
- Writer line edit: produce an evidence-calibrated rewrite with global coherence.
- Auditor pass (independent): critique correctness/evidence/LaTeX safety; run claim-strength audit, literature/novelty gate, and response-localization checks; emit verification requests.
- Deep verification (Codex): step-by-step derivation/maths checks based on
verification_requests.md.- Optional: run a secondary deep verifier (Gemini/Claude) for redundancy via
--secondary-deep-verify-*.
- Optional: run a secondary deep verifier (Gemini/Claude) for redundancy via
- Optional repair loop: apply reviewer feedback (audit + deep verification), re-audit and re-verify (bounded by
--max-rounds).
Input Modes: Full Document vs Fragment
The tool auto-detects whether the input is a full LaTeX document:
- Full document: first uncommented
\\begin{document}exists.- The tool treats the preamble as read-only and only edits the body.
clean.tex = (original preamble) + (edited body).tracked.texis valid only iflatexdiffsucceeds. Missing/failed/emptylatexdiffis fail-closed / NOT_READY; comment-only fallback is forbidden.
- Fragment: no
\\begin{document}.- The tool edits the entire fragment as-is.
- The tool may emit
tracked_fragment_audit.tex, but that artifact is audit-only and must not be treated astracked.texor as a real latexdiff delivery.
Quick Start
Set paths once:
SKILLS_DIR="${SKILLS_DIR:-$(for r in "${CLAUDE_CONFIG_DIR:-$HOME/.claude}" "${CODEX_HOME:-$HOME/.codex}" "$HOME/.config/opencode"; do [ -d "$r/skills" ] && echo "$r/skills" && break; done || true)}"
PAPER_REVISER="$SKILLS_DIR/paper-reviser"
RESEARCH_TEAM="$SKILLS_DIR/research-team"
1) Smoke test (no model calls)
python3 "$PAPER_REVISER/scripts/bin/paper_reviser_edit.py" \
--in /path/to/draft.tex \
--out-dir /tmp/paper_reviser_out \
--stub-models
2) Real run (opus writer + gemini-3.1-pro-preview auditor)
python3 "$PAPER_REVISER/scripts/bin/paper_reviser_edit.py" \
--in /path/to/draft.tex \
--out-dir /tmp/paper_reviser_out \
--run-models \
--writer-backend claude --writer-model opus \
--auditor-backend gemini --auditor-model gemini-3.1-pro-preview
Common optional flags:
--max-rounds 2(allow one repair cycle after the first audit)--context-file evidence.md(append extra notes/citations for the writer to use; see Verification Loop)--context-dir evidence/(append multiple evidence files (.md/.txt), name-sorted; good for orchestration)--mode fast(quick revision mode: skip deep derivation/maths verification; still runs writer + auditor; also disables secondary deep verification)--no-codex-verify(skip deep derivation verification; not recommended if you care about physics/maths correctness)--fallback-auditor claude --fallback-auditor-model <MODEL>(fallback when Gemini auditor output is empty/malformed)--secondary-deep-verify-backend gemini --secondary-deep-verify-model <MODEL>(optional redundancy: a second step-by-step checker)--codex-timeout-seconds 900(hard timeout forcodex execdeep verification)--codex-timeout-policy stub|allow-secondary|fail(timeout handling policy)--force(overwrite an existing--out-dir)
Robustness notes:
- Auditor END-only marker recovery: if
AUDIT_MDhas END but misses BEGIN, the tool can recover it as an implicit BEGIN case. - Gemini auditor malformed/empty output: the tool retries once with a stricter marker reminder; if still bad and
--fallback-auditor=claudeis enabled, it falls back automatically. - Clean-size guard is adaptive:
--min-clean-size-rationow considers both raw bytes and non-comment bytes (best-effort comment stripping), reducing false positives on comment-heavy drafts. - Deep verifier timeout is auditable: on timeout (policy
stub/allow-secondary),deep_verification.mdis written asVERDICT: NOT_READYwith timeout cause. - Latexdiff delivery is fail-closed for full documents: if
latexdiffis missing, fails, or returns empty output, the run recordstracked_delivery.status = not_ready, forcesaudit.mdtoNOT_READY, and does not write a faketracked.tex. - Latexdiff repair/verification contract: this repo tool only records
tracked_delivery,repair_loop, and compile-verification audit state inrun.json; it does not provide a generic TeX compile/fix runtime. Clean/latexdiff PDF compilation, real log reading, and bounded repair attempts belong to the use-time agent inside the concrete paper project. - If the use-time agent did not run clean/latexdiff PDF verification,
run.jsonmust stay explicitly unverified (not_run/not_ready) rather than pretending success.
Verification Loop (Optional, Recommended)
After a run:
- Read
verification_requests.md.- For orchestration, prefer
verification_requests.json(machine-readable).
- For orchestration, prefer
- Do quick literature checks (or ask another tool/skill/MCP workflow to search).
- Write the results into a context file (e.g.
evidence.md) with:- the verified reference(s),
- a 1-2 sentence justification,
- and any exact wording constraints you want enforced.
- Re-run with
--context-file evidence.md(and optionally--max-rounds 1):
python3 "$PAPER_REVISER/scripts/bin/paper_reviser_edit.py" \
--in /path/to/draft.tex \
--out-dir /tmp/paper_reviser_out_r2 \
--run-models \
--writer-backend claude --writer-model opus \
--auditor-backend gemini --auditor-model gemini-3.1-pro-preview \
--context-file /path/to/evidence.md \
--max-rounds 1
Build a research-team verification plan (JSON)
To help a research workflow orchestrate literature verification, you can convert
verification_requests.json into a deterministic plan of research-team literature_fetch.py commands:
python3 "$PAPER_REVISER/scripts/bin/build_verification_plan.py" \
--in /tmp/paper_reviser_out/verification_requests.json \
--out /tmp/paper_reviser_out/verification_plan.json \
--kb-dir verification/knowledge_base/literature \
--trace-path verification/knowledge_base/methodology_traces/literature_queries.md \
--arxiv-src-dir verification/references/arxiv_src
Then execute the plan tasks (typically under an approval gate) and write per-item evidence notes.
Finally re-run paper-reviser with either --context-file evidence.md or --context-dir evidence/.
Contract Highlights
- Evidence-calibrated revision contract: do not default to hedging. Keep strong supported statements, strengthen underclaimed text when evidence warrants it, and weaken only for genuine evidence/logic/literature gaps.
- Referee-response mode contract: detected from context/structure rather than file naming; referee comments are read-only, only author responses/manuscript revisions may change, and every
we revised/clarified/added/correcteddeclaration must be localized to the shortest sufficient manuscript location. - Claim-strength audit + literature/novelty gate: novelty claims require full-text support, not title/abstract/metadata-only checks.
- Author color versus latexdiff color contract: author color remains an independent semantic layer; diff colors must stay distinct so colored insertions/deletions remain visible.
- Correction-convergence contract: bounded repair rounds must resolve blockers with the smallest sufficient edit and no silent fallback success.
HEP literature checks (INSPIRE/arXiv)
For high-energy physics (and nearby fields), a practical workflow is:
- Use
research-team’sliterature_fetch.pyto search INSPIRE / arXiv and download arXiv LaTeX sources:
# INSPIRE search (returns candidate records)
python3 "$RESEARCH_TEAM/scripts/bin/literature_fetch.py" \
inspire-search --query "t:your topic AND date:2020->2026" -n 5
# Fetch one INSPIRE record (optional: write a KB note under knowledge_base/)
python3 "$RESEARCH_TEAM/scripts/bin/literature_fetch.py" \
inspire-get --recid 1234567 --write-note
# Download arXiv LaTeX sources (stored under references/arxiv_src/<arxiv_id>/)
python3 "$RESEARCH_TEAM/scripts/bin/literature_fetch.py" \
arxiv-source --arxiv-id 2101.01234 --out-dir references/arxiv_src
- Skim the downloaded source (TeX/PDF) and write a short
evidence.mdexplaining what is supported/unsupported. - Re-run
paper-reviserwith--context-file evidence.mdso the writer can tighten or correct the statements.
If you are already running inside an agent environment with hep-mcp / @autoresearch/hep-mcp, you can also use its INSPIRE/arXiv retrieval tools (search + source download) and then feed the results into --context-file. (Keep the editing step separate from the retrieval step for auditability.)
Safety/Scope Notes
- The tool may strengthen/add claims when evidence supports them; this is an evidence-calibrated workflow, not a default-conservative or default-hedging workflow.
- Complex computation verification is out of scope; the tool instead produces
open_questions.md/verification_requests.md. - LaTeX safety is best-effort. The tool attempts to prevent edits inside verbatim-like environments (verbatim/lstlisting/minted/comment), but you should still compile-check after edits.
- This repo is not a general TeX compile/repair engine. When using the skill in a real paper project, the agent should run LaTeX compilation there, read the actual log, and apply the smallest auditable repair needed (latexdiff options, preamble/macros, or minimal post-processing).
- If you need preamble changes (packages/macros), the tool will usually propose them in
changes.md; apply them manually (preamble is preserved in full-document mode). - The tool operates on a single
.texfile at a time. For multi-file projects using\\input{}/\\include{}, run it per file (or on a pre-concatenated version). Orphan-ref warnings may be false positives when labels live in other files. - Defaults (override as needed):
--encoding utf-8,--min-clean-size-ratio 0.85,--max-rounds 1,--codex-timeout-seconds 900,--codex-timeout-policy stub. - For full documents,
tracked.texis valid only when produced by reallatexdiff. If that delivery is unavailable, the run must stayNOT_READYinstead of substituting a comment-only fallback. - For full documents, if
clean.texcompiles, the use-time agent should make a real attempt to compile the latexdiff PDF as well. If that diff build fails, read the log and try the smallest auditable fix; do not substitute the clean PDF for the diff PDF.
Dev Smoke Tests (Local)
bash "$PAPER_REVISER/scripts/dev/run_smoke_tests.sh"