rrwrite-workflow

name: rrwrite-workflow description: Use for complete end-to-end manuscript generation. Orchestrates all steps from repository analysis to final manuscript with validation gates, two-stage review, citation verification, and automated revision. arguments:

name: repo_path description: Path to research repository (local or GitHub URL) required: true
name: target_dir description: Output directory for manuscript (default: manuscript/reponame_v1) default: manuscript
name: max_revisions description: After critique, automatically revise up to N times (0=disabled, default 0) default: 0 allowed-tools: context: fork

RRWrite: Complete Manuscript Generation Workflow

Comprehensive, guided workflow incorporating all reliability improvements:

✅ Defense-in-depth citation validation (4 layers)
✅ Verification gates (no incomplete sections)
✅ Two-stage review (content + format)
✅ Rationalization counters (evidence-based error messages)
✅ Task decomposition (2-5 minute checkpoints)
✅ Root cause tracing (automated debugging)

Quick Reference

Phase	Time	Output	Validation
1. Analyze Repository	2-5 min	repository_analysis.md, data tables	Auto
2. Plan Manuscript	3-5 min	outline.md	Word count, structure
3. Assess Journal	2-3 min	journal_assessment.md	User selection
4. Research Literature	5-10 min	literature.md, evidence.csv, citations.bib	DOI verification
5. Extract Figures/Tables	2-4 min	figure_manifest.json, table_manifest.json	Manifest validation
6. Draft Sections	20-40 min	abstract.md, intro.md, etc.	Verification gates
7. Assemble Manuscript	1-2 min	manuscript_full.md, .docx	Citation sync, completeness
8. Review & Revision	10-15 min (+2-6 min per revision)	critique_v{N}.md	Content → Format → Auto-revision

Total: 45-90 minutes (without revision) | 50-100 minutes (with 2 revisions) Validation: Continuous at every phase

Phase 1: Repository Analysis

Step 1.1: Analyze Repository Structure

python scripts/rrwrite-analyze-repository.py \
  --repo {repo_path} \
  --output {target_dir}

Generates:

repository_analysis.md - Structure, files, dependencies
data_tables/ - File inventory, statistics, research indicators
repo_evidence.md - Evidence database for claims

Validation (automatic):

File count > 0
README or documentation found
At least one code file detected

If validation fails: Cannot proceed. Check repo path is correct.

Phase 2: Manuscript Planning

Step 2.1: Generate Outline

python scripts/rrwrite-plan-manuscript.py \
  --repo-analysis {target_dir}/repository_analysis.md \
  --output {target_dir}/outline.md

Generates:

outline.md - Section structure, word targets, evidence mapping

Validation:

python scripts/rrwrite-validate-manuscript.py \
  --file {target_dir}/outline.md \
  --type outline

Requirements:

All required sections present (Abstract, Intro, Methods, Results, Discussion)
Word count ≥ 500 (detailed outline)
Evidence files mapped

If validation fails: Review outline, add missing sections, ensure detail sufficient.

Phase 3: Journal Assessment

Step 3.1: Assess Target Journals

python scripts/rrwrite-assess-journal.py \
  --outline {target_dir}/outline.md \
  --output {target_dir}/journal_assessment.md

Generates:

journal_assessment.md - 3-5 recommended journals with fit scores

Step 3.2: Select Journal

User decision required: Review recommendations, select target journal.

Options:

Nature Methods (high impact, strict requirements)
PLOS Computational Biology (open access, broader scope)
Bioinformatics (specialized, faster review)

Step 3.3: Fetch Guidelines

python scripts/rrwrite-fetch-guidelines.py \
  --journal {selected_journal} \
  --output {target_dir}/journal_guidelines.md

Generates:

journal_guidelines.md - Author instructions, word limits, format requirements

Phase 4: Literature Research

Step 4.1: Detect Previous Version (Optional)

python scripts/rrwrite_import_evidence_tool.py \
  --detect-only \
  --target-dir {target_dir}

If previous version found: Prompt to reuse literature as base (DOIs validated, broken links excluded).

Decision: [Y]es to import + expand | [N]o to start fresh

Step 4.2: Research Literature

python scripts/rrwrite-research-literature.py \
  --outline {target_dir}/outline.md \
  --target-dir {target_dir}

Search Strategy: Cascading year search (prioritize recent, expand back if needed)

Tier 1: Recent work (2024-2026) - target 15-20 papers
Tier 2: Medium recent (2020-2023) - if <15 papers found
Tier 3: Foundational (2016-2019) - if still <15 papers
Stopping: 15-20 papers ideal, 10+ acceptable for niche topics

See docs/cascading-literature-search.md for detailed strategy.

Generates:

literature.md - 1-page summary (background, related work, gaps)
literature_evidence.csv - Citation database (DOI, title, year, quote)
literature_citations.bib - Bibliography (BibTeX format)

Defense-in-Depth Layer 1 activated: All citations validated on entry.

Validation:

python scripts/rrwrite-validate-manuscript.py \
  --file {target_dir}/literature.md \
  --type literature

Requirements:

≥ 10 citations (adequate coverage)
All DOIs verified (resolve correctly)
All required sections present (Background, Related Work, Gaps)
Word count 800-1500

If validation fails: Run tracer to identify broken DOIs:

python scripts/rrwrite_citation_tracer.py {citation_key} literature {target_dir}

Phase 5: Figure and Table Extraction (2-4 min)

Extract existing figures and tables from the analyzed repository and generate supplementary visualizations.

Step 5.1: Run Extraction

python scripts/rrwrite-extract-figures-tables.py \
  --repo-path {repo_path} \
  --manuscript-dir {target_dir} \
  --extract figures,tables \
  --generate figures \
  --verbose

Generates:

figures/from_repo/ - Original repository figures (Priority 1)
figures/generated/ - Analysis visualizations (Priority 2)
tables/from_repo/ - Original data tables
tables/generated/ - Analysis tables
figure_manifest.json - Metadata with priority and section recommendations
table_manifest.json - Table metadata

Step 5.2: Review Extraction Results

Check the extraction summary:

# View figure manifest
cat {target_dir}/figures/figure_manifest.json | grep -A 3 '"id"'

# View repository figures index
cat {target_dir}/figures/from_repo/README.md

Validation:

Repository figures copied to from_repo/ directory
Generated figures created in generated/ directory
Manifests created with valid JSON
Priority correctly assigned (1=repo, 2=generated)

Decision Points:

If no repository figures found: Continue (generated figures still available)
If extraction errors: Check repository path and file permissions

If manifests invalid: Run validation:

python scripts/rrwrite_manifest_generator.py \
  --manuscript-dir {target_dir} \
  --validate \
  --schemas-dir schemas

Output Summary:

Figures from repository: N
Figures generated: M
Tables from repository: X
Tables generated: Y

Phase 6: Draft Sections (WITH VERIFICATION GATES)

CRITICAL: Iron Law of Academic Drafting

NO SECTION MARKED COMPLETE WITHOUT PASSING VERIFICATION GATE

For each section:

Step 5.1: Draft Section

python scripts/rrwrite-draft-section.py \
  --section {section_name} \
  --target-dir {target_dir}

Task decomposition (2-5 minute chunks):

Task 1: First paragraph (2 min, verify word count + citations)
Task 2: Second paragraph (3 min, verify appropriateness)
Task 3: Third paragraph (2 min, verify no orphaned refs)
Task 4: Final assembly (1 min, verify total word count ±20%)

Defense-in-Depth Layers 1-2 activated:

Layer 1: Entry validation (citations in evidence file)
Layer 2: Business logic (section appropriateness)

See docs/citation-rules-by-section.md for appropriate citation types.

Step 5.2: VERIFICATION GATE (MANDATORY)

Before proceeding to state update:

python scripts/rrwrite-validate-manuscript.py \
  --file {target_dir}/{section_name}.md \
  --type section

Checklist (ALL must pass):

Word count within ±20% of target
All citations in literature_evidence.csv
No orphaned figure/table references
Required subsections present (if applicable)
Exit code = 0

If exit code ≠ 0: DO NOT PROCEED. Fix errors, re-validate.

Common rationalizations countered:

Thought	Reality
"I'll fix citations later"	40% of citations forgotten → plagiarism risk
"Word count is close enough"	Journals auto-reject at limit violations
"This is just a draft"	First drafts with errors become habits

See docs/rationalization-table.md for full list.

If validation fails: Use root cause tracer:

python scripts/rrwrite_citation_tracer.py {failing_citation} {section_name} {target_dir}

Step 5.3: Update State (ONLY AFTER GATE PASSES)

from rrwrite_state_manager import StateManager
manager = StateManager(output_dir="{target_dir}")
manager.add_section_completed("{section_name}")

Step 5.4: Check Progress

python scripts/rrwrite-status.py --output-dir {target_dir}

Repeat Steps 5.1-5.4 for all sections:

abstract
introduction
methods
results
discussion
availability

Section-specific citation rules enforced:

Abstract: 0-2 citations max
Introduction: All types appropriate
Methods: Tools/datasets only (NO abstract principles)
Results: Observations only (NO explanations)
Discussion: All types appropriate
Availability: Platform-specific only (NO methodology papers)

Phase 7: Assemble Manuscript

Step 6.1: Assemble Sections

python scripts/rrwrite-assemble-manuscript.py \
  --output-dir {target_dir} \
  --max-revisions {max_revisions}

Generates:

manuscript_full.md - Complete assembled manuscript
manuscript_full.docx - Word document (if pandoc available)
EVIDENCE_REPORT.md - Citation evidence report

Defense-in-Depth Layer 3 activated: Assembly validation

All citations synchronized between text and bibliography
No orphaned references
Section order correct for journal
Metadata complete

If max_revisions > 0:

Automatically runs critique after assembly
Iteratively revises sections to address issues
Stops when major issues resolved or max iterations reached

Step 6.2: Validate Assembly

python scripts/rrwrite-validate-manuscript.py \
  --file {target_dir}/manuscript.md \
  --type manuscript

Requirements:

All required sections present
Total word count 1000-15000
Citations synchronized (text ↔ bibliography)
No orphaned tables/figures
Journal limits respected (tables, figures)

If validation fails: Most commonly citation sync issues. Tracer identifies:

python scripts/rrwrite_citation_tracer.py {failing_citation} manuscript {target_dir}

Phase 8: Two-Stage Review and Revision

Note: If max_revisions > 0 was set in Phase 6, critique and automated revision have already been completed. This phase is for manual review of results or additional critique if needed.

Stage 1: Content Review (PRIORITY ISSUES)

Mindset: Skeptical scientist - assume claims wrong until proven

python scripts/rrwrite-critique-manuscript.py \
  --manuscript-dir {target_dir}

Generates:

critique_content_v1.md - Content validation report
critique_format_v1.md - Format compliance report

Checks (6 validations):

Research question clearly stated?
All claims supported by evidence?
Logical flow coherent?
Methods reproducible?
Results interpretations valid?
Narrative coherent?

Output: Major issues (content validity), minor issues (improvements)

Action:

If max_revisions = 0: Manually fix issues and re-assemble
If max_revisions > 0: Review automated revision results in critique_content_v{N}.md

Stage 2: Automated Revision (OPTIONAL)

Only if max_revisions > 0 (already completed in Phase 6)

The automated revision system iteratively addresses issues:

Parse critique reports → Extract structured issues
Map to sections → Determine which sections need revision
Apply revisions → Use specialized revisers (rule-based + LLM)
Re-assemble → Combine revised sections
Re-critique → Generate new critique report
Check convergence → Stop if major issues resolved or max iterations reached

Convergence criteria (stops when ANY met):

✓ Major issues == 0 (primary goal)
Max iterations reached
Stalled (< 5% improvement)

Review results:

# Check revision summary
cat {target_dir}/.rrwrite/state.json | grep -A 20 "revision"

# View final critique
cat {target_dir}/critique_content_v{N}.md

Stage 3: Format Review (SECONDARY ISSUES)

Mindset: Copy editor - trust content, verify presentation

Note: Format critique already generated in Stage 1 as critique_format_v1.md

# If running critique manually:
python scripts/rrwrite-critique-format.py \
  --file {target_dir}/manuscript_full.md \
  --journal {selected_journal} \
  --output {target_dir}/critique_format_v1.md

Checks (6 validations):

Citations formatted correctly?
Tables numbered and captioned?
Figures referenced sequentially?
No orphaned references?
Word counts within journal limits?
Required sections present per journal?

Output: Minor issues (formatting)

Action: Fix formatting issues for polish.

Why Two Stages?

Different mindsets: Content = domain expertise + skepticism | Format = attention to detail
Clear priorities: Major (content) before Minor (format)
Can parallelize: Run both simultaneously, prioritize content fixes
Reduced cognitive load: Focus on one concern type

Final Output Structure

{target_dir}/
├── .git/                               # Version control
├── .rrwrite/
│   ├── state.json                      # Workflow state
│   └── citation_audit.jsonl            # Citation usage audit trail
├── repository_analysis.md              # Phase 1 output
├── outline.md                          # Phase 2 output (VALIDATED)
├── journal_assessment.md               # Phase 3 output
├── journal_guidelines.md               # Phase 3 output
├── literature.md                       # Phase 4 output (VALIDATED)
├── literature_evidence.csv             # Citation database (DOI verified)
├── literature_citations.bib            # Bibliography
├── repo_evidence.md                    # Evidence database
├── data_tables/                        # Repository tables
│   ├── file_inventory.tsv
│   ├── repository_statistics.tsv
│   └── research_indicators.tsv
├── sections/                           # Phase 6 outputs (ALL VALIDATED)
│   ├── abstract.md                     # ✅ Passed verification gate
│   ├── introduction.md                 # ✅ Passed verification gate
│   ├── methods.md                      # ✅ Passed verification gate
│   ├── results.md                      # ✅ Passed verification gate
│   ├── discussion.md                   # ✅ Passed verification gate
│   └── availability.md                 # ✅ Passed verification gate
├── manuscript.md                       # Phase 6 output (VALIDATED)
├── critique_content_v1.md              # Phase 7 Stage 1 output
└── critique_format_v1.md               # Phase 7 Stage 2 output

Validation Summary

Continuous Validation Throughout Workflow

Phase	Validation Type	Tool	Enforcement
1. Analysis	Auto	Built-in checks	Soft (warnings)
2. Outline	Manual	`rrwrite-validate-manuscript.py --type outline`	Required before proceeding
4. Literature	Manual + Layer 1	`rrwrite-validate-manuscript.py --type literature`	Required + DOI verification
5. Each Section	VERIFICATION GATE	`rrwrite-validate-manuscript.py --type section`	MANDATORY, exit code 0
6. Assembly	Layer 3	`rrwrite-validate-manuscript.py --type manuscript`	Required + citation sync
7. Content Review	Stage 1	`rrwrite-critique-content.py`	Major issues must be fixed
7. Format Review	Stage 2	`rrwrite-critique-format.py`	Minor issues recommended fixes

Defense-in-Depth Citation Validation (Active Throughout)

Layer 1: Entry Validation (Phase 4-6)

Fast-fail: Citation must be in evidence file before use

Layer 2: Business Logic (Phase 6)

Section-specific appropriateness (Methods = tools, Results = observations)

Layer 3: Assembly Validation (Phase 7)

Manuscript-wide completeness, bibliography synchronization

Layer 4: Audit Trail (Continuous)

Citation usage logged for forensics and debugging

Error Handling

Common Errors and Solutions

Error	Phase	Solution	Tool
Citation not found	5 (Draft)	Add to literature_evidence.csv	Root cause tracer
Word count violation	5 (Draft)	Expand/condense section	Task decomposition
Broken DOI	4 (Literature)	Find alternative source or remove	DOI validator
Citation sync failure	6 (Assembly)	Fix bib entry or text citation	Citation tracer
Orphaned reference	5-6	Add figure/table or remove reference	Format validator
Incomplete section	5 (Draft)	Cannot proceed, fix errors first	Verification gate

Root Cause Tracing (When Validation Fails)

python scripts/rrwrite_citation_tracer.py {citation_key} {section} {target_dir}

Provides:

Level 1: Symptom observation
Level 2: Immediate cause (where failing)
Level 3: Usage trace (how got there)
Level 4: Data origin (where added)
Level 5: Trigger (when/why started)
Suggested fix: Actionable commands
Prevention: How to avoid recurrence

Workflow State Management

Check Current Status

python scripts/rrwrite-status.py --output-dir {target_dir}

Shows:

Phase completion (which steps done)
Section status (pending, in_progress, completed)
Validation results (passed/failed per section)
Next recommended action

Resume Interrupted Workflow

State persists in {target_dir}/.rrwrite/state.json

To resume:

Check status: python scripts/rrwrite-status.py --output-dir {target_dir}
Identify incomplete phase
Re-run that phase's command
Continue from there

Comparison: Comprehensive vs. Power User Workflow

This Workflow (Comprehensive, Guided)

Pros:

Fully automated orchestration
Built-in validation at every step
Verification gates prevent incomplete work
Guided decisions (journal selection, etc.)
Error messages explain why and what to do
Task decomposition provides checkpoints

Cons:

More verbose output
Less granular control
Must follow phase order

Best for: First-time users, complex manuscripts, wanting validation safety

Power User Workflow (Individual Commands)

Pros:

Granular control over each step
Can skip/reorder phases as needed
Parallel execution possible
Expert mode (minimal output)
JSON output for automation
Shell aliases for speed

Cons:

Must manage phase dependencies manually
Must remember to validate
More responsibility for correctness

Best for: Experienced users, iterative refinement, automation integration

See: docs/power-user-workflow.md for individual commands

Success Metrics

Expected Outcomes

After completing this workflow, you should have:

Quality:

✅ Zero incomplete sections (verification gates)
✅ Zero citation errors (defense-in-depth)
✅ All claims evidence-backed (audit trail)
✅ Reproducible methods (validation checks)
✅ Journal-compliant format (two-stage review)

Files:

✅ 15-20 markdown files (analysis, outline, sections, manuscript, reviews)
✅ 1 CSV (citation database)
✅ 1 BIB (bibliography)
✅ Data tables (repository statistics)
✅ Complete git history (version control)

Validation:

✅ All sections passed verification gates
✅ Manuscript passed assembly validation
✅ Content review completed (major issues addressed)
✅ Format review completed (minor issues noted)

Time:

40-80 minutes total (varies by repository size)
~2 hours if including revisions from critique

Next Steps After Workflow Completion

1. Address Critique Issues

Review both critique reports:

cat {target_dir}/critique_content_v1.md
cat {target_dir}/critique_format_v1.md

Fix major issues first (content), then minor (format).

2. Iterate if Needed

For major revisions:

Edit affected sections directly: {target_dir}/sections/{section}.md
Re-validate each edited section: python scripts/rrwrite-validate-manuscript.py --file ... --type section
Re-assemble: python scripts/rrwrite-assemble-manuscript.py --target-dir {target_dir}
Re-critique: python scripts/rrwrite-critique-content.py ...

3. Export for Submission

# Convert to journal-specific format (if needed)
pandoc {target_dir}/manuscript.md -o {target_dir}/manuscript.docx
pandoc {target_dir}/manuscript.md -o {target_dir}/manuscript.pdf

4. Version Control

cd {target_dir}
git add .
git commit -m "Final manuscript draft v1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>"
git tag v1.0

Troubleshooting

Workflow Stuck?

Check state: python scripts/rrwrite-status.py --output-dir {target_dir}
Identify failing phase
Run validation for that phase
If validation fails, use root cause tracer
Fix identified issue
Re-run phase command
Continue

Need Help?

Validation errors: See docs/rationalization-table.md
Citation issues: Run python scripts/rrwrite_citation_tracer.py ...
Word count issues: See docs/2-5-minute-rule.md for task decomposition
Section rules: See docs/citation-rules-by-section.md

RRWrite: Complete Manuscript Generation Workflow

Quick Reference

Phase 1: Repository Analysis

Step 1.1: Analyze Repository Structure

Phase 2: Manuscript Planning

Step 2.1: Generate Outline

Phase 3: Journal Assessment

Step 3.1: Assess Target Journals

Step 3.2: Select Journal

Step 3.3: Fetch Guidelines

Phase 4: Literature Research

Step 4.1: Detect Previous Version (Optional)

Step 4.2: Research Literature

Phase 5: Figure and Table Extraction (2-4 min)

Step 5.1: Run Extraction

Step 5.2: Review Extraction Results

Phase 6: Draft Sections (WITH VERIFICATION GATES)

CRITICAL: Iron Law of Academic Drafting

Step 5.1: Draft Section

Step 5.2: VERIFICATION GATE (MANDATORY)

Step 5.3: Update State (ONLY AFTER GATE PASSES)

Step 5.4: Check Progress

Phase 7: Assemble Manuscript

Step 6.1: Assemble Sections

Step 6.2: Validate Assembly

Phase 8: Two-Stage Review and Revision

Stage 1: Content Review (PRIORITY ISSUES)

Stage 2: Automated Revision (OPTIONAL)

Stage 3: Format Review (SECONDARY ISSUES)

Why Two Stages?

Final Output Structure

Validation Summary

Continuous Validation Throughout Workflow

Defense-in-Depth Citation Validation (Active Throughout)

Error Handling

Common Errors and Solutions

Root Cause Tracing (When Validation Fails)

Workflow State Management

Check Current Status

Resume Interrupted Workflow

Comparison: Comprehensive vs. Power User Workflow

This Workflow (Comprehensive, Guided)

Power User Workflow (Individual Commands)

Success Metrics

Expected Outcomes

Next Steps After Workflow Completion

1. Address Critique Issues

2. Iterate if Needed

3. Export for Submission

4. Version Control

Troubleshooting

Workflow Stuck?

Need Help?

Related Documentation