name: reporting description: Create publication-ready Quarto reports with standard academic structure incorporating all outbreak analysis results
Reporting Skill: Publication-Ready Outbreak Analysis Reports
[!IMPORTANT] PRIMARY WORKFLOW: When the user requests an analysis, create ONE Quarto document (
.qmdfile) that contains:
- All code chunks for data cleaning, parameter retrieval, analysis, and visualization
- Narrative text with inline R code explaining each step and presenting results
- Standard academic structure (Background, Methods, Results, Discussion)
Do NOT create separate R scripts. The Quarto document is both the analysis script AND the publication-ready report. When rendered, it executes all code and generates the final document.
[!IMPORTANT] Use the Introspection Protocol: See epiverse-overview skill for the protocol when working with R packages.
Overview
This skill provides guidance for creating a single Quarto document that contains the complete outbreak analysis workflow embedded within a standard academic report structure.
The Complete Pipeline → Report
Data Intake → Parameters → Simulation/Analysis → Visualization → REPORT (end goal)
Critical Style Guidelines
1. Inline R Code for Numbers (MANDATORY)
ALWAYS use inline R code for any numbers in text. Never hardcode numbers.
Good:
The outbreak included `r nrow(linelist)` cases with CFR of `r sprintf("%.1f%%", cfr * 100)`
(95% CI: `r sprintf("%.1f%%", cfr_lower * 100)`–`r sprintf("%.1f%%", cfr_upper * 100)`).
Bad:
The outbreak included 5,130 cases with CFR of 9.97% (95% CI: 9.2%–10.8%).
Why: Ensures numbers update automatically, prevents errors, maintains full reproducibility.
2. UK English Spelling (MANDATORY)
ALWAYS use UK English spelling:
- Analyse (not Analyze)
- Characterise (not Characterize)
- Centre (not Center)
- Colour (not Color)
- Programme (not Program, except "computer program")
- Standardise (not Standardize)
- Visualisation (not Visualization)
3. Collapsed Code for Detailed Methods
Use code-fold: true for detailed methods code:
#| label: data-cleaning
#| code-fold: true
#| code-summary: "Show data cleaning code"
cleaned_data <- raw_data |>
standardize_dates() |>
remove_duplicates()
When to use:
- Data cleaning pipelines
- Complex statistical procedures
- Sensitivity analyses
- Diagnostic plots
When NOT to use:
- Setup chunks (use
echo: false) - Simple results display
- Final visualizations
Standard Report Structure
1. Executive Summary
2-3 sentences summarizing key findings for stakeholders (~100-150 words)
2. Background
Disease context, outbreak setting, why analysis is needed (~300-500 words)
3. Objectives
Bulleted list of specific analysis questions
4. Data
Data sources, quality, sample size, completeness
5. Methods
- One subsection per major analysis component
- Use
code-fold: truefor detailed code - Document packages and parameters
6. Results
- Present findings objectively
- Use inline R code for ALL numbers
- Cross-reference figures and tables
7. Discussion
- Interpret results in context
- Compare to literature
- Acknowledge limitations
- State implications
8. Conclusions
2-4 sentences with main takeaways
9. Computational Environment
- Session information
- Package versions
- Reproduction instructions
10. References
BibTeX citations for all sources
Essential YAML Header
---
title: "Analysis Title"
author: "Name / Organization"
date: today
format:
html:
toc: true
toc-depth: 3
code-fold: true
code-tools: true
embed-resources: true
theme: cosmo
fig-width: 8
fig-height: 5
pdf:
toc: true
number-sections: true
execute:
echo: false
warning: false
message: false
bibliography: references.bib
---
Chunk Options
#| label: descriptive-name
#| echo: false # Hide code, show output (default)
#| code-fold: true # Show collapsed code (methods)
#| code-summary: "Show code"
#| fig-cap: "Figure caption"
#| fig-width: 8
#| fig-height: 5
#| tbl-cap: "Table caption"
Setup Chunk Template
#| label: setup
#| message: false
#| warning: false
# Configure repositories
options(repos = c(
epiverse = "https://epiverse-trace.r-universe.dev",
CRAN = "https://cloud.r-project.org"
))
# Core Epiverse-TRACE packages
library(simulist)
library(cleanepi)
library(linelist)
library(cfr)
library(epiparameter)
# Data manipulation and visualization
library(tidyverse)
library(here)
library(gt)
library(patchwork)
# Set reproducibility
set.seed(42)
# Create output directories
dir.create(here::here("outputs", "plots"), recursive = TRUE, showWarnings = FALSE)
dir.create(here::here("outputs", "tables"), recursive = TRUE, showWarnings = FALSE)
Integrating Analysis Results
Pattern: Analyze → Store → Display → Save
1. Perform Analysis
#| label: estimate-cfr
#| code-fold: true
#| code-summary: "Show CFR estimation code"
cfr_results <- cfr_rolling(
data = cleaned_data,
delay_density = delay_func
)
# Store for inline reporting
cfr_estimate <- cfr_results$severity_estimate
cfr_lower <- cfr_results$severity_low
cfr_upper <- cfr_results$severity_high
In narrative (use inline R code):
The estimated CFR was `r sprintf("%.1f%%", cfr_estimate * 100)`
(95% CI: `r sprintf("%.1f%%", cfr_lower * 100)`–`r sprintf("%.1f%%", cfr_upper * 100)`).
2. Create Visualization
#| label: plot-cfr
#| echo: false
#| fig-cap: "Rolling CFR estimates with 95% CI"
#| fig-width: 8
#| fig-height: 5
cfr_plot <- cfr_results |>
ggplot(aes(x = date)) +
geom_ribbon(aes(ymin = severity_low, ymax = severity_high),
fill = "darkred", alpha = 0.2) +
geom_line(aes(y = severity_estimate), color = "darkred") +
scale_y_continuous(labels = scales::percent) +
labs(title = "Rolling CFR Estimates", x = "Date", y = "CFR") +
theme_minimal()
cfr_plot
# Save for external use
ggsave(here::here("outputs", "plots", "cfr_rolling.png"),
cfr_plot, width = 8, height = 5, dpi = 300, bg = "white")
3. Create Summary Table
#| label: table-cfr
#| echo: false
#| tbl-cap: "CFR estimates with 95% CI"
tibble(
Metric = c("Naive CFR", "Delay-corrected CFR", "Observation period"),
Value = c(
sprintf("%.1f%%", naive_cfr * 100),
sprintf("%.1f%% (%.1f%% - %.1f%%)",
cfr_estimate * 100, cfr_lower * 100, cfr_upper * 100),
sprintf("%d days", outbreak_duration)
)
) |>
gt() |>
tab_header(title = "CFR Estimation Results")
Publication-Quality Figures
Multi-Panel with Patchwork
#| label: fig-combined
#| fig-cap: "Overview. (A) Epidemic curve. (B) Rolling CFR."
#| fig-width: 10
#| fig-height: 8
plot_a <- epicurve_plot + labs(tag = "A")
plot_b <- cfr_plot + labs(tag = "B")
combined <- plot_a / plot_b
combined
ggsave(here::here("outputs", "plots", "figure_combined.png"),
combined, width = 10, height = 8, dpi = 300, bg = "white")
Cross-Referencing
As shown in @fig-combined, the outbreak peaked around day 45.
Publication-Quality Tables
#| label: tbl-summary
#| tbl-cap: "Outbreak summary statistics"
summary_stats <- tibble(
Metric = c("Total Cases", "Total Deaths", "Attack Rate"),
Value = c(
format(total_cases, big.mark = ","),
format(total_deaths, big.mark = ","),
sprintf("%.1f%%", attack_rate * 100)
)
)
summary_stats |>
gt() |>
tab_header(
title = "Outbreak Summary",
subtitle = sprintf("Period: %s to %s",
format(min_date, "%Y-%m-%d"),
format(max_date, "%Y-%m-%d"))
)
Citations
BibTeX File (references.bib)
@article{barry2018outbreak,
title={Outbreak of Ebola virus disease},
author={Barry, Ahmadou and others},
journal={The Lancet},
year={2018}
}
Citing in Text
The delay-corrected CFR method [@barry2018outbreak] accounts for
right-censoring bias. All analyses used Epiverse-TRACE [@epiverse2024].
Package Citations
#| label: package-citations
#| echo: false
citation("cfr")
citation("simulist")
Reproducibility Section
Session Information
#| label: session-info
#| echo: false
sessionInfo()
Package Versions
#| label: pkg-versions
#| tbl-cap: "Package versions"
key_packages <- c("simulist", "cleanepi", "cfr", "epiparameter")
tibble(
Package = key_packages,
Version = sapply(key_packages, function(pkg) {
as.character(packageVersion(pkg))
})
) |> gt()
Reproduction Instructions
## Reproducibility
**Requirements**: R ≥ 4.3.0, Quarto ≥ 1.3.0
**Steps**:
```bash
git clone https://github.com/your/repo.git
cd repo
quarto render analysis.qmd
Random Seeds: All operations use set.seed(42)
Analysis Date: r format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z")
## Report Templates
### Quick Start
Use [ebola_analysis.qmd](../../../examples/ebola_outbreak/ebola_analysis.qmd) as template:
```bash
cp examples/ebola_outbreak/ebola_analysis.qmd my_analysis.qmd
cp examples/ebola_outbreak/references.bib my_references.bib
Then customize for your analysis.
Rendering and Publishing
# HTML (self-contained)
quarto render analysis.qmd
# PDF (requires LaTeX)
quarto render analysis.qmd --to pdf
# Word
quarto render analysis.qmd --to docx
# Publish to GitHub Pages
quarto publish gh-pages analysis.qmd
Common Patterns
Pattern 1: Simulated Outbreak Report
- Background → Define disease
- Methods → Describe simulation parameters
- Results → Show dynamics
- Discussion → Compare to historical outbreaks
Pattern 2: Real Data Analysis
- Background → Outbreak context
- Data → Sources and quality
- Methods → Cleaning and analysis
- Results → Estimates with uncertainty
- Discussion → Implications for response
Pattern 3: Scenario Comparison
- Background → Decision context
- Methods → Scenarios tested
- Results → Comparative analysis
- Discussion → Recommendations
Troubleshooting
"Quarto not found"
brew install quarto # macOS
# Or download from https://quarto.org/
"Package not found"
install.packages(c("gt", "knitr", "patchwork"))
"Path errors"
# Use here::here() for portable paths
source(here::here("R", "functions.R"))
"Figures not showing"
- Check chunk produces output
- Verify fig-width and fig-height
- Use
echo: trueto debug
Best Practices Checklist
- Title and metadata complete
- Executive summary clear
- All sections follow structure
- Code chunks have labels
- Figures have captions
- Tables formatted with gt
- ALL numbers use inline R code
- UK English spelling throughout
- Detailed methods in collapsed chunks
- Citations complete
- Session info included
- Renders without errors
Integration with Other Skills
All skills are integrated within the Quarto document, not as separate scripts:
Analysis in Quarto
#| label: estimate-cfr
#| code-fold: true
# Use cfr package directly in code chunk
cfr_results <- cfr_rolling(
data = cleaned_data,
delay_density = delay_func
)
# Store values for inline reporting
cfr_estimate <- cfr_results$severity_estimate
Then use inline R code in narrative: The CFR was `r sprintf("%.1f%%", cfr_estimate * 100)`.
Visualization in Quarto
#| label: plot-epicurve
#| echo: false
#| fig-cap: "Epidemic curve"
# Create plot directly in code chunk
epicurve <- ggplot(cleaned_data, aes(x = date_onset)) +
geom_histogram(binwidth = 7) +
theme_trace()
epicurve # Display plot inline
Note: Optionally save outputs with ggsave() or write_csv() for reuse, but the Quarto document should be self-contained and executable from start to finish.
Summary
When the user requests an analysis, create ONE Quarto document that:
- Contains the complete workflow: All data cleaning, parameter retrieval, analysis, and visualization code in code chunks
- Is executable: Running
quarto render analysis.qmdexecutes all code and generates the report - Follows standard structure: Executive summary → Background → Methods → Results → Discussion → Conclusion
- Is reproducible: Seeds, versions, clear instructions embedded in the document
- Is publication-ready: Professional figures, tables, inline R code for all numbers
- Is self-contained: Single HTML/PDF file with everything needed
Critical Requirements:
- ONE
.qmdfile containing all code and narrative (not separate R scripts) - Inline R code for ALL numbers (
`r variable`) - UK English spelling (analyse, characterise, visualisation)
- Collapsed code for detailed methods (
code-fold: true)
Remember: The Quarto document is both the analysis script AND the final report. Review ebola_analysis.qmd for a complete example.