name: synthesize description: Read analysis outputs, compare against literature, and draft findings for a project REPORT.md. Use when notebooks have been run and the user wants to interpret results and write up findings. allowed-tools: Bash, Read, Write, Edit, WebSearch, AskUserQuestion user-invocable: true
Synthesis Skill
After notebooks have been run, read the outputs, compare against literature, and draft the findings/interpretation sections in the project's REPORT.md. Also update ## Status in README.md to reflect completion.
Usage
/synthesize <project_id>
If no <project_id> argument is provided, detect from the current working directory (if inside projects/{id}/).
Workflow (Two-Pass Approach)
Pass 1: Read Data and Draft Findings
Step 0: Precondition check
Read projects/{project_id}/beril.yaml and validate status. If beril.yaml is missing (pre-manifest project), skip this check and rely on the file-existence checks below.
status: exploration— stop. Tell the user:"This project is still in exploration — there's no
RESEARCH_PLAN.mdto synthesize against yet. Write the plan first (use/berdl_startto resume the workflow), then re-run/synthesize."status: proposed— stop. Tell the user:"This project has a research plan but no analysis yet. Run the analysis notebooks first (Phase C of the workflow) so
/synthesizehas results to interpret. Resume via/berdl_start."status: active— proceed; this is the normal forward path (active→analysis).status: analysis— proceed (re-synthesis on a project still pre-review; no status change).status: reviewed— silent demote. ExistingREVIEW_N.mdfiles were against the previousREPORT.mdand will go stale via hash mismatch. No prompt; just inform the user in the agent's reply: "Demoted toanalysis; existing reviews are now stale. Run/berdl-reviewagain before submitting." Then proceed.status: complete— explicit confirmation prompt before proceeding:"This project is currently complete (approved {approval.at} by {approval.by}). Running /synthesize will overwrite
REPORT.mdand demote the project toanalysis. The previous approval will be archived underprevious_approvals. Continue? (y/n)"- Yes: move the current
approvalblock inberil.yamltoprevious_approvals: [](append) with an addedarchived_at: "<now>"field, setstatus: analysis, deleteprojects/{project_id}/REVIEW.md(the canonical copy of the now-archived review), and delete bothSUBMITTED.mdandSUBMISSION_FAILED.mdif present (audit lives inberil.yaml.previous_approvalsandberil.yaml.submissions[]). The README Status update happens in Step 7 below as part of the normal/synthesizeflow. Then proceed. - No: abort with no changes.
- Yes: move the current
Synthesizing without analysis outputs (the exploration and proposed cases) produces empty or fabricated findings, which is exactly what the new lifecycle is designed to prevent. The reviewed/complete demotes are how the iteration loop and reopen-and-resubmit flow stay honest.
Step 1: Gather Project Context
Read these project files:
projects/{project_id}/RESEARCH_PLAN.md— the hypothesis, expected outcomes, analysis plan (orresearch_plan.mdfor legacy projects)projects/{project_id}/README.md— current state of the project (preserve Research Question, Authors)projects/{project_id}/references.md— existing literature references
If RESEARCH_PLAN.md doesn't exist, check for research_plan.md (legacy). If neither exists, read the README for research question and hypothesis context.
Step 2: Read Analysis Outputs
Scan the project for results:
CSV files in
projects/{project_id}/data/:- Read each CSV and interpret: column names, row counts, distributions, key statistics
- Identify the main result variables (correlations, counts, p-values, effect sizes)
Figures in
projects/{project_id}/figures/:- List available figures and their filenames (infer content from names)
Notebook outputs in
projects/{project_id}/notebooks/:- If executed
.ipynbfiles are present, read output cells for results - Look for printed summaries, DataFrames, and statistical test outputs
- If executed
Step 3: Draft Initial Findings
Based on the data, draft findings that address:
- Key results: What did the data show? (specific numbers, correlations, counts)
- Hypothesis outcome: Was H1 supported or H0 not rejected?
- Statistical significance: Report p-values, effect sizes, confidence intervals if available
- Unexpected patterns: Note any surprising results or anomalies
Step 4: Present Draft to User
Show the initial findings interpretation and ask:
- "Does this interpretation look correct?"
- "Are there results I missed or misinterpreted?"
- "Any additional context to include?"
Wait for user feedback and revise if needed.
Pass 2: Literature Cross-Reference and Synthesis
Step 5: Search Literature for Context
Invoke /literature-review to search for papers that:
- Tested similar hypotheses in related organisms
- Used comparable methods or data
- Reported results that align or conflict with the BERDL findings
Focus searches on:
- The specific organisms/taxa analyzed in the project
- The specific biological question (e.g., "pangenome openness environmental adaptation")
- Key methods used (e.g., "partial correlation phylogenetic signal")
Step 6: Compare Findings Against Literature
For each key finding, assess:
| Question | Assessment |
|---|---|
| Does this agree with published work? | Cite supporting papers |
| Does this contradict published work? | Note methodology differences that could explain discrepancies |
| Is this novel? | Identify what BERDL data adds that wasn't known before |
| Are there caveats? | Data coverage, confounders, methodological limitations |
Step 7: Produce Synthesis
Create or update projects/{project_id}/REPORT.md with the following sections:
# Report: {Title}
## Key Findings
### {Finding 1 Title}

{Statistical result with specific numbers}
*(Notebook: {notebook_name}.ipynb)*
### {Finding 2 Title} (if applicable)
{Statistical result}
*(Notebook: {notebook_name}.ipynb)*
## Discoveries
(Optional section — include only if the analysis surfaced non-trivial findings worth elevating across projects. Skip the section entirely if there's nothing material to capture; an absent section is the natural representation of "no claims of this kind." Each entry is a self-contained insight a reader from another project could learn from.)
- {One-line discovery 1, e.g., "Pangenome openness correlates with environmental breadth in soil-associated genera (rho=0.38, p<0.01)."}
- {One-line discovery 2 (if applicable)}
## Performance Notes
(Optional section — include only if this project hit non-obvious query timings, optimizations, or anti-patterns that future projects on similar data should know. Skip the section entirely if there's nothing material to capture.)
- {One-line performance observation, e.g., "Joining `species_pangenome_genes` to `species_function_genes` via `species_id` is 3x faster than via `cluster_id` for queries spanning >100 species."}
## Results
{Detailed results with embedded figures and markdown tables}
## Interpretation
{What the results mean biologically}
### Literature Context
- {Finding} aligns with Author et al. (Year) who found {similar result} in {organism}
- {Finding} contradicts Author et al. (Year) — possible explanation: {methodology difference}
### Novel Contribution
{What BERDL data adds that wasn't known before}
### Limitations
- {Data coverage limitations}
- {Potential confounders}
- {Methodological caveats}
## Data
### Sources
| Collection | Tables Used | Purpose |
|------------|-------------|---------|
| `{collection_id}` | `{table1}`, `{table2}` | {what this data provides} |
### Generated Data
| File | Rows | Description |
|------|------|-------------|
| `data/{filename}.csv` | {row_count} | {what the data contains} |
## Supporting Evidence
### Notebooks
| Notebook | Purpose |
|----------|---------|
| `{filename}.ipynb` | {what the notebook does} |
### Figures
| Figure | Description |
|--------|-------------|
| `{filename}.png` | {what the figure shows} |
## Future Directions
1. {Suggested next step based on findings}
2. {Follow-up analysis addressing limitations}
3. {New questions raised by the results}
## References
- Author et al. (Year). "Title." *Journal*. PMID: {pmid}
Important guidelines for the template:
- Inline figures: Place
near the finding each figure supports. The UI rewrites these paths automatically for web rendering. Every figure in the project'sfigures/directory should appear inline at least once. - Notebook provenance: End each finding subsection with
*(Notebook: filename.ipynb)*to trace results back to the analysis code. - Data section: The
## Datasection documents data lineage.### Sourceslists BERDL collections and tables queried.### Generated Datalists output files with row counts. - Collection IDs: Use the exact BERDL collection identifier (e.g.,
kescience_fitnessbrowser,kbase_ke_pangenome) in the Sources table. These IDs link to the collection detail pages on the Research Observatory, which include citation and attribution information for data providers. - README update: Ensure the collection IDs appear somewhere in the README.md text so the UI can auto-detect and display Data Collections links on the project page.
- References: Always include references, even for well-known data sources. At minimum cite the primary data sources (e.g., Price et al. 2018 for Fitness Browser, Arkin et al. 2018 for KBase).
- Discoveries / Performance Notes sections: optional. Populate when there's something material to capture. Do not write to per-project memory files directly — these sections flow through
/berdl-review(the reviewer evaluates them as part of the report), then/submitextracts the approved-and-reviewed content intoprojects/{project_id}/memories/{discoveries,performance}.mdat approval time. Writing memories at synthesize time would propagate unvetted claims; the approval-gated path keeps OV-ingestible memories tied to content that survived review. If a section has no entries, omit it from REPORT.md entirely (don't write an empty## Discoveriesheading);/submittreats absent + empty identically and won't write a memory file.
Also update projects/{project_id}/README.md:
- After Step 0, status is always either
active(forward path) oranalysis(re-synthesis path, possibly after areviewed/completedemote). In both cases, set## Statusto "Analysis — report drafted, awaiting/berdl-reviewand/submit." This is honest about the project's state: any prior approval was archived in Step 0, and any prior reviews are stale via hash mismatch. - Preserve existing
## Research Questionand## Authorssections.
Step 7b: Update Manifest
Update projects/{project_id}/beril.yaml (skip silently if the file is missing — pre-manifest project):
status: set toanalysis. (Step 0 already handled the demote cases forreviewed/completeand rejectedexploration/proposed, so the only statuses that reach Step 7b areactive— flipping forward — andanalysis— already there, idempotent.)artifacts.report:truelast_session_at: current ISO 8601 timestamp
This makes /synthesize self-contained: when invoked directly (outside the /berdl_start orchestration), the manifest reflects the correct state without bypassing the plan-review checkpoint or the proposed → active transition that Phase C owns.
Step 8: Update References
Add any new papers found during synthesis to projects/{project_id}/references.md.
If the file doesn't exist, create it following the format from /literature-review.
Step 9: Trigger Pitfall Capture (if needed)
If unexpected data patterns were found during interpretation (missing data, anomalous distributions, coverage gaps), follow the pitfall-capture protocol.
Step 10: Suggest Next Steps
After completing the synthesis, tell the user:
"Findings drafted in
projects/{project_id}/REPORT.md. Next steps:
- Review the Key Findings and Interpretation sections.
- Run
/berdl-reviewto produce a numbered review of the current report (REVIEW_N.md). Iterate as much as you want — each review embeds the report's hash so/submitknows which one is current.- When you're ready to stand behind the project, run
/submitto approve and archive it to the lakehouse."
Integration
- Reads from:
data/*.csv,figures/,notebooks/*.ipynb,RESEARCH_PLAN.md,references.md - Calls:
/literature-review(for literature comparison) - Produces:
REPORT.md(Key Findings, Results, Interpretation, Supporting Evidence, Future Directions, References); updatedREADME.md(Status) - Consumed by:
/submit(reviewer assesses the findings in REPORT.md)
Pitfall Detection
When you encounter errors, unexpected results, retry cycles, performance issues, or data surprises during this task, follow the pitfall-capture protocol. Read .claude/skills/pitfall-capture/SKILL.md and follow its instructions to determine whether the issue should be added to the active project's projects/<id>/memories/pitfalls.md.