name: workflow_skill_creator description: Distills a completed user workflow or interaction into a reusable VenusFactory agent skill. Use when the user says "make this a skill", "create a skill from what we just did", "package this workflow" or similar. Adapts the workflow into the VenusFactory tools wiring + SKILL.md pattern. Do not use for creating skills from scratch without an existing workflow. license: Apache-2.0 (adapted from google-deepmind/science-skills) metadata: skill-author: VenusFactory2 (adapted from Google DeepMind)
Workflow-to-Skill Distiller
Turns a completed VenusFactory workflow into a reusable agent skill. Extracts patterns from an interaction that already happened and packages them following the project's conventions.
[!CAUTION] You MUST complete Phase 1 (Brainstorming) before writing any code or SKILL.md content. Skipping brainstorming produces skills that are either too rigid or too vague.
Phase 1: Brainstorming (MANDATORY)
Have an iterative back-and-forth conversation with the user. Do NOT ask all questions at once. Pick 2-3 relevant questions per round, refine your understanding, and ask follow-ups.
Round 1: Understand the Workflow
Start by summarizing what you observed, then ask:
- "Here's my understanding of the workflow: [summary]. Is this accurate?"
- "What are the expected inputs and outputs?"
- "How often will this run? One-off, recurring, or part of a larger pipeline?"
Round 2: Flexibility and Error Handling
For each step:
- "If [step X] fails (API down, no results), should the agent (a) ask for guidance, (b) try alternatives automatically, or (c) fail loudly?"
- "Are there steps where the exact method matters (must use a specific database / model), vs. steps where any reasonable approach is fine?"
Round 3: Reuse Existing VenusFactory Tools
Before asking these questions, check src/tools/database/, src/tools/visualize/, src/tools/predict/, src/tools/mutation/, src/tools/search/, src/tools/train/, src/tools/file/, src/tools/bioinfo/ for overlap. If an existing tool covers a step, the new skill MUST reference it — do not reimplement.
- "I noticed the workflow uses [tool X, tool Y] that already exist. The new skill will reference these. Anything else to incorporate?"
- "Are there rate limits for new external APIs that aren't covered?"
- "Any reference docs (API specs, papers, datasets) I should include under
references/?"
Round 4: Scope, Code, and Naming
- "Should the skill cover [X, Y, Z] from the workflow, or include/exclude anything?"
- Determine whether the skill needs new code:
- Needs new code if any step calls an external API not yet wrapped, processes files, or computes results not already in
src/tools/. → Follow the 6-file wiring recipe below. - Instruction-only if every step orchestrates existing VenusFactory tools. → Write SKILL.md only, no Python.
- Needs new code if any step calls an external API not yet wrapped, processes files, or computes results not already in
- Propose a name:
{verb}_{noun}_databasefor new DB wrappers,{noun}for analysis/visualization, matching existing folders (alphafold_database,pymol,clustalo_msa, etc.).
Round 5: Validation (Optional)
- "Sample query + expected answer I can use to verify? Optional but helpful."
Brainstorming Completion Criteria
Move to Phase 2 only when you can answer:
- Purpose and scope
- Inputs and outputs
- Strict vs flexible steps
- Which existing VenusFactory tools are reused
- What new scripts (if any) are needed
- Rate limits
- Error handling strategy
- Code needed (→ 6-file wiring) or instruction-only (→ SKILL.md only)
- Sample query/answer
Phase 2: Skill Design
Produce a design document (markdown plan) and present for approval:
- Skill name and frontmatter (see Rule 6).
- Directory structure showing all planned files.
- Existing VenusFactory tools referenced with rationale.
- New code files with proposed function signatures.
- Rate limiting strategy for any new external API.
- Error handling strategy per step.
Wait for explicit user approval before Phase 3.
Phase 3: Implementation
Guiding Principles
- Match the project's existing Python style — use the conda env at
~/miniconda3/envs/venus/bin/python(seeenvironment.yaml); do NOT introduceuvor per-script# /// scriptheaders. - Prefer
requests+urllib3.util.retry.Retryfor HTTP (this is the established pattern, e.g.src/tools/database/alphafold/alphafold_structure.py). - All public download/query functions return a JSON string with the rich envelope:
- Success:
{"status": "success", "file_info": {...} | "content": "...", "content_preview": str, "biological_metadata": {...}, "execution_context": {...}} - Error:
{"status": "error", "error": {"type", "message", "suggestion"}, "file_info": null}
- Success:
- Use
to_client_file_path()fromsrc/tools/path_sanitizer.pyfor anyfile_pathin responses. - Document rate limits in comments and respect them with
time.sleepbetween polls.
Rule 1: Reuse Existing Tools
When the workflow uses functionality already in src/tools/, the new SKILL.md MUST reference the existing tool by name in its ## Project Tools table — do not duplicate the implementation.
Rule 2: Rate Limiting for New APIs
For any new external API, the implementation MUST handle rate limits:
- Look up the API's documented rate limit; if undocumented, default to 1 req/s.
- For submit-poll-download APIs (EBI, ColabFold, Foldseek webserver), poll interval ≥ 5 s with a sensible upper-bound timeout (5–10 min).
- Use
time.monotonic()for timing. - Retry transient 5xx + 429 with exponential backoff (existing pattern:
Retry(total=5, backoff_factor=1, status_forcelist=[429, 500, 502, 503, 504])). - On non-retriable 4xx, include the response body in the error message — bodies contain actionable details (e.g. "invalid accession") that let the agent self-correct.
Rule 3: The 6-File Wiring Recipe (When Code Is Needed)
To add a new database/visualization/etc tool to VenusFactory, touch exactly these files in this order:
src/tools/<category>/<name>/__init__.py— export the public function(s).src/tools/<category>/<name>/<name>_operations.py— copy the_error_response()/_download_success_response()envelope builders fromsrc/tools/database/alphafold/alphafold_operations.py, then implement the entry function returning the JSON string.src/tools/database/tools_agent.py(or category equivalent) — addfrom .<name> import ..., defineclass <Tool>Input(BaseModel), write the@tool("…", args_schema=<Tool>Input)wrapper with a try/except, and append toDATABASE_TOOLS.src/tools/database/tools_api.py— add a@router.get/@router.postroute calling the same core function.src/tools/database/tools_mcp.py— add a@mcp.tool(name="…")wrapper (no Pydantic schema; positional + typed args).src/tools/tools_agent_hub.py— add the new_toolname to (a) the topfrom tools.database.tools_agent import (...)block, (b) theDATABASE_TOOLSlist, and (c) the__all__re-export list.
After wiring, verify:
~/miniconda3/envs/venus/bin/python -c "from src.tools.tools_agent_hub import get_tools; print(len(get_tools()))"
The count must have increased by the number of new tools.
Rule 4: SKILL.md Goes in src/agent/skills/<name>/
Every skill needs a SKILL.md with the VenusFactory frontmatter:
---
name: <snake_case_name>
description: <≤500 chars; lead with capability, then "Use when...", then "Don't use for...">
license: <e.g. Apache-2.0 / Unknown>
metadata:
skill-author: VenusFactory2.
---
# <Skill Title>
## Overview
<1 paragraph>
## Project Tools (VenusFactory2)
| Tool | Args | Returns | Description |
|------|------|---------|--------------|
| **<tool_name_in_agent>** | <required + defaults> | <JSON envelope shape> | <one-line purpose> |
## When to Use This Skill
<bulleted scenarios>
## Common Mistakes
<2-3 pitfalls>
Optional: a references/ subdirectory for API specs, recipe books, or sample payloads.
Rule 5: Instruction-Only Pattern (No New Code)
If the workflow purely orchestrates existing VenusFactory tools, skip steps 1–6 of Rule 3 and write only src/agent/skills/<name>/SKILL.md. Use a clear ## Workflow section:
## Workflow
### 1. Step Name
- Description
- Which existing tool to call (e.g. `download_alphafold_structure_by_uniprot_id`)
- How to chain its output to the next step
### 2. Next Step
...
Rule 6: File-First Output
All download tools MUST write large payloads to disk and return only {file_path}. Never return raw PDB / mmCIF / large JSON in the response — it explodes the agent's context window. The corresponding analyze_* / read_* tools then read from those files.
Phase 4: Validation
After implementation:
- Import smoke test:
python -c "from src.tools.<category>.<name> import <fn>" - Hub count check: confirm
get_tools()count increased. - Manual invocation through
chat_agent.pywith a prompt that should trigger the new skill. - Sample query from Round 5 (if provided) end-to-end.
References
references/cli_script_template.py— original google-deepmind template, kept for reference (NOT used directly in VenusFactory; we don't useuv/ inline scripts).- Canonical VenusFactory wiring example:
src/tools/database/alphafold/end-to-end (see alsoalphafold_analyze.pyfor a pure-local-analysis pattern, no external API). - Canonical submit-poll-download example:
src/tools/database/foldseek/.