name: method-finder description: "Biological experiment method (protocol/SOP) search, comparison, and structured retrieval for synthetic biology and drug discovery labs. Use when the user asks to: find an experimental method or SOP (LC-MS, fermentation, protein expression/purification, enzyme assay, molecular cloning, etc.), compare protocols, look up troubleshooting for a lab method, retrieve step-by-step parameters for a known SOP, search PubMed/Europe PMC/protocols.io for method literature, generate a grounded SOP draft, or check protocol completeness. Supports Chinese and English queries. Enforces compliance: internal full-text is returned; external sources return metadata + abstract + link only." license: MIT metadata: author: Ficere version: "1.0.0" repo: https://github.com/Ficere/bioprotocol-finder
Method Finder Skill
When to Use This Skill
Load this skill when the user needs to:
- 查找实验方法 / Find a protocol: "有没有LC-MS定量的SOP?" / "Find me a protocol for enzyme kinetics assay"
- 比较方法差异 / Compare methods: "比较发酵培养基优化和温度分步策略的区别"
- 检索参数 / Look up parameters: "INT-FERM-002 的溶氧控制参数范围是多少?"
- 排错 / Troubleshoot: "蛋白纯化收率低,有哪些可能原因?"
- 外部文献检索 / External literature: "PubMed上有没有关于包涵体复性的近期方法?"
- 生成SOP草案 / Generate SOP draft: "帮我起草一个大肠杆菌重组蛋白表达纯化的SOP"
- 完整性检查 / Completeness check: "这份实验方案有哪些QC检查点缺失?"
Architecture Overview
Tool Interface (9 functions)
↓
Retrieval Orchestrator (BM25 + Dense + Metadata Filter + RRF)
↓ ↓
Internal Library External Index Cache
(SQLite FTS5 + (SQLite + PubMed/
Chroma vectors) EuropePMC/protocols.io)
Setup (First-Time)
pip install -r requirements.txt
cp .env.example .env # fill in keys (all optional)
python scripts/ingest_internal.py examples/sample_method_card.yaml
python scripts/build_vector_index.py
python examples/quickstart.py
Tool Functions
Load and call from core/tool_interface.py:
from core.tool_interface import (
search_methods, get_method, get_method_steps,
get_method_parameters, compare_methods,
find_troubleshooting, search_external,
generate_sop_draft, check_protocol_completeness,
)
1. search_methods(query, filters, top_k, include_external)
Hybrid BM25 + semantic search across internal library and external cache.
Returns list[MethodSearchResult] with score, source, and license status.
Internal cards appear first.
from core.schema import SearchFilters
results = search_methods(
query="LC-MS 肽段定量",
filters=SearchFilters(method_type=["proteomics_quantification"]),
top_k=10,
)
2. get_method(method_id) -> MethodCard | None
Full internal method card. External entries not available here.
3. get_method_steps(method_id) -> list[Step] | None
Structured step list. step.critical=True marks critical steps.
4. get_method_parameters(method_id) -> list[Parameter] | None
Parameter table with unit, range, adjustable flag, instrument constraints.
5. compare_methods(method_ids) -> ComparisonResult
Side-by-side diff across throughput, time, equipment, reagents, steps, etc. Minimum 2 IDs. Internal only — external entries lack full-text for comparison.
6. find_troubleshooting(method_id, symptom) -> list[TroubleshootingEntry]
Keyword-match troubleshooting entries. Falls back to all entries if no match.
7. search_external(query, filters, top_k, use_live_api) -> list[ExternalIndexEntry]
Searches PubMed, Europe PMC, and protocols.io (if token configured). Returns metadata + abstract snippet (≤300 chars) + URL only. Full text must be accessed at the original source URL.
entries = search_external("chaperone refolding inclusion body", top_k=10)
for e in entries:
print(e.title, e.url, e.license_status)
# Do NOT attempt to return full protocol text — redirect to e.url
8. generate_sop_draft(goal, sample_type, equipment, throughput, context_method_ids, language)
LLM-backed grounded SOP generation. Requires LLM_API_KEY.
Sections with is_gap=True lack internal evidence — must be flagged to user.
sop = generate_sop_draft(
goal="重组酶在大肠杆菌中表达和纯化",
sample_type="BL21(DE3) 细胞裂解液",
equipment=["AKTA Pure", "超声破碎仪", "离心机"],
throughput="medium",
language="zh",
)
9. check_protocol_completeness(protocol_draft) -> CompletenessReport
QA completeness check. Requires LLM_API_KEY. Checks: controls, QC checkpoints, parameter ranges, method validation, safety, documentation, reagent specs.
Grounding Rules (MUST FOLLOW)
- Internal cards: Return any field freely — this is proprietary internal content.
- External entries: Return
title,authors,year,doi,url,abstract_snippet(≤300 chars),license_statusonly. Never return or reconstruct full experimental steps from external sources. - LLM generation: Prohibit hallucinating steps. Every section must cite
grounded_from. Mark gaps explicitly. - License transparency: Always include
license_statusin every result shown to the user. - Redirect: For external content, always direct the user to the original
urlfor full protocol access.
Key Configuration
See references/connector_specs.md for full connector details.
| Key | Purpose | Without Key |
|---|---|---|
OPENAI_API_KEY / EMBEDDING_API_BASE |
Embedding API | Falls back to local bge-m3 |
LLM_API_KEY |
SOP generation + completeness check | Tools disabled |
NCBI_API_KEY |
PubMed rate limit boost | 3 req/s (still functional) |
PROTOCOLSIO_ACCESS_TOKEN |
protocols.io search | Source skipped |
Adding Internal Method Cards
# data/cards/my_method.yaml
method_id: "INT-XXX-001"
title: "My Protocol"
method_type: "enzyme_assay" # see schema for full enum
# ... (full schema: references/method_card_schema.yaml)
python scripts/ingest_internal.py data/cards/my_method.yaml
python scripts/build_vector_index.py --rebuild
python scripts/validate_cards.py data/cards/ --strict
Compliance Policy
Full policy: references/compliance_policy.md
- Internal cards: full content allowed
- PubMed/Europe PMC: metadata + abstract (API content) + DOI link only
- protocols.io: CC-BY entries may include abstract; others metadata only
- Commercial sites (Bio-protocol, JoVE): DOI redirect only, no programmatic access
local_fulltext_allowedfield is always set and must be respected