name: gi-expression description: Predict tissue / cell-type expression (log TPM + TPM) from a 9,198 bp TSS-centered DNA sequence using the Genomic Intelligence G0 Expression model, via the hosted /v1/tasks/expression/predict API. The model is conditioned on a free-text cell-type / assay description. license: MIT metadata: openclaw: requires: bins: - python3 env: null config: null always: false emoji: ๐งช homepage: https://docs.genomicintelligence.ai os: - darwin - linux install: - kind: pip package: requests bins: null trigger_keywords: - expression prediction - predict expression - sequence to expression - TPM prediction - cell type expression - tissue expression - RNA-seq prediction - gi expression - G0 expression - genomic intelligence expression author: ClawBio + Genomic Intelligence demo_data:
- path: example_data/expression_hbb_k562.fa
description: HBB (ฮฒ-globin) TSS-centered 9,198 bp window, reverse-complemented to gene-sense. K562 is the demo cell context โ HBB is highly expressed in K562 erythroleukemia.
dependencies:
python: '>=3.10'
packages:
- requests>=2.31 domain: genomics endpoints: cli: python skills/gi-expression/gi_expression.py --input {input_file} --output {output_dir} inputs:
- name: input_file
type: file
format:
- fa
- fasta
- fna description: Single-record FASTA. The expression model expects exactly 9,198 bp centered on the TSS, gene-sense (RC minus-strand genes). required: false outputs:
- name: report type: file format: md description: Markdown report โ predicted log(TPM+1), TPM, model + timing.
- name: result
type: file
format: json
description: Full
{data, meta}response. - name: reproducibility type: directory description: command.sh + environment.json. tags:
- genomics
- expression
- RNA-seq
- TPM
- sequence-to-expression
- dna-lm
- gi-api version: 0.1.0
๐งช gi-expression
You are gi-expression, a ClawBio agent that calls the Genomic Intelligence sequence-to-expression model. Given a TSS-centered 9,198 bp window and a cell-type description, it returns predicted expression (log TPM + TPM).
โ ๏ธ Remote inference โ opt-in required. Unlike most ClawBio skills, this skill uploads your FASTA sequence to the hosted Genomic Intelligence API at
https://api.genomicintelligence.ai. Prefer a browser? The same models run interactively at https://genomicintelligence.ai. Do not submit identifiable patient data without an appropriate data-use agreement. Key setup: see Authentication below.
Trigger
Fire this skill when the user says any of:
- "predict expression for this gene / sequence"
- "what's the expression of this region in [cell type]?"
- "sequence-to-expression prediction"
- "TPM prediction", "log TPM prediction"
- "gi-expression", "G0 expression"
Do NOT fire when:
- The user has counts / RNA-seq output and wants differential expression โ
rnaseq-de - The user wants tissue annotation / GTEx lookup โ use external resources
Why This Exists
- Without it: Sequence-to-expression models (Enformer / Borzoi / G0 Expression) need GPU + private weights + careful 9-kbp windowing.
- With it: One CLI call โ expression prediction conditioned on free-text cell-type description, in <1 s.
- Why ClawBio: Private weights, hosted. ClawBio's reproducibility bundle + chaining (
gi-promoterโgi-expressionโrnaseq-deinterpretation).
API Backed
POST https://api.genomicintelligence.ai/v1/tasks/expression/predict โ default model g0-expression.
Workflow
- Parse: single-record FASTA (must be 9,198 bp, TSS-centered, gene-sense).
- Build options:
{"description": "assay term name is polyA plus RNA-seq. biosample summary is Homo sapiens K562."}by default; override via--description "...". - POST to
/v1/tasks/expression/predict. - Render:
report.md(headline log TPM) +result.json+reproducibility/.
CLI Reference
# Demo โ HBB in K562
python skills/gi-expression/gi_expression.py --demo --output /tmp/gi-expression-demo
# Custom cell-type description
python skills/gi-expression/gi_expression.py \
--input my_tss_window.fa \
--description "assay term name is polyA plus RNA-seq. biosample summary is Homo sapiens liver." \
--output report_dir
# Via ClawBio runner
python clawbio.py run gi-expression --demo
Authentication
The skill requires a Genomic Intelligence partner key in GI_API_KEY. Resolution order:
--api-key <value>CLI flag (explicit override).GI_API_KEYenvironment variable.- Otherwise: the skill raises a
RuntimeErrorpointing here.
Quick start โ ClawBio hackathon key
A shared hackathon-tier key ships in .env.example at the repo root (50 concurrent / 120 rpm, opt-in only). From wherever the ClawBio files live on your machine:
# Repo root (git clone) โ or ~/.claude/plugins/cache/clawbio/clawbio/<version>/ for plugin installs
cp .env.example .env
set -a && source .env && set +a
Production / heavier use
Request an individual key at contact@genomicintelligence.ai, then:
export GI_API_KEY=gi_yourkeyhere
Demo
python clawbio.py run gi-expression --demo
Bundled fixture is HBB centered on its canonical TSS, RC'd to gene-sense. With the K562 description, expect ~2.86 log(TPM+1) โ 16 TPM (HBB is highly expressed in K562 erythroleukemia).
Gotchas
- Sequence length is rigid: 9,198 bp. Anything else fails 422 validation. Center on the TSS.
- Gene-sense is mandatory. Minus-strand genes need reverse-complementing โ same posture as the GI testing fixtures. Without RC, HBB returns ~0.4 log(TPM+1) instead of ~2.89.
descriptionis required. The model is conditioned on it; "assay term name is polyA plus RNA-seq. biosample summary is Homo sapiens [tissue]." is the canonical format.- TPM scale is not absolute across tissues โ useful as a relative ranking within a cell type, not as a precise count prediction.
- Hackathon key is shared โ
GI_API_KEYfor heavier use.
Output Structure
output_dir/
โโโ report.md
โโโ result.json
โโโ reproducibility/
โโโ command.sh
โโโ environment.json
Integration with Bio Orchestrator
Routes here on: "predict expression", "sequence to expression", "TPM prediction", "cell-type expression".
Chains with: gi-promoter โ gi-expression (validate predicted promoters by predicting downstream expression), rnaseq-de (compare predicted expression to measured DE results), variant-annotation (compare ref/alt sequence expression for promoter / 5'UTR variants).
Safety
Research tool. Not a clinical assay. Predictions are model outputs, not measurements.