gsmm-builder

star 13.4k

Build or load a genome-scale metabolic model (GSMM) using COBRApy. Covers loading from BIGG, constructing minimal models from scratch, setting medium constraints, and exporting validated .json model files.

aiming-lab By aiming-lab schedule Updated 5/20/2026

name: gsmm-builder description: > Build or load a genome-scale metabolic model (GSMM) using COBRApy. Covers loading from BIGG, constructing minimal models from scratch, setting medium constraints, and exporting validated .json model files. metadata: category: domain trigger-keywords: "metabolic,metabolism,GSMM,COBRApy,COBRA,BIGG,genome-scale model,stoichiometric model,model loading,medium constraints" applicable-stages: "9,10,11,12,13" priority: "2"

Overview

The gsmm-builder skill constructs or loads genome-scale metabolic models (GSMMs) in the COBRApy framework. It is the entry point for every metabolic flux analysis pipeline. Output is a validated COBRApy Model object serialized to a JSON file ready for downstream FBA and flux analysis.

GSMMs encode every known metabolic reaction in an organism as a stoichiometric matrix. Constraints (reaction bounds, medium composition, objective function) turn the model into a solvable linear program.


Workflow

Step 1 — Decide: Load Existing or Build from Scratch

Option A: Load a curated BIGG model

import cobra
import cobra.io

# Load E. coli iJO1366 from a local SBML file
model = cobra.io.read_sbml_model("iJO1366.xml")

# Or load from a pre-downloaded JSON file
model = cobra.io.load_json_model("iJO1366.json")

print(f"Loaded {model.id}: {len(model.reactions)} reactions, "
      f"{len(model.metabolites)} metabolites, {len(model.genes)} genes")

Key BIGG model IDs:

  • iJO1366E. coli K-12 MG1655 (2583 reactions)
  • Recon3DHomo sapiens (13543 reactions)
  • iMM904S. cerevisiae (1577 reactions)
  • iNJ661M. tuberculosis (1049 reactions)

Option B: Build a minimal model from scratch

from cobra import Model, Metabolite, Reaction

model = Model("toy_glycolysis")

# Define metabolites with compartments and formula
glc_e = Metabolite("glc__D_e", formula="C6H12O6", name="D-Glucose",
                   compartment="e")
glc_c = Metabolite("glc__D_c", formula="C6H12O6", name="D-Glucose",
                   compartment="c")
atp_c = Metabolite("atp_c",  formula="C10H12N5O13P3", name="ATP",
                   compartment="c")
biomass = Metabolite("biomass", formula="", name="Biomass", compartment="c")

# Build reactions
ex_glc = Reaction("EX_glc__D_e")
ex_glc.lower_bound = -10.0  # uptake (negative = import)
ex_glc.upper_bound = 0.0
ex_glc.add_metabolites({glc_e: 1.0})

transport = Reaction("GLCt")
transport.lower_bound = -1000.0
transport.upper_bound = 1000.0
transport.add_metabolites({glc_e: -1.0, glc_c: 1.0})

# Stoichiometry: 1 glucose + ADP -> 2 ATP (simplified glycolysis)
glycolysis = Reaction("GLYCOLYSIS")
glycolysis.lower_bound = 0.0
glycolysis.upper_bound = 1000.0
glycolysis.add_metabolites({glc_c: -1.0, atp_c: 2.0})

biomass_rxn = Reaction("BIOMASS")
biomass_rxn.lower_bound = 0.0
biomass_rxn.upper_bound = 1000.0
biomass_rxn.add_metabolites({atp_c: -10.0, biomass: 1.0})

model.add_reactions([ex_glc, transport, glycolysis, biomass_rxn])

Step 2 — Set the Objective Function

# Set biomass as the optimization target
model.objective = "BIOMASS_Ec_iJO1366_core_53p95M"  # reaction ID string

# Verify objective is set
print(model.objective.to_json())

Step 3 — Define the Growth Medium

# M9 minimal medium with glucose (aerobic)
M9_MEDIUM = {
    "EX_glc__D_e": -10.0,   # glucose uptake, mmol/gDW/h
    "EX_o2_e":    -20.0,    # oxygen (aerobic)
    "EX_nh4_e":  -1000.0,   # ammonium (unlimited)
    "EX_pi_e":   -1000.0,   # phosphate (unlimited)
    "EX_so4_e":  -1000.0,   # sulfate (unlimited)
    "EX_h2o_e":  -1000.0,   # water (unlimited)
    "EX_h_e":    -1000.0,   # protons (unlimited)
}

# Apply medium: close all exchange reactions first, then open selected
medium = model.medium  # returns dict of current open exchange lb magnitudes
for rxn_id, lb in M9_MEDIUM.items():
    if rxn_id in model.reactions:
        model.reactions.get_by_id(rxn_id).lower_bound = lb

# Anaerobic: set O2 uptake to zero
# model.reactions.get_by_id("EX_o2_e").lower_bound = 0.0

Step 4 — Export Model

import cobra.io

cobra.io.save_json_model(model, "output/my_model.json")
cobra.io.write_sbml_model(model, "output/my_model.xml")
print("Model saved.")

Key Conventions

Convention Detail
Metabolite ID format <bigg_id>_<compartment> e.g. glc__D_c, atp_m
Compartment codes c cytosol, e extracellular, m mitochondria, n nucleus
Exchange reaction prefix EX_ e.g. EX_glc__D_e
Transport reaction prefix t or species-specific e.g. GLCt, PGI
Uptake bound sign Negative lower bound = import (e.g. lb = -10)
Secretion bound sign Positive upper bound = export (e.g. ub = 1000)
Biomass objective Reaction with ID containing BIOMASS or Growth
Aerobic medium EX_o2_e lb = -20 (mmol/gDW/h)
Anaerobic medium EX_o2_e lb = 0
Default irreversible rxn lb = 0, ub = 1000
Default reversible rxn lb = -1000, ub = 1000

BIGG Database Downloads

# Download model directly from BIGG REST API
curl -O "http://bigg.ucsd.edu/static/models/iJO1366.json"
curl -O "http://bigg.ucsd.edu/static/models/Recon3D.json"

Common Failure Modes

  • Infeasible model: missing exchange reaction or closed medium — run model.optimize() and check solution.status == "infeasible".
  • Negative growth: objective reaction direction inverted — ensure biomass_rxn.lower_bound = 0.
  • Dead-end metabolites: metabolite produced but never consumed — run gsmm-validator before FBA.
Install via CLI
npx skills add https://github.com/aiming-lab/AutoResearchClaw --skill gsmm-builder
Repository Details
star Stars 13,443
call_split Forks 1,577
navigation Branch main
article Path SKILL.md
More from Creator