name: bio-metabolomics-lipidomics description: Assigns honest lipid annotation levels, designs class-based internal-standard quantification, and runs lipid-aware differential and enrichment analysis with lipidr, guarding against in-source-fragment phantoms, sn-position over-claims, and invalid cross-class quantification. Use when naming or canonicalizing lipid species (shorthand separators, Goslin), deciding shotgun vs RP vs HILIC LC-MS, picking internal standards (SPLASH/EquiSPLASH), interpreting MS-DIAL/LipidSearch output, or comparing lipid classes. For general feature detection see metabolomics/xcms-preprocessing and metabolomics/msdial-preprocessing; for non-lipid annotation confidence see metabolomics/metabolite-annotation; for normalization/QC see metabolomics/normalization-qc; for multivariate stats see metabolomics/statistical-analysis. tool_type: r primary_tool: lipidr
Version Compatibility
Reference examples tested with: lipidr 2.16+, pygoslin 2.0+, MS-DIAL 5+
The achievable annotation level is fixed by the acquired evidence, not the software: sn-position and double-bond localization require EAD/OzID/PB/UVPD data that routine CID never produces, and class-resolved quantification requires one isotope-labeled internal standard per class. Verify both before trusting a name or a number.
Before using code patterns, verify installed versions match. If versions differ:
- R:
packageVersion('lipidr')then?function_nameto verify parameters - Python:
pip show pygoslinthenhelp(module.function)to check signatures
If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Lipidomics Analysis
"Analyze my lipidomics data" -> Canonicalize names to the resolution level the evidence supports, quantify each class against its own standard, then run class/chain-aware differential and enrichment analysis.
- R:
lipidr::read_skyline()/as_lipidomics_experiment(),de_analysis(),lsea() - Nomenclature:
pygoslin(Python) orrgoslin(R) for parsing/canonicalization - Identification: MS-DIAL 5 (open) or LipidSearch (commercial) upstream
The Single Most Important Insight -- A Lipid Name Is a Structural-Resolution Claim the Software Usually Overstates
The Liebisch/LIPID MAPS shorthand encodes, in its punctuation, exactly how much structure was measured: PC 34:1 (space, sum composition) < PC 16:0_18:1 (underscore, chains known) < PC 16:0/18:1 (slash, sn-resolved) < PC 16:0/18:1(9Z) (double-bond position+geometry). The resolution level is a property of the evidence, not of the string. Tools manufacture overstatement three ways: a formatter that only knows /, an in-silico library entry authored at sn-level that a species-level match inherits, and "annotate to the nearest database structure" silently promoting a sum composition to a full structure. sn-position is almost never genuinely measured under CID, so treat every / as an unproven _ until EAD/UVPD/derivatization evidence is in hand. The default rule is: when in doubt, drop a level.
Structural-Resolution Hierarchy (Separator Semantics)
| Notation | Separator | What was measured | What may NOT be claimed |
|---|---|---|---|
PC 34:1 |
space | class + total carbons:double-bonds (accurate mass + isotope + class diagnostic) | the two chains; sn; C=C position |
PC 16:0_18:1 |
underscore _ |
the two acyl chains (MS/MS acyl losses, RT/ECN-consistent, not an in-source fragment) | which chain is sn-1 vs sn-2 |
PC 16:0/18:1 |
slash / |
sn-1/sn-2 assignment (EAD/UVPD/enzymatic - not a CID acyl-loss intensity guess) | C=C position/geometry |
PC 16:0/18:1(9Z) |
parentheses | exact double-bond position + cis/trans (OzID/PB/EAD/UVPD) | (full structure) |
PC O-34:1 / PC P-34:1 |
O- ether / P- plasmalogen |
ether vs vinyl-ether linkage (diagnostic ion or acid-lability) | a sum composition alone cannot distinguish P-34:1 from O-34:2 (vinyl ether = ether + one C=C) |
Cer 18:1;O2/16:0 |
;O2 |
sphingoid hydroxyl count (old d18:1) - measured, not assumed |
backbone unsaturation if d18:1 was a default rather than fragment-confirmed |
Canonicalize every name through Goslin before merging tables or querying LIPID MAPS; never string-match lipid names by hand. Goslin preserves a false / faithfully - it is necessary but not sufficient.
Decision Tree by Question
| Question / situation | Approach | Why |
|---|---|---|
| Accurate class-level quantification, high throughput | Shotgun (direct infusion) or HILIC-LC-MS | constant concentration / class bands -> clean ratio to a co-eluting class IS |
| Resolve isobars/isomers, deep low-abundance coverage | RP-LC-MS (± ion mobility) | RT axis adds an identity coordinate; co-elution flags in-source fragments |
| Double-bond position, sn-position, ether/plasmalogen | LC-MS + EAD/OzID/PB/UVPD (± IM) | only these break C=C / glycerol backbone; CID is blind to them |
| Spatial localization | MS-imaging (MS-DIAL 5 spatial mode) | tissue context with predicted-CCS database |
| Need PC acyl chains | negative-mode formate/acetate adduct -> [M-CH3]- |
[M+H]+ gives only the m/z 184 head-group ion (class, no chains) |
| Neutral lipids (TG/DG) chains | [M+NH4]+ adduct |
drives neutral-loss-of-fatty-acid fragmentation |
| Suspicious elevated LPC / DG / FA pool | RT co-elution test vs the parent class | an LPC eluting at a PC's RT is an in-source fragment, not biology |
An apparent odd-chain species (PC 33:1) |
require MS/MS chain confirmation | usually an in-source fragment or 13C-isotope artifact of an even neighbor |
| Merge names across tools / before a DB lookup | Goslin canonicalization first | abbreviations and separators are tool-specific; hand string-matching corrupts merges |
| Untargeted oxidized-lipid claim | escalate to a targeted, standard-anchored oxylipin panel | untargeted oxidized-lipid IDs are hypotheses; auto-oxidation in the tube fabricates them |
Load, Normalize, and Run Differential Analysis (lipidr)
Goal: Import a quantified lipid table, normalize within class, and find lipids that differ between groups with class/chain-aware output.
Approach: Read a Skyline/matrix export into a LipidomicsExperiment, attach sample groups, normalize (PQN or class internal standard), then de_analysis with an explicit contrast; visualize as a class-faceted volcano.
library(lipidr)
# data_normalized ships with lipidr (PQN-normalized, log2); substitute a real import:
# d <- read_skyline(list.files(datadir, 'data.csv', full.names = TRUE))
# d <- add_sample_annotation(d, 'clinical.csv')
# d <- normalize_pqn(d, measure = 'Area', exclude = 'blank', log = TRUE)
data(data_normalized)
# Contrast references sample-group labels directly; group_col defaults to the first annotation
de_results <- de_analysis(data_normalized, HighFat_water - NormalDiet_water, measure = 'Area')
# logFC.cutoff is on the log2 scale used by limma's topTable inside de_analysis
sig <- significant_molecules(de_results, p.cutoff = 0.05, logFC.cutoff = 1)
plot_results_volcano(de_results, show.labels = FALSE)
Class-Based Internal-Standard Quantification (the non-negotiable)
Goal: Convert per-class signal to comparable abundances without baking in class-dependent ionization error.
Approach: Ratio each species to a stable-isotope-labeled standard of its OWN class, spiked before extraction so it shares the class's recovery loss; never quantify one class with another class's standard.
# normalize_istd divides each lipid by the internal standard of its matched class.
# Requires one labeled IS per class present in the data (e.g. SPLASH/EquiSPLASH covers ~13 classes).
d_istd <- normalize_istd(data_normalized, measure = 'Area', exclude = 'blank', log = TRUE)
# Class-level summary is only valid WITHIN a class unless per-class response factors were calibrated:
# cross-class molar ratios (e.g. 'PE is 3x PC') carry head-group response bias and are not licensed here.
plot_lipidclass(d_istd, 'sd')
Honest Annotation-Level Assignment (Goslin)
Goal: Downgrade any name to the level the evidence supports and verify the claimed level is internally consistent.
Approach: Parse with Goslin, read the perceived level, and re-emit at SPECIES (or MOLECULAR_SPECIES) unless sn/C=C evidence exists.
from pygoslin.parser.Parser import LipidParser
from pygoslin.domain.LipidLevel import LipidLevel
parser = LipidParser()
lipid = parser.parse('PC 16:0/18:1') # a slash-claimed name from a tool export
claimed_level = lipid.lipid.info.level # LipidLevel enum the string asserts
# Without EAD/UVPD evidence, re-emit at the honest molecular-species level (drops the unproven sn):
honest_name = lipid.get_lipid_string(LipidLevel.MOLECULAR_SPECIES) # 'PC 16:0_18:1'
sum_name = lipid.get_lipid_string(LipidLevel.SPECIES) # 'PC 34:1'
Per-Method Failure Modes
In-source-fragment phantom lyso-/DG-lipidome
- Trigger: A labile lipid (PC, TG, plasmalogen) clips an acyl chain in the ESI source before MS1.
- Mechanism: The fragment is recorded as an intact precursor; PC->LPC, PE->LPE, TG->DG->MG. The fragment can also be isobaric with a free fatty acid or another class, fabricating phantom signal in several bins; extent is instrument- and tune-dependent.
- Symptom: Inflated LPC:PC, DG:TG, or FA pools; an "LPC" eluting at a PC's retention time.
- Fix: RT co-elution test (a real LPC elutes at its own ECN position); soften the source (lower in-source CID/transfer energy); treat any large lyso/DG/FA pool as suspect until RT-cleared. Shotgun has no RT axis to run this test - never report elevated lyso-lipids from direct infusion without the in-source-fragment caveat.
sn-position over-claim
- Trigger: A tool exports
/from CID-only data, or a library back-fills its authored sn arrangement onto a species-level match. - Mechanism: CID acyl-loss intensity bias toward sn-2 is real but small, condition-dependent, and biological samples contain both regioisomers, so the ratio is a blend, not a structure readout.
- Symptom:
/-formatted names with no EAD/UVPD/derivatization evidence file attached. - Fix: Canonicalize through Goslin and re-emit at
MOLECULAR_SPECIES(_); at most state "dominant sn-2 likely X" while reporting_.
Invalid cross-class quantification
- Trigger: One global internal standard, or comparing molar abundances across classes after only within-class normalization.
- Mechanism: ESI response is head-group-dominated; a PC and a PE at equal moles give signal differing by factors that can exceed an order of magnitude.
- Symptom: "Class A is N-fold class B" statements; a single IS used for the whole lipidome.
- Fix: One isotope-labeled IS per class; report semi-quantitative within-class unless per-class (and per-adduct) response factors were independently calibrated.
Ether vs plasmalogen (O-/P-) mis-call
- Trigger: Reporting
P-(plasmalogen) from a sum composition. - Mechanism:
P-34:1andO-34:2share elemental composition (vinyl ether = ether + one C=C); mass cannot distinguish them. - Symptom: Plasmalogen calls with no vinyl-ether diagnostic ion or acid-lability evidence.
- Fix: Require a diagnostic fragment or acid-lability test; otherwise report at the level that cannot distinguish them.
Quantitative Thresholds
| Threshold | Source | Rationale |
|---|---|---|
| One isotope-labeled IS per lipid class | Köfeler 2021 (good practice); SPLASH/EquiSPLASH | ESI response is head-group-dominated; one global IS miscalibrates every other class |
| EquiSPLASH = 13 deuterated IS at equal 100 µg/mL | Avanti product spec | equimolar comparative use; SPLASH LIPIDOMIX uses unequal physiological concentrations |
| Spike IS before extraction | Köfeler 2021 | only a co-extracted IS corrects class-biased recovery (Folch/Bligh-Dyer/MTBE differ for polar minor classes) |
| MS-DIAL 5 EAD ~14 eV; 96.4% standards delineated, 78.0% sn/OH/C=C correct >1 µM | Takeda 2024 | structural lipidomics yield even with the modern method is incomplete and concentration-dependent |
| ~half of single-software species-level IDs need orthogonal evidence | Köfeler 2021 (Nat Commun) | 510/1108 features, 130/301 PCs & 55/171 TGs violated the ECN/RT model in an audited published set |
| LipidSearch grades: keep A/B/C, drop D | LipidSearch grade definitions | D = mass-only; A = class + all chains = molecular-species level, NOT sn/C=C resolved |
| Shotgun infusion below the aggregation regime | Han/Gross protocol literature | above it lipids aggregate, ESI response goes nonlinear, the IS-ratio assumption collapses |
Common Errors
| Error / symptom | Cause | Solution |
|---|---|---|
could not find function "read_lipidomes" |
non-existent function name | use read_skyline() or as_lipidomics_experiment() |
plot_enrichment rejects an enrich.results argument |
wrong signature | plot_enrichment(de.results, significant.sets, annotation = 'class', measure = 'logFC'); get sets from significant_lipidsets() |
lsea(type = 'chain') errors |
no type argument |
lsea tests class/length/unsat sets automatically; rank with rank.by = c('logFC','P.Value','adj.P.Val') |
de_results$FDR is NULL |
wrong column name | de_analysis returns limma columns: adj.P.Val, P.Value, logFC |
pygoslin LipidLevel.MOLECULAR_SUBSPECIES AttributeError |
pre-2.0 enum name | current enum is SPECIES / MOLECULAR_SPECIES / SN_POSITION / STRUCTURE_DEFINED / FULL_STRUCTURE / COMPLETE_STRUCTURE |
| Elevated LPC reported from shotgun data | in-source fragmentation with no RT to flag it | add the in-source-fragment caveat; confirm with LC-MS RT co-elution before claiming lyso biology |
References
- Liebisch G, Vizcaíno JA, Köfeler H, et al. 2013. Shorthand notation for lipid structures derived from mass spectrometry. J Lipid Res 54:1523-1530.
- Liebisch G, Fahy E, Aoki J, et al. 2020. Update on LIPID MAPS classification, nomenclature, and shorthand notation for MS-derived lipid structures. J Lipid Res 61:1539-1555.
- Fahy E, Subramaniam S, Brown HA, et al. 2005. A comprehensive classification system for lipids. J Lipid Res 46:839-861.
- Kopczynski D, Hoffmann N, Peng B, Ahrends R. 2020. Goslin: A Grammar of Succinct Lipid Nomenclature. Anal Chem 92:10957-10960.
- Kind T, Liu KH, Lee DY, et al. 2013. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat Methods 10:755-758.
- Takeda H, Takahashi M, Ikeda K, et al. 2024. MS-DIAL 5 multimodal mass spectrometry data mining unveils lipidome complexities. Nat Commun 15:9903.
- Mohamed A, Molendijk J, Hill MM. 2020. lipidr: A Software Tool for Data Mining and Analysis of Lipidomics Datasets. J Proteome Res 19:2890-2897.
- Köfeler HC, Eichmann TO, Ahrends R, et al. 2021. Quality control requirements for the correct annotation of lipidomics data. Nat Commun 12:4771.
- Köfeler HC, Ahrends R, Baker ES, et al. 2021. Recommendations for good practice in MS-based lipidomics. J Lipid Res 62:100138.
- McDonald JG, Ejsing CS, Kopczynski D, et al. 2022. Introducing the Lipidomics Minimal Reporting Checklist. Nat Metab 4:1086-1088.
- Matyash V, Liebisch G, Kurzchalia TV, et al. 2008. Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics. J Lipid Res 49:1137-1146.
- Bowden JA, Heckert A, Ulmer CZ, et al. 2017. Harmonizing lipidomics: NIST interlaboratory comparison exercise for lipidomics using SRM 1950-Metabolites in Frozen Human Plasma. J Lipid Res 58:2275-2288.
Related Skills
- metabolomics/xcms-preprocessing - Upstream peak detection and feature extraction
- metabolomics/msdial-preprocessing - MS-DIAL alignment and deconvolution upstream of lipid annotation
- metabolomics/metabolite-annotation - General (non-lipid) annotation and confidence levels
- metabolomics/normalization-qc - Sample normalization and QC framing
- metabolomics/statistical-analysis - Multivariate stats on the lipid abundance matrix