integrations-lifecycle

name: integrations-lifecycle description: Authoritative reference for Netdata's integrations pipeline -- how `metadata.yaml` drives per-integration pages, collector `taxonomy.yaml` drives dashboard TOC placement, the `COLLECTORS.md`/`SECRETS.md`/`SERVICE-DISCOVERY.md` umbrellas, the `integrations.js` and `integrations/taxonomy.json` artifacts consumed by downstream systems, and per-integration `.md` files committed to the repo. Use when adding/modifying any integration (collector, exporter, agent or cloud notification, authentication, secretstore, service-discovery, log type, deploy method); editing `metadata.yaml` or `taxonomy.yaml`; checking whether `integrations/*.md` should be hand-edited; reading generator scripts under `integrations/`, schemas under `integrations/schemas/`, taxonomy registries under `integrations/taxonomy/`, templates under `integrations/templates/`, the workflows `generate-integrations.yml` or `check-markdown.yml`; ibm.d modules where `metadata.yaml` is generated from `contexts.yaml`; the collector-consistency rule (metadata.yaml + taxonomy.yaml + config_schema.json + stock conf + alerts + README move together).

This skill is the single place to learn how Netdata's integrations pipeline works end to end. It documents:

the generator pipeline rooted in integrations/gen_integrations.py;
the collector taxonomy pipeline rooted in integrations/gen_taxonomy.py;
the JSON-Schema contracts every metadata.yaml and taxonomy.yaml is validated against;
every artifact the pipeline produces (gitignored runtime files AND committed .md documentation);
the <!--startmeta banner conventions and DO-NOT-EDIT rules;
the two CI workflows that auto-PR or gate the regenerated docs (generate-integrations.yml, check-markdown.yml);
the secondary ibm.d generation chain (contexts.yaml -> metadata.yaml);
the contract by which downstream dashboard code consumes integrations.js and, when opted in, integrations/taxonomy.json;
the collector-consistency rule (taxonomy.yaml moves with metadata and docs) and what is and is NOT enforced by tooling;
every surprising/dead/edge-case behavior an assistant or maintainer is likely to hit.

After reading SKILL.md plus the per-domain guides linked below, an assistant should never need to ask "how does metadata.yaml work?", "are these integrations/*.md files generated?", "what fields does the schema support?", "what runs in CI?", "where does the in-app integrations page get its data?".

Key concepts (read first)

metadata.yaml is the single source of truth. Every per-integration page on every surface (Learn site, in-app dashboard, the umbrella COLLECTORS.md / SECRETS.md / SERVICE-DISCOVERY.md pages) is rendered from metadata.yaml by the pipeline. Edit metadata.yaml, run the pipeline, commit the regenerated artifacts.

src/collectors/COLLECTORS.md is the source page for Learn's "Monitor anything with Netdata" page. It is generated from integrations/integrations.js by integrations/gen_doc_collector_page.py; never hand-edit its integration tables. To update that page, change the source metadata/categories or the generator, then run gen_integrations.py and gen_doc_collector_page.py.

Treat short descriptions as public product copy. Catalog descriptions must say what the integration is and what it monitors, enriches, exports, authenticates, or discovers. For collector-like metadata, the first sentence of overview.data_collection.metrics_description is the catalog sentence used by generated pages such as COLLECTORS.md. Start that sentence with a user-facing action phrase such as Monitor..., Collect..., Enrich network flows with..., or Annotate network flows with.... Do not use the catalog description for variables, defaults, option names, setup instructions, limits, or troubleshooting. Put those details in setup, default-behavior, examples, or troubleshooting fields. See description-authoring.md.
integrations/*.md files are GENERATED. DO NOT EDIT. Every per-integration .md opens with a  block that ends with message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S/EXPORTER'S/...'S metadata.yaml FILE". See artifacts-and-banners.md for the full banner spec. Edit the source metadata.yaml, regenerate, commit.
The CI workflow auto-opens a "Regenerate integrations docs" PR. After a metadata.yaml change merges to master, .github/workflows/generate-integrations.yml regenerates every per-integration .md and the umbrella pages and opens a PR for a maintainer to merge. You CAN regenerate locally and include the changes in the same PR; that is preferred to avoid two PRs per change.
The collector consistency rule. Anything that touches a collector's runtime behavior MUST land in one PR with matching changes to:
- metadata.yaml (the integration page driver),
- taxonomy.yaml (dashboard TOC placement for chart contexts),
- config_schema.json (the dashboard's DYNCFG editor),
- the stock .conf (what /etc/netdata/... ships),
- health.d/*.conf (the alert definitions),
- README.md (which is a symlink to the generated integrations/<slug>.md for single-integration plugins). See consistency.md for what is and is NOT automatically enforced.
ibm.d is different. ibm.d module metadata.yaml, README.md, and config_schema.json are GENERATED from contexts.yaml + config.go + module.yaml via go generate. NEVER hand-edit them. See ibm-d.md.
The dashboard consumes generated integration artifacts. The cloud-frontend at ${NETDATA_REPOS_DIR}/dashboard/cloud-frontend/ runs gen_integrations.py in its own CI to copy integrations.js into its source tree. The historical contract is that .js file's exact shape: export const categories = [...]; export const integrations = [...]. Collector taxonomy is emitted separately as integrations/taxonomy.json by gen_taxonomy.py; downstream consumers opt in to that JSON contract. See in-app-contract.md.

Guide	Purpose
`pipeline.md`	The 4-stage pipeline graph, every script, every artifact, the CI workflows.
`schema-reference.md`	Per-field reference for JSON Schemas under `integrations/schemas/`, including collector taxonomy schemas.
`description-authoring.md`	Product-copy rules for `metadata.yaml` descriptions and the Monitor Anything table text.
`per-type-matrix.md`	One-row-per-integration-type quick lookup: source paths, validator, render keys, output location.
`artifacts-and-banners.md`	Every committed and gitignored artifact; banner conventions; symlink rules.
`ibm-d.md`	The `contexts.yaml` -> `metadata.yaml` chain for ibm.d modules.
`consistency.md`	The collector consistency rule and what tooling enforces.
`in-app-contract.md`	How the cloud-frontend dashboard consumes `integrations.js`.
`gotchas.md`	Every surprise, dead-code reference, hardcoded marketing anchor, custom Jinja delimiter.
`recipes/INDEX.md`	Step-by-step recipes for adding/updating each integration type.
`how-tos/INDEX.md`	Live catalog: every analysis question gets a how-to entry.

Live how-to rule (mandatory)

If an assistant is asked a concrete question about the integrations pipeline that is NOT already documented under how-tos/ or one of the per-domain guides above, AND answering it requires non-trivial analysis (reading multiple scripts, running the pipeline, cross-referencing schemas), the assistant MUST author a new how-to under how-tos/<slug>.md and add a one-line entry to how-tos/INDEX.md BEFORE completing the task. This rule is durable. Skipping it means the next assistant repeats the analysis from scratch.

Path discipline

This skill follows <repo>/.agents/sow/specs/sensitive-data-discipline.md:

Files in this repo: repo-relative (integrations/gen_integrations.py, <repo>/integrations/..., src/...).
Files in sibling Netdata-org repos: ${NETDATA_REPOS_DIR}/<repo-name>/... (env-key from .env).
No literal home-directory or workstation-root paths anywhere (use the env-keyed placeholder above instead).

Sources of truth referenced by this skill

<repo>/integrations/ -- generators, schemas, templates, shared metadata files (categories.yaml, deploy.yaml).
<repo>/integrations/schemas/*.json -- all 12 schemas.
<repo>/integrations/templates/ -- all Jinja templates.
<repo>/.github/workflows/generate-integrations.yml and <repo>/.github/workflows/check-markdown.yml -- the CI.
<repo>/.github/data/distros.yml -- platform table fed into deploy rendering.
<repo>/src/go/plugin/ibm.d/ -- the secondary generator chain (docgen/main.go, metricgen/main.go).
<repo>/AGENTS.md -- the "Collector Consistency Requirements" policy text.

Related skills

project-writing-collectors -- the broader collector authoring context (NIDL contexts, dashboard shaping, plugin landscape). Read FIRST when authoring a brand-new collector; read THIS skill when working with the integration metadata side.
learn-site-structure -- how the per-integration .md files ultimately get published on learn.netdata.cloud. The Learn-side mapping is driven by <repo>/docs/.map/map.yaml; for integration pages, the relevant <!--startmeta block inside each generated .md is what Learn's ingest reads.

integrations-lifecycle