name: nkda-observability-contract description: Inspects, validates, and amends a feature specification to ensure complete, decision-driven observability coverage across metrics, traces, and logs. Fails if minimum standards cannot be met.
Skill: Observability Contract Enforcement
Enforce a complete, decision-driven observability contract on specifications, design documents, or implemented code. This skill is deterministic and idempotent — running it twice on the same input produces the same output.
Invocation modes:
| Mode | How to invoke | Input | Output |
|---|---|---|---|
| Document | Run against a markdown file (spec, plan, design doc) | File path or currently open file | Injects/rewrites ## Observability section in the document |
| Codebase | Run against a project, folder, or feature area | Project path, folder path, namespace, or feature name | Produces an Observability Audit Report as a standalone markdown section or file |
| SpecKit hook | Automatic via after_specify in .specify/extensions.yml |
The spec being generated | Same as Document mode |
When invoked manually, pass the target file or folder path. If no path is given, use the currently open file or its containing project. The skill auto-detects the mode:
- If the target is a
.mdfile → Document mode. - If the target is a
.csfile,.csproj, folder, or namespace → Codebase mode. - If invoked by SpecKit hook → Document mode against the spec.
Argument Parsing
The command MUST parse --stage before any other step.
Valid values: spec | plan | tasks | implement
If
--stageis present and valid: execute Stage Precedence Rule below. Do NOT execute Step 0.If
--stageis absent: fall through to Step 0 (auto-detection).If
--stageis present but invalid: STOP and emit:OBSERVABILITY CONTRACT FAILURE: Invalid --stage value "<value>". Expected one of: spec, plan, tasks, implement
Stage Precedence Rule
When --stage is provided it is the single source of truth. It overrides all auto-detection.
- DO NOT execute Step 0.
- DO NOT infer the target from file type or current editor context.
- Resolve target and enforcement behaviour exclusively from the Stage Resolution table below.
--stage implementis a mutating execution path. It overrides the generic audit-only behavior of Codebase mode.
Stage Resolution (Authoritative)
| Stage | Target (resolved from current feature dir) | Enforcement Focus | Required Action |
|---|---|---|---|
spec |
<feature-dir>/spec.md |
## Observability section: operations, decisions, metrics, traces, logging, correlation, validation queries |
Amend spec.md directly — inject/rewrite the section. Do not stop at listing gaps. |
plan |
<feature-dir>/plan.md |
Observability Contract section: O-1 spans, O-2 metrics, O-3 logs, O-4 progress events, CLI visibility per operation | Amend plan.md directly — inject/rewrite the section. Do not stop at listing gaps. |
tasks |
<feature-dir>/tasks.md |
Every user story phase contains explicit tasks for O-1 ActivitySource.StartActivity, O-2 IMigrationMetrics, O-3 structured ILogger, O-4 IProgressSink.EmitAsync |
Amend tasks.md directly — insert missing observability tasks into each phase. Block until gap-free. |
implement |
All .cs files created or modified in the current execution (see below) |
O-1 span on every operation method; O-2 metrics at every boundary; O-3 structured logging (Info start/end, Warn skips/errors, Debug per-item); O-4 EmitAsync at start, per ≤50-item batch, and completion | Write missing instrumentation directly into source files. Produce gap table, fix every row to ✅. |
If the resolved target file does not exist, STOP and emit:
OBSERVABILITY CONTRACT FAILURE: Target file not found: <path>.
Ensure the target artefact exists before running this stage.
Stage implement — target resolution
The target is, in priority order:
- All
.csfiles explicitly passed as arguments. - All
.csfiles created or modified in the current SpecKit implementation session (from the task execution log). - If neither is available: all
.csfiles in the current feature namespace or project folder that contain a class implementingIModule,IJob,ICommandHandler, or a service/tool class.
Do not scan the entire solution. Scope is the feature under implementation.
Resolve, do not just report. For every stage:
- Document stages (
spec,plan,tasks): write the missing content into the target file. Produce the gap table, then immediately fix every gap. - Codebase stage (
implement): write the missing instrumentation code into the correct source files. Produce the gap table, then immediately implement every fix.
Execution Guard
Before performing any stage action:
- Confirm the resolved target file or file set exists (see Stage Resolution above).
- Attempt to extract at least one operation from the target (Step 2 below).
If either check fails: STOP execution and emit the appropriate failure message. Do not attempt inference or fallback.
Role
When this skill is active, inspect the target for observability completeness and fail explicitly if minimum standards cannot be satisfied.
- Document mode: Inject or rewrite the
## Observabilitysection to meet the mandatory contract. Modifies the document directly. - Codebase mode without
--stage implement: Scan the actual source code for existing instrumentation, compare it against the mandatory contract, and produce an audit report listing gaps, violations, and required changes. This path is audit-only and does not modify source code. - Stage
implement(--stage implement): Scan the scoped source files, produce the same gap table, then write the missing instrumentation directly into those files until each gap is resolved or escalated as a blocker.
Preconditions
Before executing, read the following context files:
.agents/30-context/domains/telemetry-model.md— Three-layer model, metric naming, span inventorysrc/DevOpsMigrationPlatform.Abstractions/Telemetry/WellKnownMetricNames.cs— Existing metric namessrc/DevOpsMigrationPlatform.Abstractions/Telemetry/WellKnownActivitySourceNames.cs— Existing ActivitySource namessrc/DevOpsMigrationPlatform.Abstractions/Telemetry/WellKnownMeterNames.cs— Existing meter names
These files establish the naming conventions and existing instrumentation. New metrics and spans defined by this skill MUST be consistent with them.
Execution Steps
Execute the following steps in order. Do not skip steps. Do not proceed past a FAIL gate.
If --stage was provided, begin at Step 1 using the target resolved by the Stage Resolution table. Skip Step 0 entirely.
Step 0 — Detect Mode (only when --stage is absent)
- Examine the target path or context.
- If the target is a
.mdfile, or the skill was invoked by a SpecKit hook → enter Document mode. - If the target is a
.csfile,.csprojfile, folder, or the user named a namespace/feature → enter Codebase mode. - Record the mode. All subsequent steps are executed in both modes unless marked otherwise.
Step 1 — Locate or Create Observability Section
Document mode:
- Search the document for a heading
## Observability(case-insensitive). - If it does not exist, insert an empty
## Observabilitysection immediately before## Requirements(or at the end if## Requirementsis absent). - If it exists, read its full content for evaluation in subsequent steps.
Codebase mode:
- Prepare an empty audit report structure. No document is modified in this step.
- Identify the scope: if a single file, scope to that file's class/namespace; if a folder or project, scope to all
.csfiles within.
Step 2 — Extract Operations
Document mode: Scan the entire document and extract every operation that the feature introduces or modifies.
Codebase mode: Scan the source code within scope and extract operations by identifying:
- Classes implementing
IModule,IJob,ICommandHandler, or similar platform interfaces - Controller actions or endpoint mappings
- Methods decorated with
[Command],[HttpGet],[HttpPost], etc. - Public async methods that represent workflow steps (e.g.
ExecuteAsync,RunAsync,ProcessAsync) - Message/event handlers (classes implementing
IHandler<T>,IConsumer<T>, etc.) - Background services (
BackgroundService,IHostedService)
An operation is any of:
- API endpoint (REST, gRPC, SignalR)
- CLI command or subcommand
- Background job or scheduled task
- Module entry point (export, import, inventory, discovery)
- Workflow or orchestration step
- Event handler or message consumer
For each operation, record:
| Field | Value |
|---|---|
| Name | A short, unambiguous identifier (e.g. workitems.export, job.enqueue) |
| Type | One of: api, command, job, module, workflow, handler |
| Entry point | The class or method that initiates the operation (from spec or actual source file + line) |
| Dependencies | External calls, store operations, SDK calls the operation makes |
Codebase mode — additional extraction:
For each operation found in code, also record:
| Field | Value |
|---|---|
| File | Source file path |
| Has ActivitySource? | Whether the class creates or uses an ActivitySource / Activity |
| Has Metrics? | Whether the class injects or uses IMigrationMetrics, IDiscoveryMetrics, Counter, Histogram, etc. |
| Has Structured Logging? | Whether the class uses ILogger with structured templates (not string concatenation) |
| Has Correlation? | Whether Activity.Current, traceId, or operationId is propagated |
FAIL GATE: If zero operations can be extracted or inferred, emit the following and stop:
OBSERVABILITY CONTRACT FAILURE: No operations could be determined from the input.
The target must contain at least one concrete operation (API, command, job, module, workflow, or handler)
before observability can be assessed. Amend the spec or verify the correct code scope.
Step 3 — Define Operator Decisions per Operation
For each operation, define the decisions an operator must be able to make using telemetry. Every signal (metric, span, log) must map to at least one decision.
Standard decision categories:
| Decision | Question it answers |
|---|---|
| Is it working? | Are requests succeeding at an acceptable rate? |
| Is it fast enough? | Is latency within SLO bounds? |
| Is it overloaded? | Is concurrency or queue depth exceeding capacity? |
| What failed? | Which specific operation failed and why? |
| Where is it slow? | Which dependency or step is the bottleneck? |
| Is it correct? | Do output counts match input counts? Are invariants maintained? |
For each operation, assign at minimum: Is it working?, Is it fast enough?, Is it overloaded?, and What failed?.
Step 4 — Define Metrics (OpenTelemetry aligned)
For each operation, define the mandatory metrics. Use the naming convention:
<domain>.<capability>.<operation>.<measure>
Where:
<domain>=migrationordiscovery(matchingWellKnownMeterNames)<capability>= module or subsystem name (e.g.export,import,inventory)<operation>= specific action (e.g.workitem,attachment,revision)<measure>= what is measured (e.g.count,duration_ms,errors,in_flight)
Mandatory metrics per operation:
| Metric | Instrument | Unit | Decision served |
|---|---|---|---|
| Throughput | Counter<long> |
{operation} |
Is it working? |
| Latency | Histogram<double> |
ms |
Is it fast enough? |
| Outcome (success) | Counter<long> |
{operation} |
Is it working? |
| Outcome (failure) | Counter<long> |
{operation} |
What failed? |
| In-flight / queue depth | UpDownCounter<long> or ObservableGauge<int> |
{operation} |
Is it overloaded? |
Validation rules:
- Every metric name MUST follow the four-segment naming convention.
- Every metric MUST map to at least one decision from Step 3.
- Metrics that do not support any operator decision MUST be rejected and removed.
- Metrics MUST represent business activity. Infrastructure-level metrics (CPU, memory, GC) are out of scope for the spec — they are provided by the runtime.
- If the operation is a batch, add a batch-size
Histogrammetric. - Check existing
WellKnownMetricNames— reuse existing names where the semantics match; do not invent duplicates.
Document mode: Write the metrics table into the Observability section under ### Metrics.
Codebase mode: For each required metric, check whether it already exists in the code:
- Search the scoped files for
Counter<,Histogram<,UpDownCounter<,ObservableGauge<instrument creation. - Search
WellKnownMetricNamesandWellKnownDiscoveryMetricNamesfor matching constant names. - For each mandatory metric: mark as Present (with file + line), Partial (instrument exists but wrong type/name), or Missing.
- Record findings in the audit report.
Step 5 — Define Traces (end-to-end)
For each operation, define the span hierarchy. Every operation MUST have a root span. Every dependency call MUST be a child span.
Span table format:
| Component | Span Name | Tags | Parent | Decision served |
|---|---|---|---|---|
<class> |
<operation.name> |
<tag list> |
Root or <parent span> |
<decision> |
Validation rules:
- Every operation MUST have exactly one root span.
- Every dependency identified in Step 2 MUST appear as a child span.
- Context propagation method MUST be stated (automatic via
Activityhierarchy, or explicitW3C TraceContextheader). - Span names MUST use lowercase dot-separated segments matching the metric domain.
- Check the existing span inventory in
telemetry-model.md— reuse existing spans where semantics match. - Tags MUST include:
job.id,operation,moduleat minimum for root spans. - Child spans MUST include the entity identifier (e.g.
wi.id,attachment.name).
FAIL GATE: If any operation has no root span defined, or any dependency has no child span, emit:
OBSERVABILITY CONTRACT FAILURE: Trace coverage is incomplete.
Operation '<name>' is missing a root span, or dependency '<dep>' has no child span.
Every operation must have end-to-end trace coverage.
Document mode: Write the span table into the Observability section under ### Traces.
Codebase mode: For each required span, check whether it already exists:
- Search for
ActivitySourcefield declarations andStartActivitycalls in the scoped files. - Match span names to the required span table.
- Verify child spans exist for each identified dependency.
- Verify tags include required fields (
job.id,operation,module, entity identifiers). - Mark each span as Present, Partial (exists but missing tags or children), or Missing.
- Record findings in the audit report.
Step 6 — Define Structured Logging
For each operation, define the mandatory log events. Logs are structured (key-value pairs), not free-text.
Mandatory log events per operation:
| Event | Level | Fields | Decision served |
|---|---|---|---|
| Operation started | Information |
operationId, operation, input summary | Is it working? |
| Operation completed | Information |
operationId, operation, outcome, durationMs, output summary | Is it working? / Is it fast enough? |
| Operation failed | Error |
operationId, operation, errorType, errorMessage, durationMs | What failed? |
| Dependency call slow | Warning |
operationId, dependency, durationMs, threshold | Where is it slow? |
| Retry attempt | Warning |
operationId, operation, attempt, maxAttempts, delay | Is it overloaded? |
| Step detail | Debug |
operationId, step, detail | (diagnostic only) |
| Wire-level detail | Trace |
operationId, payload summary | (diagnostic only) |
Validation rules:
DebugandTracelevel logs MUST be declared as optional and disabled by default.- Log fields MUST NOT contain raw customer data (field values, project names, org URLs, attachment paths) without
DataClassification.Customerscope — per project guardrails. - Work item IDs are integer identifiers and are NOT customer data.
- Every log event MUST include
operationIdfor correlation. - No unstructured string concatenation in log templates.
Document mode: Write the log events into the Observability section under ### Logging.
Codebase mode: For each required log event, check whether it already exists:
- Search for
ILoggerusage,LogInformation,LogWarning,LogError,LogDebug,LogTracecalls. - Verify structured templates (message templates with
{Parameter}placeholders, not string concatenation or interpolation). - Verify each operation has start, completion, and error log events.
- Check that log fields do not contain raw customer data without
DataClassification.Customerscope. - Mark each log event as Present, Partial (exists but unstructured or missing fields), or Missing.
- Record findings in the audit report.
Step 7 — Define Correlation Model
Define the correlation identifiers that MUST be present on all telemetry (metrics tags, span tags, log fields) for this feature.
Mandatory correlation fields:
| Field | Source | Scope |
|---|---|---|
operationId / traceId |
Activity.Current.TraceId or generated GUID |
All telemetry |
parentId |
Activity.Current.ParentSpanId |
Spans and logs within a parent context |
job.id |
Job context | All telemetry within a job |
| Domain identifiers | Feature-specific (e.g. wi.id, project.name, org.url) |
Where applicable |
FAIL GATE: If correlation cannot be established (e.g. no job context, no activity source), emit:
OBSERVABILITY CONTRACT FAILURE: Correlation model is missing.
Feature '<name>' does not define how telemetry will be correlated across metrics, traces, and logs.
Define the correlation identifiers and their sources.
Document mode: Write the correlation model into the Observability section under ### Correlation.
Codebase mode: Verify correlation in the actual code:
- Check that
Activity.Currentis used orActivitySource.StartActivityestablishes trace context. - Check that
ILoggerscopes or log templates includeoperationId/traceId. - Check that metric tag lists include
job.idand domain identifiers. - Mark correlation as Present, Partial, or Missing per operation.
- Record findings in the audit report.
Step 8 — Generate Validation Queries
For each operator decision defined in Step 3, generate a KQL-style validation query that proves the decision can be answered using the defined signals.
Required query categories:
| Category | Proves | Source |
|---|---|---|
| Failure identification | Failures can be identified by operation and cause | Metrics (outcome.failure) + Logs (Error) |
| Latency analysis | P50/P95/P99 latency can be computed per operation | Metrics (latency histogram) |
| Load observation | In-flight concurrency or queue depth is visible | Metrics (in_flight / queue_depth) |
| End-to-end trace | A single request can be traced from entry to all dependencies | Traces (root + child spans) |
| Error diagnosis | Root cause can be determined from logs + traces | Logs (Error) joined with Traces |
Query format:
// <Category>: <What it proves>
<table>
| where <filter>
| summarize <aggregation> by <dimensions>
FAIL GATE: If any required query category cannot be expressed using the defined metrics, traces, and logs, the observability contract is incomplete. Emit:
OBSERVABILITY CONTRACT FAILURE: Validation query '<category>' cannot be derived.
The defined signals do not support answering: '<decision question>'.
Add the missing metric, span, or log event to close the gap.
Document mode: Write the queries into the Observability section under ### Validation Queries.
Codebase mode: Generate the same queries, but annotate each with whether the required signals are Present or Missing in the actual code. If a query cannot be satisfied because the underlying instrument or span does not exist in code, flag it as a gap.
Step 9 — Idempotency Check
Before writing the final Observability section:
- Compare the generated content with any existing
## Observabilitysection. - If the existing section already satisfies all validation rules, do not modify it.
- If the existing section is partially correct, preserve correct structures and amend only the gaps.
- Do not duplicate metrics, spans, or log events that already exist and are correct.
- Do not reorder existing content unless required for correctness.
Step 10 — Write Output
Document mode: Replace or insert the ## Observability section with the validated content.
Codebase mode without --stage implement: Produce the Observability Audit Report (see Codebase Output Format below). Do not modify source files automatically. The report is the deliverable.
Stage implement (--stage implement): Produce the same audit report structure, but the deliverable is both:
- the report, and
- the corresponding instrumentation edits applied to the scoped source files.
For --stage implement, do not stop at listing gaps when the fix is mechanical and local to the scoped files. Apply the fix, rerun the relevant checks, and leave only true blockers unresolved.
Document Output Structure
The ## Observability section MUST follow this structure:
## Observability
### Operations
| Name | Type | Entry Point | Dependencies |
|---|---|---|---|
| ... | ... | ... | ... |
### Operator Decisions
| Operation | Decision | Question |
|---|---|---|
| ... | ... | ... |
### Metrics
| Metric Name | Instrument | Unit | Operation | Decision |
|---|---|---|---|---|
| ... | ... | ... | ... | ... |
### Traces
| Component | Span Name | Tags | Parent | Decision |
|---|---|---|---|---|
| ... | ... | ... | ... | ... |
**Context propagation:** <method>
### Logging
| Event | Level | Fields | Operation | Decision |
|---|---|---|---|---|
| ... | ... | ... | ... | ... |
> Debug and Trace levels are disabled by default.
### Correlation
| Field | Source | Scope |
|---|---|---|
| ... | ... | ... |
### Validation Queries
#### Failure Identification
\```kql
<query>
\```
#### Latency Analysis
\```kql
<query>
\```
#### Load Observation
\```kql
<query>
\```
#### End-to-End Trace
\```kql
<query>
\```
#### Error Diagnosis
\```kql
<query>
\```
Codebase Output Structure
When in Codebase mode, produce the audit report in the following format:
## Observability Audit Report
**Scope:** `<project, folder, or file path>`
**Date:** `<date>`
**Operations found:** `<count>`
### Operations Discovered
| Name | Type | Entry Point | File | Has Spans | Has Metrics | Has Logging | Has Correlation |
|---|---|---|---|---|---|---|---|
| ... | ... | ... | ... | ✅/❌/⚠️ | ✅/❌/⚠️ | ✅/❌/⚠️ | ✅/❌/⚠️ |
Legend: ✅ = Present and correct, ⚠️ = Partial (exists but incomplete), ❌ = Missing
### Gaps — Metrics
| Operation | Required Metric | Status | Detail |
|---|---|---|---|
| ... | ... | Missing/Partial | <what is wrong or missing, file + line if partial> |
### Gaps — Traces
| Operation | Required Span | Status | Detail |
|---|---|---|---|
| ... | ... | Missing/Partial | <what is wrong or missing> |
### Gaps — Logging
| Operation | Required Event | Status | Detail |
|---|---|---|---|
| ... | ... | Missing/Partial | <what is wrong or missing> |
### Gaps — Correlation
| Operation | Required Field | Status | Detail |
|---|---|---|---|
| ... | ... | Missing/Partial | <what is wrong or missing> |
### Required Changes
Prioritised list of changes needed to reach full observability coverage:
1. **[CRITICAL]** `<file>` — `<what to add/fix>` (closes gap: `<which metric/span/log>`)
2. **[HIGH]** ...
3. **[MEDIUM]** ...
### Validation Queries
<same query format as Document mode, annotated with signal availability>
### Verdict
**PASS** — All operations have complete observability coverage.
— or —
**FAIL** — `<N>` operations have incomplete coverage. See Required Changes above.
If all operations pass, the verdict is PASS. If any operation has a Missing metric, span, or correlation field, the verdict is FAIL.
Enforcement Summary
| Condition | Action |
|---|---|
No ## Observability section |
Create it |
| Section exists but incomplete | Amend missing parts |
| Section exists and complete | No modification (idempotent) |
| Zero operations extractable | FAIL — spec must define operations |
| Missing root span for any operation | FAIL — trace coverage incomplete |
| Missing correlation model | FAIL — correlation is mandatory |
| Validation query cannot be derived | FAIL — signals are insufficient |
| Metric without a mapped decision | REJECT — remove the metric |
| Log with raw customer data and no classification scope | REJECT — redact or classify |
| Duplicate of existing correct structure | SKIP — do not duplicate |
Completion Criteria
Both modes
- All operations extracted or inferred from input.
- Every operation has at least four operator decisions assigned.
- Every operation has all five mandatory metric types defined.
- Every metric follows
<domain>.<capability>.<operation>.<measure>naming. - Every metric maps to at least one operator decision.
- Every operation has a root span defined.
- Every dependency has a child span defined.
- Context propagation is stated.
- Every operation has start, completion, and error log events.
- Debug/Trace logs are marked optional and disabled by default.
- No raw customer data in log fields without
DataClassification.Customer. - Correlation model defines operationId, parentId, job.id, and domain identifiers.
- All five validation query categories are present and derivable.
Document mode only
- Observability section follows the mandatory structure.
- Existing correct content is preserved (idempotent).
- No TODO placeholders remain in the Observability section.
Codebase mode only
- Every operation has been checked for existing instrumentation.
- Every gap is marked with status (Missing/Partial) and file location.
- Required Changes list is prioritised (Critical > High > Medium).
- Verdict is explicitly stated (PASS or FAIL).
- Audit-only Codebase runs did not modify source files without explicit user instruction.
-
--stage implementruns modified only the scoped source files needed to close observability gaps. - Observability section follows the mandatory structure.
- Existing correct content is preserved (idempotent).
- No TODO placeholders remain in the Observability section.
The skill is not complete until all criteria are checked. Any unchecked criterion is a failure.