name: legacy-ibmi-program-analyzer description: Analyze individual IBM i programs (RPGLE, CLLE, COBOL) to extract control flow, file I/O, external calls, and error handling with evidence backing. Use when diving deep into one program's behavior from an approved inventory, when standalone exploratory review is needed, or when checking whether an existing central delivery artifact can be reused before scanning source again. Layer 1 (platform-specific) skill of the Legacy Spec Factory reverse chain.
IBM i Legacy Program Analyzer
Skill Card
| Field | Notes |
|---|---|
| Problem solved | Creates an evidence-backed technical analysis of one RPGLE, CLLE, or COBOL program. |
| Input | One approved source program; for chain-ready output, its OBJ-* inventory entry; referenced DDS/copybooks; optional runtime, SME notes, and project reference packs for message catalogs, control files, or data dictionaries. |
| Output | program-analysis-<OBJ-ID>.md for chain-ready runs, or standalone program-analysis.md for exploratory inspection, covering call map, control flow, file I/O, external calls, and error handling. |
| Core prompt strategy | Check reusable central artifacts before scanning, extract concrete code behavior only when needed, enrich field/message meanings from approved reference packs, tag every inference, avoid business-rule invention, and separate exploratory analysis warnings from downstream blocking gates. |
| Upstream skill | legacy-ibmi-inventory for chain-ready output; none required for standalone exploratory output. |
| Downstream consumer | legacy-ibmi-flow-analyzer, legacy-ibmi-module-analyzer, data-model analysis, batch digest, and spec synthesis. |
| Validation standard | Chain-ready output resolves Program ID in approved inventory; standalone exploratory output marks inventory linkage missing and uses source ranges/local evidence without claiming downstream readiness. |
| Known risk | Inferring business intent from field or routine names without SME or runtime confirmation. |
| Practical example | Analyze ORDENTR RPGLE to document subroutines, PF/LF I/O, display-file indicators, and calls to credit-check programs. |
Purpose
Create a detailed analysis of one IBM i program (RPGLE, CLLE, or COBOL) documenting its RDi-style program call map, control flow, file I/O operations, external calls, and error handling. This skill does not infer business rules and does not generate modernization code. It produces evidence-backed analysis ready for SME validation and downstream spec generation.
This skill supports two intent modes:
standalone_exploratory— user wants to inspect one program or see the skill's output shape. Missing inventory/OBJ/EV linkage is a warning and a downstream-readiness blocker, not a current-analysis block. Produce the analysis from the supplied source, markstatus: draft_exploratory, and state clearly that the artifact is not eligible for flow/module/BRD/spec handoff until inventory linkage is added.chain_ready— user wants an artifact that can feedlegacy-ibmi-flow-analyzer, module analysis, BRD, or spec work. Require an approved inventory entry, canonicalOBJ-*, evidence IDs, and normal Step Contract validation.
Inputs
Accept:
- Optional: central delivery documents repo identity and
delivery_artifact_lookup_profile. For the current lending-card setup, the default repo isCH-WPS-LENDING-CARDS/legacy-modernization-delivery. The delivery repo name is configurable. The accepted artifact source of truth is GitHub remotemain; a local checkout is only a cache after it has been fetched and verified againstorigin/main. Prior accepted program-analysis artifacts may be stored directly under program folders such asmodules/*/{PROGRAM}where{PROGRAM}is the exact program identity. For this department,@CU118andCU118are distinct programs. - One RPGLE, CLLE, or COBOL source file (the program to analyze)
- Optional: DDS copybook definitions (DSPF, PRTF, PF, LF) referenced by the program
- Optional: SME notes on entry points, quirks, or runtime behavior
- Optional: project/company reference packs containing message catalogs,
control files, code tables, data dictionaries, or dictionary-center mappings.
Original files may be Markdown, CSV, Excel, Word, PDF, or normalized
document-intake outputs. Preferred paths:
00_reference_packs/<PACK-SLUG>/reference-pack-index.yamlor00_context_packages/<MODULE-SLUG>/field-dictionary-context.md. - Required for meaningful message review when message/status IDs are observed: message file, message catalog, approved reference pack, source literal/comment, runtime evidence, or SME-approved description source for every observed message/status/code. Static indexing may proceed without it, but final delivery and chain-ready promotion are blocked until message descriptions are resolved.
- Required for
chain_ready: Program must be referenced in an approved01_inventory/inventory.yamlvia program ID (OBJ-*) - Allowed for
standalone_exploratory: inventory may be absent when the user is only evaluating a skill output or analyzing a program locally. Do not fabricateOBJ-*orEV-*; use source ranges/local source references and markinventory_linkage: missing.
Stop and require clarification if:
- Program source is missing or incomplete (create a
blocked_pending_sourceartifact or TBD routing note instead of guessing) - Program is marked
blockedin the inventory - Program ID (
OBJ-*) cannot be located in inventory and the user asked for chain-ready/downstream output - Source contains raw, unredacted production data (require redaction review per
../../docs/data-collection-and-redaction.md)
Output Contract
Produce:
program-analysis-<OBJ-ID>.mdper program for chain-ready runs (one file per analysis session)program-analysis.mdfor standalone exploratory inspection when no inventory-backedOBJ-*exists yetcentral_lookup_resultwhen a central delivery artifact is checked before source analysis. Values:found_on_remote_main— program folder exists on the delivery repo remotemain; return the path and do not scan source again.not_found_on_remote_main— no exact program folder match exists on remotemain; proceed to normal source scan.remote_unavailable— remotemaincould not be checked; do not pretend the artifact is missing.
- Never report the program-analysis step complete unless the main wrapper file
exists at the target output path and follows the required section order. When
deterministic indexing runs first, it must create a draft
program-analysis.mdwrapper seed with all required sections; semantic deep-read then fills that stable layout instead of inventing a new layout.
Use:
templates/program-analysis.mdas starting pointreferences/output-contract.mdfor field definitions and evidence taggingreferences/large-program-analysis.mdfor large-program, segmented, and context-window-safe analysisscripts/index_rpg_source.pyas the deterministic source-index helper when local file access is available:- Windows: try
py -3 scripts\index-rpg-source.py <source> --program <NAME> --out-dir <DIR> --delivery-root <remote-main-snapshot> --delivery-profile <delivery-profile.yaml>, fall back topythonifpy -3is unavailable - macOS/Linux:
python3 scripts/index-rpg-source.py <source> --program <NAME> --out-dir <DIR> --delivery-root <remote-main-snapshot> --delivery-profile <delivery-profile.yaml>If all launchers fail, stop and report: "Python runtime unavailable". Do not configure PATH, install Python, or create a virtual environment. Apply the same launcher order to all temporary consistency checks, YAML readability checks, Markdown sanity checks, and one-off helper scripts run during this skill. When--delivery-rootis provided, the helper checks central delivery artifacts first. If it printscentral_lookup_result: found_on_remote_main, stop and reuse the approved artifact path instead of writing new source-index artifacts. If the SME explicitly wants to refresh an existing approved artifact, rerun with--force-rescan --rescan-reason "<why>"; the helper will scan source and record the prior central artifact path in the generated metadata. The helper writes a draftprogram-analysis.mdwrapper seed, core sidecars, and large-program batch checkpoints when triggered. The seed is not approved analysis; it exists to keep the final layout stable and prevent missing main artifacts.
- Windows: try
scripts/validate_program_analysis_contract.py(or the rootscripts/validate-program-analysis-contract.pywrapper) as the mechanical finalization gate before delivery:- Windows: try
py -3 scripts\validate-program-analysis-contract.py --analysis-dir <DIR>, fall back topythonifpy -3is unavailable - macOS/Linux:
python3 scripts/validate-program-analysis-contract.py --analysis-dir <DIR>If all launchers fail, stop and report: "Python runtime unavailable". Do not configure PATH, install Python, or create a virtual environment.
- Windows: try
references/control-flow-patterns.mdfor language-specific pattern recognitionreferences/error-handling-taxonomy.mdfor error detectionreferences/evidence-tagging.mdfor evidence strength levels and tagging methodologyreferences/reference-pack-lookup.mdfor using project/company Markdown / CSV / Excel / Word / PDF control files, message catalogs, and data dictionariestemplates/evidence-tags.mdas a quick reference card for inline evidence annotationtemplates/reference-pack-index.yamlas the recommended index for project/company Markdown lookup packs
Follow:
../../docs/id-conventions.mdfor stable IDs (OBJ-, EV-, TBD-*)../../docs/evidence-and-knowledge-taxonomy.mdfor evidence strength labels../../docs/input-readiness-rubric.mdfor input readiness scoring
Examples:
examples/simple-crud-rpgle/— straightforward CRUD program, high-confidence analysisexamples/complex-batch-job/— multi-subroutine batch job, moderate complexityexamples/incomplete-source-negative/— negative case: missing source, TBD handling
Step Contract
This skill is one step in the Legacy Spec Factory reverse chain. It conforms
to the canonical Step Contract shape — see
../legacy-step-contract/SKILL.md and
../legacy-step-contract/references/step-contract.md for the full
field-level rules. The summary below is normative for this skill.
Input
- Required: one RPGLE / CLLE / COBOL source program.
- Required for chain-ready output: the program's
OBJ-*ID located in anapproved(orapproved_with_non_blocking_tbd)01_inventory/inventory.yaml. - Optional: DDS copybook source (DSPF, PRTF, PF, LF) for files the
program touches; SME notes on entry points, quirks, or runtime behavior;
approved project reference packs for message descriptions, control-file
values, field meanings, and
standard_field_idmappings. For raw Excel/Word/PDF/image inputs, preferlegacy-document-evidence-intakenormalized Markdown/CSV/text outputs with evidence coordinates. - Input readiness scoring:
0-5 blocked: program source missing/incomplete, inventory status blocked for a chain-ready request, or evidence authorization unresolved.6 minimum_pass: one current program source is present. If approved inventory linkage is absent, proceed only asstandalone_exploratory.7-8 usable: referenced DDS/copybooks and object metadata are available for most file and display/report interactions.9-10 strong: runtime logs, screen/report samples, SME notes, known edge cases, and parameter/interface notes are also supplied.- Missing runtime samples or SME notes does not block static program analysis; it limits confidence for business meaning and exception realism.
- Readiness checks: source is current production (tier 1) rather than
archival when known; evidence authorization is resolved. For
chain_ready, Inventory Completeness Gate passes and the program is not markedblockedin inventory. - Stop conditions: source missing or incomplete; program marked
blockedin inventory for a chain-ready request;OBJ-*not found in inventory for a chain-ready request; raw unredacted production data present. Missing inventory alone does not stopstandalone_exploratoryanalysis.
Execution
- Procedure: see the Workflow section below (11 ordered steps).
- Program-size tiers: default to the normal-program path. Most field
programs are under 10,000 lines and should not inherit large-program
ceremony unless density requires it. Use three SME-facing tiers:
normal_program: fewer than 10,000 lines and no density trigger. Produce lightweight SME review plus core artifacts:program-analysis.md,program-analysis-summary.yaml,source-index.yaml,routine-index.md, andmessage-inventory.yaml. Do not createroutine-logic-details.md,routine-logic-details.yaml,deep-read-plan.md, or retained batch files unless density changes the tier or the user explicitly continues a deep-read pass.complex_normal_program: under large thresholds but dense enough to need extra evidence surfaces, for example more than 10 routines, more than 5 recommended deep-read windows, dense file I/O, dense messages, dense SQL, many external calls, or field mutation chains. Produce the lightweight review plus only triggered sidecars such asdeep-read-plan.md,all-routine-coverage-ledger.md,message-inventory.md,file-io-inventory.md/file-io-inventory.yaml,field-mutation-matrix.md/field-mutation-matrix.yaml, orsql-inventory.md/sql-inventory.yaml.large_extreme_program: source greater than 10,000 lines, more than 25 routines, more than 20 external calls, more than 25 object dependencies, or cannot safely fit in context with evidence windows. Usereferences/large-program-analysis.md, build the full sidecar set, and deep-read in batches of at most five routines/windows. The artifact set must include retained batch checkpoint files underroutine-logic-details/, starting withroutine-logic-details/deep-read-batch-001.md.
- When the source file is accessible on disk, first run
scripts/index_rpg_source.py(or the rootscripts/index-rpg-source.pywrapper) to classify the program tier and produce the appropriate artifact set. Windows: trypy -3, fall back topython; macOS/Linux: usepython3. If all launchers fail, stop and report: "Python runtime unavailable". Do not configure PATH, install Python, or create a virtual environment. Apply the same launcher order to temporary consistency checks, YAML readability checks, Markdown sanity checks, and one-off helper scripts run during this skill. These are pre-analysis structure artifacts, not the final program analysis. Do not produce whole-program business narrative until the source index, SME-first sections, and any needed coverage evidence exist. Pass--delivery-root <remote-main-snapshot>and--delivery-profile <delivery-profile.yaml>when available. If the helper reportscentral_lookup_result: found_on_remote_main, stop the scan and return the approved central artifact path to the user unless the SME has explicitly requested--force-rescan --rescan-reason "<why>". - Allowed inference: control flow extracted from EXSR/CALL/PERFORM;
calculations and branch logic from source statements; file I/O from
F-spec and I/O statements; field lineage from visible assignments,
DDS/copybooks, and parameter lists; error paths from MONITOR/
MONMSG/ON-ERROR or explicit return/status checks; pattern-based
labeling tagged
strongly_inferredormedium_confidencewith explicit notes. - Forbidden assumptions: inventing subroutines, file access beyond what I/O statements show, business meaning from field names, external call parameters absent from source or copybooks, error codes not explicitly caught or returned; reading non-redacted evidence. Reference-pack lookup may explain an observed identifier, but it must not create a behavior claim that is absent from source, runtime evidence, or SME notes.
- TBD handling: missing DDS →
TBD: pending_source; undefined subroutine reference →TBD: pending_source; unclear error path →TBD: pending_sme_judgment; non-blocking gaps taggednon_blocking.
Output
- Canonical artifact:
program-analysis-<OBJ-ID>.mdfor chain-ready runs orprogram-analysis.mdfor standalone exploratory inspection (one per program). - Required sections:
Calculation Logic,Validation Logic,Exception Handling,Message Inventory,Metadata,Analysis Coverage & Scope,Program Call Map,Routine Cards,Routine Logic Details,Deep Read Windows,Entry Points & Parameters,Object Dependencies,Logic Decomposition Ledger,Data Touch Map,Key File & Field Logic,Control Flow,File I/O,External Calls,Error Handling,Redundancy Candidate Notes,TBDs & Blocking Status,Review Checklist. - Required IDs:
chain_readyoutput reusesOBJ-*andEV-*from inventory/evidence manifest.standalone_exploratoryoutput must not fabricateOBJ-*orEV-*; it uses source ranges/local source references and recordsinventory_linkage: missing. Both modes may mint program-localBEH-*,EX-*, andTBD-*. Does not mintBR-*,CAP-*,DEC-*. - Reference pack metadata: when message catalogs, control files, code tables, or data dictionaries are used, record pack path, pack ID/version, authorization status, and lookup coverage in Metadata.
- Message description gate: every observed message ID, status value,
response literal, return code, SQLSTATE, CPF/MCH/RNX/CPD message, operator
message, or shop-local token must have a specific description and description
source before final delivery.
unresolved - message description not availableis allowed only as a pre-analysis gap; it is blocking for final review / chain-ready output. - Handoff status:
status: draftuntil SME review;approvedorapproved_with_non_blocking_tbdis required beforelegacy-ibmi-flow-analyzerruns against the chain that includes this program.status: draft_exploratoryis never eligible for downstream flow/module/BRD/spec use without rerun or linkage update.
Validation
- Mechanical: in
chain_ready, every non-trivial behavior has ≥1EV-*link and every call/object reference resolves against inventory. Instandalone_exploratory, every non-trivial behavior has source-range/local evidence and unresolved inventory/object/evidence mappings are listed as downstream-readiness gaps, not current-analysis blockers. In both modes, every TBD has a blocking-status tag and required sections are present. Any observed message/status/code with an unresolved description source is blocking for final delivery; provide message file/catalog/reference pack, source literal/comment, runtime evidence, or SME-approved text before declaring the analysis review-ready.program-analysis.md/program-analysis-<OBJ-ID>.mdmust pass the Program Artifact Finalization Gate before delivery, and chain-ready output must not be promoted when the gate fails. Reference-pack facts must cite the pack/file/row or anchor used. - AI semantic: behaviors are consistent with the linked source lines;
no invented subroutines, fields, files, field mutations, message IDs,
or error codes; evidence strength not overstated (no
weakly_inferredposing asconfirmed_from_code); front-loaded Calculation Logic, Validation Logic, Exception Handling, and Message Inventory reconcile against routine-level evidence and ledgers; flow header (if present) reconciled against the code-derived program call map. Dictionary or control-file meanings enrich observed identifiers; they do not override code behavior. Contradictions become TBDs. - Report quality: Program Call Map uses compact
Visual Overviewplus auditableCall Evidence; key fields and variables preserveFIELD_NAME(business meaning) andVARIABLE_NAME(business meaning) [direction]; external/dynamic calls and error codes carry resolution status; unresolved items are centralized in TBDs / Open Items. - SME / human approval: SME signs entry points, parameter contracts, file I/O semantics, external interface contracts, and error handling realism. Required when the program affects money, inventory, compliance, or customer status; recommended otherwise.
- Blocking conditions: any
BEH-*without source evidence; any invented IBM i fact; any unresolvedpending_sourceTBD on a section that is load-bearing for the next flow analysis; SME absence when SME is required by the program's risk class. Missing inventory blocks onlychain_ready/ downstream promotion, not standalone exploration.
Emit a Step Validation Report (see
../legacy-step-contract/templates/step-validation-report.md) with
status pass, pass_with_warnings, or blocked when reporting upward
to the orchestrator.
Workflow
Central Artifact Reuse Preflight
- Before reading or scanning source, resolve the latest accepted central
artifact snapshot using a
delivery_artifact_lookup_profile. The delivery repo name is configurable; do not hardcodelegacy-modernization-deliveryexcept as the current department default. Use Git method 2 only: check remote availability withgit ls-remote, then inspectmainwith a temporary shallow clone / sparse checkout or an already-fresh local cache. Do not use GitHub API tooling for this lookup. - Do not use a stale local checkout, unsynced branch, feature branch, or
private working copy as proof that no central artifact exists. If remote
maincannot be checked, setcentral_lookup_result: remote_unavailableand ask for access/context before deciding whether to scan. - Recommended read-only Git sequence:
Then inspect the configuredgit ls-remote <delivery-repo-url> main git clone --depth 1 --branch main --filter=blob:none \ --sparse <delivery-repo-url> <tmp-delivery-dir> git -C <tmp-delivery-dir> sparse-checkout set "modules"program_folder_patternsunder the temporary checkout. Delete the temporary checkout after the lookup if it is not a managed cache. - Build the lookup from this profile shape, supplied by user/project config
when available and otherwise defaulted conservatively:
Other departments can overridedelivery_artifact_lookup_profile: repo: CH-WPS-LENDING-CARDS/legacy-modernization-delivery branch: main module_roots: - modules/* program_folder_patterns: - modules/*/{PROGRAM} artifact_file_patterns: - program-analysis.md - program-analysis-*.md - program-analysis-summary.yaml program_name_normalization: case: upper preserve_prefixes: ["@"] exact_folder_name_match: truerepo,branch,module_roots,program_folder_patterns, and artifact filenames. A profile may point tocapabilities/*/{PROGRAM},apps/*/programs/{PROGRAM}, or any equivalent reviewed-delivery layout as long as it identifies one program folder at a time. - Search the remote-current central artifact snapshot for an exact program
folder match before using project-local output. Preferred lookup keys:
exact program/member name first, then
OBJ-*, module/CAP-ID, source library/member, and the source ref or collection date recorded inprogram-analysis-summary.yamlor module metadata. - Match program folders by exact normalized program identity. For the
current screenshot layout,
@CU118andCU118are distinct programs; do not strip leading@and do not cross-match the two names. Do not treat a substring match such asCU11insideCU118as a hit unless the profile explicitly allows fuzzy matching. - If an exact program folder exists on remote
main, setcentral_lookup_result: found_on_remote_main, return the remote path, and tell the user the program has already been scanned in the central delivery repo. Do not run deterministic source indexing, deep-read, or source scanning. - If no exact program folder exists on remote
main, setcentral_lookup_result: not_found_on_remote_mainand proceed to normal source scan.
- Before reading or scanning source, resolve the latest accepted central
artifact snapshot using a
Size & Structure Preflight
- Count approximate source lines, routine definitions, external calls, and object dependencies before writing business summary prose
- If local source file access is available, run:
- Windows: try
py -3 scripts\index-rpg-source.py <source-file> --program <PROGRAM> --out-dir <analysis-dir> --delivery-root <remote-main-snapshot> --delivery-profile <delivery-profile.yaml>, fall back topythonifpy -3is unavailable - macOS/Linux:
python3 scripts/index-rpg-source.py <source-file> --program <PROGRAM> --out-dir <analysis-dir> --delivery-root <remote-main-snapshot> --delivery-profile <delivery-profile.yaml>When--delivery-rootis available and the helper reportscentral_lookup_result: found_on_remote_main, stop and return the central artifact path instead of generating a new local analysis seed. To refresh a reviewed central artifact intentionally, add--force-rescan --rescan-reason "<SME/business reason>"; never force a rescan silently. If all launchers fail, stop and report: "Python runtime unavailable". Do not configure PATH, install Python, or create a virtual environment. Use the same launcher order for all temporary consistency checks, YAML readability checks, Markdown sanity checks, and one-off helper scripts.
- Windows: try
- Use
source-index.yaml,program-analysis-summary.yaml,program-analysis.md,routine-index.md, andmessage-inventory.yamlas the normal deterministic pre-analysis index. Addroutine-logic-details.md,routine-logic-details.yaml,all-routine-coverage-ledger.md,deep-read-plan.md,message-inventory.md,file-io-inventory.md/file-io-inventory.yaml,field-mutation-matrix.md/field-mutation-matrix.yaml, andsql-inventory.md/sql-inventory.yamlonly when the generatedoptional_sidecar_triggersor downstream evidence needs require them. - For SQLRPGLE and free-format RPGLE, use statement-level indexing for
DCL-PI,DCL-PR,DCL-DS,DCL-S,DCL-PROC, procedure calls, assignments,EXEC SQLblocks, host variables,SQLCODE, andSQLSTATE. Do not split multi-line SQL statements or free-format calculation chains into fixed line chunks. - Select both compatibility
analysis_mode(standard,segmented, orlarge_program) and SME-facingprogram_size_tier(normal_program,complex_normal_program, orlarge_extreme_program). - For
normal_program, keepprogram-analysis.mdconcise and SME-first. Forcomplex_normal_programorlarge_extreme_program, build the needed structure/sidecar evidence before business summary prose. - Create Analysis Coverage & Scope and initialize the coverage ledger
- Prevent claims of complete understanding until coverage supports them with indexed routines, deep-read windows, resolved call/data edges, and explicit gaps
Program Artifact Finalization Gate
Run this gate before delivering program-analysis.md /
program-analysis-<OBJ-ID>.md.
program-analysis.mdis the SME review wrapper, not a free-form compressed summary. It must followreferences/output-contract.mdFile Structure with all required##sections in order. Small programs may keep sections concise or mark them not applicable / pending, but sections must not be omitted.program-analysis-summary.yamlis the compact machine-readable program summary and sidecar index. Core sidecars declared there must exist:source-index.yaml,routine-index.md, andmessage-inventory.yaml.routine-logic-details.mdandroutine-logic-details.yamlare conditional: normal programs leave themnot_written_by_default; complex/large or explicit deep-read runs declare thempresent. Optional sidecars are required only when declaredpresentoroptional_triggered.routine-logic-details.yamlis the RLOG coverage source of truth.routine-logic-details/part-*.mdorroutine-logic-details/deep-read-batch-*.mdfiles are retained deep-read checkpoint and audit surfaces for large/extreme runs. They do not replace the consolidatedroutine-logic-details.md, but they must be kept when generated. Each batch file must use the exact top-level##section order:Calculation Logic,Validation Logic,Exception Handling,Scope,Batch Coverage Summary,Message Inventory,Routine Details. The first three core sections must appear before per-routine detail and must not contain pasted source code, fenced code blocks, or verbatim RPG/CL/COBOL/SQL statements. Use source identifiers, normalized logic summaries, source ranges, evidence IDs, andRLOG-*links instead. The finalroutine-logic-details.mdmust contain everyroutine_logic_inventory.details[].detail_idfrom YAML and whole-programCalculation Logic,Validation Logic,Exception Handling,Message Inventory,Routine Detail Index, andRoutine Details; do not deliver only the latest batch, a delta, or a summary of recent batches.- For
large_extreme_program,routine-logic-details/and at leastroutine-logic-details/deep-read-batch-001.mdare required checkpoint artifacts. A missing batch directory or first batch file is blocking even if the consolidatedroutine-logic-details.mdexists. - For
normal_program/ quick exploratory runs with no part files, this gate is a lightweight structural check. It does not require large-program sidecars or batched deep-read. Forcomplex_normal_program,large_extreme_program,segmented,large_program, existingroutine-logic-details/part-*.mdorroutine-logic-details/deep-read-batch-*.md, orchain_readyoutput, a failed gate is blocking. - Run the validator with the repository Python launcher convention:
Windows
py -3 scripts\validate-program-analysis-contract.py --analysis-dir <DIR>(fall back topython), macOS/Linuxpython3 scripts/validate-program-analysis-contract.py --analysis-dir <DIR>.
- Select Program & Resolve Analysis Intent
- Determine whether the user wants
standalone_exploratoryorchain_readyoutput. If the user says they only want to inspect/analyze a program, see skill output, run a test, or are not producing BRD/spec, default tostandalone_exploratory. - Accept program ID (OBJ-*) from approved
01_inventory/inventory.yamlwhen available - Confirm program name, type (RPGLE / CLLE / COBOL), and library
- Stop if program is marked
blockedor inventory is not approved for achain_readyrequest - If inventory is missing in
standalone_exploratory, continue analysis, markinventory_linkage: missing, and create a downstream-readiness gap instead ofblocked_pending_source - Document source location and collection date
- Determine whether the user wants
2a. Load Approved Reference Packs (Optional)
- If the user, inventory, or context package provides company control files,
message catalogs, code tables, data dictionaries, or dictionary-center
mappings, resolve them through
references/reference-pack-lookup.md. - Prefer
00_reference_packs/<PACK-SLUG>/reference-pack-index.yamlfor enterprise/shared packs and00_context_packages/<MODULE-SLUG>/field-dictionary-context.mdfor module-specific dictionary context. - Confirm authorization status is
approved_for_analysis,internal_reference, orsyntheticbefore using content. If status is unknown, do not silently consume the files; ask for approval or create a reference-pack TBD. - For Excel, Word, PDF, image, or scanned inputs, prefer normalized
Markdown/CSV/text outputs and coordinates from
legacy-document-evidence-intake. If only raw files are supplied and they are not directly readable in the current runtime, route tolegacy-document-evidence-intakeor create a reference-pack readability TBD. Do not treat conversion/tooling failure as evidence that the lookup entry does not exist. - Build a lookup index from exact identifiers only: message ID, status code,
return code, SQLSTATE, CPF/MCH/RNX/CPD ID, file name, record format,
field name, alias,
standard_field_id, control-file field, and control-file value. - Record the pack path, pack ID/version, owner, source format, normalized output path, document-intake manifest when available, and lookup coverage in Metadata. Reference packs are explanation evidence for observed identifiers; they do not prove that a branch, call, file mutation, or exception path exists.
Extract Entry Points & Parameters
- Identify main entry point (program parameter list, return value or status code)
- Identify callable sub-procedures:
- RPGLE procedures: Fixed-form
P ... B/P ... Eor free-formdcl-proc/end-proc; both callable via CALL statement - RPGLE subroutines:
BEGSR/ENDSRblocks (internal subroutines, callable only from within same program) - CLLE:
SUBRlabel blocks andCALLSUBRinvocations (internal subroutines); external procedures via CALL - COBOL: ENTRY statements, PERFORM … UNTIL / VARYING
- RPGLE procedures: Fixed-form
- Document parameter types, expected ranges, and direction (input/output/both)
- Tag with evidence:
confirmed_from_code(from source headers or RPGLE specifications) - Create TBD if parameter contract is undocumented or unclear
Trace Main Control Flow & Logic Ledger
- Document procedure call sequence (what calls what, in what order)
- For every load-bearing mainline segment, subroutine, procedure, or paragraph, add Routine Logic Details. Load-bearing means the routine performs field calculation, validation, downstream-affecting branching, file mutation, external handoff, error/status assignment, display/report output, or queue/message interaction.
- For routine-dense programs, keep
program-analysis.mdcompact:routine_count <= 25: full Routine Logic Details may live in the main analysis.routine_count > 25: the main analysis must contain a Routine Logic Details summary table withRLOG-<PROGRAM>-NNNdetail IDs, while full details live inroutine-logic-details.mdandroutine-logic-details.yaml.
routine_count > 80or source lines > 10,000: split human-authored semantic detail intoroutine-logic-details/part-*.mdorroutine-logic-details/deep-read-batch-*.mdretained batch checkpoint files by mainline/dispatch, state-changing routines, validation/message routines, external boundaries, and indexed utilities. Each batch file must be SME-first and use this exact top-level##layout:Calculation Logic,Validation Logic,Exception Handling,Scope,Batch Coverage Summary,Message Inventory,Routine Details. The core logic sections must summarize material calculations, assignments, validations, exception paths, message/status outcomes, and links to the routines in that batch.Message Inventorymust include one row per exact message, status value, return code, response literal, or operator text observed in that batch; do not group messages into family summaries. Then place the detailed routine sections below those summaries. Core sections must not contain real source-code snippets or fenced code blocks; preserve evidence through source ranges and identifiers, not pasted code. After batch deep-read is complete, merge every part file back into one finalroutine-logic-details.mdSME review document. The final document must contain whole-program## Calculation Logic,## Validation Logic,## Exception Handling,## Message Inventory, a routine detail index, and all routine detail sections. Do not leavepart-*.mdordeep-read-batch-*.mdfiles as the only SME review surface.- In Routine Logic Details, explain each routine's execution trigger, step-by-step logic, field calculations/assignments, branch outcomes, exits, and evidence. Do not summarize a routine as "validation logic" or "amount calculation" without target fields, operands, expressions, branch guards, precision/conversion notes, and business effect.
- Within each Routine Logic Details subsection, add Conditioned
calculation blocks for every guard-scoped calculation chain that affects
money, percentage, quantity, status, return value, message/error code,
persisted field, display/report field, queue payload, or downstream
branch. In fixed-format RPG this includes conditioning indicators,
named/numbered condition groups, result indicators,
IFxx/ELSE/ENDIF,CASxx,DOscopes, and operation-level indicators. A block such as "Condition 5" must be analyzed as its own source-backed unit: list the guard, all guarded statements in order, each target assignment, intermediate variables, final output/error effect, and source line range. Do not hide these chains in a generic branch outcome or only in the Logic Decomposition Ledger. - Within each Routine Logic Details subsection, add Outcome reverse
traces for every material message ID, status code, return code,
response value, indicator-driven outcome, or error field produced by the
routine. Start from the visible outcome (for example
UCC1852orW0BUR), walk backwards to the exact branch/guard that sets or emits it, then continue backwards through the conditioned calculation block, comparison threshold, intermediate variables, and source operands/carriers that make the outcome true. A result such asUCC1852must not be left as "warning/reject condition" if the source shows a trigger chain such asCAACOS/CAACCM/CATTHD/CATCAM/CAACLT -> WOOVAM -> WOACLT -> WOOVPE -> BBLSOP -> UCC1852. - Within each Routine Logic Details subsection, connect calculated or assigned fields back to data carriers: source file/parameter/queue/screen field, intermediate work variables, output or persisted carrier, and the related Field Lineage / Field Mutation Matrix row when one exists.
- Within each Routine Logic Details subsection, close routine-local
exceptions: trigger, error/status/message fields, handling action,
downstream skip/rollback/output, Validation Logic row, and evidence.
If no exception path is observed for a routine, state
none observedrather than leaving the closure implicit. - Identify conditional branching:
- RPGLE: IF, SELECT, indicator-driven branching
- CLLE: IF, ELSE, GOTO
- COBOL: IF, EVALUATE, PERFORM … UNTIL
- Identify loops and exit conditions (DO, DOWHILE, PERFORM, VARYING)
- Build the Logic Decomposition Ledger for rules that must not be
compressed into generic prose:
- arithmetic operations (
ADD,SUB,MULT,DIV,EVALexpressions) with source operands, result fields, and rounding or precision behavior - string construction (
CAT,MOVE, substring operations) where it creates business identifiers, keys, or external payload fields - constants and literals used in limits, status values, rates, return codes, message IDs, flags, or branch decisions
- single-condition, compound-condition, nested-condition,
SELECT/CASE, and loop rules with explicit branch priority
- arithmetic operations (
- Preserve condition order and nesting when it changes behavior. Do not flatten mutually exclusive tiers or fallback branches into unrelated bullet points.
- Build a front-loaded Calculation Logic section immediately after the title and before Metadata. It must summarize the whole program's material calculation and assignment chains for IT SME first-read review, including arithmetic, derived amounts, status/result assignments, key construction, message/status carriers, outbound payload fields, and persisted field updates. Every row must link back to Routine Logic Details, a conditioned calculation block, the Logic Decomposition Ledger, Key File & Field Logic, or a Field Mutation row. Do not leave a critical calculation only in a later routine-local subsection or ledger.
- Add Routine / Window Data Flow for every load-bearing routine or
deep-read window: purpose, input variables, transformation logic,
output variables, side effects, source lines, and evidence. Use
VARIABLE_NAME(business meaning) [direction], and mark inferred or unresolved meanings explicitly. - Document handled vs. unhandled paths
- Tag each non-trivial control structure:
confirmed_from_codeormedium_confidenceif inferred - Create TBD for unclear program flow (missing subroutines, undefined labels)
Build Program Call Map
- Treat the call map as the default entry view for every program, regardless of size. It should resemble an IBM RDi call-hierarchy view: routines and external boundaries connected by call edges, not a statement-level flowchart.
- First, look for a source-level flow header at the top of the program. Many IBM i shops embed an ASCII tree comment like:
If present, use it as documented intent and navigation evidence (tier 3 evidence perMain line Main flow control |-- SR990 First time initialization |-- SR100 Card validation | |-- SR110 Currency conversion | | |-- SR111 Convert transaction amount../../docs/code-as-ground-truth.md) and render theVisual Overviewas a normalized fencedtexthierarchy using|--branches. Preserve useful author labels, but reconcile the tree against actual EXSR/CALL statements before presenting it as the program call map. The header is not authoritative when it disagrees with actual EXSR/CALL statements. - Independently derive a program call map from code by scanning for EXSR / CALLP / CALL / PERFORM / CALLPRC statements and BEGSR-ENDSR / BEGPR-ENDPR / paragraph definitions.
- Compare header vs. code-derived graph:
- If they match → tag
confirmed_from_code(with source-level flow header as evidence) - If header exists but code differs → create TBD (comment drift, dead code, or missing subroutine)
- If no header exists → use code-derived graph, tag
confirmed_from_code(from EXSR/CALL/PERFORM statements)
- If they match → tag
- Produce the required views (see
references/output-contract.md):- Visual overview (RDi-style fenced
texthierarchy using|--branches; compact by design but detailed enough to orient SMEs; not Mermaid). For routine-dense programs, this view is not expected to enumerate every routine, but it must state its display coverage before the tree, for example:Visual coverage: main dispatch and high-impact branches only (shows 22 of 124 routines); complete routine inventory is in routine-index.md, Node Inventory, and Call Evidence. - Node inventory (mainline, internal routines, procedures, external programs, APIs, queues, services)
- Call Evidence (caller, callee, call type, condition, source lines, evidence source, and resolution status)
- Reverse caller index (for each node, who calls it)
- Visual overview (RDi-style fenced
- Use the wording
Evidence basis: source-level flow header + derived call analysiswhen header and code-derived evidence are both used. - Treat dynamic calls as unresolved until a concrete target is proven from source, runtime evidence, inventory, or SME notes. Capture the target variable and assignment lines when available.
- Identify hub/common candidates and orphaned routines. A hub/common label is structural only; do not promote it into a business service or modernization boundary without flow/module evidence and SME confirmation.
List Object Dependencies (flat reference inventory)
- Match the shop's
F5-OBJREF TREEoutput format: enumerate every external object referenced by the program, regardless of how it is used. - Object types to cover:
- Files: PF (physical), LF (logical), DSPF (display), PRTF (printer)
- Data: *DTAARA (data area), *DTAQ (data queue), *MSGF (message file)
- Programs: *PGM (called program), *SRVPGM (bound service program)
- Source-level: Copybooks /
/COPYdirectives, data-structure includes
- Capture columns: object name, type, version (if shop tracks one), description, inventory ID (OBJ-*), evidence
- Cross-reference each entry against
01_inventory/inventory.yaml:- If a matching OBJ-* exists → link it
- If not → create TBD: inventory gap (the inventory missed this object)
- This section is the flat parent list; deeper per-object analysis happens in steps 7 (Data Touch Map + File I/O) and 8 (External Calls).
- Tag:
confirmed_from_code(visible in F-spec, D-spec, /COPY, or CALL statement)
- Match the shop's
Document Data Touch Map, Key Fields & File I/O
- Build a program-local Data Touch Map before the detailed File I/O table. This is the data companion to the Program Call Map: it shows which routines touch which data carriers and where state enters, changes, leaves, or persists.
- Track object / record / critical-field level movement. Do not enumerate every temporary RPG work variable.
- Critical fields include money/amount fields, account/customer IDs, inventory quantities, approval/decline decisions, posting flags, status fields, return codes, error codes, audit IDs, and external message payload fields.
- For every data carrier, capture:
- carrier/object (PF, LF, DSPF, PRTF,
*DTAARA,*DTAQ,*MSGQ, CALL parameter list, copybook/data structure, IFS file) - routine/procedure touching it
- operation (read, write, update, delete, send, receive, parameter in/out/inout)
- key/payload fields
- state impact (
read-only,creates,updates,deletes,async send,async receive,external handoff) - evidence and TBDs for unclear payload structure or direction
- carrier/object (PF, LF, DSPF, PRTF,
- Build the Key File & Field Logic section:
- classify each key file as driver, lookup/reference, state update, transaction/detail insert, audit/log, screen/report, queue/message, or parameter/data-structure carrier
- classify each key field by role: access key, input, derived value, calculation operand/result, branch condition, status/flag, return/error code, message ID, external parameter, persisted field, or audit/output field
- show field lineage for critical fields as
physical/source field -> alias/data structure -> work variable -> calculation/condition -> write-back alias -> persisted field - preserve source identifiers and business meaning together:
FIELD_NAME(business meaning). If one side is unresolved, writefield unresolved(business meaning) orFIELD_NAME(meaning unresolved); mark inferred meanings inline. - preserve variable flow as
VARIABLE_NAME(business meaning) [input/output/input-output/local/control] - when an approved reference pack maps a field, include
standard_field_id, reference-pack file/row, dictionary version, and owner/steward when available - if a control-file/code-table lookup gives a value meaning, cite it as reference-pack evidence while keeping the condition and assignment source-backed
- if dictionary/control-file meaning conflicts with code behavior, keep both facts visible and create a contradiction TBD instead of selecting one silently
- create TBDs when DDS/copybook metadata is missing, physical-field mapping is unclear, or a variable participates in a critical path but its source cannot be proven
- For each file from step 6 (PF / LF / DSPF / PRTF), classify operations:
- Sequential read: READ (next record), READP / READPE (read previous)
- Random read: SETLL (set lower limit), READE (read equal), CHAIN (random access)
- Write operations: WRITE (add new record), UPDATE (modify record), DELETE
- Display file: EXFMT (format + read interaction with DSPF for menus, input screens)
- Embedded SQL / SQLRPGLE: EXEC SQL blocks, cursor operations, SQLCODE / SQLSTATE error handling
- Note key fields used in each operation (e.g., CHAIN on CUSTID)
- For every WRITE / UPDATE / DELETE / SQL mutation, produce a
field-level mutation row that names:
- record format and access key
- branch condition that permits the mutation
- fields assigned immediately before the mutation
- source value, literal, expression, or copied field used for each persisted field
- indicators,
%FOUND,%ERROR,SQLCODE,SQLSTATE, or return codes checked before and after the mutation - a File Access Summary purpose using an action verb that explains why the file is accessed; do not use Purpose as a substitute for field descriptions
- For file-I/O-dense or SQLRPGLE programs, keep the main
File I/Osection as a SME-readable summary. Store complete observed I/O evidence infile-io-inventory.md/file-io-inventory.yaml, persisted native and SQL mutations infield-mutation-matrix.md/field-mutation-matrix.yaml, and embedded SQL details insql-inventory.md/sql-inventory.yaml. Link main-table rows to stableFIO-*,MUT-*, andSQL-*detail IDs instead of expanding every operation inline. - Reference file definitions from inventory via evidence ID (EV-*)
- Tag evidence:
confirmed_from_code(from file specifications or I/O statements) - Create TBD if DDS is missing, key field unclear, or SQL schema is not documented
Identify External Calls
- List all external program calls:
- RPGLE: CALL, CALLP (procedure call)
- CLLE: CALL, CALLPRC
- COBOL: CALL, CICS, MQ, REST
- List external service program calls (binding)
- List external interfaces: IFS, HTTP, message queues, data queues
- Document parameter contract if visible in source (call statement, copybook)
- For dynamic calls, document the target variable, source lines where
the target is assigned, parameters, resolution status (
resolved,partially_resolved,dynamic_unresolved,inferred,confirmed), and any evidence gap. - Tag:
confirmed_from_code(source statement visible) orneeds_sme_review(undocumented) - Create TBD if external interface is unknown or network-dependent
- List all external program calls:
Document Error Handling
- List monitored errors:
- RPGLE: MONITOR / ON-ERROR block, escape messages
- CLLE: MONMSG (monitor message)
- COBOL: ON ERROR, error flag checking
- Document error codes and recovery paths (retry, log, return error)
- Document logged errors or message writes (DSPLY, message queue, spool)
- Document unhandled exceptions (crash, abort)
- Build an Exception Closure Ledger that covers every observed
business, parameter, and system I/O exception path. Each row must
include trigger condition, message ID / error code / return code,
detection mechanism, fields set, handling action (
RETURN,GOTO, rollback, skip write, log, send message, continue), and downstream impact. - Build a front-loaded Validation Logic section immediately after
Calculation Logic and before Exception Handling. It must list every
validation, status, response, return-code, message, indicator-driven
outcome, and generic handler outcome observed in source or message-file
references, including
CPF*,CPD*,MCH*,RNX*,SQL*, shop-localUCC*/LCC*, and literal business error codes. Do not limit the inventory to shop-local message prefixes or bury it inside Error Handling. - Populate Validation Logic with Message / Status Code, Message Description, Validation / Error Type, Set By / Source Lines, Trigger Condition, Reverse Trigger Chain / Routine Logic Link, Output Carrier, Downstream Effect, and Evidence Status. Include status codes, response codes, indicator-driven error branches, exception/log output codes, data queue response status values, and message/status fields assigned during validation or file I/O failures.
- Create one Validation Logic row per explicit message ID, status code, return code, response value, SQLSTATE, CPF/MCH/RNX/CPD message, user-defined code, or generic catch-all token. Do not group multiple message IDs into one row and do not replace individual descriptions with summary labels such as "validation messages" or "call-specific message IDs".
- Populate Message Description from message files, approved reference-pack
message catalogs/control files/code tables, source literals, comments,
runtime evidence, vendor references, or SME notes. If the description is
not available, write
unresolved - message description not available, mark the row unresolved, and create a TBD / Open Item. - If literal code assignments cannot be fully traced, state
Validation logic unresolved:with the concrete tracing gap. - For every material Validation Logic row, populate Reverse Trigger Chain / Routine Logic Link. This must point to the Routine Logic Details subsection, conditioned calculation block, outcome reverse trace, or source-backed TBD that explains why the code is set. Generic triggers such as "validation failed", "warning/reject condition", or "product/group control check" are not sufficient when source operands, comparisons, or intermediate calculations are visible.
- When a catch-all handler is present (
MONMSG MSGID(*ANY), bareON-ERROR, generic exception paragraph), mark it as generic coverage and still list the specific observed messages handled elsewhere. Do not infer specific message IDs from a generic handler. - Build a front-loaded Exception Handling section immediately after Validation Logic and before Message Inventory. It must summarize every observed business, parameter, file I/O, SQL, external-call, system, and generic-handler exception path with trigger, detection mechanism, fields/messages set, handling action, downstream effect, supporting detail link, and evidence. Every path must show whether it closes through return, rollback, skip, continue, abort, log/message output, or downstream suppression.
- Build a front-loaded Message Inventory section immediately after
Exception Handling and before Metadata. It must create one summary row per exact
message ID, status value, return code, response literal, SQLSTATE,
CPF/MCH/RNX/CPD message, operator message, or shop-local message token
observed in the program. Preserve exact codes/literals, include the best
available short description, occurrence count, primary routines, first
seen/set-by location, trigger/handler summary, and detail ID.
For message-dense, segmented, or large programs, keep the main section
compact and store full per-occurrence detail in
message-inventory.mdandmessage-inventory.yaml. When a message/control value is found in an approved reference pack, set Message Source toreference pack: <pack_id>/<file>#<row-or-anchor>and still trace Emitted / Set By and Trigger / Handler from source in the sidecar detail. If no description is available, writeunresolved - message description not available, create an Open Item / TBD, and mark the artifact not final-review-ready until the message file, catalog, reference pack, source literal/comment, runtime evidence, or SME-approved description is supplied. - Tag:
confirmed_from_code(explicit error block) orstrongly_inferred(pattern-based) - Create TBD if error handling is unclear or context-dependent
- List monitored errors:
Mark Redundancy Candidates Conservatively
- Do not delete or suppress code during program analysis.
- Mark a move, assignment, temporary variable, branch, or routine as
candidate_redundancy: yesonly when it is not observed in any calculation, condition, file mutation, log/message, exception path, external output, parameter handoff, or persisted field lineage. - Use
candidate_redundancy: unknownwhen downstream source, copybooks, DDS, or called-program behavior is missing. - Preserve the trace that proves the decision.
- Prepare for SME Review
- Consolidate all TBDs created in steps 3–10 with clear blocking status:
pending_source— missing DDS, incomplete sourcepending_sme_judgment— behavior unclear from source alonenon_blocking— known gaps that don't affect downstream analysis
- Add or preserve a centralized Open Items / Limitations table with Open Item, Impact, Evidence Gap, and Suggested Follow-up.
- Generate review checklist for SME validation
- Mark analysis as
blocked_pending_sourcewhen missing or incomplete source prevents safe analysis; otherwise mark asdraft(ready for review) - Gate: Analysis artifact is ready when every non-TBD behavior has an evidence_strength of
confirmed_from_code,strongly_inferred, ormedium_confidence(the latter two only when an SME review note is attached)
Workflow State Write-Back
At the end of a program-analysis run, update
<project-root>/workflow-state.yaml per
docs/workflow-state-contract.md.
Template: skills/legacy-modernization-orchestrator/references/state-writeback-snippet.md.
Stage this skill produces:
3b Program Analysis Donewhen every in-scope program ininventory.yamlhas an approvedprogram-analysis.md3a Program Analysis In Progresswhen one or more in-scope programs still lack an analysis
Last artifact path pattern:
02_programs/<MODULE-SLUG>/<OBJ-PROGRAM>/program-analysis.md
Writes per run:
- Overwrite
capabilities[<CAP-* from current_focus>]with stage id, the path of the analysis you just saved,last_skill: legacy-ibmi-program-analyzer, and blocking IDs (tbds,sme_pendingfor any money / inventory / compliance branch awaiting SME). - Append one
history[]entry withnotenaming the program analyzed (e.g."analyzed ORDENTR"). - Overwrite
project.last_updated_at/project.last_updated_by.
Never touch current_focus, other capabilities' entries, or past
history[] rows. A re-run on the same program is allowed; a re-run that
would lower stage_id requires the orchestrator's Rollback Protocol.
Anti-Hallucination Rules
Code is ground truth. See ../../docs/code-as-ground-truth.md for
the full principle. When source code disagrees with comments,
source-level flow headers, shop tool outputs, or SME recollection,
the code wins for behavioral claims. The disagreement itself
becomes a TBD asking the SME to confirm whether the secondary source
is stale or the code drifted from intent — but until then, the
analysis describes what the code actually does.
Do NOT invent:
- Subroutine entry points not directly visible in source (no inventing BEGPR blocks or subroutine labels)
- File access beyond what I/O statements (SETLL, READE, CHAIN, WRITE, UPDATE, DELETE) directly show
- Business logic from field names alone (e.g., a field named CREDLIMIT does not explain why the limit exists or how it's used)
- External call parameters if not visible in source headers, copybooks, or CALL statements
- Error codes if not explicitly caught or returned
- Key-file roles, field lineage, or field mutations when the source only proves a generic file reference
- Message IDs from generic handlers or message prefix conventions alone
Instead:
- If DDS is missing, create
TBD-<SLUG>-NNN: Confirm field meaning from [file-name] DDS - If subroutine is referenced but undefined, create
TBD-<SLUG>-NNN: Locate subroutine definition [name] - If error handling is unclear, tag
needs_sme_reviewinstead of inventing error recovery - If external interface parameters are unknown, create
TBD-<SLUG>-NNN: Confirm parameter contract for [program-name]
Evidence minimum:
- Every non-trivial behavior must have ≥1 evidence link, even if evidence_strength is low
- Do not document "likely" behavior without explicit evidence tag
- TBD questions count as evidence of a gap, not coverage
SME Review Questions
The generated program-analysis-<OBJ-ID>.md must include a checklist. Before approval, SME must validate:
- Entry points are correct and complete (no missing callable subroutines)
- Program Call Map keeps a compact ASCII hierarchy Visual Overview and a traceable Call Evidence table
- Parameter contracts match actual usage (no invented parameters)
- Routine Logic Details summarize every load-bearing routine/subroutine/procedure in the main analysis and route routine-dense detail to
routine-logic-details.md/routine-logic-details.yamlwith stableRLOG-*IDs - Routine Logic Details or sidecar detail explain field calculations, carrier/lineage ties, routine-local exception closure, branch outcomes, exits, and evidence for each deep-read load-bearing routine
- Routine Logic Details or sidecar detail break out every material conditioned calculation block, including RPG conditioning indicators / condition groups such as
Condition 5, with guarded statements, calculation order, target fields, intermediate variables, final output/error effect, and source evidence - Routine Logic Details or sidecar detail include outcome reverse traces from every material message/status/error/return outcome back to the branch guard, conditioned calculation block, comparison threshold, intermediate variables, and source operands/carriers that make the outcome true
- Calculation Logic is front-loaded immediately after the title, summarizes the whole program's material calculations/assignments, and links every row to Routine Logic Details, Logic Decomposition, Key File & Field Logic, or Field Mutation evidence
- Logic Decomposition Ledger preserves calculations, constants, branch priority, loops, and CASE/SELECT behavior
- Routine / Window Data Flow shows variable-level input, transformation, output, side effects, source lines, and evidence
- Data Touch Map captures critical carriers, keys, payloads, and state impacts
- Key File & Field Logic preserves source identifiers with business meanings for key fields, aliases, work variables, calculations/conditions, and persisted fields
- File I/O Key Fields preserve source identifiers plus business meanings, and Purpose describes file access behavior
- File I/O field mutation matrix names which files and fields are written, updated, deleted, or skipped, and dense I/O/SQL detail is routed to
file-io-inventory.md/file-io-inventory.yaml,field-mutation-matrix.md/field-mutation-matrix.yaml, andsql-inventory.md/sql-inventory.yaml - External and dynamic calls include caller routine, source lines, parameters, resolution status, purpose, and evidence
- Validation Logic is front-loaded immediately after Calculation Logic, has one row per message/status/return/response/generic outcome with message descriptions and reverse trigger chains, and Error Handling closes each exception path through return, rollback, skip, log, or downstream impact
- Exception Handling is front-loaded immediately after Validation Logic, covers every observed business/parameter/I/O/external/system/generic exception path, and links each row to closure evidence
- Message Inventory is front-loaded immediately after Exception Handling, has one summary row per explicit message/code/literal, links message-dense details to
message-inventory.md/message-inventory.yaml, and preserves description source, carrier/destination, trigger/handler, related Validation/Exception row, and evidence status in the summary or sidecar - Inferred and unresolved calls, fields, variable meanings, and error codes are explicitly marked
- Code identifiers remain intact and readable in rendered tables/lists
- Redundancy candidates are conservative and do not remove hidden rules
- TBDs are non-blocking or properly flagged for follow-up
- Analysis contains no invented subroutines or undocumented file access
Runtime Portability
Canonical source: skills/legacy-ibmi-program-analyzer/SKILL.md
Runtime adapters are synced via scripts/sync-skills.sh:
- Codex:
.codex/skills/legacy-ibmi-program-analyzer/SKILL.md - Claude Code:
.claude/skills/legacy-ibmi-program-analyzer/SKILL.md - OpenCode:
.opencode/skills/legacy-ibmi-program-analyzer/SKILL.md
No runtime-specific assumptions are embedded in the canonical version.
Version History
v0.1.0 (2026-05-14): Initial release
- 10-step workflow for RPGLE, CLLE, COBOL (with Program Call Map extraction and flat object-dependency listing)
- Entry point extraction, control flow tracing, file I/O documentation
- External call and error handling detection
- Evidence tagging and TBD handling
- SME review checklist
- Positive and negative examples
v0.2.0 (2026-06-01): Program-chain readiness tightening
- Added Logic Decomposition Ledger, Key File & Field Logic, field-level File I/O mutation matrix, Exception Closure Ledger, and conservative redundancy candidate notes
- Required every observed message ID / error code and every critical field lineage or mutation to carry evidence or a TBD
v0.2.1 (2026-06-02): Evidence-first report format tightening
- Renamed the Program Call Map tree-style subsection to auditable
Call Evidence - Required source identifier + business meaning for key fields and variables
- Added File I/O Purpose, external/dynamic call resolution status, Error Code Inventory, Routine / Window Data Flow, and centralized Open Items / Limitations
- Renamed the Program Call Map tree-style subsection to auditable
v0.2.2 (2026-06-02): Error Code Inventory precision tightening
- Required one Error Code Inventory row per explicit message ID, status code, return code, response value, SQLSTATE, or generic catch-all token
- Renamed the inventory description column to
Message Descriptionand required description evidence or an unresolved TBD - Explicitly forbids grouped message-family summaries such as "validation messages" in place of per-message rows
v0.2.3 (2026-06-02): Per-routine logic detail tightening
- Added required Routine Logic Details for each load-bearing routine, subroutine, procedure, paragraph, or mainline segment
- Required field calculation / assignment rows with target fields, expressions, operands, branch guards, precision/conversion notes, business effect, and evidence
- Forbids compressing subroutine logic into generic summaries such as "validation logic" or "amount calculation"
v0.2.4 (2026-06-02): Routine-local lineage and exception closure tightening
- Added routine-local field lineage / carrier rows so calculations connect source carrier, intermediate variables, output/persisted carrier, and lineage/mutation references
- Added routine-local exception closure rows for trigger, error/status/message fields, handling action, downstream skip/rollback/output, and Error Code Inventory link
- Tightened subroutine output to match the program-single to program-chain principles for data-source preservation and exception closure
v0.2.5 (2026-06-02): Conditioned calculation and outcome reverse-trace tightening
- Required Routine Logic Details to break out material guard-scoped calculation chains, including fixed-format RPG conditioning indicators / named condition groups such as
Condition 5 - Requires guarded statement order, target assignments, intermediate variables, final output/error effect, source line range, and evidence for each material conditioned calculation block
- Requires outcome reverse traces from material message/status/error outcomes back to branch guards, conditioned calculation blocks, comparison thresholds, intermediate variables, and source operands/carriers
- Renamed the current output section from
Error Code Inventoryto front-loadedValidation Logic, placed immediately after Routine Logic Details - Forbids hiding condition-scoped calculation chains only in generic branch outcomes, Validation Logic summaries, or the Logic Decomposition Ledger
- Required Routine Logic Details to break out material guard-scoped calculation chains, including fixed-format RPG conditioning indicators / named condition groups such as
v0.2.6 (2026-06-04): IT SME first-read core logic front-loading
- Added top-of-document
Calculation Logicimmediately after the title - Moved
Validation Logicto immediately followCalculation Logic - Requires both top sections to summarize whole-program core behavior while linking back to routine-level evidence and ledgers
- Added top-of-document
v0.2.7 (2026-06-04): IT SME first-read exception and message front-loading
- Added top-of-document
Exception Handlingimmediately after Validation Logic - Added top-of-document
Message Inventoryimmediately after Exception Handling - Requires exception paths and every observed message/code/literal to be reviewable before Metadata while still linking back to detailed closure evidence
- Added top-of-document
v0.2.8 (2026-06-05): Program Call Map visual format tightening
- Required
Visual Overviewto remain a compact fencedtextASCII hierarchy - Standardized the tree shape around
PROGRAM mainlineplus|--branch connectors - Requires source flow headers to be reconciled against code-derived Call Evidence before use as the visual map
- Required
v0.2.9 (2026-06-16): Large-program artifact completion tightening
- Deterministic indexer now emits a draft
program-analysis.mdwrapper seed with the required section order - Large/extreme runs retain
routine-logic-details/deep-read-batch-001.mdstyle batch checkpoint files underroutine-logic-details/ - Finalization gate blocks large/extreme delivery when the main wrapper or required batch checkpoint artifacts are missing
- Deterministic indexer now emits a draft
v0.2.10 (2026-06-16): Batch layout and no-source-snippet gate
- Requires every
routine-logic-details/deep-read-batch-*.mdcheckpoint to use the same top-level##layout - Blocks batch core logic sections that include fenced code blocks or source-code-like RPG/CL/COBOL/SQL snippets
- Keeps evidence via identifiers, source ranges, and
RLOG-*links rather than pasted source statements
- Requires every
v0.2.11 (2026-06-16): Message description gate
- Blocks final delivery when observed message/status/code values have unresolved descriptions
- Requires a message file/catalog/reference pack, source literal/comment, runtime evidence, or SME-approved description source
- Marks missing message catalog/reference data as a blocking TBD instead of silently producing low-value message ID lists
v0.2.12 (2026-06-17): Central artifact reuse preflight
- Requires checking a configurable
delivery_artifact_lookup_profileand the delivery repo remotemainbefore scanning a requested program again - Defaults the current lending-card profile to
CH-WPS-LENDING-CARDS/legacy-modernization-deliveryremotemain, with exact program folder patternmodules/*/{PROGRAM}and preserved leading@in program identities - Uses lightweight
central_lookup_resultrouting:found_on_remote_main,not_found_on_remote_main, orremote_unavailable - Standardizes GitHub lookup on read-only Git method 2:
git ls-remoteplus temporary shallow clone / sparse checkout - Forbids rerunning deterministic source indexing when remote
mainalready contains an exact central program artifact folder for the requested program
- Requires checking a configurable
v0.2.13 (2026-06-18): Direct indexer central reuse gate
- Added
--delivery-rootand--delivery-profileto the deterministic source indexer so direct single-program analyzer runs can check the remote-main snapshot/cache before writing new artifacts - When the exact program folder is found, the indexer prints
central_lookup_result: found_on_remote_main, reports the artifact path, and exits without generating a newsource-index.yamlorprogram-analysis.mdseed - Added explicit
--force-rescan --rescan-reason "<why>"override for SME refresh scenarios; forced scans recordcentral_artifact_reusemetadata so reviewers can see which approved artifact was intentionally bypassed
- Added