name: deal-data description: Use when reconstructing one or more M&A deals from SEC filings into source-backed extraction bundles with structured judgments.
Deal-Data
Overview
Reconstruct one or more deals from online primary filings such as SEC EDGAR
materials. This skill produces one extraction bundle per deal under
Data/deals/<deal_slug>/.
Per-deal bundles contain:
- source artifacts,
- factual artifacts,
- judgment artifacts.
The agent writes only to Data/deals/<deal_slug>/. It never writes directly to
Data/canonical/ or Data/views/; downstream Python scripts handle merge and
derivation.
This is an archival-first protocol. Facts come before judgments. The skill must preserve ambiguity rather than forcing clean labels.
When To Use
- Reconstructing one or more deals from SEC filings.
- Testing taxonomy changes on calibration cases.
- Building extraction bundles before anything enters the canonical database.
Do Not Use This For
- Direct writes into
Data/canonical/orData/views/. - Batch processing all deals without explicit user request.
- BlackJAX estimation or Monte Carlo runs.
- Generic spreadsheet editing.
- Using Chicago RA entries as a learning source.
Source Hierarchy
- SEC filing. Every fact must trace to a
source_textquote. - Alex's 9 corrected cases. These are methodology references only. The agent may study the judgment style, but must independently extract from the filing.
- The agent's output. It must stand on the filing, not on workbook reshaping.
Chicago RA entries are not a valid learning source.
Anti-Laundering Rule
The agent cannot reshuffle Alex's workbook entries into the new schema and
present them as independent extraction. Every event requires source_text
verifiable against the filing.
Output Contract
Write one extraction bundle per deal under:
Data/deals/<deal_slug>/
Each bundle must contain:
source/source_selection.jsonsource/source_manifest.jsonsource/background_section.txtextraction/deal.jsonextraction/actors.jsonlextraction/events.jsonlextraction/process_cycles.jsonlextraction/event_actor_links.jsonlextraction/judgments.jsonlreviewed/judgment_overrides.csvreviewed/fact_corrections.csv
If a case fails early, write the source artifacts plus a failure memo. Do not write partial extraction JSONL files.
Core Procedure
- Select the filing and log the source-selection process.
- Isolate the chronology section and save it.
- Extract facts only: deal metadata, actors, cycles, events, event-actor links.
- Segment cycles after reviewing the whole chronology.
- Create judgments after the factual layer is complete.
Required Reading Order
Read these in order:
references/workflow.mdreferences/schemas.mdreferences/judgment-taxonomy.mdreferences/quality-gates.mdreferences/cycle-segmentation.md
Design Principle
The pipeline should produce a self-contained database that carries more
information than the previous workbooks. If the filing contains relevant detail
not captured by a defined field, record it in raw_note (events), notes
(actors), or deal_notes (deal.json) rather than discarding it.
Common Mistakes
- Treating a banker as a bidder.
- Using workbook rows or local notes to fill gaps in the filing.
- Treating a target-side communication as if it were itself a dropout fact.
- Collapsing multiple cycles into one process.
- Forcing a formal boundary where the filing only supports a transition zone.
- Converting total value to per-share without explicit share count support.
- Counting partial-asset bidders as valid whole-company bidders.
- Classifying dropout mechanism from surface wording alone when context points the other way.
Practical Rule
If you are not sure whether something is a fact or a judgment, put the raw event
into the factual layer and put the interpretation into judgments.jsonl.