name: ocas-styx description: Transaction data store with merchant enrichment. Provides a clean, queryable interface over raw bank transaction data. Enriches garbled/obfuscated transaction names into real business entities using SearXNG search plus LLM resolution. Includes financial sync (Plaid API) for pulling transactions and balances daily. Other skills (Taste, Rally, Vesper, Corvus, Sands) read from Styx for consumption signals, spending analysis, and pattern detection. NOT for creating transactions (use bank), budgeting strategy (use Rally), or email-based consumption scanning (use Taste). license: MIT source: https://github.com/indigokarasu/styx includes:
- references/**
- scripts/** metadata: author: Indigo Karasu (indigokarasu) version: 1.4.0 tags:
- transactions
- finance
- merchant-enrichment
- banking
- data-store triggers:
- transaction data
- bank transactions
- merchant enrichment
- financial data store
- query transactions
Styx — Transaction Data Store
Styx is the system's transaction intelligence layer. It sits between raw bank data (from Plaid via financial-sync) and consumer skills that need clean merchant information (Taste, Rally, Vesper, Corvus, Sands).
When to Use
- Enriching garbled/obfuscated transaction names into real business entities
- Merchant lookup and business matching from transaction data
- Answering "what did I spend" or "where did I spend" questions
- Pulling/syncing bank transactions via Plaid API
- Spending analysis, pattern detection, or calendar-based spending context
- Providing clean merchant data to consumer skills (Taste, Rally, Vesper, Corvus, Sands)
- Parsing email receipts (e.g., Rainbow Grocery eReceipts) and storing line items in
receipt_line_itemstable
When NOT to Use
- Budgeting strategy or financial planning (use Rally)
- Email-based consumption scanning (use Taste)
- Creating or modifying transactions (use your bank directly)
- General web research or non-transaction search (use Sift)
- Account management (adding/removing bank links) — use Plaid Link flow directly
Core principles
- Raw data is sacred — transaction records from Plaid are never modified. Enrichment data lives in separate tables, linked by transaction_id.
- Append-only — Styx only adds new records. It never deletes or updates raw transactions. Enrichment records can be superseded (marked stale) but not deleted.
- Read-only for consumers — other skills query Styx via the query API or read the SQLite DB directly. They do NOT write to Styx tables.
- Enrichment is idempotent — running enrichment on already-enriched transactions produces the same result. Safe to re-run.
Data flow
See references/data-flow.md for the data flow diagram.
Database
Styx maintains its own SQLite database at /root/.hermes/data/styx.db.
IMPORTANT: Hardcode this path. Do NOT use {agent_root} — it resolves to the indigo profile home, not the shared data directory.
The active DBs are:
/root/.hermes/data/transactions.db— raw Plaid transaction data (1,055 transactions, last: 2026-06-13)/root/.hermes/data/styx.db— enriched merchant data (1,056 transaction_merchants, 460 merchants, all enriched as of 2026-06-15)
Note: Plaid sync cron runs daily at 7 AM but no new transactions have appeared since 2026-06-13. The bank link may need re-auth or simply has no new activity.
A second copy exists at /root/.hermes/commons/data/ocas-styx/styx.db but it is a stale 0-byte stub — ignore it.
Schema
Three core tables: merchants, transaction_merchants, enrichment_runs.
Receipt parsing table: receipt_line_items (23 columns — see below).
Full DDL: references/schema.md
receipt_line_items Table (23 columns)
Used for storing parsed email receipt line items (e.g., Rainbow Grocery eReceipts).
See references/receipt-line-items-insert.md for the correct INSERT pattern and gotchas.
Enrichment pipeline
Google Places Enrichment (All Categories)
The enrichment pipeline resolves garbled/obfuscated transaction names into real businesses. The default script only enriches food merchants. For full coverage, use the universal enrichment script:
Script: styx_universal_enrichment.md ← read this reference first
# Universal enrichment — all non-financial categories
# Created 2026-06-20. Script exists at:
# /root/.hermes/profiles/indigo/skills/ocas-styx/scripts/styx_universal_enrich.py
# references/styx_universal_enrichment.md if needed.
# Last known path (may not exist): /root/.hermes/commons/data/ocas-styx/styx_universal_enrich.py
# Food-only (original script) — confirmed working
python3 /root/.hermes/profiles/indigo/skills/ocas-styx/scripts/styx_places_enrich.py --all
Categories covered by universal script: retail, service, entertainment, transport, personal_care, medical, home, government, housing, travel, food/restaurant (all 10 food subcategories).
Categories skipped (no physical location): transfer, income, bank_fees, loan_payments,
loan_disbursements. These get source: 'internal'.
Legacy LLM Enrichment Pipeline
For garbled names that Google Places can't resolve: exact match → fuzzy match → SearXNG search → LLM resolution → manual review queue. Full details: references/enrichment-pipeline.md
Query API
Other skills read from Styx using these patterns:
- Category transactions: enriched transactions filtered by merchant category
- Spending by merchant: aggregated totals and visit counts
- Unresolved transactions: candidates needing enrichment
DB path: {agent_root}/data/styx.db
Receipt Parsing Pipeline
When parsing email receipts (e.g., Rainbow Grocery):
- Fetch emails via
get_gmail_messages_content_batch— large results persisted to/tmp/hermes-results/<uuid>.txt - Parse persisted files — XML wrapper around JSON requires brace-depth counting to extract first complete JSON object
- Extract bodies — split by
\n\nMessage ID:, then extract between--- BODY ---and---\n\n - Parse line items — handle department headers, PLU/UPC codes, prices, weight/quantity info
- Write to Styx — use the
receipt_line_itemsINSERT pattern above (22 values,idauto-increments)
Consumer skill contracts
Taste
Taste reads from Styx to discover restaurants and food businesses that Jared has transacted with but that didn't appear in email/calendar.
Taste queries:
m.category IN ('restaurant', 'cafe', 'bar', 'food')for diningm.category IN ('grocery', 'supermarket', 'food_store')for food shopping- Transactions with
personal_finance_category = 'FOOD_AND_DRINK'as fallback
Taste does NOT write to Styx. It writes to its own signals.jsonl and items.jsonl.
Rally
Rally reads from Styx for spending analysis and budget tracking.
Vesper
Vesper reads from Styx for daily/weekly spending summaries in briefings.
Corvus
Corvus reads from Styx for pattern detection in spending behavior.
Sands
Sands reads from Styx for calendar-based spending context.
Security
- Styx DB is read-only for consumer skills (enforced by skill contract, not filesystem)
- Raw transaction data in transactions.db is never modified by Styx
- Enrichment data is additive only
Financial Sync
- Sync script:
{skill_root}/scripts/plaid_sync.py(incremental, daily 7 AM cron) - History script:
{skill_root}/scripts/plaid_history.py(full 24-month pull) - DB:
{agent_root}/data/transactions.db(raw, read-only) - Cron job
a418e00ee21e: daily 7 AM,no_agent: true
Gotchas
- Self-update: untracked files block
git pull—git stashonly stashes tracked files. New (untracked) files in the skill directory will block the merge. Move them aside before pulling, then compare/restore afterward. - Self-update: stash pop may conflict — After pulling,
git stash popcan produce merge conflicts if both the pulled changes and the stashed changes touch the same lines. query.py --health-checkdoes not exist — Use inline Python to verify DB integrity instead.- Raw transaction data is sacred — Styx never modifies or deletes records in
transactions.db. - Name cleaning is essential — Plaid transaction names are heavily obfuscated (e.g.,
DD *DOORDASH ROYALINDI,ABM-350 MISSION GARAGE). Strip prefixes before matching. - Redacted names can't be enriched — Transactions with fully redacted names (
***************) are skipped entirely. - Consumer skills are read-only — Taste, Rally, Vesper, Corvus, and Sands query Styx but must never write to Styx tables.
- receipt_line_items INSERT requires 22 values — The table has 23 columns but
idauto-increments. google_auth_mcpimport path is profile-dependent — When running under theindigoHermes profile,Path.home()returns/root/.hermes/profiles/indigo/homeinstead of/root. Scripts that dosys.path.insert(0, str(Path.home() / '.hermes' / 'scripts'))orsys.path.insert(0, str(AGENT_ROOT / 'scripts'))will fail to findgoogle_auth_mcp.py. Fix: Hardcodesys.path.insert(0, str(Path('/root/.hermes/scripts')))in any script that importsgoogle_auth_mcp. Affected scripts (all fixed as of 2026-06-04): dispatch:triage.py,check_unread.py,gmail_search.py,gmail_scan.py; taste:email_scan.py,run_historical_scans.py; scripts:email_check.py,dream_journal_pipeline.py.- Indigo's OAuth token file may lack
client_secret— The token file at/root/.google_workspace_mcp/credentials/mx.indigo.karasu@gmail.com.jsonmay only haveaccess_token,refresh_token,client_id— butgoogle_auth_mcp.pyneedsclient_secretfor token refresh and atokenkey alias. Fix: Addclient_secretfrom the cached client secret file. Also addtokenas an alias foraccess_tokenandtoken_uri: 'https://oauth2.googleapis.com/token'. - Database and secrets path mismatch (migration artifact) — After a profile/data migration, the active databases live at
/root/.hermes.old/data/(styx.db,transactions.db) and secrets at/root/.hermes.old/secrets/plaid.env, but all scripts hardcode/root/.hermes/data/and/root/.hermes/secrets/. Workaround: Create symlinks before running scripts:
The universal enrichment script atmkdir -p /root/.hermes/data ln -sf /root/.hermes.old/data/styx.db /root/.hermes/data/styx.db ln -sf /root/.hermes.old/data/transactions.db /root/.hermes/data/transactions.db ln -sf /root/.hermes.old/secrets /root/.hermes/secrets/root/.hermes.old/commons/data/ocas-styx/styx_universal_enrich.py(not in the skill'sscripts/orcommons/data/) must be run from that location. - Jared's token refresh adds
access_tokenkey — When refreshing Jared's token, the Google OAuth response includesaccess_token(nottoken). The original file usedtokenas the key. After refresh, both keys exist.google_auth_mcp.pyreadstoken_data.get("token"), so ensure thetokenkey is present. - styx.db may exist with no tables — The DB file can be created empty (0 bytes) by the skill initialization script without the schema being applied. Before any receipt parsing or enrichment, verify tables exist.
llm_resolve.pydoes NOT work in cron/background context — The script callshermes ask --no-streamvia subprocess, which returns no output when there is no interactive session.- styx_places_enrich.py is food-only — The original enrichment script only covers food/restaurant categories. Use
styx_universal_enrich.pyfor all categories. Seereferences/styx_universal_enrichment.md. - styx_universal_enrich.py created 2026-06-20 — Now exists at
/root/.hermes/profiles/indigo/skills/ocas-styx/scripts/styx_universal_enrich.py. Covers retail, service, entertainment, transport, personal_care, medical, home, government, housing, travel. Skips financial categories (transfer, income, bank_fees, loan_payments, loan_disbursements). Run:python3 styx_universal_enrich.py --limit 0to enrich all pending non-food merchants. Includes name cleaning (strips FSP*, SP , ABM-, etc.) and international address parsing (UK postcodes, city-only addresses). - Correct script path for food-only enrichment — The food-only script lives at
/root/.hermes/profiles/indigo/skills/ocas-styx/scripts/styx_places_enrich.py, NOT at/root/.hermes/skills/ocas-styx/scripts/styx_places_enrich.py(that path doesn't exist).
Post-enrichment verification
After every enrichment run, verify the results before marking the run as complete:
- Spot-check 5–10 enriched
transaction_merchantsrecords at random. - Confirm the
enrichment_runstable row for this run shows statuscompleted. - Verify
review_queue.jsonlhas been updated with any new low-confidence matches.
Automation
Self-update
Pull the latest Styx package from GitHub source. Full procedure: references/self_update.md.
Support File Map
| File | When to read |
|---|---|
references/styx_universal_enrichment.md |
Before running Google Places enrichment — use this instead of the food-only default |
references/financial-sync.md |
Before configuring Plaid sync |
references/scripts.md |
Before running enrichment or query scripts |
references/schema.md |
Before querying or modifying the database |
references/query-api.md |
Before writing consumer queries |
references/enrichment-pipeline.md |
Before running or debugging LLM enrichment |
references/styx_universal_enrichment.md |
Before running Google Places enrichment (read FIRST) |
references/self_update.md |
Before running self-update |
references/cron-gotchas.md |
Before debugging cron enrichment failures |
Files
See references/storage-layout.md for the full file table.
OKRs
schedule_adherence
- Target: On-demand enrichment runs complete within 5 minutes of invocation.
data_integrity
- Target: Zero raw transaction records modified or deleted by enrichment pipeline.
Visibility
public