name: sync description: Sync all financial institutions — balances, transactions, MFA handling, import, classify trigger: manual
Sync
When the user says "sync", "update my accounts", "refresh", or "sync my finances" — run the full sync pipeline.
Critical Execution Model
NEVER run sync-all.js as a foreground Bash command. A foreground command blocks the agent from reading messages. MFA codes from the user will sit unprocessed, bank sessions will time out, and the entire sync will fail. This has happened before — do not repeat it.
Procedure
Notify user immediately: "Starting sync for all accounts..."
Start sync in background with
run_in_background: true:node readers/sync-all.js --import --classifyFlags:
--balances— balances only (faster)--bank <name>— single institution only--import— import JSON → SQLite after sync--classify— classify transactions after import- No flags = full sync (balances + transactions, all institutions)
Immediately begin MFA monitoring loop — do NOT wait for sync to finish:
Loop every 3-5 seconds until background task completes: a. Run: ls data/mfa-pending/*.request.json (check for MFA requests) b. If request files exist → read each one, build list of banks needing codes c. Notify user: "MFA codes needed: [bank1], [bank2], ..." d. Wait for user's reply e. Parse codes and submit ALL in parallel (see parsing rules below) f. Also check: ls data/adaptive-pending/*.json (visual help requests) g. Check if background task completed (TaskOutput)When sync completes, run the extraction step (see below), then report summary to user.
Post-Sync Extraction
After the sync background task finishes, the agent handles LLM extraction. The sync scripts capture raw text — the agent extracts structured data from it.
Read each data/sync-output/*.json file. For any file with a pendingExtraction field:
Balance extraction (pendingExtraction.balanceText):
- Read the raw dashboard page text
- Read
config/accounts.jsonfor known accounts (last-4 digits, aliases, types) - Extract structured balances: account name, account type, balance amount (signed — credit/mortgage negative)
- Match each balance to a known account by last-4 digits or name
- Write extracted balances to the
balancesarray in the output JSON - Remove the
pendingExtraction.balanceTextfield
PDF transaction extraction (pendingExtraction.pdfTexts):
- Each entry has: institution, accountId, accountType, fileName, text (LiteParse output)
- Extract structured transactions: date (YYYY-MM-DD), description, amount, currency
- Preserve amounts as they appear in the source document. Do NOT interpret or flip signs. If the PDF shows a purchase as
$99.00(positive), write99.00. If it shows a payment as-$753.56, write-753.56. Sign normalization is handled byimport.jsusingconfig/data-semantics.json— the LLM's job is extraction only. - If amounts include sign indicators (minus sign, parentheses, CR/DR suffix), include the sign in the extracted number.
($8.38)→-8.38.$753.56 CR→-753.56. - Append extracted transactions to the
transactionsarray in the output JSON - Also extract statement balances from each PDF:
- Statement period: start date and end date (YYYY-MM-DD)
- Opening balance (beginning of period) — if available
- Closing balance (end of period) — required
- For mortgages: "Principal Balance" is the closing balance (store as negative — it's a liability)
- Write to
statementBalancesarray in the output JSON:{ accountId, periodStart, periodEnd, openingBalance, closingBalance, source: 'pdf' } - The import pipeline upserts into the
statement_balancestable (dedup on institution + account_id + period_end)
- Remove the
pendingExtraction.pdfTextsfield
Statement balance extraction (pendingExtraction.statementPdfs):
- Each entry has: accountId, accountType, fileName, text (LiteParse output), source ('pdf')
- For EACH PDF, extract:
- Statement period start date (YYYY-MM-DD) — look for "through", "Statement Period", date ranges
- Statement period end date (YYYY-MM-DD) — the closing date
- Opening/beginning balance — "Beginning Balance", "Previous Balance"
- Closing/ending balance — "Ending Balance", "New Balance", "Principal Balance"
- For credit cards: "New Balance" is the closing balance, "Previous Balance" is the opening
- For mortgages: "Principal Balance (Not a Payoff Amount)" is the closing balance — store as NEGATIVE
- Write to
statementBalancesarray in the output JSON:{ accountId, periodStart, periodEnd, openingBalance, closingBalance, source: 'pdf' } - Remove the
pendingExtraction.statementPdfsfield
Real estate extraction (pendingExtraction.pageTexts + pendingExtraction.address):
- Each entry has: source (Google/Zillow/Redfin), text (page text)
- Extract the estimated property value from each source
- Average the values
- Write to
balancesarray as: accountId "home-residence", accountType "real_estate", balance = average value - Remove the
pendingExtractionfield
After all extractions, remove the pendingExtraction field from each output file. Then run import + classify:
node sync-engine/import.js
node sync-engine/classify.js
Import automatically validates transaction signs against known anchors in config/data-semantics.json. If validation warnings appear (e.g., "TARGET is a debit but has positive amount"), the data semantics for that institution may need updating — check config/data-semantics.json and verify the platform hasn't changed its sign convention.
MFA Code Parsing
Users provide codes in flexible formats:
BankA 123456, BankB 7654321banka: 123456 bankb: 7654321123456 7654321(in order of request)
Map institution names flexibly — see config/institutions-status.md for the name mapping table.
Submit via:
node -e "require('./readers/mfa-bridge').submitCode('<institution>', '<code>')"
Submit ALL codes in parallel — do NOT wait between submissions. Bank sessions time out.
Adaptive Help
When the browser primitive encounters an unknown page state, it writes a help request to data/adaptive-pending/. The request includes an annotated screenshot and page URL.
- Read the screenshot and describe what you see to the user
- Ask the user what action to take
- Send instructions back to the adaptive bridge
Task-Error Requests
When a task (balances or transactions) fails during execution, the graduated recovery system may write a type: 'task-error' request to data/adaptive-pending/. This is distinct from type: 'unknown-state' requests (login-phase).
The request includes:
task— which task failed (balancesortransactions)step— where in the task it failed (e.g.,extract-dashboard-text,download-transactions)failedSelector— the Playwright selector that timed out (if applicable)error.category— classification:timeout,selector-not-found,navigation,maintenance,session-expired,unknownpage.url— current page URLpage.textSnippet— first 2000 chars of visible page textscreenshot— path to annotated screenshotelements— interactive elements on the page
To respond, submit instructions the same way as unknown-state requests:
node -e "require('./readers/adaptive-bridge').submitInstruction('<institution>', { actions: [...] })"
Actions can include: click (with selector), type (with selector + text), evaluate (with code), navigate (with url), wait (with ms), key (with key).
The recovery system has a 60s timeout for Level 3 adaptive requests (vs 300s for login adaptive). If no instruction arrives, it skips the task, preserves partial data, and sends a Telegram notification.
Sync Script Telegram Backup
The sync script sends its own notification via the bot API when MFA is detected. This is a backup — the agent MUST still poll the bridge directory, because the script notification may fail silently.
Data Integrity Safeguards
- Failed syncs never overwrite good data — previous balances are preserved
- Balance sanity check: warns if any account balance changed >50% from last sync
- Zero-balance protection: if new sync returns 0 balances but previous had data, previous data is kept
- API token expiry: warns when refresh tokens expire within 2 days
- Each output file has
syncedAtandpreviousSyncedAtfor staleness tracking - Real estate refreshes monthly (25-day staleness threshold)
Summary Format
After sync completes, report:
SYNC COMPLETE
Institution Status Balances Transactions Time
─────────────────────────────────────────────────────────
BankA ✓ 4 12 45s
BankB ✓ 2 8 38s
BankC ✓ 1 5 52s
BankD ✓ (API) 2 15 2s
BankE ✗ MFA timeout —
...
Imported: 24 balances, 40 transactions → SQLite
Classified: 35 new transactions categorized
⚠ BankE: MFA code not received within timeout
✓ All other institutions synced successfully
Key Commands
node readers/sync-all.js --import --classify # full pipeline
node readers/sync-all.js # sync only (JSON staging)
node readers/sync-all.js --balances # balances only (faster)
node sync-engine/import.js # import JSON → SQLite (independent)
node sync-engine/classify.js # classify transactions (independent)
node sync-engine/classify.js --stats # classification breakdown
node readers/run.js <bank> --balances # single bank balances
node readers/run.js <bank> --transactions # single bank transactions