name: sentence-mining
description: Build and maintain Japanese sentence-mining cards for Anki, fully self-contained (no AnkiMorphs install required) and configurable per user via a one-time /sentence-mining setup. (1) Video mode — paste any Instagram reel, YouTube video/Short, TikTok, Twitter video, or local file → yt-dlp + AssemblyAI + SudachiPy + a built-in i+1 known-word diff produces draft cards. (2) Bank mode — give a list of target words → search across your locally-indexed subs2srs .apkg banks for natural example sentences, reusing the bank's original audio + screenshot when available. (3) Replace mode — fix existing cards whose example sentence is bad (too short, a fragment, incomprehensible): pull a better, more comprehensible sentence (Immersion Kit → Nadeshiko → local bank, re-ranked by your own i+1), edit the card in place (archiving the old sentence to a previous_versions field), and rehabilitate the card (de-leech, unsuspend, reset-to-due) so you re-learn it fresh. All modes push via AnkiConnect onto a note type and decks you choose at setup. Use proactively whenever input is (a) a Japanese-language video URL, (b) a list of Japanese words, or (c) a request to improve/replace sentences on existing cards. Trigger phrases include "mine this video", "make sentence cards from ", "turn this reel into cards", "mine these words", "find sentences for [w1, w2, …]", "i keep forgetting ", "pull cards from my bank", "leech these", "search the banks for X", "replace the sentence for ", "find a better sentence for X", "fix my flag:1 cards", "these sentences are too short/confusing", "set up sentence mining", /sentence-mining, /sentence-mining setup, or any video URL paired with a mention of Anki / cards / morphs / i+1.
/sentence-mining, /sentence-mining setup, or any video URL paired with a mention of Anki / cards / morphs / i+1.Sentence Mining
Three modes (plus a one-time setup). Two of them create new cards and share one
post-processing pipeline; the third fixes existing cards in place.
── CREATE NEW CARDS ── ── FIX EXISTING CARDS ──
┌────────────────────┐ ┌─────────────────────┐ ┌──────────────────────────┐
│ Video URL / file │ │ List of words │ │ flag:1 / "fix" / a word │
│ "mine this reel" │ │ "card for 同期" │ │ "better sentence for X" │
└─────────┬──────────┘ └──────────┬──────────┘ └────────────┬─────────────┘
▼ ▼ ▼
┌─────────────┐ ┌──────────────┐ ┌───────────────────────┐
│ VIDEO MODE │ │ BANK MODE │ │ REPLACE MODE │
│ yt-dlp → │ │ search local │ │ IK → Nadeshiko → bank │
│ AssemblyAI →│ │ .apkg banks │ │ re-ranked by your i+1 │
│ i+1 diff │ │ │ │ │
└──────┬──────┘ └──────┬───────┘ └───────────┬───────────┘
└───────────┬───────────┘ ▼
▼ ┌───────────────────────────┐
┌─────────────────────────┐ │ edit card IN PLACE: │
│ Shared post-process │ │ archive old → new + media │
│ curate → explain → │ │ → rehab → reflag / retire │
│ media → draft → push │ └───────────────────────────┘
└─────────────────────────┘
Video and bank mode share the Steps 1–8 below: the mode reference covers the mode-specific Steps 1–3 (and the Step 5 specifics), then you come back here for the shared post-processing. Replace mode runs its own pipeline end-to-end — see references/replace-mode.md.
Route the request. Look at what Ray gave you:
| Input | Mode | Reference |
|---|---|---|
setup / "set up sentence mining" / no config.json yet |
setup | references/setup.md |
| URL (instagram, youtube, tiktok, twitter) or local video | video | references/video-mode.md |
| Plain list of Japanese words | bank | references/bank-mode.md |
| "replace"/"swap"/"better sentence for" existing cards, or "fix my flag:1 cards" | replace | references/replace-mode.md |
| Both (URL + words) | ask Ray | — |
Replace vs bank/video: if the words/cards already exist in Anki and the ask is to swap in a better example sentence (the current one is too short, a fragment, or incomprehensible) — that's replace mode, not bank mode. Bank/video mode create new cards; replace mode edits existing ones in place and archives the old sentence.
Canonical sentence-source order (policy): when finding an example sentence for a word, try Immersion Kit → Nadeshiko → local sentence bank in that order, keeping the first hit usable at Ray's i+1. This is fully implemented in replace mode. The new-card modes don't use it yet — video mode takes sentences from the video, bank mode from local banks; moving them onto this same order is the intended next step (not done yet). See references/replace-mode.md.
Before anything else, ensure Anki is up (skip for setup mode, which has no collection to hit yet). The create/fix modes all need AnkiConnect, so run bash <skill-dir>/scripts/ensure_anki.sh first — it launches Anki for you if it's closed and waits for it to load, instead of dying mid-pipeline on "Connection refused". Only stop and tell the user if it exits non-zero.
Then check setup. If <skill-dir>/config.json does not exist, the skill is unconfigured — route to setup mode (references/setup.md) first, then continue with the user's actual request. Setup is also how a friend imports this skill into their own Anki: it interviews them for their note type, fields, decks, known-word sources, and sentence banks, then writes their own config.json (git-ignored, never shared).
Inputs and required env
The skill is designed to be shareable: nothing about a specific person's Anki is hardcoded. Two git-ignored files hold all the per-user state:
<skill-dir>/config.json— note type, field mapping, deck names, known-word sources, sentence-bank locations. Written by/sentence-mining setup. Seeconfig.example.jsonfor the shape. Read by every script through_config.py.<skill-dir>/.env— API keys only:cp <skill-dir>/.env.example <skill-dir>/.env # then paste the keysRequired keys:
ASSEMBLYAI_API_KEY(video mode only) andGEMINI_API_KEY(both modes — explanation TTS + sentence TTS fallback). Real env vars override.env. If a key is missing, the script exits pointing at.env— don't fall back to alternatives without asking.
The only hard dependencies besides those two files:
- Anki running with AnkiConnect (default port 8765) — don't verify by hand; run
bash <skill-dir>/scripts/ensure_anki.shonce at the start of any mode. It pings AnkiConnect, launches Anki (open -a Anki) if it's down, waits up to ~3 min for it to load, and confirms it stays up (3 pings — Anki sometimes answers once then crashes during a big-collection load on this machine). Exit 0 = good to proceed; exit 1 = surface its stderr to the user (likely a sync/database modal blocking the addon, or the addon disabled). yt-dlp,ffmpegon PATH- Python:
pip3 install --break-system-packages google-genai sudachipy sudachidict_core(SudachiPy is the Japanese tokenizer — pure pip, nobrew install mecabneeded)
AnkiMorphs is NOT required. The i+1 known-word diff is re-implemented inside the skill: it reads the cards in the decks/note-types you name at setup, SudachiPy-tokenizes the configured field, and treats a lemma as "known" once its highest card interval ≥ threshold (default 21 days) — the same idea AnkiMorphs uses, but computed live through AnkiConnect with the same tokenizer the miner uses. See references/known-words.md.
For bank mode, the banks must be indexed first — setup offers to do this, or see references/bank-mode.md §"One-time setup".
If config.json is missing when a script runs, it exits telling the user to run
/sentence-mining setup.
Steps 1–3 (mode-specific)
Follow the reference for the mode you routed into. By the end of those steps you have a candidates.json (or banksearch.json) shaped as:
{
"source": "video" | "bank-search",
"source_id": "...",
"source_url": "...", // optional, video only
"candidates": [
{ "lemma": "...", "sentence": "...", "deck": "<config.decks.main>", "i_level": "i1" | "i?", ... },
...
]
}
Read it. Zero candidates? Tell Ray why (all words known, all dupes, no hits across banks, etc.) and stop.
Step 3.5 — Curate (shared, inline)
Walk the candidate list and drop entries that aren't worth a card. Filter aggressively — a Ray-quality card teaches a generalizable word he'll hit again, not a one-off label from this specific source. Drop:
- Pop-culture proper nouns — anime/manga/game titles, character names, song titles, group/idol names. Real-world brands or places (
スターバックスコーヒー,富士山,東京) are fine; pop-culture-specific titles are not. - Tokenizer fragments — lemmas that are clearly mid-word cuts (
ざいってfrom "うざいって",けんぽfrom "じゃんけんぽい"). Tell: starts with a particle, ends mid-syllable, no dictionary entry. Rarer now that SudachiPy SplitMode C keeps compounds whole, but still spot-check. (Video mode only — bank mode doesn't tokenize.) - Transcription garbage — nonsense given the sentence's clear topic, especially when JPDB rank is
1000000000(no entry). Don't try to rescue a contaminated sentence; drop the candidate. (Video mode only.) - Trail-off / partial sentences — sentence ends mid-clause or starts with a connecting particle; the audio clip will sound broken.
- Subs2srs concatenated frames — bank-mode sentences like
(line1) (line2): keep only the chunk containing the target word. - Compound katakana redundant with components — if both
アイスアメリカーノandアメリカーノare candidates, drop the compound.
When dropping is judgment-call, lean toward dropping. Ray would rather mine 3 great cards than 15 mediocre ones.
Apply by deleting entries from data["candidates"] and saving back. Print a short "kept N / dropped M because …" summary.
Step 4 — Generate explanations (shared, inline)
For each candidate, generate the Japanese explanation inline — don't shell out. Use this prompt verbatim, swapping {word} and {sentence} (it's the prompt from Ray's ai-language-explainer addon, so cards match the style of his 9000+ existing ones):
Please write a short explanation of the word '{word}' using the context of the original sentence: '{sentence}'.
Write an explanation that helps a Japanese beginner understand the word and how it is used with this context as an example.
Explain it in the same way a native would explain it to a 13-year-old. Don't use any English, only use simpler Japanese.
1. Don't write the furigana for any of the words in brackets after the word.
2. Don't start with stuff like という言葉を簡単に説明するね, just dive straight into explaining after starting with the word.
Write each explanation into the candidate's explanation field. Keep each under ~250 Japanese characters — it gets read aloud by TTS.
Why Claude (you) writes this and not a script: the prompt depends on contextual Japanese fluency. You produce more natural output than a separate API call would, and you can react to tone (formal vs casual vs anime-speak). See references/explanation-prompt.md for the canonical addon prompt if you ever need to verify.
Step 5 — Generate media (mode-specific)
Different script per mode — both write the same draft.json shape so Step 7 (push) is shared.
| Mode | Script | What it does |
|---|---|---|
| video | scripts/generate_media.py |
ffmpeg clip + screenshot from the video, Gemini TTS on explanation |
| bank | scripts/generate_media_bank.py |
copy bank's audio/image (or Gemini TTS sentence if absent), TTS explanation |
Both write sentenceAudio_file, picture_file, explanationAudio_file (the latter two may be empty strings for bank cards where the bank shipped no image). Media lands in /Users/ray/Library/Application Support/Anki2/User 1/collection.media/.
Cards are processed by a pool of 3 concurrent workers (ffmpeg clip + screenshot + Gemini TTS per card, staying under Gemini's 10 RPM free-tier cap). Video mode pushes inline by default: generate_media.py inserts each card into Anki the moment its own media finishes — three generations in flight, cards streaming in one by one as they complete (out of order is normal). This folds Steps 5–7 into one command. Pass --no-push to only write the draft for a separate push.py run (the legacy two-step / draft-only flow, still used by bank mode).
See references/video-mode.md §"Step 5" or references/bank-mode.md §"Step 5" for the script invocation.
Step 6 — Push & summarize
Default behavior: push immediately after Step 5 succeeds. No approval gate. Ray confirmed in June 2026 that the curation + explanation pass in Steps 3.5 + 4 has been reliable enough that asking "say push to commit" was just adding friction. Anki's own review queue is the real gate — bad cards get suspended or deleted there. Push first; show the result. For video mode this is automatic — Step 5 generates and inserts in one pass (pass --no-push only when Ray says "draft only"); bank mode still calls push.py afterward.
Skip auto-push only if Ray explicitly said "draft only" / "don't push" / "let me review first" in the originating message. In that case, fall through to the legacy approval flow at the bottom of this section.
Go to Step 7, then print the summary in this shape (video mode):
Pushed 17 cards from <SOURCE_ID> to Anki ✓
→ "Ray's Sentence Cards" (i+1): 12
1. 気迫 (きはく) — "彼は気迫のこもった目で..." [JPDB rank 4823]
...
→ "Ray's Sentence Mining Deferred" (i+2/i+3): 5
13. 揶揄う (からかう) — "..." [JPDB rank 12044]
...
Skipped during curation: <N> (ads / tokenizer fragments / transcription errors).
Draft: ~/Downloads/sentence-mining/<source>.draft.json
Example (bank mode):
Pushed 2 bank cards (word list: 同期, 西暦, 和暦) ✓
1. 同期 [tokyo_ghoul_season_1] 🔊🖼
"同期では二人 二人共 聡明で強い意思を持った女性でした"
2. 西暦 [legend_of_the_galactic_heroes_eng_jp]
"西暦2166年には 木星の衛星 イオに" (sentence TTS synthesized; no image)
Misses: 和暦 — no hit across N indexed banks.
If push.py reports any failed, list them with the reason from the response so Ray knows what didn't make it in.
Legacy approval flow (only when Ray says "draft only" / "don't push")
Print the same summary but with "Mined N candidate cards" and a "Say 'push' to commit, or tell me which to drop" line. Then wait. He may:
- Say "push" — run Step 7
- Say "drop 3, 7, 11" — remove those, ask again
- Say "regenerate explanation for 5" — redo, regen TTS for that one
- Say "try a different sentence for X" — look at runner-up bank hits and re-stage
- Say "no" — leave draft.json on disk; he can come back to it
Step 7 — What push.py does (tags, formatting, dedup)
python3 <skill-dir>/scripts/push.py --draft ~/Downloads/sentence-mining/<source>.draft.json
This:
Calls AnkiConnect
addNoteswith the full card listTags every card with two tags:
claude-sentence-mining(video) ORclaude-sentence-bank(bank) — the permanent kind tagi1/i2/i3/i?— the current i-level (count of unknown content words in the sentence) so Ray can filter by complexity in Anki
Other context — per-run
source:*,speaker:*,bank:*,auto-mined:*— is intentionally NOT promoted to tags (Ray asked these be dropped in June 2026 because they cluttered the tag tree without adding study value). The full data still lives in the draft JSON for debugging.Sentence field is prefixed with
<b>A:</b>for video diarized cards so it's clear who's talkingEmpty picture →
。filler, never blank. The note type's Back template has a{{^picture}}branch that re-renderssentence_audioand forces.audio { display: block }. So a card with a blank picture field replays the sentence audio on the back — on AnkiMobile/AnkiDroid the audio autoplays on both front and back.push.py(andreplace_apply.py) therefore write。into the picture field whenever there's no image, which flips{{#picture}}truthy and silences the back replay. Don't "fix" this back to an empty string.Nothing is suspended — Ray studies them all and decides per-card
If any addNote fails (usually a late-detected duplicate), push.py reports which and skips it without aborting the batch.
Step 8 — Cleanup
Leave the video / draft / intermediate JSONs in ~/Downloads/sentence-mining/. Ray asked for this — it lets him re-run, re-watch, or scrub for context. Don't auto-delete.
Reference files
- references/setup.md — the
/sentence-mining setupinterview that writesconfig.json - references/known-words.md — the built-in i+1 known-word diff (replaces AnkiMorphs); how "known" is computed and configured
- references/video-mode.md — Steps 1–3 and Step 5 for video-URL input
- references/bank-mode.md — Steps 1–3 and Step 5 for word-list input + one-time bank indexing setup
- references/replace-mode.md — fix existing cards' sentences in place via Immersion Kit, re-ranked by your i+1, archiving to
previous_versions - references/apkg-schema.md —
.apkgZIP/SQLite layout and field separators - references/bank-formats.md — field-role detection heuristics + known notetypes
- references/note-type.md — note-type fields and how
config.field_mapmaps onto them - references/transcript-schema.md — shape of AssemblyAI's response
- references/explanation-prompt.md — verbatim prompt from Ray's addon
Scripts inventory
| script | mode | purpose |
|---|---|---|
setup.py |
setup | probe Anki (note types/fields/decks), tools, keys; validate config.json |
_config.py |
all | load config.json (merged over defaults) — single source of truth |
transcribe.py |
video | AssemblyAI Universal-3 Pro JP transcription with diarization |
analyze.py |
video | SudachiPy tokenize + built-in known-word diff (cached) + JPDB rank |
generate_media.py |
video | ffmpeg clip + screenshot + Gemini TTS (3 parallel); pushes each card inline as it finishes (--no-push = stage draft only) |
extract_bank.py |
bank | parse .apkg → local index JSON + media dir |
search_banks.py |
bank | word-list → top-N sentence candidates across indexed banks |
generate_media_bank.py |
bank | copy bank media (or TTS fallback) + Gemini TTS explanation |
replace_search.py |
replace | resolve target cards (flag / note-ids / words) → search Immersion Kit → Nadeshiko → local bank → filter + re-rank by your i+1 → replace-draft JSON |
replace_apply.py |
replace | stage media (URL or local) + TTS explanation (best-effort), archive old sentence to previous_versions, overwrite fields, retag i-level, rehabilitate (de-leech/unsuspend/reset-to-due), clear flag:1 so the redone card just rejoins the study queue (--done-flag N to flag instead, -1 to leave). Retires unfixable misses (not-worth-learning + suspend + clear flag; --keep-misses to skip). --rehab-flag N rehabilitates a batch with no field changes |
push.py |
both | AnkiConnect addNotes onto config.note_type via config.field_map |
ensure_anki.sh |
all | ping AnkiConnect; open -a Anki if down; wait for load + verify stable (run first) |
_env.py |
both | loads .env into os.environ |
_anki.py |
both | AnkiConnect helper + storeMediaFile (URL from config) |
Gotchas (universal)
- AnkiConnect must be running — but don't make Ray launch it. Run
bash <skill-dir>/scripts/ensure_anki.shat the start of every mode; it launches Anki if closed, waits for the collection to load, and verifies stability (Anki has crashed once mid-load right after launch on this machine, answering a single version ping before dying — the 3-ping stability check catches that). Only if it exits 1 after launching is something actually wrong (sync/database modal blocking the addon, or addon disabled) — surface that to Ray rather than retrying. - Don't push cards with empty explanation. If Step 4 failed for a card (you got confused, refused, etc.), drop it from the draft rather than pushing a hollow one.
- Gemini TTS preview model rate-limits. Free tier is 10 RPM.
generate_media.py(video) caps TTS concurrency at 3,generate_media_bank.py(bank) at 2 (override withSM_TTS_CONCURRENCY); both back off exponentially on 429. If you still hit limits, lower the cap or serialize. allowDuplicate: Falsein push.py means re-pushing the same word is silently rejected. To check ahead of time: query<word-field>:<lemma> deck:"<main-deck>"(fromconfig.field_map.word/config.decks.main) against AnkiConnect during curation.analyze.pyalready pre-dedupes against the configured mining decks.- Known-word scan is cached. The first mine of the day scans every configured known-source deck (~100s for a large collection); subsequent runs reuse the cache for
config.known_words.cache_hours(default 6). After a big review session, passanalyze.py --refresh-known(or just wait out the TTL) so freshly-matured words drop out of mining. - Never leave the picture field blank — write
。. A blank picture makes the Back template replay the sentence audio (double audio on mobile, front + back). All create/fix scripts already default an imageless card's picture to。. To bulk-fix legacy blank-picture cards:updateNoteFieldssettingpictureto。on every note matchingnote:"<note_type>" picture:. Done once in June 2026 across 1626 cards.