name: rajio description: Use only when explicitly asked to use rajio for Japanese audio/video subtitle translation. metadata: author: OneKuma version: '0.2.0-beta.2'
Rajio
Use this skill to translate Japanese audio/video into polished, carefully proofread
subtitles with rajio: prepare context, extract audio, transcribe Japanese, proofread
the transcript, run multi-round Simplified Chinese translation and review, polish the
final subtitle text, and export SRT/ASS files.
Do not use this skill unless the user explicitly asks for the rajio skill or asks you to create polished Chinese subtitles from Japanese audio/video with rajio.
Non-Negotiable Rules
Quality comes first:
- The goal is accurate, natural, comfortable subtitles. Correctness, readability, tone, subtitle flow, and viewing comfort outrank mechanical formatting cleanup.
rajio checkis a quality floor, not the definition of finished quality.fatalmeans invalid data or session state,errormeans a problem that seriously hurts readability in ordinary cases, andwarningmeans a recommended improvement.- Clearing, skipping, or preserving
rajio checkissues only establishes the technical baseline. Manual review and text refinement are still required for ASR mistakes, proper nouns, context, terminology, fixed phrases, translation consistency, concision, tone, and Chinese subtitle polish. - Do not satisfy QA heuristics by making subtitles less correct, less natural, or harder to watch. Also do not ignore warnings mechanically: inspect remaining warnings and decide case by case whether to fix, preserve, merge, split, retime, compress, or document the exception.
Respect privacy and provider boundaries:
- Make the privacy boundary explicit before transcription. Rajio uploads audio to the configured transcription provider; start transcription only after the user authorizes that upload.
- During
translation_work, do not use the OpenAI-compatible provider configured in.envas a machine-translation service. Sub-agents produce the first-draft batch translations from the provided context; the main agent reviews, merges, validates, and performs the required full-file Chinese refinement.
Respect manual-stage ownership:
transcript_workandtranslation_workare manual stages. Always process first-draft proofreading and translation through sub-agent batches. If sub-agent tooling is unavailable, stop and report that the manual stage cannot be completed under this skill.- The main agent owns batch planning, patch review, patch application, glossary decisions,
consistency QA,
description.md, validation, commits, exports, and final reporting. - The main agent must not proofread or translate the full first-draft manual stage by itself. The explicit exception is final Chinese refinement: after sub-agent translation batches have produced the first draft, the main agent must perform full-file Chinese subtitle refinement as described in Refine Chinese Subtitles.
Define manual review strictly:
- Manual review means reading subtitle segment text in timeline order with enough
neighboring context to judge meaning, flow, timing, tone, terminology, and subtitle
comfort. It does not mean only running
rajio check, scanning issue summaries, applying generated patches, or spot-checking validation examples. - During
transcript_work,translation_work, and final Chinese refinement, agents must userajio segments listwith explicit ranges, offsets, IDs, issue filters, and neighboring context to inspect actual subtitle text in batches. Every segment in the assigned or owned scope must be reviewed as text, not only as QA metadata. - Batch workers must read every segment in their assigned range plus enough surrounding segments to catch cross-boundary continuity problems. The main agent must review worker patches, batch boundaries, glossary decisions, and at least one continuous full-file Chinese pass after applying first-draft translation patches.
rajio checkis only validation support. Passing checks, clearing warnings, addingskip_checks, or applying suggested patches does not count as manual review unless the affected subtitle text has been read in context.- Do not use ad hoc automation scripts to edit
segments.toml, generate subtitle text, generate translation/proofread patch operations, or addskip_checks. Write and review subtitle edits through the manual process andrajio segmentstooling. - Automation scripts are allowed only for non-editing support such as collecting counts, slicing check JSON for inspection, comparing statistics, or validating data shape. They must support review, not replace reading and judgment.
- If a range was not read segment by segment in context, report it as unreviewed. Do not call the stage polished or final.
Respect file boundaries:
- Never edit
transcript/raw/segments.toml,transcript/raw/checkpoints/*.toml, ortranscript/raw/chunks/*.toml. Raw transcript files are references. - Edit only the active manual work file,
description.md, and session-local patch or review artifacts:transcript/work/segments.toml,translation/work/segments.toml,description.md, and files under session-localpatches/or clip review directories. description.mdis the source of truth for media metadata, user notes, context, glossary, fixed terms, style requirements, and unresolved uncertainty. Keep it current throughout the session.
Use rajio tools deliberately:
- Use
rajio checkas documented in the CLI section before commits and final reporting, while remembering it is not a substitute for manual QA. - Use
rajio segmentscommands for stable targeted edits to work-stagesegments.toml: list/filter segments, edit fields, split/merge subtitle units, and delete semantically empty filler segments. Shape:rajio segments <command> <target>. - Use
rajio clipscommands for difficult source-video ranges that need independent retranscription for comparison. Clip outputs are sidecar review artifacts only; do not treat them as automatic replacements fortranscript/work/segments.toml. - Record intentional subtitle QA
errorexceptions with per-segmentskip_checksin the work-stagesegments.toml. Every skip must name the exact issue code and include a reason. Never skipfataldata/file/schema/timeline issues, unfinished translation, or unreviewed batches. Matched skips are omitted from check output; stale skips still reportfatal unused_skip_check.
Sub-Agent Batch Contract
- Spawn sub-agents for every
transcript_workproofread batch and everytranslation_worktranslation batch. If sub-agent tooling is unavailable, stop and report that manual stages cannot be completed under this skill. - Run sub-agent batches within the active concurrency/thread limit, and close or release completed workers before spawning more.
- Read SUB_AGENTS.md before spawning sub-agents. Keep this file focused on workflow rules; use that document for mandatory batch-worker templates and instructions.
- The main agent owns batch planning, patch application, glossary decisions, consistency
QA, final full-file Chinese refinement,
description.md,rajio check, commits, exports, and final reporting.
Required Input
- Local audio/video path. Refuse to start without this.
- Optional but preferred: title, original URL, publish date, uploader/channel, synopsis, cast, program/corner names, user notes, fixed terminology, and translation style requirements.
If optional metadata is missing, proceed with filename-based defaults, record the
uncertainty in description.md, and revisit it when transcript context reveals more.
CLI Quick Reference
For complete command syntax, examples, output formats, segment patch shape, clip artifact details, and environment variables, read CLI.md.
Check whether rajio is available:
command -v rajio
If it is not installed, run commands through npx rajio ....
Command Overview
Use the installed CLI:
rajio <target> [options]
rajio segments <command> <target> --stage transcript
rajio clips <command> <target>
rajio check <target>
rajio doctor <target>
Default Command
The default command drives the whole session workflow.
Default command media option:
--media <path>: invocation-only media override.
Default command workflow controls:
--continue=until-manual: run automatic stages until the next manual stage.--continue=step: run one automatic stage.--commit: commit the current manual stage after validating its work file.--reset <stage>: regenerate fromaudio,transcript_raw,transcript_work,translation_work, orexport.--full: runs automatic stages only; manual stages still require sub-agent batch work and--commit.
Audio chunk options:
--chunk-target <seconds>: local audio chunk target. Default600, minimum60.--chunk-boundary-search <seconds>: silence search window around the target cut point. Default90, range0..300.--chunk-silence-noise <db>: ffmpegsilencedetectthreshold. Default-35.--chunk-silence-duration <seconds>: minimum silence duration. Default0.4.
rajio keeps both single_file and chunking audio strategies. With the current
ElevenLabs/scribe_v2/integrated flow, rajio validates these options for CLI compatibility but
selects strategy = "single_file" and does not write stages.audio.chunking or
stages.audio.chunks[]. Future transcription models may select the chunking strategy again.
Segments
Most rajio segments commands print affected segment rows. segments apply is the
exception: by default it prints operation counts plus patch-scoped check feedback. Agents
should default to --json for parseable output. When using verbose JSON, pipe the output
through jq to select only the fields you need instead of reading the full raw payload.
See CLI.md for JSON structures.
Segment command examples:
rajio segments list /path/to/session --json --stage transcript
rajio segments list /path/to/session --json --stage transcript --id 12
rajio segments list /path/to/session --json --stage transcript --id 12,15,19
rajio segments list /path/to/session --json --stage transcript --id 12,15,19 --around 3
rajio segments list /path/to/session --json --stage transcript --offset 100 --limit 50
rajio segments list /path/to/session --json --stage transcript --start 600 --end 660
rajio segments list /path/to/session --json --stage translation --issues empty_zh,zh_line_hard_limit
rajio segments list /path/to/session --json --stage translation --issues duration_too_long --level error
rajio segments list /path/to/session --json --stage translation --issues empty_zh --offset 100 --limit 50
rajio segments apply /path/to/session patch.toml --json --stage translation
rajio segments apply /path/to/session --json --stage translation <<'EOF'
[[operations]]
op = "edit"
segment_id = "12"
zh = "修正后的中文字幕"
EOF
rajio segments edit /path/to/session 12 --json --stage transcript --start 10.2 --end 13.4 --speaker A --ja "修正した日本語"
rajio segments edit /path/to/session 12 --json --stage transcript --ja "修正した日本語" --dry-run
rajio segments split /path/to/session 12 --json --stage transcript --at 11.8 --gap 0.05 --id1 12.1 --id2 12.2 --ja1 "前半の日本語" --ja2 "後半の日本語" --speaker1 A --speaker2 B
rajio segments merge /path/to/session 12.1 12.2 --json --stage transcript --id 12 --ja "結合した日本語" --speaker A,B
rajio segments insert /path/to/session 12.5 --json --stage transcript --start 42.0 --end 43.2 --speaker A --ja "追加された字幕"
rajio segments delete /path/to/session 13 --json --stage transcript
In segments commands, pass /path/to/session after the segment subcommand. Replace
--stage transcript with --stage translation for translation/work/segments.toml.
Segment ids must be non-empty, trimmed strings without commas.
segments list selects rows by id, time range, validation issue, or plain pagination:
--id <ids>: show a comma-separated id list in requested order. Segment ids themselves must not contain commas.--id <ids> --around <count>: show surrounding context for each requested id, deduplicated in timeline order.--start <time> --end <time>: show segments whosestarttime is in[start, end).--issues <codes>: show segments matching validation codes such asinvalid_time,ja_line_hard_limit, orempty_zh; add--level errorto exclude warning-level matches for soft-or-hard codes like duration and reading speed. Add--offsetand--limitto page through issue matches.--offset <count> --limit <count>: show a zero-based window after any issue filtering; omit--limitto read from offset to the end. Do not combine with--id,--around, or--start/--end.
segments apply <target> [file] applies an ordered TOML patch as the batch form of edit,
split, merge, insert, and delete. Pass a file path, or omit [file] only when providing
stdin in the same shell command, such as <<'EOF' ... EOF. For batch work, prefer a patch file
under a session-local patches/ directory. Normal apply writes the patched segments, then
runs patch-scoped check feedback. --dry-run validates the patch, previews affected output,
and runs the same checks without writing changes. Use --verbose --json with jq when you
need affected segment rows and their remaining issues.
created_by = "worker-a"
start = 120.0
end = 180.0
[[operations]]
op = "edit"
segment_id = "12"
zh = "修正后的中文字幕"
[[operations]]
op = "edit"
segment_id = "title"
skip_checks = [
{ code = "zh_repeated_punctuation", reason = "Official title spelling." },
{ code = "zh_line_hard_limit", reason = "Official title should stay on one line." }
]
[[operations]]
op = "split"
source_id = "long"
gap = 0.05
[[operations.replacements]]
segment_id = "long.1"
start = 10.0
end = 13.2
speaker = "A"
ja = "前半の日本語"
zh = "前半中文字幕"
[[operations.replacements]]
segment_id = "long.2"
start = 13.2
end = 16.0
speaker = "A"
ja = "後半の日本語"
zh = "后半中文字幕"
[[operations]]
op = "merge"
source_ids = ["13.1", "13.2"]
merged_id = "13"
speaker = "A,B"
ja = "結合した日本語"
zh = "合并后的中文字幕"
[[operations]]
op = "insert"
segment_id = "13.5"
start = 16.2
end = 17.0
speaker = "A"
ja = "追加された字幕"
zh = "新增字幕"
[[operations]]
op = "delete"
segment_id = "14"
Clips
Clip command examples:
rajio clips transcribe /path/to/session --start 120 --end 180 --label noisy-overlap
rajio clips list /path/to/session --json
rajio clips show /path/to/session clip-120000-180000 --json
Use clips when an initial transcription has a complex, noisy, overlapped, or error-prone
time range that should be independently recognized for comparison. clips list prints
only clip rows; clips show prints only that clip's segments.toml. Agents should
default to --json for clips list and clips show; otherwise output is a
human-readable table. See CLI.md for JSON structures.
Check
Use rajio check before committing manual stages and before final reporting. It validates
session shape, timeline integrity, required text, and subtitle QA heuristics, but it does
not replace semantic review for ASR mistakes, names, terms, context, translation quality,
or editorial polish. Treat the levels as follows:
fatal: invalid data or session state; fix before proceeding.error: a problem that seriously hurts subtitle readability in ordinary cases; fix it unless a specific reviewed exception is better for accuracy or viewing comfort.warning: a recommendation; inspect it and make a local editorial decision instead of mechanically fixing or mechanically ignoring it.
Passing rajio check, including with zero fatal/error issues, does not mean the
subtitles are polished. It only means the work has passed the baseline data and subtitle
heuristic checks.
Use --json for machine-readable output; pipe it to jq when you need to extract fields
or slice down the output. See CLI.md for JSON structures.
rajio check /path/to/session --json --level error: show blockingfatalanderrorissues.rajio check /path/to/session --json --stage transcript --language ja: check transcript work Japanese QA. Transcript checks only supportja.rajio check /path/to/session --json --stage translation: check translation work Chinese QA;zhis the default language for translation.rajio check /path/to/session --json --stage translation --language ja: inspect Japanese subtitle QA inherited intotranslation/work/segments.toml.- Add
--verboseonly when you need full sortedissues, such as locating exact segment IDs and issue codes before addingskip_checks. When using verbose JSON, pipe it throughjqto inspect only the fields you need instead of reading the full raw payload.
Subtitle QA Rules
These are the subtitle QA thresholds enforced by rajio check; severity, stage, and
language filtering follow the Check section above.
| Rule | Warning | Error |
|---|---|---|
| Japanese line length | ja line exceeds 20 visible non-space characters |
ja line exceeds 28 visible non-space characters |
| Chinese line length | zh line exceeds 16 visible non-space characters |
zh line exceeds 24 visible non-space characters |
| Line breaks | 2 lines that still need review after merge-length check | More than 2 lines, or 2 lines that merge within the soft length and should be one line |
| Subtitle duration | shorter than 0.5 seconds or longer than 7 seconds | shorter than 0.3 seconds or longer than 10 seconds |
| Reading speed | Japanese exceeds 15 chars/s; Chinese exceeds 11 chars/s | Japanese exceeds 20 chars/s; Chinese exceeds 15 chars/s |
| Adjacent gap | gap is 50-150 ms | gap is under 50 ms |
| Punctuation | none | ordinary comma/period punctuation, ordinary sentence-ending punctuation, 2+ repeated question/exclamation marks, or punctuation-only line |
Do not satisfy numeric limits by creating unreadable single-character, single-syllable,
or isolated filler subtitles. Prefer natural compression, merging with an adjacent segment,
retiming, or splitting at a semantic pause. Single ? or ! is allowed when needed for
intent, but use it sparingly.
Warnings are still work items. Review them in context and either improve the subtitle or keep the current form because it is more accurate, natural, or comfortable than the mechanical alternative.
Workflow
0. Prepare The Session
- Resolve the media path to an absolute path and confirm it exists.
- Choose the session directory. If the user provides one, use it. Otherwise create one near the media file or in the current workspace using a filesystem-safe title or media stem. Do not copy large media files unless the user asks.
- Create or update
description.md. - Gather confirmed context before transcription when practical. Use the original URL, official pages, video title, filenames, on-screen text, user notes, and later transcript discoveries. Record uncertainty explicitly instead of guessing.
Use this description.md shape:
---
media: ./video.mp4
title: Video title or filename stem
url: https://example.com/original
published_at: 2026-06-06
---
## Context
- Source/uploader:
- User notes:
- Video synopsis:
- Cast/speakers:
- Program/corner structure:
- Known fixed greetings or sign-offs:
- Related events/products/works mentioned:
## Glossary And Fixed Terms
- Japanese term/person/place -> Chinese translation or note
- Common ASR confusion -> Correct Japanese term / Chinese translation
## Style Requirements
- Translate into natural Simplified Chinese subtitles.
- Preserve important names and terminology consistently.
1. Run To Transcript Work
Before automatic stages, run:
rajio doctor /path/to/session
rajio doctor reports the rajio CLI version and update-check result first. Confirm the CLI version matches this SKILL frontmatter metadata.version; if they differ, or doctor reports a newer version is available or update checking failed, report that before starting automatic stages. It also validates runtime configuration and provider access using the target directory for .env loading. Do not start transcription until rajio doctor passes or the environment issue is resolved.
Run:
rajio /path/to/session --continue=until-manual
Expected result: rajio creates or reads session.toml, extracts audio, transcribes
Japanese, writes raw transcript artifacts, creates transcript/work/segments.toml, and
stops at transcript_work.
Wait for transcript/raw/checkpoints/input-000.toml or
transcript/raw/checkpoints/input-000.error.log. Do not restart while requests may still be in
flight unless there is a clear CLI/provider failure.
Treat automatically created work segments as a draft.
Rajio may write suggested patches under
transcript/work/suggested-patches/, but never applies them automatically. Review them during
the proofread stage below.
Suggested patch review:
- Review suggested patches in numeric filename order before spawning transcript workers.
- Before spawning transcript workers, either apply, manually fold in, or explicitly reject
each suggested patch, including
*-low.*files. - Treat
confidence = "medium"patches with extra care and*-low.*files as manual decision candidates, not automatic cleanup or post-worker TODOs. - Patch
reasonandconfidencefields do not change apply behavior.
Example generated files:
transcript/work/suggested-patches/
01-punctuation-cleanup-chunk-000-000000s-000600s-high.toml
02-fragment-merge-chunk-000-000000s-000600s-high.toml
02-fragment-merge-chunk-000-000000s-000600s-medium.toml
03-boundary-retime-chunk-000-000000s-000600s-high.toml
04-long-segment-candidates-chunk-000-000000s-000600s-low.md
2. Proofread And Polish Japanese
Proofread flow:
- Review automatically generated suggested patches under
transcript/work/suggested-patches/, adjust them if needed, dry-run them withrajio segments apply <session> <patch> --stage transcript --dry-run, then apply, manually incorporate, or reject them before assigning worker ranges. - Spawn transcript proofread sub-agents following SUB_AGENTS.md. Each
worker gets a label and assigned source-media
start/endrange. - Review each worker patch file, dry-run summary, and final report confirmations. Apply
accepted patches to
transcript/work/segments.toml. A blocker report is not an accepted patch; resolve it or reassign that range before committing. - Manually perform a whole-transcript review and polish pass for context, terminology, fixed phrases, structure, and readability across batch boundaries.
- Commit
transcript_workonly after semantic review and validation are clean, or only intentional subtitle QA exceptions remain with exactskip_checks.
Transcript review requirements:
- Use the segment commands documented in the CLI section with
--stage transcriptfor transcript inspection, patch review, patch application, and validation. - Do not translate in this stage; only correct and polish the Japanese transcript.
- The main agent's proofread and polish pass must be the manual review defined in
Non-Negotiable Rules. Use
rajio segments listto read the actual transcript text in timeline-order batches with neighboring context, review suggested and worker patches against those segments, and inspect batch boundaries. rajio check, suggested patches, issue filters, dry-run summaries, or statistics are only supporting evidence. They do not count as transcript proofreading unless the affected subtitle text is read in context.
For complex, noisy, overlapped, or suspicious ASR ranges, the main agent or a sub-agent may
use rajio clips transcribe to retranscribe the original media time range as sidecar
evidence. Then use
rajio clips list --json and rajio clips show <id> --json to compare the alternate
transcript against transcript/work/segments.toml. Clip output is reference material; do
not treat it as an automatic replacement.
Validate often with rajio check as documented in the CLI section. This only checks data
shape, timing, required fields, and subtitle limits; before committing, still polish the
content semantically against the acceptance criteria below.
Acceptance criteria:
- Every segment has stable
id, numericstart/end, non-emptyspeaker, and non-empty Japaneseja. - Timestamps increase and do not overlap.
- Japanese text is coherent, natural, and corrected against
description.md, glossary, proper nouns, and raw transcript references. - Known names, program titles, corner names, event names, hashtags, greetings, mail reads, and sign-offs are corrected consistently.
- Search the whole transcript for likely ASR variants of fixed terms, not only exact glossary terms.
- Check high-risk positions explicitly: opening title call, self-introductions, listener greetings, corner starts, event announcements, mail-address reads, and ending sign-off.
- Follow the Subtitle QA Rules for line length, line count, duration, reading speed, gaps, and punctuation.
Speaker and segment structure:
- A normal segment should represent one readable subtitle unit.
- Do not preserve unreadable fragments such as single characters or syllables when adjacent fragments form one jointly spoken phrase.
- If multiple speakers complete the same short phrase together, merge it into one segment
with complete
ja; combine speakers with comma-separated values such asspeaker = "A,B"when attribution matters. - Preserve segment IDs unless a structural correction truly requires a change.
Before committing:
- Update
description.mdwith newly confirmed context and terminology. - Search for known ASR confusions and wrong proper nouns.
- Spot-check opening, middle, and ending subtitles for proper nouns and fixed phrases.
- Confirm no remaining segment is an unreadable fragment that should be merged.
When clean:
rajio /path/to/session --commit --continue=until-manual
If only intentional subtitle QA exceptions remain, inspect them first:
rajio check /path/to/session --json --stage transcript --language ja --verbose
If preserving an exception improves accuracy, naturalness, or readability, add
skip_checks to the affected segment with the exact issue code and a reason, then commit
normally:
rajio /path/to/session --commit --continue=until-manual
Expected result: rajio commits transcript_work, creates
translation/work/segments.toml, and stops at translation_work.
3. Translate And Polish Chinese
Initial translation flow:
- Plan explicit translation batches by non-overlapping source-media time ranges instead of attempting the whole file in one pass. Choose range sizes by dialogue density and the active sub-agent concurrency limit.
- Spawn translation sub-agents following SUB_AGENTS.md. Each worker gets
a label and assigned source-media
start/endrange. - Review each worker patch file, dry-run summary, and final report confirmations. Apply
accepted patches to
translation/work/segments.tomlso every segment has filled or refinedzh. A blocker report is not an accepted patch or a completed batch; resolve it or reassign that range before committing. - Manually perform a whole-file first-draft review for terminology, subtitle continuity, missing translations, Japanese corrections made during translation, and cross-batch style consistency.
- Commit
translation_workand export only after every batch has been translated, terminology has been cross-checked, and validation has no blockingfatalor Chineseerrorissues except reviewed intentional subtitle QA exceptions.
Translation review requirements:
- Use the segment commands documented in the CLI section with
--stage translationfor translation inspection, patch review, patch application, and validation. - Fill or refine translated subtitle text in
zh; keep Japanese corrections limited to transcript issues found while translating. - The main agent's first-draft translation review must be the manual review defined in
Non-Negotiable Rules. Use
rajio segments listto read translated segments in timeline-order batches with neighboring context, comparejaandzh, review worker patches against those segments, and inspect batch boundaries. rajio check, issue lists, dry-run summaries, or statistics are only supporting evidence. They do not count as translation review unless the subtitle text is read in context.
During batch work, keep glossary updates and unresolved uncertainty in description.md,
and search earlier completed batches when a new name, phrase, or style decision appears.
Before committing, confirm this command has no blocking fatal or Chinese error issues:
rajio check /path/to/session --json --stage translation
To inspect Japanese QA in the current translation work file, run the same command with
--language ja.
During translation_work, translation/work/segments.toml is the active subtitle work
file. If translation reveals a Japanese typo, wrong name, wrong fixed phrase, missing
context, or bad segment structure, correct the relevant ja and zh in
translation/work/segments.toml and update description.md when the decision affects
terminology or future batches.
If a translation problem points back to an uncertain or messy source-audio range, use
rajio clips transcribe for that original media time range and inspect it with
rajio clips show <id> --json. Use the sidecar transcript as a second reference before
editing the committed transcript and reconciling the translation.
Validate often with rajio check as documented in the CLI section. This only checks data
shape, timing, required fields, and subtitle limits; before committing, still polish the
content semantically against the acceptance criteria below.
Acceptance criteria:
- Keep
id,start,end, andspeakerstable unless a structural correction or intentionally removed semantically empty filler genuinely requires a change.jamay be corrected intranslation/work/segments.tomlwhen translation review finds a Japanese typo, name, or fixed-phrase issue. - Every segment has non-empty
zh. - Chinese is natural Simplified Chinese subtitle language, not word-by-word literal output.
- Preserve meaning, tone, speaker intent, jokes, references, and discourse flow.
- Very short segments that are only meaningless fillers, breaths, interjections, or pure hesitation sounds may be deleted from the subtitle if removing them does not change meaning, speaker intent, or timing comprehension.
- Smooth spoken hesitation, false starts, and harmless repetition in Chinese unless they are semantically important, characterize the speaker, or affect the scene's rhythm.
- Remove unnecessary Chinese filler and transcript-shaped clutter, including but not
limited to redundant
嗯,啊,哦,呃,欸,那个,就是, repeated对对对, duplicated verbs, repeated subjects, stalled false starts, and trailing particles such as嘛when they do not carry tone or timing value. Treat these as review candidates, not a fixed deletion list. Keep interjections when they express a real reaction, joke beat, surprise, embarrassment, or speaker personality. - Keep Chinese renderings globally consistent for people, programs, corners, events, hashtags, works, products, honorific decisions, and recurring phrases.
- Use
description.mdas the glossary and style source. Update it if new confirmed terms are discovered. - Translate merged multi-speaker phrases as one complete subtitle. Do not preserve syllable-by-syllable fragments in Chinese.
- Follow the Subtitle QA Rules for line length, line count, duration, reading speed, gaps, and punctuation.
- Do not create an awkward short trailing subtitle only to satisfy a warning threshold. Preserve subtitle continuity and readability first.
Before committing:
- Compare
description.mdglossary againsttranslation/work/segments.toml. - Search for inconsistent Chinese names, untranslated Japanese names, wrong titles, and stale translations from earlier draft assumptions.
- Spot-check opening, middle, ending, fixed greetings, mail reads, event announcements, and sign-off for Japanese correctness and Chinese readability.
- Check subtitle continuity across adjacent segments: the Chinese should read as connected dialogue, not isolated literal fragments.
- Review Japanese
errorandwarningissues still present intranslation/work/segments.toml:
rajio check /path/to/session --json --stage translation --language ja --level warning
- Record unresolved uncertainty in
description.mdor mention it in the final report.
When clean, commit and export the first translation draft:
rajio /path/to/session --commit --continue=until-manual
If intentional Chinese QA exceptions remain, inspect them first:
rajio check /path/to/session --json --stage translation --verbose
Add skip_checks to each affected segment only after manual review confirms every
remaining error issue is an intentional subtitle QA exception and no fatal issues
remain. Then commit normally.
rajio /path/to/session --commit --continue=until-manual
Expected result: rajio commits translation_work, runs export, and reaches the
terminal done state. This ends the CLI workflow, but the exported subtitles are still
only a first-pass translation and proofread draft. The current main agent must continue
with the refinement pass below before treating the subtitles as final polished output.
Expected draft output:
output/*.ja.srtoutput/*.zh.srtoutput/*.ja-zh.ass
4. Refine Chinese Subtitles
After the first draft export, the main agent must perform at least three full-pass Chinese
subtitle refinement passes over translation/work/segments.toml, and should keep
iterating while meaningful improvements remain. This is not a
substitute for the sub-agent batch translation stage: do not use these passes to fill
large missing sections or redo the whole translation from scratch. Use them to raise the
already translated draft to final subtitle quality.
Preserve the committed draft's structure unless a change clearly improves accuracy,
readability, or subtitle continuity. Do not break the Subtitle QA Rules, timeline
integrity, required fields, segment IDs, or transcript alignment. If refinement changes
the work file after export, recommit translation_work and regenerate export output.
Required refinement pass loop:
For each Chinese refinement pass, follow these steps exactly:
- Read
translation/work/segments.tomlfrom start to end in timeline order as actualja/zhsubtitle text, usingrajio segments listwindows or ranges with neighboring context until the full file is covered. This full read is mandatory; do not replace it withrg,rajio check, issue summaries, issue lists, issue IDs, examples, filters, or statistics. - Edit the Chinese subtitles for style, flow, tone, concision, consistency, and viewing
comfort. Update matching
ja, glossary, ordescription.mdentries only when the Chinese review exposes a confirmed source or term problem. - Run the translation checks for
zhandjaonly as regression checks. Final refinement starts from a committed draft that should already have no blockingfatalorerrorissues; if this pass introduces any newfatalorerror, fix it immediately. Warnings may guide inspection, but this pass is not mechanical warning cleanup. - Reread every segment changed in this pass one by one in its final subtitle form, with neighboring context for boundary flow.
- If the reread finds a style, meaning, flow, or consistency problem, fix it. If checks
show a new
fatalorerror, fix it immediately. Then repeat steps 3-4. - Count the pass complete only after the full-file read, editorial pass, regression checks, and changed-segment reread are all done.
Refinement requirements:
- Perform at least three full-file Chinese refinement passes before final verification. A
pass must make a deliberate full-file check for text quality, not only run
rajio check. - Each refinement pass must satisfy the manual review definition in
Non-Negotiable Rules: read the actual
ja/zhsegment text in timeline order with neighboring context, usingrajio segments listwindows or ranges to cover the full file. Warning lists, statistics, scripted transformations, and validation output may guide where to look, but they do not count as a refinement pass. - Do not substitute
rgsearches,rajio checksummaries,rajio check --verboseissue lists, issue IDs, examples, or filtered issue views for full-file reading. They are only navigation aids; the pass counts only when the subtitle text itself was read and edited in order. - Read the Chinese subtitles continuously across adjacent segments, not only segment by segment. Repair places where the text reads like isolated translated fragments.
- Enforce global term consistency for names, programs, corners, events, works, products, hashtags, recurring jokes, honorific choices, and fixed phrases.
- Match register, tone, and speaker intent to the local context: casual speech should not become stiff, jokes should not become flat, and emotional emphasis should not disappear.
- Prefer natural Simplified Chinese subtitle language over literal completeness. Compress harmless repetition and spoken clutter when the source meaning, rhythm, and speaker personality are preserved.
- Clean up filler-heavy Chinese, wordiness, and direct translation artifacts, including but not limited to redundant interjections, duplicated words or clauses, repeated subjects, over-explicit pronouns, stalled false starts, stiff connectives, explanatory padding, and literal translations of Japanese hesitation. Treat these as review candidates rather than a fixed deletion list; delete or rewrite them when they only mirror source disfluency or make the subtitle read like prose instead of subtitles.
- Check pronouns, ellipses, omitted subjects, callbacks, and topic shifts against nearby Japanese context so Chinese lines do not become ambiguous or misleading.
- Smooth sentence flow across subtitle boundaries while keeping each subtitle readable on its own timing. Avoid awkward trailing fragments created only to satisfy line limits.
- Revisit glossary decisions in
description.md; update it when a better confirmed term or style rule is chosen, then apply that choice consistently through the full file. - Search for stale draft assumptions, mixed translations of the same term, untranslated Japanese, accidental simplified/traditional mismatches, and Chinese punctuation noise.
- Preserve meaningful speaker style differences where the source supports them, but do not over-characterize beyond the audio/video evidence.
- If a Chinese issue exposes a likely Japanese subtitle mistake, fix the
ja/zhpair intranslation/work/segments.toml, updatedescription.mdif needed, and rerun the translation checks.
During and after refinement, validate with:
rajio check /path/to/session --json --stage translation --language zh --level warning
rajio check /path/to/session --json --stage translation --language ja --level warning
After the final refinement pass, run export reset with commit. This commits dirty
translation_work when refinement changed it, and otherwise just regenerates export from
the existing committed translation:
rajio /path/to/session --reset export --commit --continue=until-manual
If the final refined translation still has intentional Chinese QA exceptions, inspect them
first as documented above, add exact per-segment skip_checks, then rerun the same commit
and export reset command:
rajio /path/to/session --reset export --commit --continue=until-manual
Expected final output:
output/*.ja.srtoutput/*.zh.srtoutput/*.ja-zh.ass
5. Final Verification
Before reporting completion:
- Run
rajio checkas documented in the CLI section. - Treat the result as data validation only.
- Confirm
session.tomlis not stuck infailed,dirty, or an unexpected manual stage. - Confirm expected output files exist under
output/. - Confirm at least three full-file Chinese refinement passes were performed after the
sub-agent translation draft. If refinement changed
translation/work/segments.toml, recommit and regenerate exports before reporting completion. - Perform manual content QA:
- proper nouns and fixed terms
- opening title call and speaker introductions
- middle section timing and speaker continuity
- event/work/corner names
- ending sign-off
- Chinese readability, subtitle continuity, and terminology consistency
- unnecessary filler words, repeated expressions, false starts, and direct-translation artifacts in Chinese
- Search final work files for known ASR-confusion variants and glossary terms one last time.
- Perform at least two final spot-check rounds before reporting. Each round should sample
opening, middle, ending, at least one dense dialogue range, and at least one known
uncertain range from
description.md; do not countrajio checkas a spot-check. - Report output files, remaining warnings, assumptions, content-QA limits, and any spots needing user judgment.
Failure Handling
- If
rajio checkreports schema, duplicate ID, empty text, invalid time, or overlap errors, fix the relevant work file before committing. - If a committed manual stage becomes
dirty, inspect the changed work and rerun--commitonly after it passes manual review and validation. - If transcription fails, inspect
transcript/raw/checkpoints/input-000.error.log, check credentials, provider access, media path, ffmpeg, and ffprobe, report the likely cause and recommended next step to the user, then pause work. Do not retry transcription, reset transcription artifacts, or continue downstream stages unless the user explicitly asks for it. A matching completed checkpoint is reused on retry; use--reset transcript_rawonly when the user asks to start a full new transcription round. - If the user asks to retry an earlier workflow step, run the default command with
--reset:--reset audioretries audio extraction,--reset transcript_rawreruns transcription generation,--reset transcript_workregenerates the transcript work file,--reset translation_workregenerates the translation draft, and--reset exportreruns subtitle export.