rajio

name: rajio description: Use only when explicitly asked to use rajio for Japanese audio/video subtitle translation. metadata: author: OneKuma version: '0.2.0-beta.2'

Use this skill to translate Japanese audio/video into polished, carefully proofread subtitles with rajio: prepare context, extract audio, transcribe Japanese, proofread the transcript, run multi-round Simplified Chinese translation and review, polish the final subtitle text, and export SRT/ASS files.

Do not use this skill unless the user explicitly asks for the rajio skill or asks you to create polished Chinese subtitles from Japanese audio/video with rajio.

Non-Negotiable Rules

Quality comes first:

The goal is accurate, natural, comfortable subtitles. Correctness, readability, tone, subtitle flow, and viewing comfort outrank mechanical formatting cleanup.
rajio check is a quality floor, not the definition of finished quality. fatal means invalid data or session state, error means a problem that seriously hurts readability in ordinary cases, and warning means a recommended improvement.
Clearing, skipping, or preserving rajio check issues only establishes the technical baseline. Manual review and text refinement are still required for ASR mistakes, proper nouns, context, terminology, fixed phrases, translation consistency, concision, tone, and Chinese subtitle polish.
Do not satisfy QA heuristics by making subtitles less correct, less natural, or harder to watch. Also do not ignore warnings mechanically: inspect remaining warnings and decide case by case whether to fix, preserve, merge, split, retime, compress, or document the exception.

Respect privacy and provider boundaries:

Make the privacy boundary explicit before transcription. Rajio uploads audio to the configured transcription provider; start transcription only after the user authorizes that upload.
During translation_work, do not use the OpenAI-compatible provider configured in .env as a machine-translation service. Sub-agents produce the first-draft batch translations from the provided context; the main agent reviews, merges, validates, and performs the required full-file Chinese refinement.

Respect manual-stage ownership:

transcript_work and translation_work are manual stages. Always process first-draft proofreading and translation through sub-agent batches. If sub-agent tooling is unavailable, stop and report that the manual stage cannot be completed under this skill.
The main agent owns batch planning, patch review, patch application, glossary decisions, consistency QA, description.md, validation, commits, exports, and final reporting.
The main agent must not proofread or translate the full first-draft manual stage by itself. The explicit exception is final Chinese refinement: after sub-agent translation batches have produced the first draft, the main agent must perform full-file Chinese subtitle refinement as described in Refine Chinese Subtitles.

Define manual review strictly:

Manual review means reading subtitle segment text in timeline order with enough neighboring context to judge meaning, flow, timing, tone, terminology, and subtitle comfort. It does not mean only running rajio check, scanning issue summaries, applying generated patches, or spot-checking validation examples.
During transcript_work, translation_work, and final Chinese refinement, agents must use rajio segments list with explicit ranges, offsets, IDs, issue filters, and neighboring context to inspect actual subtitle text in batches. Every segment in the assigned or owned scope must be reviewed as text, not only as QA metadata.
Batch workers must read every segment in their assigned range plus enough surrounding segments to catch cross-boundary continuity problems. The main agent must review worker patches, batch boundaries, glossary decisions, and at least one continuous full-file Chinese pass after applying first-draft translation patches.
rajio check is only validation support. Passing checks, clearing warnings, adding skip_checks, or applying suggested patches does not count as manual review unless the affected subtitle text has been read in context.
Do not use ad hoc automation scripts to edit segments.toml, generate subtitle text, generate translation/proofread patch operations, or add skip_checks. Write and review subtitle edits through the manual process and rajio segments tooling.
Automation scripts are allowed only for non-editing support such as collecting counts, slicing check JSON for inspection, comparing statistics, or validating data shape. They must support review, not replace reading and judgment.
If a range was not read segment by segment in context, report it as unreviewed. Do not call the stage polished or final.

Respect file boundaries:

Never edit transcript/raw/segments.toml, transcript/raw/checkpoints/*.toml, or transcript/raw/chunks/*.toml. Raw transcript files are references.
Edit only the active manual work file, description.md, and session-local patch or review artifacts: transcript/work/segments.toml, translation/work/segments.toml, description.md, and files under session-local patches/ or clip review directories.
description.md is the source of truth for media metadata, user notes, context, glossary, fixed terms, style requirements, and unresolved uncertainty. Keep it current throughout the session.

Use rajio tools deliberately:

Use rajio check as documented in the CLI section before commits and final reporting, while remembering it is not a substitute for manual QA.
Use rajio segments commands for stable targeted edits to work-stage segments.toml: list/filter segments, edit fields, split/merge subtitle units, and delete semantically empty filler segments. Shape: rajio segments <command> <target>.
Use rajio clips commands for difficult source-video ranges that need independent retranscription for comparison. Clip outputs are sidecar review artifacts only; do not treat them as automatic replacements for transcript/work/segments.toml.
Record intentional subtitle QA error exceptions with per-segment skip_checks in the work-stage segments.toml. Every skip must name the exact issue code and include a reason. Never skip fatal data/file/schema/timeline issues, unfinished translation, or unreviewed batches. Matched skips are omitted from check output; stale skips still report fatal unused_skip_check.

Sub-Agent Batch Contract

Spawn sub-agents for every transcript_work proofread batch and every translation_work translation batch. If sub-agent tooling is unavailable, stop and report that manual stages cannot be completed under this skill.
Run sub-agent batches within the active concurrency/thread limit, and close or release completed workers before spawning more.
Read SUB_AGENTS.md before spawning sub-agents. Keep this file focused on workflow rules; use that document for mandatory batch-worker templates and instructions.
The main agent owns batch planning, patch application, glossary decisions, consistency QA, final full-file Chinese refinement, description.md, rajio check, commits, exports, and final reporting.

Required Input

Local audio/video path. Refuse to start without this.
Optional but preferred: title, original URL, publish date, uploader/channel, synopsis, cast, program/corner names, user notes, fixed terminology, and translation style requirements.

If optional metadata is missing, proceed with filename-based defaults, record the uncertainty in description.md, and revisit it when transcript context reveals more.

CLI Quick Reference

For complete command syntax, examples, output formats, segment patch shape, clip artifact details, and environment variables, read CLI.md.

Check whether rajio is available:

command -v rajio

If it is not installed, run commands through npx rajio ....

Command Overview

Use the installed CLI:

rajio <target> [options]
rajio segments <command> <target> --stage transcript
rajio clips <command> <target>
rajio check <target>
rajio doctor <target>

Default Command

The default command drives the whole session workflow.

Default command media option:

--media <path>: invocation-only media override.

Default command workflow controls:

--continue=until-manual: run automatic stages until the next manual stage.
--continue=step: run one automatic stage.
--commit: commit the current manual stage after validating its work file.
--reset <stage>: regenerate from audio, transcript_raw, transcript_work, translation_work, or export.
--full: runs automatic stages only; manual stages still require sub-agent batch work and --commit.

Audio chunk options:

--chunk-target <seconds>: local audio chunk target. Default 600, minimum 60.
--chunk-boundary-search <seconds>: silence search window around the target cut point. Default 90, range 0..300.
--chunk-silence-noise <db>: ffmpeg silencedetect threshold. Default -35.
--chunk-silence-duration <seconds>: minimum silence duration. Default 0.4.

rajio keeps both single_file and chunking audio strategies. With the current ElevenLabs/scribe_v2/integrated flow, rajio validates these options for CLI compatibility but selects strategy = "single_file" and does not write stages.audio.chunking or stages.audio.chunks[]. Future transcription models may select the chunking strategy again.

Segments

Most rajio segments commands print affected segment rows. segments apply is the exception: by default it prints operation counts plus patch-scoped check feedback. Agents should default to --json for parseable output. When using verbose JSON, pipe the output through jq to select only the fields you need instead of reading the full raw payload. See CLI.md for JSON structures.

Segment command examples:

rajio segments list /path/to/session --json --stage transcript
rajio segments list /path/to/session --json --stage transcript --id 12
rajio segments list /path/to/session --json --stage transcript --id 12,15,19
rajio segments list /path/to/session --json --stage transcript --id 12,15,19 --around 3
rajio segments list /path/to/session --json --stage transcript --offset 100 --limit 50
rajio segments list /path/to/session --json --stage transcript --start 600 --end 660
rajio segments list /path/to/session --json --stage translation --issues empty_zh,zh_line_hard_limit
rajio segments list /path/to/session --json --stage translation --issues duration_too_long --level error
rajio segments list /path/to/session --json --stage translation --issues empty_zh --offset 100 --limit 50
rajio segments apply /path/to/session patch.toml --json --stage translation
rajio segments apply /path/to/session --json --stage translation <<'EOF'
[[operations]]
op = "edit"
segment_id = "12"
zh = "修正后的中文字幕"
EOF
rajio segments edit /path/to/session 12 --json --stage transcript --start 10.2 --end 13.4 --speaker A --ja "修正した日本語"
rajio segments edit /path/to/session 12 --json --stage transcript --ja "修正した日本語" --dry-run
rajio segments split /path/to/session 12 --json --stage transcript --at 11.8 --gap 0.05 --id1 12.1 --id2 12.2 --ja1 "前半の日本語" --ja2 "後半の日本語" --speaker1 A --speaker2 B
rajio segments merge /path/to/session 12.1 12.2 --json --stage transcript --id 12 --ja "結合した日本語" --speaker A,B
rajio segments insert /path/to/session 12.5 --json --stage transcript --start 42.0 --end 43.2 --speaker A --ja "追加された字幕"
rajio segments delete /path/to/session 13 --json --stage transcript

In segments commands, pass /path/to/session after the segment subcommand. Replace --stage transcript with --stage translation for translation/work/segments.toml. Segment ids must be non-empty, trimmed strings without commas.

segments list selects rows by id, time range, validation issue, or plain pagination:

--id <ids>: show a comma-separated id list in requested order. Segment ids themselves must not contain commas.
--id <ids> --around <count>: show surrounding context for each requested id, deduplicated in timeline order.
--start <time> --end <time>: show segments whose start time is in [start, end).
--issues <codes>: show segments matching validation codes such as invalid_time, ja_line_hard_limit, or empty_zh; add --level error to exclude warning-level matches for soft-or-hard codes like duration and reading speed. Add --offset and --limit to page through issue matches.
--offset <count> --limit <count>: show a zero-based window after any issue filtering; omit --limit to read from offset to the end. Do not combine with --id, --around, or --start/--end.

segments apply <target> [file] applies an ordered TOML patch as the batch form of edit, split, merge, insert, and delete. Pass a file path, or omit [file] only when providing stdin in the same shell command, such as <<'EOF' ... EOF. For batch work, prefer a patch file under a session-local patches/ directory. Normal apply writes the patched segments, then runs patch-scoped check feedback. --dry-run validates the patch, previews affected output, and runs the same checks without writing changes. Use --verbose --json with jq when you need affected segment rows and their remaining issues.

created_by = "worker-a"
start = 120.0
end = 180.0

[[operations]]
op = "edit"
segment_id = "12"
zh = "修正后的中文字幕"

[[operations]]
op = "edit"
segment_id = "title"
skip_checks = [
  { code = "zh_repeated_punctuation", reason = "Official title spelling." },
  { code = "zh_line_hard_limit", reason = "Official title should stay on one line." }
]

[[operations]]
op = "split"
source_id = "long"
gap = 0.05

[[operations.replacements]]
segment_id = "long.1"
start = 10.0
end = 13.2
speaker = "A"
ja = "前半の日本語"
zh = "前半中文字幕"

[[operations.replacements]]
segment_id = "long.2"
start = 13.2
end = 16.0
speaker = "A"
ja = "後半の日本語"
zh = "后半中文字幕"

[[operations]]
op = "merge"
source_ids = ["13.1", "13.2"]
merged_id = "13"
speaker = "A,B"
ja = "結合した日本語"
zh = "合并后的中文字幕"

[[operations]]
op = "insert"
segment_id = "13.5"
start = 16.2
end = 17.0
speaker = "A"
ja = "追加された字幕"
zh = "新增字幕"

[[operations]]
op = "delete"
segment_id = "14"

Clips

Clip command examples:

rajio clips transcribe /path/to/session --start 120 --end 180 --label noisy-overlap
rajio clips list /path/to/session --json
rajio clips show /path/to/session clip-120000-180000 --json

Use clips when an initial transcription has a complex, noisy, overlapped, or error-prone time range that should be independently recognized for comparison. clips list prints only clip rows; clips show prints only that clip's segments.toml. Agents should default to --json for clips list and clips show; otherwise output is a human-readable table. See CLI.md for JSON structures.

Check

Use rajio check before committing manual stages and before final reporting. It validates session shape, timeline integrity, required text, and subtitle QA heuristics, but it does not replace semantic review for ASR mistakes, names, terms, context, translation quality, or editorial polish. Treat the levels as follows:

fatal: invalid data or session state; fix before proceeding.
error: a problem that seriously hurts subtitle readability in ordinary cases; fix it unless a specific reviewed exception is better for accuracy or viewing comfort.
warning: a recommendation; inspect it and make a local editorial decision instead of mechanically fixing or mechanically ignoring it.

Passing rajio check, including with zero fatal/error issues, does not mean the subtitles are polished. It only means the work has passed the baseline data and subtitle heuristic checks.

Use --json for machine-readable output; pipe it to jq when you need to extract fields or slice down the output. See CLI.md for JSON structures.

rajio check /path/to/session --json --level error: show blocking fatal and error issues.
rajio check /path/to/session --json --stage transcript --language ja: check transcript work Japanese QA. Transcript checks only support ja.
rajio check /path/to/session --json --stage translation: check translation work Chinese QA; zh is the default language for translation.
rajio check /path/to/session --json --stage translation --language ja: inspect Japanese subtitle QA inherited into translation/work/segments.toml.
Add --verbose only when you need full sorted issues, such as locating exact segment IDs and issue codes before adding skip_checks. When using verbose JSON, pipe it through jq to inspect only the fields you need instead of reading the full raw payload.

Subtitle QA Rules

These are the subtitle QA thresholds enforced by rajio check; severity, stage, and language filtering follow the Check section above.

Rule	Warning	Error
Japanese line length	`ja` line exceeds 20 visible non-space characters	`ja` line exceeds 28 visible non-space characters
Chinese line length	`zh` line exceeds 16 visible non-space characters	`zh` line exceeds 24 visible non-space characters
Line breaks	2 lines that still need review after merge-length check	More than 2 lines, or 2 lines that merge within the soft length and should be one line
Subtitle duration	shorter than 0.5 seconds or longer than 7 seconds	shorter than 0.3 seconds or longer than 10 seconds
Reading speed	Japanese exceeds 15 chars/s; Chinese exceeds 11 chars/s	Japanese exceeds 20 chars/s; Chinese exceeds 15 chars/s
Adjacent gap	gap is 50-150 ms	gap is under 50 ms
Punctuation	none	ordinary comma/period punctuation, ordinary sentence-ending punctuation, 2+ repeated question/exclamation marks, or punctuation-only line

Do not satisfy numeric limits by creating unreadable single-character, single-syllable, or isolated filler subtitles. Prefer natural compression, merging with an adjacent segment, retiming, or splitting at a semantic pause. Single ？ or ！ is allowed when needed for intent, but use it sparingly.

Warnings are still work items. Review them in context and either improve the subtitle or keep the current form because it is more accurate, natural, or comfortable than the mechanical alternative.

Workflow

0. Prepare The Session

Resolve the media path to an absolute path and confirm it exists.
Choose the session directory. If the user provides one, use it. Otherwise create one near the media file or in the current workspace using a filesystem-safe title or media stem. Do not copy large media files unless the user asks.
Create or update description.md.
Gather confirmed context before transcription when practical. Use the original URL, official pages, video title, filenames, on-screen text, user notes, and later transcript discoveries. Record uncertainty explicitly instead of guessing.

Use this description.md shape:

---
media: ./video.mp4
title: Video title or filename stem
url: https://example.com/original
published_at: 2026-06-06
---

## Context

- Source/uploader:
- User notes:
- Video synopsis:
- Cast/speakers:
- Program/corner structure:
- Known fixed greetings or sign-offs:
- Related events/products/works mentioned:

## Glossary And Fixed Terms

- Japanese term/person/place -> Chinese translation or note
- Common ASR confusion -> Correct Japanese term / Chinese translation

## Style Requirements

- Translate into natural Simplified Chinese subtitles.
- Preserve important names and terminology consistently.

1. Run To Transcript Work

Before automatic stages, run:

rajio doctor /path/to/session

rajio doctor reports the rajio CLI version and update-check result first. Confirm the CLI version matches this SKILL frontmatter metadata.version; if they differ, or doctor reports a newer version is available or update checking failed, report that before starting automatic stages. It also validates runtime configuration and provider access using the target directory for .env loading. Do not start transcription until rajio doctor passes or the environment issue is resolved.

Run:

rajio /path/to/session --continue=until-manual

Expected result: rajio creates or reads session.toml, extracts audio, transcribes Japanese, writes raw transcript artifacts, creates transcript/work/segments.toml, and stops at transcript_work.

Wait for transcript/raw/checkpoints/input-000.toml or transcript/raw/checkpoints/input-000.error.log. Do not restart while requests may still be in flight unless there is a clear CLI/provider failure.

Treat automatically created work segments as a draft.

Rajio may write suggested patches under transcript/work/suggested-patches/, but never applies them automatically. Review them during the proofread stage below.

Suggested patch review:

Review suggested patches in numeric filename order before spawning transcript workers.
Before spawning transcript workers, either apply, manually fold in, or explicitly reject each suggested patch, including *-low.* files.
Treat confidence = "medium" patches with extra care and *-low.* files as manual decision candidates, not automatic cleanup or post-worker TODOs.
Patch reason and confidence fields do not change apply behavior.

Example generated files:

transcript/work/suggested-patches/
  01-punctuation-cleanup-chunk-000-000000s-000600s-high.toml
  02-fragment-merge-chunk-000-000000s-000600s-high.toml
  02-fragment-merge-chunk-000-000000s-000600s-medium.toml
  03-boundary-retime-chunk-000-000000s-000600s-high.toml
  04-long-segment-candidates-chunk-000-000000s-000600s-low.md

2. Proofread And Polish Japanese

Proofread flow:

Review automatically generated suggested patches under transcript/work/suggested-patches/, adjust them if needed, dry-run them with rajio segments apply <session> <patch> --stage transcript --dry-run, then apply, manually incorporate, or reject them before assigning worker ranges.
Spawn transcript proofread sub-agents following SUB_AGENTS.md. Each worker gets a label and assigned source-media start/end range.
Review each worker patch file, dry-run summary, and final report confirmations. Apply accepted patches to transcript/work/segments.toml. A blocker report is not an accepted patch; resolve it or reassign that range before committing.
Manually perform a whole-transcript review and polish pass for context, terminology, fixed phrases, structure, and readability across batch boundaries.
Commit transcript_work only after semantic review and validation are clean, or only intentional subtitle QA exceptions remain with exact skip_checks.

Transcript review requirements:

Use the segment commands documented in the CLI section with --stage transcript for transcript inspection, patch review, patch application, and validation.
Do not translate in this stage; only correct and polish the Japanese transcript.
The main agent's proofread and polish pass must be the manual review defined in Non-Negotiable Rules. Use rajio segments list to read the actual transcript text in timeline-order batches with neighboring context, review suggested and worker patches against those segments, and inspect batch boundaries.
rajio check, suggested patches, issue filters, dry-run summaries, or statistics are only supporting evidence. They do not count as transcript proofreading unless the affected subtitle text is read in context.

For complex, noisy, overlapped, or suspicious ASR ranges, the main agent or a sub-agent may use rajio clips transcribe to retranscribe the original media time range as sidecar evidence. Then use rajio clips list --json and rajio clips show <id> --json to compare the alternate transcript against transcript/work/segments.toml. Clip output is reference material; do not treat it as an automatic replacement.

Validate often with rajio check as documented in the CLI section. This only checks data shape, timing, required fields, and subtitle limits; before committing, still polish the content semantically against the acceptance criteria below.

Acceptance criteria:

Every segment has stable id, numeric start/end, non-empty speaker, and non-empty Japanese ja.
Timestamps increase and do not overlap.
Japanese text is coherent, natural, and corrected against description.md, glossary, proper nouns, and raw transcript references.
Known names, program titles, corner names, event names, hashtags, greetings, mail reads, and sign-offs are corrected consistently.
Search the whole transcript for likely ASR variants of fixed terms, not only exact glossary terms.
Check high-risk positions explicitly: opening title call, self-introductions, listener greetings, corner starts, event announcements, mail-address reads, and ending sign-off.
Follow the Subtitle QA Rules for line length, line count, duration, reading speed, gaps, and punctuation.

Speaker and segment structure:

A normal segment should represent one readable subtitle unit.
Do not preserve unreadable fragments such as single characters or syllables when adjacent fragments form one jointly spoken phrase.
If multiple speakers complete the same short phrase together, merge it into one segment with complete ja; combine speakers with comma-separated values such as speaker = "A,B" when attribution matters.
Preserve segment IDs unless a structural correction truly requires a change.

Before committing:

Update description.md with newly confirmed context and terminology.
Search for known ASR confusions and wrong proper nouns.
Spot-check opening, middle, and ending subtitles for proper nouns and fixed phrases.
Confirm no remaining segment is an unreadable fragment that should be merged.

When clean:

rajio /path/to/session --commit --continue=until-manual

If only intentional subtitle QA exceptions remain, inspect them first:

rajio check /path/to/session --json --stage transcript --language ja --verbose

If preserving an exception improves accuracy, naturalness, or readability, add skip_checks to the affected segment with the exact issue code and a reason, then commit normally:

rajio /path/to/session --commit --continue=until-manual

Expected result: rajio commits transcript_work, creates translation/work/segments.toml, and stops at translation_work.

3. Translate And Polish Chinese

Initial translation flow:

Plan explicit translation batches by non-overlapping source-media time ranges instead of attempting the whole file in one pass. Choose range sizes by dialogue density and the active sub-agent concurrency limit.
Spawn translation sub-agents following SUB_AGENTS.md. Each worker gets a label and assigned source-media start/end range.
Review each worker patch file, dry-run summary, and final report confirmations. Apply accepted patches to translation/work/segments.toml so every segment has filled or refined zh. A blocker report is not an accepted patch or a completed batch; resolve it or reassign that range before committing.
Manually perform a whole-file first-draft review for terminology, subtitle continuity, missing translations, Japanese corrections made during translation, and cross-batch style consistency.
Commit translation_work and export only after every batch has been translated, terminology has been cross-checked, and validation has no blocking fatal or Chinese error issues except reviewed intentional subtitle QA exceptions.

Translation review requirements:

Use the segment commands documented in the CLI section with --stage translation for translation inspection, patch review, patch application, and validation.
Fill or refine translated subtitle text in zh; keep Japanese corrections limited to transcript issues found while translating.
The main agent's first-draft translation review must be the manual review defined in Non-Negotiable Rules. Use rajio segments list to read translated segments in timeline-order batches with neighboring context, compare ja and zh, review worker patches against those segments, and inspect batch boundaries.
rajio check, issue lists, dry-run summaries, or statistics are only supporting evidence. They do not count as translation review unless the subtitle text is read in context.

During batch work, keep glossary updates and unresolved uncertainty in description.md, and search earlier completed batches when a new name, phrase, or style decision appears. Before committing, confirm this command has no blocking fatal or Chinese error issues:

rajio check /path/to/session --json --stage translation

To inspect Japanese QA in the current translation work file, run the same command with --language ja.

During translation_work, translation/work/segments.toml is the active subtitle work file. If translation reveals a Japanese typo, wrong name, wrong fixed phrase, missing context, or bad segment structure, correct the relevant ja and zh in translation/work/segments.toml and update description.md when the decision affects terminology or future batches.

If a translation problem points back to an uncertain or messy source-audio range, use rajio clips transcribe for that original media time range and inspect it with rajio clips show <id> --json. Use the sidecar transcript as a second reference before editing the committed transcript and reconciling the translation.

Acceptance criteria:

Keep id, start, end, and speaker stable unless a structural correction or intentionally removed semantically empty filler genuinely requires a change. ja may be corrected in translation/work/segments.toml when translation review finds a Japanese typo, name, or fixed-phrase issue.
Every segment has non-empty zh.
Chinese is natural Simplified Chinese subtitle language, not word-by-word literal output.
Preserve meaning, tone, speaker intent, jokes, references, and discourse flow.
Very short segments that are only meaningless fillers, breaths, interjections, or pure hesitation sounds may be deleted from the subtitle if removing them does not change meaning, speaker intent, or timing comprehension.
Smooth spoken hesitation, false starts, and harmless repetition in Chinese unless they are semantically important, characterize the speaker, or affect the scene's rhythm.
Remove unnecessary Chinese filler and transcript-shaped clutter, including but not limited to redundant 嗯, 啊, 哦, 呃, 欸, 那个, 就是, repeated 对对对, duplicated verbs, repeated subjects, stalled false starts, and trailing particles such as 嘛 when they do not carry tone or timing value. Treat these as review candidates, not a fixed deletion list. Keep interjections when they express a real reaction, joke beat, surprise, embarrassment, or speaker personality.
Keep Chinese renderings globally consistent for people, programs, corners, events, hashtags, works, products, honorific decisions, and recurring phrases.
Use description.md as the glossary and style source. Update it if new confirmed terms are discovered.
Translate merged multi-speaker phrases as one complete subtitle. Do not preserve syllable-by-syllable fragments in Chinese.
Follow the Subtitle QA Rules for line length, line count, duration, reading speed, gaps, and punctuation.
Do not create an awkward short trailing subtitle only to satisfy a warning threshold. Preserve subtitle continuity and readability first.

Before committing:

Compare description.md glossary against translation/work/segments.toml.
Search for inconsistent Chinese names, untranslated Japanese names, wrong titles, and stale translations from earlier draft assumptions.
Spot-check opening, middle, ending, fixed greetings, mail reads, event announcements, and sign-off for Japanese correctness and Chinese readability.
Check subtitle continuity across adjacent segments: the Chinese should read as connected dialogue, not isolated literal fragments.
Review Japanese error and warning issues still present in translation/work/segments.toml:

rajio check /path/to/session --json --stage translation --language ja --level warning

Record unresolved uncertainty in description.md or mention it in the final report.

When clean, commit and export the first translation draft:

rajio /path/to/session --commit --continue=until-manual

If intentional Chinese QA exceptions remain, inspect them first:

rajio check /path/to/session --json --stage translation --verbose

Add skip_checks to each affected segment only after manual review confirms every remaining error issue is an intentional subtitle QA exception and no fatal issues remain. Then commit normally.

rajio /path/to/session --commit --continue=until-manual

Expected result: rajio commits translation_work, runs export, and reaches the terminal done state. This ends the CLI workflow, but the exported subtitles are still only a first-pass translation and proofread draft. The current main agent must continue with the refinement pass below before treating the subtitles as final polished output.

Expected draft output:

output/*.ja.srt
output/*.zh.srt
output/*.ja-zh.ass

4. Refine Chinese Subtitles

After the first draft export, the main agent must perform at least three full-pass Chinese subtitle refinement passes over translation/work/segments.toml, and should keep iterating while meaningful improvements remain. This is not a substitute for the sub-agent batch translation stage: do not use these passes to fill large missing sections or redo the whole translation from scratch. Use them to raise the already translated draft to final subtitle quality.

Preserve the committed draft's structure unless a change clearly improves accuracy, readability, or subtitle continuity. Do not break the Subtitle QA Rules, timeline integrity, required fields, segment IDs, or transcript alignment. If refinement changes the work file after export, recommit translation_work and regenerate export output.

Required refinement pass loop:

For each Chinese refinement pass, follow these steps exactly:

Read translation/work/segments.toml from start to end in timeline order as actual ja/zh subtitle text, using rajio segments list windows or ranges with neighboring context until the full file is covered. This full read is mandatory; do not replace it with rg, rajio check, issue summaries, issue lists, issue IDs, examples, filters, or statistics.
Edit the Chinese subtitles for style, flow, tone, concision, consistency, and viewing comfort. Update matching ja, glossary, or description.md entries only when the Chinese review exposes a confirmed source or term problem.
Run the translation checks for zh and ja only as regression checks. Final refinement starts from a committed draft that should already have no blocking fatal or error issues; if this pass introduces any new fatal or error, fix it immediately. Warnings may guide inspection, but this pass is not mechanical warning cleanup.
Reread every segment changed in this pass one by one in its final subtitle form, with neighboring context for boundary flow.
If the reread finds a style, meaning, flow, or consistency problem, fix it. If checks show a new fatal or error, fix it immediately. Then repeat steps 3-4.
Count the pass complete only after the full-file read, editorial pass, regression checks, and changed-segment reread are all done.

Refinement requirements:

Perform at least three full-file Chinese refinement passes before final verification. A pass must make a deliberate full-file check for text quality, not only run rajio check.
Each refinement pass must satisfy the manual review definition in Non-Negotiable Rules: read the actual ja/zh segment text in timeline order with neighboring context, using rajio segments list windows or ranges to cover the full file. Warning lists, statistics, scripted transformations, and validation output may guide where to look, but they do not count as a refinement pass.
Do not substitute rg searches, rajio check summaries, rajio check --verbose issue lists, issue IDs, examples, or filtered issue views for full-file reading. They are only navigation aids; the pass counts only when the subtitle text itself was read and edited in order.
Read the Chinese subtitles continuously across adjacent segments, not only segment by segment. Repair places where the text reads like isolated translated fragments.
Enforce global term consistency for names, programs, corners, events, works, products, hashtags, recurring jokes, honorific choices, and fixed phrases.
Match register, tone, and speaker intent to the local context: casual speech should not become stiff, jokes should not become flat, and emotional emphasis should not disappear.
Prefer natural Simplified Chinese subtitle language over literal completeness. Compress harmless repetition and spoken clutter when the source meaning, rhythm, and speaker personality are preserved.
Clean up filler-heavy Chinese, wordiness, and direct translation artifacts, including but not limited to redundant interjections, duplicated words or clauses, repeated subjects, over-explicit pronouns, stalled false starts, stiff connectives, explanatory padding, and literal translations of Japanese hesitation. Treat these as review candidates rather than a fixed deletion list; delete or rewrite them when they only mirror source disfluency or make the subtitle read like prose instead of subtitles.
Check pronouns, ellipses, omitted subjects, callbacks, and topic shifts against nearby Japanese context so Chinese lines do not become ambiguous or misleading.
Smooth sentence flow across subtitle boundaries while keeping each subtitle readable on its own timing. Avoid awkward trailing fragments created only to satisfy line limits.
Revisit glossary decisions in description.md; update it when a better confirmed term or style rule is chosen, then apply that choice consistently through the full file.
Search for stale draft assumptions, mixed translations of the same term, untranslated Japanese, accidental simplified/traditional mismatches, and Chinese punctuation noise.
Preserve meaningful speaker style differences where the source supports them, but do not over-characterize beyond the audio/video evidence.
If a Chinese issue exposes a likely Japanese subtitle mistake, fix the ja/zh pair in translation/work/segments.toml, update description.md if needed, and rerun the translation checks.

During and after refinement, validate with:

rajio check /path/to/session --json --stage translation --language zh --level warning
rajio check /path/to/session --json --stage translation --language ja --level warning

After the final refinement pass, run export reset with commit. This commits dirty translation_work when refinement changed it, and otherwise just regenerates export from the existing committed translation:

rajio /path/to/session --reset export --commit --continue=until-manual

If the final refined translation still has intentional Chinese QA exceptions, inspect them first as documented above, add exact per-segment skip_checks, then rerun the same commit and export reset command:

rajio /path/to/session --reset export --commit --continue=until-manual

Expected final output:

output/*.ja.srt
output/*.zh.srt
output/*.ja-zh.ass

5. Final Verification

Before reporting completion:

Run rajio check as documented in the CLI section.
Treat the result as data validation only.
Confirm session.toml is not stuck in failed, dirty, or an unexpected manual stage.
Confirm expected output files exist under output/.
Confirm at least three full-file Chinese refinement passes were performed after the sub-agent translation draft. If refinement changed translation/work/segments.toml, recommit and regenerate exports before reporting completion.
Perform manual content QA:
- proper nouns and fixed terms
- opening title call and speaker introductions
- middle section timing and speaker continuity
- event/work/corner names
- ending sign-off
- Chinese readability, subtitle continuity, and terminology consistency
- unnecessary filler words, repeated expressions, false starts, and direct-translation artifacts in Chinese
Search final work files for known ASR-confusion variants and glossary terms one last time.
Perform at least two final spot-check rounds before reporting. Each round should sample opening, middle, ending, at least one dense dialogue range, and at least one known uncertain range from description.md; do not count rajio check as a spot-check.
Report output files, remaining warnings, assumptions, content-QA limits, and any spots needing user judgment.

Failure Handling

If rajio check reports schema, duplicate ID, empty text, invalid time, or overlap errors, fix the relevant work file before committing.
If a committed manual stage becomes dirty, inspect the changed work and rerun --commit only after it passes manual review and validation.
If transcription fails, inspect transcript/raw/checkpoints/input-000.error.log, check credentials, provider access, media path, ffmpeg, and ffprobe, report the likely cause and recommended next step to the user, then pause work. Do not retry transcription, reset transcription artifacts, or continue downstream stages unless the user explicitly asks for it. A matching completed checkpoint is reused on retry; use --reset transcript_raw only when the user asks to start a full new transcription round.
If the user asks to retry an earlier workflow step, run the default command with --reset: --reset audio retries audio extraction, --reset transcript_raw reruns transcription generation, --reset transcript_work regenerates the transcript work file, --reset translation_work regenerates the translation draft, and --reset export reruns subtitle export.