dify-docs-format-check-cjk

star 170

Check formatting compliance in changed Chinese and Japanese documentation against writing-guides/formatting-guide.md, tools/translate/formatting-zh.md, and tools/translate/formatting-ja.md. Use after finalizing a translation batch, or when the user says "check formatting (zh/ja)", "check CJK formatting", or "format audit" on translated content.

langgenius By langgenius schedule Updated 4/30/2026

name: dify-docs-format-check-cjk description: > Check formatting compliance in changed Chinese and Japanese documentation against writing-guides/formatting-guide.md, tools/translate/formatting-zh.md, and tools/translate/formatting-ja.md. Use after finalizing a translation batch, or when the user says "check formatting (zh/ja)", "check CJK formatting", or "format audit" on translated content.

CJK Formatting Check (Chinese + Japanese)

Purpose

Verify Chinese (zh/) and Japanese (ja/) documentation against every rule in:

  • writing-guides/formatting-guide.md — general rules
  • tools/translate/formatting-zh.md — Chinese-specific rules
  • tools/translate/formatting-ja.md — Japanese-specific rules

Mechanical rules are enforced by the linter script (check-format-cjk.py); judgment-call rules are checked by reading the file. The linter automatically selects the right rule set based on whether the path starts with zh/ or ja/.

Before Starting

Audit the entire file, not just the diff. Default to files currently under review, detected via:

  • git diff --name-only
  • git diff --cached --name-only
  • Untracked files from git status --porcelain (lines starting with ??)

Filter for .mdx and .md files under zh/ or ja/. If no files are detected, ask the user which files to check.

Because translations are typically produced as a zh/ja pair from the same English source, it is natural to audit both languages in a single session.

Checks

Part 1 — Deterministic (run the linter)

Run the linter:

python3 .claude/skills/dify-docs-format-check-cjk/check-format-cjk.py <file> [<file> ...]

The script selects zh or ja rules based on the file path. Rules fall into three groups: shared (both zh and ja), zh-only, and ja-only.

Shared structural rules (zh + ja)

  • F-title-missing — frontmatter missing title.
  • F-desc-trailing-perioddescription ends with a period.
  • F-quote-needed / F-quote-unnecessary / F-single-quote — frontmatter quoting rules.
  • F-blank-after-fm — no blank line between frontmatter close and body.
  • H-trailing-hash, H-blank-before, H-blank-after, H-skip-level — heading structure rules from the general guide.
  • B-trailing-colon-inside — colon inside **...**.
  • L-asterisk-bullet, L-nested-indent, L-blank-before, L-blank-after — list structure.
  • C-no-language, C-blank-before, C-blank-after — code block rules.
  • Li-click-here, Li-http-external — link rules.
  • I-raw-img-tag, I-alt-too-long, I-caption-alt-mismatch, I-filename-* — image rules (same set as the EN skill).
  • M-tab-no-title — Mintlify component rules.
  • S-double-blank — spacing.
  • P-em-dash-spaces, P-en-dash-spaces — general punctuation.

Shared CJK rules (zh + ja)

  • CJK-latin-spacing — CJK character directly adjacent to Latin letter, digit, or backtick without a space. Exceptions: punctuation boundary, start/end of line, inside code/URLs.
  • CJK-halfwidth-punct — half-width punctuation , . : ; ? ! ( ) directly adjacent to a CJK character. (Slash and other exceptions are handled by language-specific rules below.)
  • CJK-bold-no-space — bold span **...** adjacent to CJK character without a space on each side.
  • CJK-link-no-space — markdown link text adjacent to CJK character without a space on each side.
  • CJK-italic*text* italic used on a CJK span. Chinese and Japanese should use bold, never italic.
  • CJK-em-dash — em dash or double em dash —— appears in CJK text. Restructure the sentence.
  • CJK-disclaimer-missing — translation disclaimer (<Note> ⚠️ ...) missing directly below the frontmatter.
  • CJK-cross-lang-link — internal link begins with the wrong language prefix (e.g., /en/... inside a zh/ file).

Heading rules

  • H-heading-end-punct — CJK heading ends with sentence-ending punctuation (。,、;:).

Chinese-only rules

  • ZH-ascii-ellipsis... used where Chinese ellipsis …… is expected.
  • ZH-fullwidth-slash — full-width slash used; must be /.
  • ZH-quotes — mainland-style double or single quotation marks "", '' used; must be corner brackets 「」 (single) or 『』 (nested).
  • ZH-range-hyphen — numeric range uses - or ; must use .
  • ZH-percent-space — space between a digit and % or °.

Japanese-only rules

  • JA-fullwidth-digit — full-width digit used (, , ...).
  • JA-fullwidth-latin — full-width Latin letter used (, , ...).
  • JA-fullwidth-space — full-width space ( ) used.
  • JA-sentence-too-long — sentence longer than 80 Japanese characters.
  • JA-go-prefix prefix used on a verb in the "avoid" list (ご確認ください, ご参照ください, ご入力ください). ご利用 is allowed.
  • JA-heading-sentence-ending — heading ends with します / します。 / します?, indicating a full-sentence rather than noun-phrase form.
  • JA-style-mix — the file contains both です/ます and だ/である forms in body text. Only one register should be used.

Part 2 — Judgment-call review (LLM reads each file)

For each changed file, read it and look for:

Translatable elements

  • Tab titles (<Tab title="...">) translated with the glossary value.
  • Frame captions and image alt text translated.
  • Bold UI labels translated, matching the codebase i18n file (web/i18n/zh-Hans/ for zh, web/i18n/ja-JP/ for ja). Cross-check with the terminology-check skill.
  • Natural-language prompt examples inside code blocks translated. Variable placeholders ({{variable_name}}) stay unchanged.

Anchor translation (zh/ja)

  • Cross-references [text](#anchor) or [text](/path#anchor) use the translated heading slug, not the English original. Example: the heading ## 响应 produces #响应, not #response.

Chinese style

  • Enumeration comma used for parallel items within a sentence (where Chinese separates items that would use commas in English).
  • Chinese ellipsis …… used correctly; not combined with in the same phrase.
  • Arabic numerals used for technical content (3 种, not 三种) — judgment call since some idiomatic expressions use Chinese numerals.
  • Translationese patterns from tools/translate/formatting-zh.md: redundant 你的, unnecessary , 当...时 wrappers, 能够 instead of , 可以 where would suffice. Flag sentences that read as machine-translated.

Japanese style

  • です/ます maintained in body text.
  • Headings are noun phrases, not full sentences.
  • Katakana long-vowel mark rules: short loanwords (3 morae or fewer) keep the trailing ; longer ones drop it. Established compound katakana terms (ワークフロー, ナレッジベース) match the glossary.
  • Middle dot only where needed for readability; established compound terms have no middle dot.
  • Translationese patterns from tools/translate/formatting-ja.md: redundant あなたの, 〜することができます instead of 〜できます, 〜とき/〜場合 wrappers, stacked . Flag sentences that read as machine-translated.

General

  • Terminology matches the glossary (writing-guides/glossary.md). Prefer invoking the terminology-check skill for rigorous checking rather than duplicating its logic here.

Output Format

## CJK Formatting Check Results

### File: {path} ({lang})

**Deterministic violations** ({n})
- Line {n} [{rule-id}]: {message}
- ...

**Judgment-call findings** ({n})
- Line {n}: {description of issue} — {rule area}
- ...

**Clean checks**
- ✅ Disclaimer present
- ✅ No bold/CJK spacing issues
- ...

Group by file. If a file has no issues at all, report a single ✅ line.

Important

  • Do NOT modify any files. This is a read-only audit.
  • When a deterministic rule flags something that is clearly intentional on inspection (e.g., a half-width comma inside a Latin acronym, a mainland quotation mark quoted from an external source), surface it but note the ambiguity so the user can decide.
  • The terminology-check skill is the authoritative source for glossary and UI-label verification. If a finding overlaps, defer to that skill and just note the overlap.
  • Japanese sentence-length and style-mix checks are heuristic. They flag candidates for human review; they are not definitive rules violations.
Install via CLI
npx skills add https://github.com/langgenius/dify-docs --skill dify-docs-format-check-cjk
Repository Details
star Stars 170
call_split Forks 386
navigation Branch main
article Path SKILL.md
More from Creator