name: humanize-en
description: Strip AI writing tells from English prose — em-dash overuse, rule of three, negative parallelisms, AI vocabulary (delve, tapestry, crucial, pivotal, underscore, showcase), vague attributions, promotional tone, conjunctive padding (moreover, furthermore, indeed), hedging, signposting, chatbot artifacts. Preserves meaning, structure, code blocks, links, anchors, and frontmatter — rewrites only the flagged phrasing. Operates on inline text or a prose file path; optionally loads a BRAND-VOICE.md via -f for brand-specific rules. Use whenever English prose needs to sound less machine-generated — "humanize this", "remove AI tells", "clean up the AI slop", "sounds like ChatGPT", "polish the English" — on READMEs, docs, release notes, blog drafts, PR bodies, or commit messages.
when_to_use: Also invoked as a subroutine by other writing skills (e.g., /write-clear-readme) to scrub drafts before shipping. Skip for structural restructuring of a README (use /write-clear-readme instead), non-English text, or content where AI-authored tone is intentional (transcripts, dataset labels).
argument-hint: "[-f
Humanize EN
Strip AI writing tells from English prose. Preserves meaning, structure, code blocks, links, anchors, and frontmatter — rewrites only the flagged phrasing.
Additional context from the user: $ARGUMENTS
Scope
This skill removes AI slop. Default mode (no -f): the goal is a clean, direct, human-edited register that preserves the source voice — universal AI tells stripped, no opinion injected. If the source is an opinion piece and the user explicitly asks for voice, references/voice.md covers the optional voice-calibration pass.
Under -f <voice-doc>: the skill raises the bar. The brand voice becomes the primary contract — every rule in the loaded BRAND-VOICE.md (forbidden lexicon, pronouns, rewrite rules, forbidden patterns, lexical exceptions) is enforced deterministically by prescan.py --brand, then by the LLM pass, then re-validated by validate.py. The 32 universal patterns are the floor; the brand voice is the ceiling.
Brand voice integration (optional)
When $ARGUMENTS starts with -f <voice-doc>, load a BRAND-VOICE.md (typically produced by /brand-voice) and treat its rules as the primary contract on top of the universal 32 patterns.
$SKILL_DIR = this skill's folder — ${CLAUDE_SKILL_DIR} in Claude Code, the directory containing this SKILL.md elsewhere.
Workflow:
- Strip
-f <voice-doc>from the head of$ARGUMENTS. The remainder follows the Input modes table below as usual. - Verify
<voice-doc>exists withGlob. Missing or unreadable → degrade to default behavior with an explicit warning ("<path>not found — applying universal patterns only"). Never crash. - Resolve the rules with
python3 <extract_rules.py> --full <voice-doc>— flattens YAML, resolves anyvoice.extendschain, applies_replace/_remove, emits the merged rule block. Try"$SKILL_DIR"/../brand-voice/scripts/extract_rules.py, then~/.claude/skills/brand-voice/scripts/extract_rules.py, then~/.agents/skills/brand-voice/scripts/extract_rules.py. Non-zero exit surfaces chain-resolution errors (extends-cycle,extends-depth-exceeded,extends-parent-not-found) and aborts. Fallback when no candidate resolves: warn "brand-voice scripts unavailable; chain resolution skipped" andReadthe YAML directly — the fallback skipsvoice.extends. - Run the brand-aware prescan alongside the universal one:
python3 "$SKILL_DIR"/scripts/prescan.py --brand <voice-doc> <file>. It emits hits for every mechanically detectable YAML rule (forbidden lexicon, rewrite-rule rejects, all-caps emphasis, pronoun violations, signposting, negative parallelism, rule-of-three headings, rhetorical questions, emoji) plus the 8 universal patterns. Brand hits carrysource: "brand"and arule_id. - Cite per source. Name each hit by source: pattern numbers for universal (
#14), brandrule_ids for brand ([no-hedging-imperative],[forbidden_lexicon:game-changing],[all_caps_emphasis]). Brand rules win on direct conflict — a voice that requires em-dashes overrides pattern #14. - Validate after Edit with
python3 "$SKILL_DIR"/scripts/validate.py --brand <voice-doc> [--baseline <pre-rewrite-hits.json>] <file>. Outcomes:clean→ emit Coverage report and exit.residuals→ surface them; auto-iterate up to 3 passes (see Iteration model).regression→ revert the offending edit and re-rewrite. - Pseudo-tables (
```textor unspecified-language fences) are scanned the same as prose under-f— see *Preservation rules*. Real code (```python,```bash, etc.) stays verbatim.
If the user wants brand-aware rewriting and no voice doc exists, defer: "No BRAND-VOICE.md at <path>. Run /brand-voice extract first."
Iteration model
Default: one pass. Under -f, auto-iterate detect → rewrite → validate up to 3 rounds (stop on clean or regression). --iterate <N> overrides — 1 disables the loop, 5 raises the cap. Each pass emits the residual coverage table.
Input modes
Resolve $ARGUMENTS (after stripping any leading -f <voice-doc>) as follows:
| Input shape | Behavior |
|---|---|
| Empty | Propose the most recent prose target from session context (recent read/edit/draft) and confirm. Ask only when none is detectable. |
| Prose file path | Read the file. Audit, propose a diff, apply only on explicit approval via Edit. |
| Non-prose file path | Refuse: "Non-prose file — this skill targets prose documents, not structured data or source code." Rewrite docstring or comment grammar manually. |
| Inline text (anything else) | Humanize in place and return the rewritten text in the chat. |
Prose extensions (treat as file): .md, .mdx, .txt, .rst, .tex, .html, .adoc.
Non-prose extensions (refuse as file): .json, .yaml, .yml, .toml, .csv, .tsv, .xml, and any source-code file (.py, .ts, .js, .rs, .go, .java, …). Rewriting data or code files would break parsing or semantics even when the rewrite looks harmless.
Classify the first token via Glob (stay in allowed-tools, no shell-out): existing file + prose extension → process; + non-prose extension → refuse per the table; + unknown extension (e.g., CHANGELOG, notes.log) → ask before guessing; nonexistent path → treat as inline text. The middle two branches prevent silent humanizing of data and source-code files.
Process
- Read fully — the whole text, not one paragraph at a time. Patterns compound across sentences (rule-of-three + synonym cycling + promotional tone often ride together).
- Prescan mechanically — for file inputs, run
"$SKILL_DIR"/scripts/prescan.py <file>(or pipe inline text via-). It emits a JSON hit-list for the 8 highest-signal universal patterns (#1, #4, #7, #8, #9, #14, #23, #28). Under-f, add--brand <voice-doc>so the same script also scans for every mechanically detectable brand rule (forbidden lexicon, rewrite-rule rejects, all-caps emphasis, pronoun violations, signposting, rhetorical questions, rule-of-three headings, emoji, negative parallelism). Subjective patterns (tone, rule-of-three in body prose, vague attributions) stay LLM-only. - Capture the baseline — when
-fis set and the input is a file, save the prescan output to a temp path before any rewrite. The validation gate consumes it via--baselineto detect regressions. - Full detect pass — walk the 32 patterns in
references/patterns.mdAND every YAML rule from the brand doc. Do not anchor on the prescan output — the catalogue walk catches what regex cannot, and under-fthe catalogue is both the universal list and the brand rules. - Draft rewrite — replace flagged phrasing with direct, specific alternatives. Keep sentence-level meaning intact. See Preservation rules below for what stays verbatim and what may still be adjusted.
- Self-audit —
- Default mode: ask "What still reads as obviously AI-generated?" List remaining tells in 2–4 bullets. Revise.
- Under
-f: walk everyforbidden_pattern, everyforbidden_lexiconentry, everypronouns.forbidrule, and everyrewrite_rules[*].rejectfrom the loaded voice doc. Emit the Coverage report (see Output format) — missing rows are a hard failure, not a stylistic choice.
- Validate — for file inputs under
-f, run"$SKILL_DIR"/scripts/validate.py --brand <voice-doc> --baseline <prescan.json> <file>afterEditapplies. Onresiduals, iterate (up to the cap from Iteration model); onregression, revert and re-rewrite the affected passage; onclean, exit. - Report — present the final rewrite plus the Coverage report (count-only by default, rule-by-rule under
-f). For file inputs, propose the diff and wait for approval beforeEdit.
Quick reference — the 10 highest-signal tells
Roughly 90% of real AI slop comes from this subset. The 8 mechanical patterns (#1, #4, #7, #8, #9, #14, #23, #28) are pre-flagged by prescan.py; #3 and #10 stay LLM-only — too context-dependent for regex. Full catalog with before/after examples is in references/patterns.md — consult it when a hit needs context or you are unsure whether to flag.
| # | Pattern | Instead |
|---|---|---|
| 1 | Significance inflation — "pivotal moment", "testament to", "evolving landscape" | State the fact directly. |
| 3 | Superficial -ing — "…reflecting broader trends", "…underscoring the importance" | End the sentence; drop the participial coda. |
| 4 | Promotional — "nestled", "breathtaking", "vibrant", "stunning" | Neutral description with a concrete detail. |
| 7 | AI vocabulary — delve, tapestry, intricate, pivotal, testament, underscore, crucial, garner, showcase, vibrant, interplay, align with, additionally, moreover, furthermore, indeed | Plain-English equivalent or delete. |
| 8 | Copula avoidance — "serves as", "stands as", "features", "boasts" | Use is/are/has. |
| 9 | Negative parallelism — "It's not just X, it's Y" | Direct affirmative sentence. |
| 10 | Rule of three — three-item lists where two or four would be honest | Use the real count. |
| 14 | Em-dash overuse | Prefer commas or periods unless the dash does real work. |
| 23 | Filler phrases — "in order to", "it is important to note that", "at this point in time" | Delete or contract. |
| 28 | Signposting — "Let's dive in", "Here's what you need to know", "Without further ado" | Just say the thing. |
Preservation rules
The rewrite must NOT change:
- Real code — anything inside backticks, or fenced blocks with a language hint (e.g.
```python,```bash,```typescript,```json,```yaml,```sql). The info-string after the opening fence is the discriminator. - URLs and anchors — the
(url)portion of[text](url),#anchorrefs, image paths. - Frontmatter — YAML/TOML blocks at file top.
- Quoted material — text inside
"…"attributed to a person or source. - Technical terms, proper nouns, product names — even when they match an "AI vocabulary" flag in other contexts (e.g., a product literally named "Tapestry" is not a pattern-7 hit).
- Structural markers — heading levels, list depth, table columns, HTML tag syntax (tag names and attribute names). Rewrite the prose inside the structure; do not restructure.
- Factual claims — if a sentence states a number, date, or attribution, preserve it verbatim even when the surrounding clause is rewritten.
Pseudo-tables — fenced blocks with no info-string or text (e.g. ``` alone, ```text) are treated as prose, not code. They typically hold label-prefixed columns (INSTRUMENTS Log drum…), terminal transcripts, or ASCII layouts. Their content is scanned and rewritten — but column alignment is preserved (auto-pad after a token-length change). To suppress pseudo-block scanning entirely (treat every fenced block as code), pass --strict-code-only to prescan.py and validate.py.
May be adjusted — link text inside […] is prose and can be rewritten when it carries AI tells (e.g., [delve into the transformative landscape] → [read more]). HTML attributes that contain prose (alt, title, aria-label) follow the same principle.
When in doubt, keep the original token and only adjust the connective tissue around it.
Output format
Default mode emits ## Rewrite + ## Patterns removed. Under -f, the same shell with a ## Coverage report table where every YAML rule appears as a row — even with 0 hits, since missing rows are a hard failure. For file paths, add a ## Diff preview block and (under -f) a ## Validation report with status: clean | residuals | regression, then Apply? (yes/no) — apply only on explicit yes from the end user, even when another skill invokes /humanize-en on their behalf.
Full templates with the canonical column order and the approval contract: references/output-formats.md.
Rules
Everything not listed below is already enforced by Process and Preservation rules above.
- Never inject first-person voice, opinions, or colloquial hedges into neutral registers (docs, specs, formal READMEs, release notes). The source voice wins (default mode); under
-f, the brand voice wins; only the AI tells and brand violations go. - Never drop a sentence entirely unless it is pure chatbot artifact (e.g., "I hope this helps!", "Let me know if you'd like me to expand on any section"). Every other sentence gets rewritten, not deleted.
- Iteration is bounded — default 1 pass; under
-f, auto-iterate up to 3 (detect → rewrite → validate). Use--iterate <N>to override. - Match the source register — a commit message stays terse, a release note stays bulleted, a README paragraph stays prose.
- Coverage report is the contract under
-f— every YAML rule has a row, even with 0 hits. Skipping a rule from the report is a hard failure: the audit either ran or it didn't.
When to defer to another skill
- Structural problems (wrong headings, missing TOC, collapse patterns) →
/write-clear-readme. - Define, update, or inspect a brand voice doc →
/brand-voice extract|update|diff|show. This skill consumes the voice doc via-f;/brand-voiceproduces it. - The text is in a non-English language → stop and tell the user; this skill is English-only by design.
Reference
references/patterns.md— full 32-pattern catalogue with before/after examples. Load when a hit needs context or a reviewer asks why a phrase was flagged.references/voice.md— optional voice calibration for opinion pieces or personal writing. Load only when the user explicitly asks for voice, personality, or a sample-matching pass.references/output-formats.md— canonical templates for the Rewrite, Coverage report, Diff preview, and Validation report blocks. Load before emitting the final response when the shape needs to be reproduced exactly.references/schemas.md— JSON shapes for prescan hits (universal + brand), eval samples, eval results, and validate.py output. Consult when editing any script that produces structured output.scripts/prescan.py— regex-based pre-scan emitting a JSON hit-list. Without flags: 8 universal patterns. With--brand <voice-doc>: also emits brand hits withsource: "brand"and per-rulerule_ids. With--strict-code-only: blanks every fenced block (legacy behaviour). Python 3.7+, no third-party deps.scripts/brand_prescan.py— brand-aware detectors loaded byprescan.py --brand. Self-contained YAML parser sohumanize-enworks whenbrand-voiceis not installed. Readslexical_exceptions.{acronyms,compound_idioms}(top-level frontmatter key, see brand-voice's canonical-format reference) to extend the hardcoded whitelists.scripts/validate.py— post-rewrite gate. Re-runs prescan + brand checks on the rewritten file, classifies the outcome (clean/residuals/regression), and writes a JSON report. Exit 0 on clean+residuals, 1 on regression, 2 on argument or I/O errors. Required step under-f(Process step 7).scripts/utils.py— shared I/O helpers (read_text,read_json,write_json,mask_protected_regions,seeded_rng).mask_protected_regionsdelegates toprescan.pyso the two stay byte-identical.scripts/eval_patterns.py— runs prescan over the eval corpus (eval-corpus/samples/*.jsonfor universal,eval-corpus/brand-voice/*.jsonwith--brandfor brand). Scores per-sample pass/fail, emits a JSON report perreferences/schemas.md§ eval result. Exit 0 on full pass, 1 on any failure. Run before editing detection patterns to baseline current coverage, then re-run to confirm no regression.
Gotchas
-fwith missingextract_rules.pysilently degrades to universal patterns. The 3-tier path fallback ("$SKILL_DIR"/../brand-voice/scripts/extract_rules.py,~/.claude/skills/brand-voice/scripts/extract_rules.py,~/.agents/skills/brand-voice/scripts/extract_rules.py) succeeds on standard installs but a non-standard layout (e.g., custom plugin dir) drops brand rules with a warning the model may overlook. Fix: pre-check the path before relying on brand-rule injection; the warning text isbrand-voice scripts unavailable; chain resolution skipped.voice.extendschain not resolved before prescan when scripts degrade. When the fallback fires (above),brand_prescan.pyreads the YAML directly and applies child rules only; parent constraints in the chain are skipped silently. Fix: when--brandis critical and scripts are unavailable, flatten the chain in the source file before invoking, or installbrand-voicescripts at one of the canonical paths.- **Pseudo-tables (
```textwith no language tag) are rewritten as prose.** Code blocks without a language hint are treated as prose-eligible content and may be rewritten; users expecting verbatim preservation get a surprise. Fix: tag every code/data block with a language hint (```python,```json,```text); pass--strict-code-onlyto force preservation of all fenced blocks. --brandchain-resolution errors don't always surface in the user-facing output.extract_rules.pyexits non-zero onextends-cycle,extends-depth-exceeded,extends-parent-not-found; the orchestrator sometimes drops the stderr in favor of running with degraded rules. Fix: when--brandis set, always checkscripts/extract_rules.py's exit code before continuing; non-zero means brand rules will be incomplete.