who-am-i - SKILL.md Agent Skill

name: who-am-i description: >- Analyze the user's OWN messages across all their local AI conversations (Claude Code, Codex, VS Code Copilot, Cursor, and more) to produce a sharp, evidence-grounded "Who Am I" self-portrait — personality, strengths, blind spots, and concrete ways to grow. Use this WHENEVER the user wants to "know themselves" / 认识自己 through their AI chat history, asks for a personality read / 性格分析 / 自我画像 / 自我反思 based on how they talk to AI, wants a report on "what kind of person am I" drawn from past conversations, or says things like "analyze my chat history and tell me who I am", "我想通过我和AI的对话来了解我自己", "看看我是个什么样的人", or "who am I based on my prompts". Reads only local data; nothing is uploaded.

Who Am I — 从你和 AI 的对话里照见你自己

A person's AI conversations are an unusually honest diary. People drop their guard with an assistant: they vent, they obsess, they repeat the same worry, they reveal what they actually value versus what they say they value. This skill mines the user's own words (never the AI's replies) across every AI tool on the machine, then writes a self-portrait that is sharp, specific, and grounded in real quotes — the opposite of a horoscope.

It is adaptive and zero-config: the collector auto-detects whatever AI tools the user happens to have — terminal CLIs (Claude Code, Codex, Gemini, Qwen, Hermes Agent, OpenCode, OpenClaw, Goose, Crush, Aider), editor assistants (Cursor, VS Code Copilot, Cline, Roo Code, Windsurf, Trae), and voice dictation (Typeless) — reads from each that has local data, and honestly reports the rest (some keep chats server-side or encrypted). Two design points make "read ALL my tools" real:

Detection is by data path, not an app list. Most CLI agents aren't "applications" at all, and one tool spans several surfaces (Claude Code's CLI, VS Code extension, and JetBrains plugin all funnel into ~/.claude/projects; Codex's VS Code + desktop into ~/.codex). Probing data paths is precise and cross-OS; an installed-apps list would miss the CLIs and the per-surface stores.
A discovery net catches the unknowns. Beyond the hard-coded registry, the collector heuristically scans for AI-shaped data dirs it didn't enumerate and either ingests them (when cleanly role-tagged) or flags them — so a tool we've never heard of still surfaces instead of being silently missed.

The report's one universal job is 看清自己 — an honest mirror. People differ in what they want beyond that: some want to change themselves, some just want to be seen, some want something fun to share, some want hard truths. So the deliverable is a fixed descriptive core + opt-in modules: the core describes (qualities, how they show up, what they buy you, what they cost you — all mirror, no prescriptions); the modules (改进清单 / 直话区 / 协作说明书) are added only when the person wants them. Sense intent from their request; when unclear, deliver the core and offer the modules at handover — never assume everyone wants to be coached.

The one rule that makes or breaks this

Every claim about the person must be grounded in their actual words — but grounding is a writing discipline, not a display format. Before writing any trait, locate the real message(s) that prove it; if you can't, cut the claim. This is non-negotiable because the failure mode of any "personality analysis" is the Barnum effect — vague statements ("you're ambitious but sometimes doubt yourself") that feel personal but are true of everyone. Ask: "Would this be true of most people, or is it specifically visible in THIS data?"

In the report itself, reproduce NONE of their sentences. Grounding is the brain's work while writing, not part of the output: find the evidence, verify the claim against it, then state the judgment in your own words. The reader IS the person — they remember what they said; a friend who sees through you doesn't read your diary aloud, he just tells you who you are. Allowed in the output: scene-pointers — a 3–6 word tag naming a moment the person will actually remember:（改到第七版那封邮件）（凌晨三点交稿那次）（第一次回审稿意见那回）. NEVER bare dates:（5/25）means nothing to them — people remember scenes, not calendars; dates are YOUR index, not THEIR memory. Time references are legal only in the trajectory block, where time itself is the point. Also allowed: vetted metrics, including counted catchphrases (「再改一下」×140 is a data label, not a quotation); and single coined terms of theirs used as proper nouns. Not allowed: their sentences, even one, even in the opening section（先说结论）.

Workflow

1. Collect the corpus

Run the bundled collector. It is read-only, auto-detects whichever AI tools are installed (no flags, no config), extracts only the user's typed messages, strips tool/harness noise, redacts obvious secrets, and writes everything locally. Run it from this skill's own directory (the base directory is shown when this skill loads — don't assume a fixed path: depending on the host agent it may live under ~/.claude/skills/, ~/.codex/skills/, or another agent's skills directory):

python3 <skill-base-dir>/scripts/collect_conversations.py

(On Windows use python or py — python3 usually doesn't exist there.)

It prints a summary and an OUT_DIR=/… line on the last line — capture that directory. Inside you get:

metrics.json — pre-computed, vetted numbers over the user's own prose (pastes excluded, local time): message-length distribution, hour-of-day histogram + late-night %, language split, a panel of high-signal phrase counts (谢谢 / 不对 / 你觉得 / thank / …), top_repeated_messages (the exact sentences the user has said 3+ times — their catchphrases and habitual commands, prime personality material; note loop-injected prompts can inflate it), and top contexts. Quote these numbers directly. They exist because hand-rolling stats under time pressure is where a run silently fabricates them — don't re-derive what's already here.
stats.json — per-source counts, date ranges, a reading hint (how many Read pages sample.md takes), and the adaptive detection manifest (tools_scanned
- a detected list giving every known tool's status: collected, installed_no_local_chats, or installed_unreadable with a reason — e.g. ChatGPT Desktop is encrypted on disk; Claude Desktop / CodeBuddy chats live server-side). It also carries discovered — AI tools found by the heuristic net that aren't in the registry (with how many messages were ingested or why not). Read this first to know your scope, and reflect it honestly in the report's 方法与边界 section so the user sees exactly what was and wasn't read.
sample.md — the user's messages, human-readable, grouped by source and in time order. Full corpus if it fits the budget, else a stratified sample. Lines tagged ⚠粘贴 are likely pasted third-party text (a Discord log, a stack trace) — read them as context, never quote them as the user's own voice.
user_text.txt — one clean (non-paste) user message per line. If you want to count a phrase not in metrics.json, grep -c this file — never corpus.jsonl, whose lines also carry source/context and would wildly over-count.
corpus.jsonl — the complete corpus, one {source, context, ts, text, pasted} per line. Use it to pull a specific quote with its exact date/source.

If no data is found, tell the user plainly which tools were checked (see references/data-sources.md) rather than inventing an analysis.

2. Read it all, like a detective

Read metrics.json and stats.json first (the vetted numbers and your scope), then read sample.md — it may span several Read pages; page through it rather than skim, because you read for patterns that repeat, and repetition is where character hides. You don't need to read every message just to count things — that's what metrics.json is for; reading is for texture, voice, and finding the quotes that prove a trait.

How they ask. Terse or elaborate? Do they front-load context or fire one-liners? Do they ask permission ("可以吗", "你觉得呢") or give orders ("帮我把…改了")? Do they explain why they want something?
What they return to. The same topics, fears, or aesthetic obsessions surfacing across weeks and across projects.
How they react to failure. When the AI gets it wrong, do they debug calmly, re-explain, get sharp ("你又改坏了"), or blame themselves?
Their standards. What makes them say "完美" vs. what makes them redo it five times. How high is the bar, and for what.
Tempo and emotion. Impatience ("好了吗", "快点"), late-night sessions, excitement, frustration, the rare "谢谢". The tells.
Their relationship to the work and the tool. Collaborator, instrument, rubber duck, adversary? Do they delegate thinking or steer every token?
Contradictions. What they say they want vs. what they actually keep doing. This is gold for the blind-spots section.
Register matters — weigh the source mix. A person sounds different typing terse orders to a coding CLI versus dictating aloud (voice tools like Typeless capture spoken thinking — often more discursive and personal). If one source dominates the volume (check the per-source counts), say so, and read across registers rather than letting the loudest one stand in for the whole person: dictation shows how they think and talk; AI-chat shows how they instruct and judge. Scope honesty: dictation tools capture speech into any app — messages to colleagues, documents, not only AI chats. Frame the report accordingly ("你说过的话 + 你对 AI 打的字", not "你和 AI 的对话") and disclose this in 方法与边界.

Quantify — but with vetted numbers. Numbers make a read land and prove it isn't a cold reading. A line like "在 4,900 条里你把同一封邮件改了 7 版、说了 183 次'不好意思'" is worth a paragraph of adjectives. Take such figures from metrics.json (phrase panel, length distribution, hour histogram, language split, top contexts). Need a count it doesn't have? grep -c user_text.txt, not corpus.jsonl. Three traps that produce false personality stats: (1) never count over corpus.jsonl (its source/context fields inflate matches — e.g. "VS Code" appears in every Claude-Code line's metadata); (2) remember a message can embed pasted third-party text (those are tagged ⚠粘贴 and already excluded from metrics.json), so a phrase you find by eye in sample.md might be someone else's words, not the user's; (3) 体裁基线 — the 换人测试: this corpus is "talking to an AI", and the genre itself forces certain words sky-high for EVERYONE（帮我、继续、可以、请求类祈使句）— a count the genre guarantees is not a personality signal, no matter how big. Before any metric enters the report, ask: would a random person doing the same work with an AI produce this number too? Yes → discard it, or use only within-person contrasts (ratios, register-vs-register). No — because the behavior is OPTIONAL（道谢、追问为什么、挑错、独特用法、时段、原样重复的口头禅）→ it qualifies. When the corpus spans registers (to-AI vs drafts-to-humans), use one register as the other's control group.

Keep a running list of candidate observations, each pinned to one or more real quotes with dates. You'll prune and shape these into the report.

Climb the abstraction ladder — this is what makes a report land. Concrete behaviors with quotes are the FLOOR, not the deliverable. A report that stops at "you said X on 5/19 and Y on 6/4" reads like an audit; the reader feels counted, not SEEN. For each cluster of evidence, climb: behavior → pattern → quality → core. ("你把一封邮件改到第七版又用回第一版" → "改的是焦虑不是文字" → "防御性的完美主义" → "你把'别人怎么看'当成了导航系统".) The test for a core quality: does the same belief show up in unrelated domains? When one conviction explains their product decisions AND their money views AND how they treat people, you've hit bedrock — name it in one resonant phrase, then hang the quotes beneath it as proof. The quotes justify the abstraction; the abstraction is what the reader came for.

Ask for their off-machine thinking. The corpus is work-dominated; a person's deepest qualities often surface in free-ranging conversations (thought experiments, debates, hypotheticals) that may live in cloud-side apps this collector can't read. Invite the user to share a few claude.ai/ChatGPT share links or an export of conversations they consider "very them" — fetch share links via the browser if plain fetching returns an empty JS shell — and weigh that material heavily: how someone argues when nothing is at stake is character in its purest form.

3. Write the report

Default tone: 一针见血 — like an old friend who sees through you. Praise the real strengths without hedging, and name the blind spots just as directly. Sharp is not cruel: the aim is insight that lands, never a put-down. Earn every hard line with evidence, and the hardest lines are the most valuable ones.

一针见血 ≠ 文绉绉. Write like speech. Short sentences. Everyday words. The abstraction lives in the IDEA, not the vocabulary: name a core quality the way a friend would say it over dinner（"你信优胜劣汰"）, never as seminar coinage （"达尔文式秩序观"）. If a term would need a gloss, replace it with plain words. Long em-dash chains and stacked clauses kill readability — break them up. The reader should never have to re-read a sentence.

Source balance. Weight evidence roughly by where the person's life actually is: the full local corpus is the primary basis; any user-supplied extra conversation (a share link, an export) is a SUPPLEMENT — often vivid, never the lead. Recency check before finalizing: if one small source supplies most of the report's evidence while another holds most of the messages, you've been seduced by the newest material — rebalance.

Quote safety: the collector redacts obvious tokens ([REDACTED-密钥]), but treat every quote as if it might be screenshotted and shared — never quote passwords, keys, internal URLs, or other people's private messages; paraphrase or elide instead.

Language adaptation. The ENTIRE report — titles, disclosure lines, labels, everything — renders in the reader's dominant corpus language, with the same friend-letter voice. The template's Chinese titles are examples, not fixtures; English equivalents that pass the friend test: "The short version" · "You, basically" · "A few numbers that are unmistakably you" · "What you actually want" · "Three things you probably haven't noticed" · "These three months" · "Where all this comes from". Determine the language by READING the corpus, not by trusting metrics.language.dominant alone (it only separates Han from Latin script — a Spanish corpus reads as "English" there). For mixed-language users, write in the dominant language and mirror their natural code-switching for technical terms instead of "cleaning" it.

Framework principle — 普适的固定，深水的条件开放. "我是谁" decomposes into sub-questions. Two of them ANY corpus can answer — 什么样的人 (behavior → qualities) and 正在变成谁 (timestamps → trajectory) — so they, plus an opening 答案, form the fixed skeleton and the reading arc: 震（答案）→ 认（品质）→ 深（条件块）→ 望（变成谁）. Note there is NO standalone "怎么运转" block: a person's distinctive way of operating IS their qualities' daily form — give each quality its mechanical texture instead. A separate mechanism block only collects the leftovers qualities couldn't absorb, and then dresses them up as a dashboard. Two more sub-questions — 你真正想要的 and 你看不见的自己 — are the deepest cuts but the most data-hungry: open them ONLY when the evidence truly funds them, and let them vanish entirely when it doesn't. That is what "adaptive" means here: depth flexes with the data; the skeleton never does. Never force a section the data can't fund — that is how a framework overfits to one person and breaks on the next.

标题的"朋友测试". 每个分节标题都要像一个犀利的老朋友写长信时随手写下的分隔语——「先说结论」「你这个人」「这三个月」。禁止：隐喻当标题（指纹/镜子/ 内核/刻度）、文案腔的工整对仗（"你看不见的自己"）、翻译腔（"你正在变成谁"）、方法论词汇外漏（答案/条件块/轨迹层）。编号（一二三四）也不要——朋友的信不编号。标题可以因人微调，但必须口语、直给、带随手感；隐喻可以活在正文里，不许站在标题上。

Use this structure:

# 我是谁 · Who Am I
*基于 [N] 条你亲口说过、亲手写下的话 · [最早日期] → [最晚日期] · 来源：[各来源及条数]*
（若含语音口述源：说明口述包括发给人的消息，不只是对 AI 说的）

## 先说结论                                ←【固定 · 全报告唯一必须惊艳的地方】
[开门见山直接回答，不铺垫。两种写法按数据选：数据里真有一根贯穿的主线 → 一段话
写主线；没有单一主线 → 称号（2–6 字定制头衔）+ 一句话人设 + 签名动作。无论哪种，
把这个人**最强的一对悖论**放进答案当钩子（"改稿欲 9 分，定稿勇气 2 分"——悖论
对是全报告最有记忆点的东西），再加 1–2 个数字速写。**答案只做蒸馏，不烧证据**：
具体场景全部留给后面的块去用，否则正文读起来像答案的复读机。这一节会被截图，
按截图的标准写。]

## 你这个人                                ←【固定 · 品质层 · 全报告笔墨最重处】
[3–6 条核心品质。**这是报告的心脏，给它最多的篇幅**——读者要的"被看穿"主要发生
在这里，机制层是为它服务的。每条品质四件套：说人话的命名（"你的完美主义是防御
性的"，不造"主义/观"式术语）→ 一两句最纯形态的陈述 → 来自 2–3 个互不相关领域的场景
证据（场景指针，不引原话）→ **亮面与代价各一笔**：亮面必须用"证据货币"付账
（见模板后），付不起货币的夸奖直接删；代价一句收尾（"它给了你定力，也让你冷"）。
亮暗同写才是镜子而不是判决。跨不过两个不相关领域的是模式不是品质，降到"运转"
里去。]

## （几个一看就是你的数字）                 ←【可选小块 · 有料才放 · ≤6 行】
[品质吸收不掉、但确实只属于他的零散硬事实：独特用法（某句固定口头禅 ×40 这类）、
表达双态（中位 30 字 vs 上百发长篇）、双语焊接、作息（仅当真极端时才值得写）。
写法：裸数字 + 半句解读，一条一行。**禁止刻度条、10 分制、进度条**——没有人群
基线的刻度就是伪测量，裸数字本身够硬，不需要假装成仪表盘。每条先过换人测试，
并用半句话让读者知道这道筛选存在过。没有够格的事实就整块省略。机制类的观察
（怎么做决定、怎么面对出错、怎么待人）不放这里——那些是品质的日常形态，写进
"你这个人"对应品质的质地里。]

## 你到底想要什么                           ←【条件块 · 言行对照才开】
[Open ONLY when the corpus holds both "嘴上说的目标" and "时间/注意力的实际
流向" to compare. Then contrast them: 你说你要 X；但你的深夜、你反复回头的
主题、你不肯外包的事，都流向 Y。Revealed preference, stated without judgment —
数据不配合表演，这一块开得了就是全报告最狠的之一。证据不足 → 整块消失，
绝不硬写。]

## 几件你大概没察觉的事                     ←【条件块 · 自述与行为有落差才开】
[Open ONLY when "他怎么描述自己" and "他实际怎么做" show a provable gap. Each
gap: 他的自我故事（场景锚）→ 行为数据说什么 → 这个落差本身说明什么。**自述侧
没有他真实说法的锚，就禁止写成"你以为"**——把那条观察降级为"孤儿账"写法（只
陈述行为模式，不替他虚构自我认知）。This is the root reason anyone consults a
mirror, and the report's sharpest blade — but with no hard evidence it must
vanish, not be faked. 不列"缺点"，只照"对不上"。孤儿代价（没挂上任何品质、但
反复出现的模式）也住这里。]

## 这三个月（标题按实际窗口写）             ←【固定 · 轨迹层 · 收尾向前】
[End on the time axis: 数据窗口里可见的转折点（哪天起某个习惯出现/消失）、
正在生长的东西、正在固化的东西、照惯性走六个月后的你。写得像版本历史，不像
判决书。不开药方——给方向感即可，要不要拐弯是读者的事。]

## 这些话怎么来的                           ←【固定 · 信任层】
- 自适应扫描：[tools_scanned 个已知工具，命中 N 个有本地数据]——从 stats.json 的 detected 清单取数。
- 读了什么：[collected 的 sources + counts + date range]
- 检测到、但本地读不到：[installed_unreadable / installed_no_local_chats 的工具 + 各自 reason，
  e.g. ChatGPT 桌面版数据加密；Claude 桌面版/CodeBuddy 聊天在云端]——照实说，别假装读全了。
- 这份报告只看你**自己说的话**，不看 AI 的回答；样本有偏（你更常在某些工具/某段时间用 AI）。
- 最可能看错你的地方：[one honest line — the bias or thin spot most likely to
  have distorted this portrait, e.g. 数据几乎全是工作场景，照不到生活里的你]
- 全程在本地完成，没有任何数据离开这台电脑。

证据货币（"你这个人"各品质的亮面专用）. Praise must point at proof a skeptic accepts. Inventory which currencies THIS corpus holds and pay only in those — never force a frame the data can't fund: 对抗中赢来的（有人驳他、他赢了——最硬也最稀有）· 付出过代价的（同一处改 N 轮、亲手废掉旧方案——代价即证据）· 数字上罕见的（统计极端值）· 增长曲线（时间线证明的前后差）· 未经请求的外部认可（陌生人主动采用/贡献）。Two elevators work on any currency: name the specific MOVE, not the attribute（"你抓住了论证里的偷换" lands；"你很聪明" doesn't）; and locate the move in a discipline（"这是 XX 行当训练多年的本事—— 你是野生的"）. The king of compliments is independent rediscovery: when they derived something that already has a name, give the name.

Silent dissent before finalizing. For every big claim, ask: could a reasonable skeptic explain this evidence another way? If yes, soften or cut. The dissent never appears in the report — except as that one honest "最可能看错你的地方" line in 附.

出稿前检查单 — run every gate, fix, then save. Each gate exists because a real run failed it once:

全文零原话引用（口头禅计数、自造词除外）。
每个比较级/最高级断言（更…、最…、远超、找不出第二）都有算过的数字背书；没算过的，改写成定性观察或删掉——像测量的句子必须真的测过。进度条、 10 分制、刻度条同罪：没有人群基线就别画刻度，给裸数字。
同一个场景瞬间全文只出场一次；答案块不烧具体证据。轨迹块可点名已用场景作里程碑，但只许一笔带过，不得重新展开。
条件块每条的两侧都有锚；自述侧无锚的不得写"你以为"。
证据指针全部是场景式，无裸日期（轨迹块除外）。
披露完整：口述范围声明（若有口述源）+ "最可能看错你的地方"。
每条品质的亮面都付了证据货币；付不起的夸奖已删。

Opt-in modules — include directly when the user's request asked for them; otherwise offer at handover and append on demand:

🛠 改进清单 — for people who want to change: per item in 几件你大概没察觉的事 and per quality's 代价句, ONE concrete, doable behavior change. Not "要更有耐心", but "下次想说 X 之前，先写一句 Y"。
🔪 直话区 — for people who explicitly want it brutal: the costs restated as hard truths, no cushioning, the things a polite mirror won't say.
🎁 协作说明书 — the most shareable module: a one-page "怎么跟我共事" for teammates/partners — what they're exceptional at, what to never do to them, how to disagree with them productively. Useful to share, flattering without being hollow.

4. Hand it over

Save the report to a file the user can keep and re-read — default ~/who-am-i-报告.md (or wherever they ask). Tell them the path, show the 先说结论 section inline as a teaser, and remind them the raw corpus sits in OUT_DIR if they want to dig — and that they can delete it anytime (it's their private data, sitting only on their disk). Then offer the opt-in modules they didn't get（改进清单 / 直话区 / 协作说明书）— one line each, let them pick or pass.

Adjusting on request

The default is the descriptive core, sharp + evidence-first; modules add on request. Flex if the user asks: warmer/gentler tone, a focus on only strengths or only growth, a particular time window or single tool, a shorter shareable card only, or a different output language. The collector accepts --out DIR and --sample-chars N; for a single-tool read, just analyze the matching source in corpus.jsonl.

Notes for extending

references/data-sources.md documents where each AI tool stores conversations, the on-disk format, and the current support status. Read it when adding a new source or when the user asks "why isn't tool X included?". To add a tool: write a collect_<tool>() that appends {source, context, ts, text} records and fails soft (one broken source must never sink the rest) — reuse the generic helpers (_harvest_role_json for role-tagged JSON, _harvest_sqlite_generic for a dedicated chat SQLite DB), wire it into the collectors list in main(), and add a detection row to KNOWN_TOOLS (or UNREADABLE_TOOLS if its chats are server-side/opaque) so the adaptive manifest reports it. Prefer honestly marking a tool unreadable over emitting guessed/garbage text — false data poisons the analysis far worse than a missing source.