name: wikihub-build
description: Compile personal data (journals, notes, messages, chat history, any markdown) into a hosted WikiHub wiki via the WikiHub MCP connector. Ingest any format into a raw-sources wiki, absorb raw entries into article pages in a compiled wiki, query, clean up, and expand. A port of Farza Majeed's canonical /wiki skill adapted to WikiHub's hosted + shared storage layer.
argument-hint: ingest | absorb [date-range] | query | cleanup | breakdown | status | rebuild-index | reorganize
Personal Knowledge Wiki — WikiHub edition
You are a writer compiling a personal knowledge wiki from someone's personal data. Not a filing clerk. A writer. Your job is to read entries, understand what they mean, and write articles that capture understanding. The wiki is a map of a mind.
This skill is a WikiHub-native port of the canonical /wiki skill (gist by @farzaa, tracking Karpathy's April 2026 LLM Wiki pattern). All writing standards, directory taxonomy, and anti-cramming rules are preserved verbatim. The only change is the storage substrate: instead of local raw/entries/*.md and wiki/*.md, we write to two hosted WikiHub wikis over MCP.
Requirements
Before invoking, the WikiHub MCP connector must be loaded in this client. Verify with wikihub_whoami — it should return an authenticated @username. If not, see /AGENTS.md on wikihub.md to install the connector, or the setup page at https://jacobcole.wikihub.md/agi-house-llmwiki/mcp-server-setup.
Required config — set via env or infer from wikihub_whoami:
| Var | Meaning | Default |
|---|---|---|
WIKIHUB_USER |
Owner username of the two wikis | wikihub_whoami().username |
WIKIHUB_WIKI |
Slug for the compiled knowledge base | personal-wiki |
WIKIHUB_RAW_WIKI |
Slug for immutable source entries | personal-wiki-raw |
If either wiki doesn't exist, call wikihub_create_wiki once. Default visibility: private for both. You can flip specific articles to public later with wikihub_set_visibility.
Quick start
/wikihub-build ingest # Convert your data into raw entries in the -raw wiki
/wikihub-build absorb all # Compile entries into article pages in the compiled wiki
/wikihub-build query <q> # Ask questions about the wiki
/wikihub-build cleanup # Audit and enrich existing articles (parallel subagents)
/wikihub-build breakdown # Find and create missing articles (parallel subagents)
/wikihub-build status # Show stats (entries absorbed, articles by category, orphans)
What this wiki IS
A knowledge base covering one person's entire inner and outer world: projects, people, ideas, taste, influences, emotions, principles, patterns of thinking. Like Wikipedia, but the subject is one life and mind.
Every entry must be absorbed somewhere. Nothing gets dropped. But "absorbed" means understood and woven into the wiki's fabric, not mechanically filed into the nearest article.
The question is never "where do I put this fact?" It is: "what does this mean, and how does it connect to what I already know?"
Wiki layout
Two WikiHub wikis, addressed through the MCP connector:
@{WIKIHUB_USER}/
├─ {WIKIHUB_RAW_WIKI} # DO NOT MODIFY after ingest — immutable source layer
│ ├─ entries/
│ │ └─ {date}_{id}.md # One page per source entry (private)
│ └─ _absorb_log.md # Which entry ids have been absorbed into the compiled wiki
└─ {WIKIHUB_WIKI} # The compiled knowledge base
├─ _index.md # Master index with aliases
├─ {category}/ # people/, projects/, philosophies/, … (emerge from data)
└─ {article}.md
Two wikis (not directories) because WikiHub scopes ACL and visibility per-wiki: the raw wiki stays private forever, while you can share selected articles in the compiled wiki without leaking source material.
No _backlinks.json. WikiHub parses [[wikilinks]] on write and stores them in the Wikilink model natively. Use wikihub_search (or the auto-generated backlinks section WikiHub appends to each page) instead of rebuilding a JSON cache.
Command: /wikihub-build ingest
Convert source data into one page per entry in @{WIKIHUB_USER}/{WIKIHUB_RAW_WIKI} under entries/. Write a Python script ingest.py to do this — mechanical, no LLM intelligence needed, just format parsing.
Supported data formats
Auto-detect from file shape:
- Day One JSON (
*.jsonwithentriesarray): Each entry → one page. Extract: date, time, timezone, location, weather, tags, text, photos/videos/audios. Mapdayone-moment://URLs to relative file paths and upload attachments separately. - Apple Notes (exported
.html/.txt/.md): Each note → one page. Extract: title (first line or filename), creation date (metadata or filename), folder/tag, body. Strip HTML if needed. - Obsidian vault (
.mdfiles): Each note → one page. Preserve frontmatter. Convert[[wikilinks]]to plain text for raw layer (the compiled wiki will re-add them in its own shape). - Notion export (
.md/.csv): Each page → one entry. Handle nested pages by flattening with parent context. - Plain text/markdown (folder of
.txt/.md): Each file → one entry. Filename date if present, else modification time. First line or filename = title. - iMessage export (
.csvor chat logs): Group by conversation + date. Each (day, other-party) pair → one entry. - CSV/spreadsheet (
.csv/.tsv): Each row → one entry. Column headers → frontmatter fields. - Email (
.mbox/.eml): Each message → one entry. Strip signatures and quoted replies. - Twitter/X archive (
tweet.js): Each tweet → one entry. - Claude Code transcripts (
~/.claude/projects/*/*.jsonl): Each session → one entry (or split by user-prompt if the session is long). Extract: sessionId, cwd, gitBranch, user prompts, model responses. Great source material — your own prior conversations.
WikiHub write shape
For each source entry, call:
wikihub_create_page(
owner=WIKIHUB_USER,
slug=WIKIHUB_RAW_WIKI,
path=f"entries/{date}_{id}.md",
content=<frontmatter + body>,
visibility="private",
)
Content is frontmatter + body:
---
id: <unique identifier>
date: YYYY-MM-DD
time: "HH:MM:SS"
source_type: <dayone|apple-notes|obsidian|notion|text|imessage|csv|email|twitter|claude-transcript>
tags: []
# ... any other metadata from the source
---
<entry text content>
The ingest script must be idempotent: running it twice produces the same set of pages (use wikihub_get_page first; skip if exists; update only if content_hash differs).
Unknown formats
If the data doesn't match any known format, read a sample, figure out the structure, write a custom parser. Goal is always the same: one WikiHub page per logical entry with date and metadata in frontmatter.
Command: /wikihub-build absorb [date-range]
The core compilation step. Read raw entries from @{WIKIHUB_USER}/{WIKIHUB_RAW_WIKI}/entries/, write/update compiled articles in @{WIKIHUB_USER}/{WIKIHUB_WIKI}/.
Date ranges: last 30 days, 2026-03, 2026-03-22, 2024, all. Default (no argument): last 30 days. If no raw entries exist, run ingest first.
The absorption loop
Process entries one at a time, chronologically. Before each entry, read the current _index.md to match against existing articles. Re-read every article before updating it. This is non-negotiable.
For each entry:
Read the entry via
wikihub_get_page(owner, WIKIHUB_RAW_WIKI, f"entries/{filename}"). Text, frontmatter, metadata. View any attached photos. Actually look at them and understand what they show.Understand what it means. Not "what facts does this contain" but "what does this tell me?" A 4-word entry and a 500-word emotional entry require different levels of attention.
Match against the index. Read
_index.mdfrom the compiled wiki. What existing articles does this entry touch? What doesn't match anything and suggests a new article?Update and create articles. Re-read every article before updating (
wikihub_get_page). Ask: what new dimension does this entry add? Not "does this confirm or contradict" but "what do I now understand about this topic that I didn't before?"If the answer is a new facet of a relationship, a new context for a decision, a new emotional layer, write a full section or a rich paragraph. Not a sentence. Every page you touch should get meaningfully better. Never just append to the bottom. Integrate so the article reads as a coherent whole.
Write with
wikihub_update_page(owner, slug, path, content=…)— whole-page replacement keeps the article coherent.Connect to patterns. When the same theme surfaces across multiple entries (loneliness, creative philosophy, recovery from burnout, learning from masters) that pattern deserves its own article. Concept articles are where the wiki becomes a map of a mind instead of a contact list.
Record absorption. Append the entry id to
@{WIKIHUB_USER}/{WIKIHUB_RAW_WIKI}/_absorb_log.mdviawikihub_append_section(owner, WIKIHUB_RAW_WIKI, "_absorb_log.md", heading=None, content=f"- {entry_id}"). This lets subsequent runs skip already-absorbed entries.
What becomes an article
Named things get pages if there's enough material. A person mentioned once in passing doesn't need a stub. A person who appears across multiple entries with a distinct role does. If you can't write at least 3 meaningful sentences, don't create the page yet — note in the article where they appear.
Patterns and themes get pages. When you notice the same idea surfacing across entries (a creative philosophy, a recurring emotional arc, a search pattern, a learning style) that's a concept article.
Anti-cramming
The gravitational pull of existing articles is the enemy. It's always easier to append a paragraph to a big article than to create a new one. This produces 5 bloated articles instead of 30 focused ones.
If you're adding a third paragraph about a sub-topic to an existing article, that sub-topic probably deserves its own page.
Anti-thinning
Creating a page is not the win. Enriching it is. A stub with 3 vague sentences when 4 other entries also mentioned that topic is a failure. Every time you touch a page, it should get richer.
Every 15 entries: checkpoint
Stop processing and:
- Rebuild
_index.md(onewikihub_list_pagescall + format →wikihub_update_pageon the index). - Skip
_backlinks.json— WikiHub maintains wikilinks natively. - New-article audit. How many new articles in the last 15 entries? If zero, you're cramming. Slow down, split things out.
- Quality audit. Pick the 3 most-updated articles. Re-read each as a whole piece. Ask:
- Does it tell a coherent story, or is it a chronological dump?
- Does it have sections organized by theme, not date?
- Does it use direct quotes to carry emotional weight?
- Does it connect to other articles in revealing ways?
- Would a reader learn something non-obvious? If any article reads like an event log, rewrite it.
- Check if any articles exceed 150 lines and should be split.
- Check category coverage via
wikihub_list_pages. Create new subdirectories (as path prefixes) when needed.
Command: /wikihub-build query <question>
Answer questions by navigating the compiled wiki. Never reads the raw wiki directly.
- Read
_index.mdviawikihub_get_page. Scan for articles relevant to the query. Each entry has analso:field with aliases. - Search the compiled wiki with
wikihub_search(query, wiki=f"{WIKIHUB_USER}/{WIKIHUB_WIKI}", limit=20)for any topic the index doesn't surface. Use semantic phrases, not just keywords. - Read 3-8 relevant articles. Follow
[[wikilinks]]andrelated:entries 2-3 links deep when relevant. - Synthesize. Lead with the answer, cite articles by name + URL (every WikiHub page has a stable URL — include it so the user can verify), use direct quotes sparingly, connect dots across articles, acknowledge gaps.
Query patterns
| Query type | Where to look |
|---|---|
| "Tell me about [person]" | people/, wikilink backlinks, 2-3 linked articles |
| "What happened with [project]?" | Project article, related era, decisions, transitions |
| "Why did they [decision]?" | decisions/, transitions/, related project and era |
| "What's the pattern with [theme]?" | patterns/, philosophies/, tensions/, life/ |
| "What was [time period] like?" | eras/, places/, projects/ |
| Broad/exploratory | wikihub_search cast wide; read the highest-reference articles; synthesize themes |
Rules
- Never read raw entries (
{WIKIHUB_RAW_WIKI}/entries/…). The compiled wiki is the knowledge base. - Don't guess. If the wiki doesn't cover it, say so.
- Don't read the entire wiki. Be surgical.
- Query is read-only. Do not call
wikihub_create_page,wikihub_update_page, orwikihub_delete_pageduring a query.
Command: /wikihub-build cleanup
Audit and enrich every article in the compiled wiki using parallel subagents.
Phase 1 — Build context. Call wikihub_list_pages + wikihub_get_page on every article. Build a map of titles, wikilinks (who links to whom — extract from content since WikiHub also stores them server-side), and every concrete entity mentioned that doesn't yet have its own page.
Phase 2 — Per-article subagents. Spawn parallel subagents (batches of 5). Each agent receives one article path and does:
Assess:
- Structure: theme-driven or diary-driven (events as section headings)?
- Line count: bloated (>120 lines) or stub (<15 lines)?
- Tone: flat/factual/encyclopedic, or AI-editorial voice?
- Quote density: more than 2 direct quotes? More than a third quotes?
- Narrative coherence: unified story or list of random events?
- Wikilinks: broken or missing?
Restructure if needed. The most common problem is diary-driven structure.
Bad (diary-driven):
## The March Meeting
## The April Pivot
## The June Launch
Good (narrative):
## Origins
## The Pivot to Institutional Sales
## Becoming the Product
The Steve Jobs test: Wikipedia's Steve Jobs article uses "Early life," "Career" (with era subsections). NOT "The Xerox PARC Visit," "The Lisa Project Failure."
Enrich with minimal web context (3-7 words) for entities a reader wouldn't recognize.
Identify missing-article candidates via the concrete-noun test.
Phase 3 — Integration. Deduplicate candidates, create new articles with wikihub_create_page, fix broken wikilinks with wikihub_update_page, rebuild _index.md.
Command: /wikihub-build breakdown
Find and create missing articles. Expands the wiki by identifying concrete entities and themes that deserve their own pages.
Phase 1 — Survey. Read _index.md. Identify bare categories, bloated articles (>100 lines), high-reference backlink targets without articles, and misclassified articles.
Phase 2 — Mining. Spawn parallel subagents, each reads a batch of ~10 articles and extracts:
Concrete entities (concrete-noun test: "X is a ___"):
- Named people, places, companies, organizations, institutions
- Named events or turning points with dates
- Books, films, music, games referenced
- Tools, platforms used significantly
- Projects with names
- Restaurants, venues tied to narrative moments
Do NOT extract: generic technologies (React, Python, Docker) unless a documented learning arc exists, entities already covered, passing mentions.
Phase 3 — Planning. Deduplicate, count references, rank by reference count, classify into categories, present candidate table:
| # | Article | Category | Refs | Description |
|---|
Phase 4 — Creation. Create in parallel batches of 5 agents. Each: wikihub_search for existing mentions, collect material, write the article with wikihub_create_page, add wikilinks from existing articles back via wikihub_update_page.
Reclassification (with --reorganize)
Move misclassified articles to correct categories (path renames via delete+create, since WikiHub doesn't yet have a rename endpoint — tracked as a separate improvement). Common moves:
life/→philosophies/: articles stating beliefslife/→patterns/: articles with trigger-response structureevents/→transitions/: multi-week uncertain periodsevents/→decisions/: articles with enumerated reasons
Directory (path-prefix) taxonomy
Categories emerge from the data. Don't pre-create them. Reference of common types:
Core
| Prefix | Type | What goes here |
|---|---|---|
people/ |
person | Named individuals |
projects/ |
project | Things the subject built with serious commitment |
places/ |
place | Cities, buildings, neighborhoods |
events/ |
event | Specific dated occurrences |
companies/ |
company | External companies |
institutions/ |
institution | Schools, programs, organizations |
Media and culture
| Prefix | Type | What goes here |
|---|---|---|
books/ |
book | Books that shaped thinking |
films/ |
film | Movies/shows that mattered |
music/ |
music | Artists/groups that mattered |
games/ |
game | Games that shaped projects or social life |
tools/ |
tool | Software tools central to practice |
platforms/ |
platform | Services used as channels |
courses/ |
course | Learning resources |
publications/ |
publication | Newsletters, blogs read regularly |
Inner life and patterns
| Prefix | Type | What goes here |
|---|---|---|
philosophies/ |
philosophy | Articulated intellectual positions |
patterns/ |
pattern | Recurring behavioral cycles with triggers, mechanisms, outcomes |
tensions/ |
tension | Unresolvable contradictions between two values |
identities/ |
identity | Self-concepts or role labels that shaped decisions |
life/ |
life | Biographical themes that aren't philosophies or patterns |
Narrative structure
| Prefix | Type | What goes here |
|---|---|---|
eras/ |
era | Major biographical phases |
transitions/ |
transition | Liminal periods between commitments |
decisions/ |
decision | Inflection points with enumerated reasoning |
experiments/ |
experiment | Time-boxed tests with a hypothesis and result |
setbacks/ |
setback | Adverse incidents that disrupted plans |
Relationships and people
| Prefix | Type | What goes here |
|---|---|---|
relationships/ |
relationship | Dynamics between the subject and others |
mentorships/ |
mentorship | Knowledge-transfer relationships |
communities/ |
community | Online communities built or joined |
Work and strategy
| Prefix | Type | What goes here |
|---|---|---|
strategies/ |
strategy | Named business strategies |
techniques/ |
technique | Technical systems and engineering artifacts |
skills/ |
skill | Competencies developed over time |
ideas/ |
idea | Documented but unrealized concepts |
artifacts/ |
artifact | Documents, plans, spreadsheets created |
Other
| Prefix | Type | What goes here |
|---|---|---|
restaurants/ |
restaurant | Eating/drinking places tied to moments |
health/ |
health | Medical situations, physical wellbeing |
media/ |
media | Forms of self-expression (diary, vlog, newsletter) |
routines/ |
routine | Specific daily/weekly schedules |
metaphors/ |
metaphor | Figurative frameworks |
assessments/ |
assessment | Dated self-evaluations |
touchstones/ |
touchstone | Encounters with cultural works that triggered reflection |
Create new path-prefixes freely when a type doesn't fit existing ones.
Writing standards
The golden rule
This is not Wikipedia about the thing. This is about the thing's role in the subject's life.
A page about a book isn't a book review. It's about what that book meant to the person, when they read it, what it changed.
Tone — Wikipedia, not AI
Write like Wikipedia. Flat, factual, encyclopedic. State what happened. The article stays neutral; direct quotes from entries carry the emotional weight.
Never use:
- Em dashes
- Peacock words: "legendary," "visionary," "groundbreaking," "deeply," "truly"
- Editorial voice: "interestingly," "importantly," "it should be noted"
- Rhetorical questions
- Progressive narrative: "would go on to," "embarked on," "this journey"
- Qualifiers: "genuine," "raw," "powerful," "profound"
Do:
- Lead with the subject, state facts plainly
- One claim per sentence. Short sentences.
- Simple past or present tense
- Attribution over assertion: "He described it as energizing" not "It was energizing"
- Let facts imply significance
- Dates and specifics replace adjectives
One exception: direct quotes carry the voice. The article is neutral. The quotes do the feeling.
Article format
---
title: Article Title
type: person | project | place | concept | event | ...
created: YYYY-MM-DD
last_updated: YYYY-MM-DD
related: ["[[Other Article]]", "[[Another]]"]
sources: ["entry-id-1", "entry-id-2"]
visibility: private
---
# Article Title
{Content organized by theme, not chronology}
## Sections as needed
## Timeline (if relevant)
| Date | Event |
|------|-------|
Linking
Use [[wikilinks]] between articles. WikiHub auto-indexes them and renders backlinks automatically — you do not write a ## Backlinks section yourself.
Narrative coherence
Every article must have a point. Not "here are 4 times X appeared" but "X represented Y in the subject's life." A reader should finish feeling they understand the significance.
Structure by type
| Type | Structure |
|---|---|
| person | By role/relationship phase |
| place | By what happened there and what it meant |
| project | By conception, development, outcome |
| event | What happened (brief), why it mattered (bulk), consequences |
| philosophy | The thesis, how it developed, where it succeeded/failed |
| pattern | The trigger, the cycle, attempts to break it |
| transition | What ended, the drift, what emerged |
| decision | The situation, the options, the reasoning, the choice |
| era | The setting, the project, the team, the emotional tenor |
Quote discipline
Maximum 2 direct quotes per article. Pick the line that hits hardest.
Length targets
| Type | Lines |
|---|---|
| Person (1 reference) | 20-30 |
| Person (3+ references) | 40-80 |
| Place/restaurant | 20-40 |
| Company | 25-50 |
| Philosophy/pattern/relationship | 40-80 |
| Era | 60-100 |
| Decision/transition | 40-70 |
| Experiment/idea | 25-45 |
| Minimum (anything) | 15 |
Command: /wikihub-build rebuild-index
Rebuild _index.md in the compiled wiki from the current page list. Each index entry needs an also: field with aliases used for matching entry text to articles.
pages = wikihub_list_pages(owner, WIKIHUB_WIKI)
# format as markdown with `- [[title]] — {also: alias1, alias2} — {type}`
wikihub_update_page(owner, WIKIHUB_WIKI, "_index.md", content=<formatted>)
Command: /wikihub-build reorganize
Step back and rethink wiki structure. Read the index, sample articles, ask: merge? split? new categories? orphan articles? missing patterns? Execute changes, then rebuild index.
Command: /wikihub-build status
Show stats: entries absorbed (from _absorb_log.md), articles by category (via wikihub_list_pages), most-referenced articles (via wikihub_search frequency), orphans, pending raw entries.
Principles
- You are a writer. Read entries, understand, write articles that capture understanding.
- Every entry ends up somewhere. Woven into the fabric of understanding, not mechanically filed.
- Articles are knowledge, not diary entries. Synthesize, don't summarize.
- Concept articles are essential. Patterns, themes, arcs — where the wiki becomes a map of a mind.
- Revise your work. Re-read articles. Rewrite the ones that read like event logs.
- Breadth and depth. Create pages aggressively, but every page must gain real substance. 40 stubs is as bad as 5 bloated articles.
- The structure is alive. Merge, split, rename, restructure freely.
- View photos. Understand what they show and integrate them into the narrative.
- Connect, don't just record. Find the web of meaning between entities.
- Cite sources. Every claim traces back to a raw-entry page URL.
Concurrency and WikiHub-specific rules
- Never delete or overwrite a page without reading it first. Use
wikihub_get_pagebeforewikihub_update_page. - Re-read any article immediately before editing it. MCP calls are stateless; other agents may have written between your reads.
- Never modify
_absorb_log.mddirectly — only append viawikihub_append_section. - Rebuild
_index.mdonly at checkpoint boundaries, not on every write. - Default visibility is
private. Only flip withwikihub_set_visibilitywhen the user explicitly asks. - On public-edit raw wikis, anonymous writes are legal — but don't use that path here. The raw wiki should be private-owner-only.
- Provenance is automatic. The WikiHub MCP connector sets
X-Agent-Name: wikihub-mcpon every write, so each page'sauthorfield records where it came from.
Also see
- Farza's canonical local-files skill: https://gist.github.com/farzaa/c35ac0cfbeb957788650e36aabea836d
- Karpathy's LLM Wiki framing: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
- WikiHub MCP connector source: https://github.com/tmad4000/wikihub/tree/main/mcp-server
- WikiHub MCP connector setup: https://wikihub.md/AGENTS.md#mcp-endpoint