name: search-full-text model: claude-sonnet-4-6 description: Invoke for FamilySearch full-text search (FTS) — immediately when the user says "full-text search", "FTS", "search document transcripts", or "construct a full-text query". Use this skill to find a person as a witness, executor, executrix, administrator, appraiser, heir, neighbor, surety, or other non-principal in deeds, probate, wills, court minutes, or notarial protocolos; to run Lucene-style queries with +required terms, wildcards, or phrase matching; and to cover spelling and transcription variants across FamilySearch's AI-transcribed historical documents. FamilySearch document images only. Exclude external sites like Ancestry or Newspapers.com (use search-external-sites), structured indexed search by name/date/place (use search-records), and planning what to search (use research-plan). allowed-tools: - fulltext_search - source_attachments - research_log_append - research_append
Search Full-Text
Narration: Read researcher_profile.narration_guidance from research.json and apply it as your narration style for this invocation. If absent, default to a one-line preamble per action.
Executes full-text searches against FamilySearch's AI-transcribed historical document images. FTS searches the raw transcript text of ~1.95 billion document images — a fundamentally different search surface than indexed Records search. FTS finds people mentioned anywhere in a document (witnesses, neighbors, heirs, appraisers), not just indexed principals.
This skill is the FTS counterpart to search-records (indexed search) and search-external-sites (non-FamilySearch repositories).
MCP tool
This skill uses one search tool:
| MCP tool | Purpose |
|---|---|
fulltext_search |
Full-text search of AI-transcribed document images using Lucene-style operators |
Key differences from indexed Records search
FTS and indexed search are completely different systems:
Indexed (record_search) |
Full-text (fulltext_search) |
|
|---|---|---|
| What's searched | Structured fields (name, date, place) | Raw transcript text of document images |
| Fuzzy matching | Auto-applies nicknames, phonetic variants, Soundex | None. Exact text matching only. |
| Abbreviation expansion | Wm→William, Jno→John automatic | Not expanded. Must search Wm and William separately. |
| Operators | q.* parameters with .exact=on modifier |
+ (require), - (exclude), "…" (phrase), ?/* (wildcards) |
| Default behavior | Fuzzy matching on all terms | OR — at least one term must appear |
| Unique strength | Finding indexed principals | Finding non-principal mentions (witnesses, neighbors, heirs) |
| Source type | Structured index (derivative) | AI transcript of document images (also derivative) |
Critical: FTS results are derivative sources
Chain: original → image → AI transcript → textDocument. Each step can introduce errors (~10% observed). Always verify against the original image (linked from each result).
Steps
1. Identify the plan item to execute
Read research.json plans[] and find the next plan item with
status: "planned" that targets full-text search. If the user
specifies a particular search, match it to a plan item or create
an ad-hoc search (with plan_item_id: null in the log).
2. Evaluate the target database
Before constructing any query, verify FTS covers the target. Read
references/online-search-literacy.md for the evaluation checklist.
- Coverage: ~6,665 searchable collections as of mid-2026. Not all FamilySearch collections are FTS-searchable.
- Collection scope: Read the description — titles mislead about geographic/temporal coverage.
- Error rate: ~10% observed. Plan for transcription variants.
3. Choose a search philosophy
Default to "less is more" for FTS. No fuzzy matching means every extra required term risks missing transcription variants.
- Uncommon surname →
+Surnameonly, filter after - Common surname →
+Surname +Associateor+Surname +Keyword - Very common surname (Smith, Jones) → multiple required terms or phrase search ("kitchen sink")
See references/online-search-literacy.md for the full framework.
4. Determine the search strategy
Read references/search-strategies.md for the full strategy
catalog. Key decision: what kind of FTS search is this?
| Research goal | Query approach |
|---|---|
| Find person as witness/appraiser/heir | +Surname in Name field, place filter after |
| Find person in narrative records (deeds, probate, court) | +GivenName +Surname in Keywords, place filter after |
| FAN cluster search | +TargetSurname +AssociateSurname in Keywords |
| Kinship determination | +Surname +"daughter of" or +Surname +"my beloved wife" |
| Migration tracing | +Surname with successive place filters |
| Enslaved persons | Enslaver surname + slavery keywords (see strategies reference) |
5. Construct the search query
Read references/query-syntax.md for operator rules.
Critical rules:
- Always use
+to require terms. Default is OR, which returns millions of irrelevant results. - Search by name only first. Do NOT include place in the initial query — place matches collection metadata and causes false positives. Apply place as a post-search filter.
- Abbreviations must be searched explicitly. FTS does not auto-expand. If searching for William, also search Wm. If searching for Thomas, also search Thos.
- Mine prior records for known surname variants before querying.
Scan existing
research.jsonassertions and log entries for the target surname. If prior records show a transcription variant (e.g., a "Flinn" assertion or log entry when searching "Flynn"), include the variant in your initial query set. FTS does not auto-expand spelling variants — the work has already been done upstream, and ignoring it wastes queries on the wrong spelling. - Phrases tolerate one intervening word.
"Ezekiel Pearce"also matches "Ezekiel John Pearce." - Wildcards:
?(one char),*(zero or more). Cannot appear inside quotes or as first character. Minimum 3 literal characters.
Example queries:
# Basic person search (require both terms). Pass projectPath so the
# host stages results and returns a staged.resultsRef for step 8.
fulltext_search({ keywords: "+Patrick +Flynn", projectPath })
# Phrase search
fulltext_search({ keywords: '+"Patrick Flynn"' })
# Person + boilerplate phrase (will search)
fulltext_search({ keywords: '+"Thomas Flynn" +"Last Will and Testament"' })
# FAN cluster (target + associate)
fulltext_search({ keywords: "+Flynn +Brennan" })
# Wildcard for HTR errors
fulltext_search({ keywords: "+Fl?nn +Patrick" })
# Abbreviation variant (separate query)
fulltext_search({ keywords: "+Wm +Flynn" })
# Natural language search
fulltext_search({ nlQuery: "Search for John Doe born in Austria" })
# Search by tree person ID
fulltext_search({ nlQuery: "KD96-TV2" })
# Search within a specific volume
fulltext_search({ imageGroupNumber: "4057677" })
When searching a specific volume: Use the Image Group Number field to restrict to one digitized volume, then add keywords.
6. Execute and iterate
Call fulltext_search with the constructed query. Always pass
projectPath (the absolute path to the project folder) so the host
stages the raw results: the response gains a staged.resultsRef handle
you hand to research_log_append in step 8 to retain them — you never
serialize the payload yourself.
Decision rules by hit count:
- 0 results → See step 10 (handle nil results)
- 1–50 results → Review all
- 50–500 results → Add Year/RecordType filter
- >500 results → Add a second required term (
+associate,+occupation,+landmark) or add place filter
If searching a collection-specific quirk: Read
references/transcription-quirks.md for HTR error patterns,
era-specific handwriting issues, and coverage gaps.
7. Triage results
For each result, evaluate match quality:
Quick triage:
- Does the target name appear in the textDocument?
- Is the name in the right context (witness signature, will clause, deed party) or a false positive (cross-column alignment, place name matching)?
- Is the place and approximate date consistent?
Attachment check: After narrowing to promising results, call
source_attachments({ uris: [ark1, ark2, ...] }) to check whether
each record is already attached to a tree person.
- Attached to the target person → deprioritize for extraction unless the user wants to re-examine it.
- Attached to a different person → flag as potentially relevant (could be a family member or duplicate).
- Unattached → prioritize for extraction — this is new evidence.
Present triage to the user: List the top results with match quality, context (what role the person plays in the document), and attachment status. Let the user confirm which records to examine in detail.
8. Retain results and write the log entry
Every search gets a log entry and retains its results — no
exceptions. Call research_log_append once per search. The tool
assigns the log id and performed timestamp, finalizes the staged
results into the results/<log_id>.json sidecar (recomputing the
count), and validates-before-persist — you supply only the judgment
(outcome, counts, notes) and the staged handle:
research_log_append({
projectPath,
planItemId: "pli_010", // null for an ad-hoc search
tool: "fulltext_search",
query: {
keywords: "+Flynn +\"Last Will and Testament\"",
recordPlace1: "Pennsylvania",
recordPlace2: "Schuylkill",
yearFrom: 1870,
yearTo: 1890
},
outcome: "positive", // your judgment: positive/negative/partial/error
resultsExamined: 5,
resultsAvailable: 47, // upstream totalResults, or null
notes: "47 Schuylkill will hits 1870–1890; 5 examined. Thomas Flynn's will (1881) names wife Mary and children Patrick, John, Margaret; Flynn also appears as a witness on two unrelated wills.",
stagedResultsRef: staged.resultsRef // omit for a nil search
})
notes is a one-line human summary of what the search returned. For a
nil search, omit stagedResultsRef entirely (no sidecar is
written) and set resultsExamined: 0.
Recovery. If the search response is stale or the
staged.resultsRef handle has expired (the host's staging TTL lapsed),
re-run the fulltext_search (with projectPath) to re-stage — it is
cheap. If research_log_append returns { ok: false, errors }, surface
the errors to the user rather than retrying blindly.
9. Update plan item status
Route the plan-item status mutation through research_append
(it validates-before-persist and writes nothing on { ok: false }):
research_append({
projectPath,
section: "plan_items",
op: "update",
planId: "pl_003", // the parent plan's pl_ id
entryId: "pli_010", // the plan item's pli_ id
fields: { status: "completed" }
})
Set status to:
completed: Search executed regardless of outcomeskipped: The search was determined to be unnecessary (e.g., the question was already answered by a prior search)
10. Handle nil results
When a search returns no results:
- Log the nil result via
research_log_appendwithoutcome: "negative",resultsExamined: 0, and nostagedResultsRef(a nil search retains no sidecar). Thenotesfield on a negative log entry must explicitly state the collection class searched, the place filters and date range applied, the spelling/variant forms queried, and the count of variants tried before declaring negative (for example: "Searched FamilySearch FTS, FamilySearch Probate collections, Schuylkill County, Pennsylvania, 1870–1890; 5 variants tried (+'Patrick Flynn', +Patrick +Flynn, +Patrick +Flinn, +Patrick +Flunn, +Flynn surname-only); 0 results — FTS coverage gap probable; recommend indexed search-records or volume browse"). A bare "no results" note is insufficient for the GPS exhaustive-search audit trail; the future reader must be able to see the search scope without re-deriving it from the query payload. - Iterate through variants before declaring negative — but cap
total queries (initial + retries) at 5 per plan item. Pick the
most promising 4 variants from
references/search-strategies.md(decision tree) andreferences/online-search-literacy.md(nil-result checklist) for the specific record class and locality; do not exhaustively walk the full variant catalogue. Log each retry separately. Once you have logged 5 nil queries for the same subject, stop and declare a coverage gap — additional retries produce diminishing returns and inflate the tool-call budget without changing the answer. - Verify coverage exists. A nil result may mean the record was never transcribed — not that it doesn't exist.
- Assess whether absence is meaningful (negative evidence) — only when coverage is known to be good for that locality/period.
- Check for fallback plan items or suggest search-records/re-plan.
- Do NOT execute unrelated diagnostic queries to "test" the FTS
index. When a long string of variant queries returns zeros, the
right next step is to declare a coverage gap and log the negative
finding — not to query a common surname (
+Smith,+Jones) to see whether the tool is "broken." The tool's response is authoritative. Diagnostic probes both inflate Tool Arguments cost on unrelated subjects and rationalize away genuine negative findings as "must be a test environment issue."
11. Queue cross-reference searches
When reading FTS results, automatically queue sub-searches for:
- Every named non-target person (witnesses, executors, appraisers, heirs, neighbors)
- Distinctive landmarks or property descriptions
- Slaveholder ↔ enslaved name pairs
- Powers of attorney → search the named agent in the other county
- Marginal annotations referencing later transactions
Present these as suggestions: "This deed names John Brennan as a witness. Would you like me to search for other documents mentioning Brennan in this county?"
12. Pass records to extraction
For each promising record, invoke record-extraction to process it. FTS results include transcript text — pass this context along.
13. Present results
After completing a search (or a batch of searches from the plan):
- Summarize what was searched and what was found
- Highlight non-principal mentions (witnesses, neighbors) — these are FTS's unique value
- Show the log entries created
- Show plan progress
- Suggest next steps:
- More plan items → "Shall I continue with the next search?"
- Cross-reference opportunities → "I found 3 witnesses. Search for them?"
- All plan items done → "All planned searches are complete."
- No results → "No matches in FTS. Would you like to try indexed search (search-records) or re-plan?"
Important rules
- Always use
+to require terms. Default is OR (millions of irrelevant results). - Search name first, filter place after. Place in the query causes metadata false positives.
- FTS results are derivative sources. Always verify against the original image.
- A nil result does not prove absence. Try variants before declaring negative; log exact parameters for reproducibility.
- Log every search. Including nil results. The log is the GPS audit trail.
- Let the user confirm before extraction. Never fabricate results.
- Do NOT write to
sourcesorassertions. This skill only writes tologandplans(status updates). Creating source entries and extracting assertions is record-extraction's job — pass promising records there instead. - Do NOT add extra fields to plan items. Plan items have a
fixed schema (
id,sequence,record_type,jurisdiction,date_range,repository,rationale,fallback_for,status). Do not addcompletion_note,notes, or any other fields — the schema enforcesadditionalProperties: false. - Always use
keywordsfor queries. Do not fall back tonlQuerywhenkeywordsqueries return few or no results. UsenlQueryonly when the user explicitly asks for a natural language search or provides a tree person ID.
Re-invocation behavior
Writes: via research_log_append, a new entry in the log section
of research.json (append-only) plus its results/log_NNN.json
sidecar (the tool finalizes the staged payload); and, via
research_append, the status field on the corresponding plan item.
On repeat invocation: always appends a new log_ entry — re-running
the search is itself a logged event. Updates the plan item's status
if applicable.
Do not duplicate: the log is append-only and research_log_append
only appends (no update or delete), so prior log_ entries and their
sidecars are never touched. Two consecutive runs of the same query
produce two log entries and two sidecars; that's correct.