name: ts-convert-from-snowflake-sv description: Convert or import a Snowflake Semantic View into ThoughtSpot as a Model. Use when Snowflake is the source and the goal is a ThoughtSpot Model — whether migrating Snowflake metrics and semantic definitions into ThoughtSpot or making a Semantic View available for Spotter and search-based analytics. Direction is always Snowflake → ThoughtSpot. Not for ThoughtSpot → Snowflake, standalone DDL generation, or adding AI context to existing ThoughtSpot models.
Snowflake Semantic View → ThoughtSpot Model
Converts a Snowflake Semantic View into a ThoughtSpot Model. Reads the semantic
view DDL via GET_DDL, maps tables, relationships, dimensions, and metrics to
ThoughtSpot TML, translates SQL expressions to ThoughtSpot formulas, and imports
the result via ts tml import.
Two scenarios are supported:
- Scenario A (existing tables): ThoughtSpot Table objects already exist for the Snowflake objects the semantic view references. Reuses those existing Table objects.
- Scenario B (new tables): No ThoughtSpot Table objects exist yet for the Snowflake objects the semantic view references. Creates new Table objects pointing to those objects.
References
| File | Purpose |
|---|---|
| ../../shared/mappings/ts-snowflake/ts-from-snowflake-rules.md | Snowflake Semantic View DDL parsing, type mapping, formula translation, column classification |
| ../../shared/mappings/ts-snowflake/ts-snowflake-formula-translation.md | SQL → ThoughtSpot formula translation rules (bidirectional reference) |
| ../../shared/schemas/thoughtspot-table-tml.md | Table TML structure, connection reference, data types, import patterns, common errors |
| ../../shared/schemas/thoughtspot-model-tml.md | Model TML structure, join scenarios, formula visibility, self-validation checklist |
| ../../shared/schemas/thoughtspot-formula-patterns.md | ThoughtSpot formula syntax, all function categories, LOD/window/semi-additive patterns, YAML encoding rules |
| ../../shared/worked-examples/snowflake/ts-from-snowflake.md | End-to-end example: BIRD_SUPERHEROS_SV → ThoughtSpot Model (se-thoughtspot, inline joins, verified against live DDL) |
| ../../shared/worked-examples/snowflake/ts-from-snowflake-dunder.md | End-to-end example: DUNDER_MIFFLIN_SALES_INVENTORY → TS Model. Exercises multi-value synonyms, per-column descriptions, table comments, semi-additive metrics (closing/opening), unique count formula, and concat() for strings. |
| ../ts-profile-thoughtspot/SKILL.md | ThoughtSpot auth methods, profile config, CLI usage |
| ../../claude/references/direct-api-auth.md | Direct API authentication fallback when stored procedures are unavailable |
Cortex Code connection (configured via cortex connections set) |
Snowflake connection code, SQL execution patterns |
| references/open-items.md | Known gaps and deferred capabilities for this skill |
Concept Mapping
| Snowflake Semantic View (real DDL format) | ThoughtSpot Model |
|---|---|
tables ( DB.SCHEMA.TABLE [primary key (col)] ) |
model_tables[] — one entry per physical ThoughtSpot table |
primary key (col) on a table |
Identifies join target — not written into model TML directly |
tables ( DB.SCHEMA.TABLE ... comment='...' ) |
TS Table TML table.description — applied as a separate Table-TML update |
dimensions ( TABLE.COL as view.NAME [comment='...'] ) |
columns[] with column_type: ATTRIBUTE |
| Dimension with date/timestamp physical column | columns[] with column_type: ATTRIBUTE (ThoughtSpot infers date type) |
metrics ( TABLE.COL as SUM(view.NAME) ) |
columns[] with column_type: MEASURE + aggregation |
metrics ( TABLE.COL as complex_sql_expr ) |
formulas[] with translated ThoughtSpot formula |
metrics ( TABLE.COL non additive by (D.col asc nulls last) as SUM(...) ) |
formulas[] with last_value(sum(...), query_groups(), {date}) |
metrics ( TABLE.COL non additive by (D.col desc nulls last) as SUM(...) ) |
formulas[] with first_value(sum(...), query_groups(), {date}) |
relationships ( REL as FROM(FK) references TO(PK) ) |
referencing_join in model_tables (Scenario A, pre-defined joins) OR joins[] inline (Scenario B) |
with synonyms=('Display Name','Alt 1','Alt 2',...) on a dimension/metric |
First → column name. Rest → properties.synonyms (with properties.synonym_type: USER_DEFINED). |
comment='...' on a dimension/metric |
column description |
Top-level comment='...' (after metrics block) |
Model TML model.description |
with extension (CA='...') |
Not mapped to ThoughtSpot — logged in report |
Key structural rules:
column_idmust use the column name from the ThoughtSpot Table TML. Export Table TMLs to confirm — do not assume they match the semantic view left-hand side.- Simple metrics (
AGG(view.col)— one column, one aggregate) →MEASUREcolumn. Complex expressions →formulas[]entry. - In Scenario A,
referencing_joinpoints to a join pre-defined at the ThoughtSpot Table object level (found by exporting the FROM table's TML). - In Scenario B / hybrid, inline
joins[]on the FROM table entry (requireswithfield).
Prerequisites
ThoughtSpot
- ThoughtSpot Cloud instance, REST API v2 enabled
- User account with
DATAMANAGEMENTorDEVELOPERprivilege — only required for import - Authentication configured — run
/ts-profile-thoughtspotif you haven't already - The
tsCLI installed (pip install -e /path/to/tools/ts-cli)
No ThoughtSpot import access? You can still run this skill in file-only mode — it generates the Table and Model TML files for you to import manually. Select FILE at the Step 10 checkpoint or say "file only" at any point before Step 11.
Snowflake
- Role with
USAGEon the database and schema containing the semantic view - Connection configured — run
/ts-profile-snowflakeif you haven't already - For Scenario B: role with
CREATE TABLEor connection modification rights
Step 0 — Overview
On skill invocation, display this plan before doing any work:
ts-convert-from-snowflake-sv — convert a Snowflake Semantic View into a ThoughtSpot Model, translating tables, joins, and SQL expressions.
Steps:
- Authenticate (ThoughtSpot + Snowflake) ............. auto 1.5. Choose session mode (A: single / B: merge / C: update) . you choose
- Identify the semantic view ......................... you choose
- Get the semantic view DDL .......................... auto
- Parse the DDL (synonyms, descriptions, range joins, filter labels, verified queries) .................... auto
- Table registration question (reuse or create) ...... you choose
- Discover / create ThoughtSpot Table objects ........ auto (may ask for clarification) 6D. Apply SV table descriptions to TS Table TMLs ....... auto (when SV has table comments)
- Find join names (Scenario A) ...................... auto
- Build the model TML (incl. column synonyms/desc) ... auto
- Translate SQL expressions → ThoughtSpot formulas ... auto 9.5. Confirm Spotter enablement (default: enabled) ...... you choose
- Review checkpoint — inspect TML before import ...... you confirm
- Import the model into ThoughtSpot .................. auto
- Verify import and produce summary report ........... auto 12.5. Import verified queries as NLS Feedback ............ auto (when SV has verified queries)
File-only mode: at Step 10, choose FILE to write TML files for manual import.
Confirmation required: Steps 1.5, 5, 9.5, 10 (Modes A/B); Steps 1.5, C4 (Mode C) Auto-executed: all others
Ready to start? [Y / N]
Do not begin Step 1 until the user confirms.
Workflow
Step 1: Authenticate
Session continuity: If profiles were already confirmed earlier in this conversation (e.g. for a previous view in a batch), skip this step and reuse them.
ThoughtSpot profile:
- Run
ts profiles listto show configured profiles. - If multiple profiles: display a numbered list and ask the user to select one.
- If exactly one profile: display it and confirm before proceeding.
- Verify:
ts auth whoami --profile {name}— print display_name and base URL.
Snowflake connection:
Uses the active Cortex Code connection (configured via cortex connections set).
Verify with a SELECT CURRENT_USER(), CURRENT_ROLE() query.
Step 1.5: Session Mode
Choose a conversion mode:
A — Convert ONE Semantic View → new ThoughtSpot Model (default)
B — Merge MULTIPLE Semantic Views → new ThoughtSpot Model
C — Update an EXISTING ThoughtSpot Model from a changed Semantic View
If the user selects A (or presses Enter): set session_mode = "single". Continue
with the workflow unchanged — Steps 2 through 13 run exactly as documented.
If the user selects B: set session_mode = "merge". The modified Steps 2, 3, and
new Step 3.5 below apply; Steps 4–13 then run on the merged result exactly once.
If the user selects C: set session_mode = "update". Skip Steps 2–13 entirely.
Run the Mode C workflow documented in the section below, then stop.
Mode C: Update an Existing ThoughtSpot Model
Run these steps when session_mode = "update" (Mode C selected at Step 1.5).
Skip Steps 2–13 entirely. When Step C6 completes, the session ends.
Step C1: Identify both objects
Semantic View (source — the updated version):
Enter database.schema.view_name or press Enter to browse: _______
ThoughtSpot Model (target — the existing model to update):
G — I have a GUID
S — Search by name
Enter G / S:
Store {sv_name} and {model_guid}. Always require both to be explicitly selected —
do not attempt to auto-match by name.
Step C2: Fetch both in parallel
Run simultaneously:
SELECT GET_DDL('SEMANTIC_VIEW', '{database}.{schema}.{sv_name}');
source ~/.zshenv && ts tml export {model_guid} --profile {profile} --fqn --associated --parse
Parse the SV DDL using the existing Step 4 logic. Extract from the Model bundle:
model_tml = next(i["tml"]["model"] for i in bundle if i["type"] == "model")
existing = {}
for col in model_tml.get("columns", []):
existing[col["name"]] = {
"description": col.get("description", ""),
"synonyms": col.get("properties", {}).get("synonyms", []),
"ai_context": col.get("properties", {}).get("ai_context"), # read-only
"formula_id": col.get("formula_id"),
"column_id": col.get("column_id"),
}
existing_formulas = {
f["id"]: f.get("expr", "")
for f in model_tml.get("formulas", [])
}
Step C3: Compute the change set
import re
def _normalise_expr(expr: str) -> str:
"""Normalise for comparison only — never use the output as actual SQL."""
refs, i = {}, 0
def _stash(m):
nonlocal i
key = f"__REF{i}__"; refs[key] = m.group(0); i += 1; return key
# Stash bracket/brace refs so they survive lowercasing
out = re.sub(r'\[[^\]]+\]|\{[^}]+\}', _stash, expr)
out = re.sub(r'\s+', ' ', out.strip()).lower()
for key, val in refs.items():
out = out.replace(key, val)
return out
def _exprs_differ(a: str, b: str) -> bool:
return _normalise_expr(a) != _normalise_expr(b)
sv_cols = set(sv_parse["columns"].keys()) # keyed by column display name
model_cols = set(existing.keys())
change_set = {
"new_columns": list(sv_cols - model_cols),
"removed_columns": list(model_cols - sv_cols), # flag only
"modified_descriptions": [],
"modified_synonyms": [],
"modified_expressions": [],
"join_changes": [],
}
for col_name in sv_cols & model_cols:
sv_col = sv_parse["columns"][col_name]
ts_col = existing[col_name]
if sv_col.get("description") and sv_col["description"] != ts_col["description"]:
change_set["modified_descriptions"].append({
"column": col_name,
"current": ts_col["description"],
"new": sv_col["description"],
})
sv_syns = set(sv_col.get("synonyms", []))
ts_syns = set(ts_col["synonyms"])
if sv_syns != ts_syns:
change_set["modified_synonyms"].append({
"column": col_name,
"current": sorted(ts_syns),
"new": sorted(sv_syns),
"added": sorted(sv_syns - ts_syns),
"removed": sorted(ts_syns - sv_syns),
})
if col_name in sv_formulas and ts_col["formula_id"]:
# IMPORTANT: translate the SV expression through the formula translation
# reference FIRST (Step 9 resolution), THEN compare TS-formula-to-TS-formula.
# Comparing raw SQL to TS formula text flags every column as modified.
sv_expr_translated = translate_sv_to_ts(sv_formulas[col_name]) # Step 9
ts_expr = existing_formulas.get(ts_col["formula_id"], "")
if _exprs_differ(sv_expr_translated, ts_expr):
change_set["modified_expressions"].append({
"column": col_name,
"current": ts_expr,
"new": sv_expr,
})
# Join changes: compare sv_parse["relationships"] vs model join graph
# Flag any relationship not present in the existing model (name or endpoint differs)
Step C4: Present the diff and collect decisions
Display the summary, then per-section review tables. Wait for the user to edit and
type done before proceeding.
Summary
=== Change set for "{model_name}" ===
✚ New columns: {N} (will be added with generated synonyms + descriptions)
✖ Removed columns: {M} (flagged only — see note below)
✏ Modified descriptions: {P} (UPDATE / KEEP per column — default: KEEP)
✏ Modified synonyms: {Q} (MERGE / UPDATE / KEEP per column — default: MERGE)
~ Modified expressions: {R} (YES / SKIP per column — confirm before re-translating)
~ Join changes: {S} (flagged for review)
= Unchanged columns: {T} (no action)
Modified descriptions — per-column table, default KEEP:
| Column | Current (TS Model) | New (from SV) | Action |
|---|---|---|---|
| Amount | Total sales amount in USD | Total revenue in local currency | KEEP |
Modified synonyms — per-column table, default MERGE:
| Column | Current synonyms | Added by SV | Removed by SV | Action |
|---|---|---|---|---|
| Product Category | category, product group | dept | product group | MERGE |
Options:
MERGE(default) — add new SV synonyms, keep existing; never remove coached synonymsUPDATE— replace existing synonyms entirely with the SV setKEEP— ignore the SV change; leave existing synonyms untouched
Modified expressions — show old and new formula side-by-side. Require YES / SKIP
per column — never bulk-apply expression changes.
Removed columns — informational list only, no action column:
⚠ The following columns exist in the ThoughtSpot Model but are no longer in the SV.
They are NOT removed automatically — removal may break dependent Answers and Liveboards.
To remove them safely: run /ts-dependency-manager first, then edit the Model TML manually.
Require the user to type done after reviewing before proceeding.
Step C5: Build the updated Model TML and import
Deep-copy the existing Model TML. Apply only the confirmed changes:
| Change type | Action |
|---|---|
| New column | Generate using Step 8 + Step 9 logic — same as create mode |
Modified description, UPDATE |
Write to column.description |
Modified description, KEEP |
Leave untouched |
Modified synonyms, MERGE |
Union: add new SV synonyms, keep all existing ones |
Modified synonyms, UPDATE |
Replace properties.synonyms[] with SV set |
Modified synonyms, KEEP |
Leave untouched |
Modified expression, YES |
Re-translate using Step 9 logic; update formulas[].expr |
Modified expression, SKIP |
Leave untouched |
ai_context on any column |
Never touch |
| Data Model Instructions | Never touch |
| Removed columns | Never touch |
Place guid at the document root (not nested under model:) and import with
--no-create-new to update the existing model in place. The import will fail if
the GUID is not found — surface the error clearly and stop.
top_level = {"guid": model_guid, "model": model_dict}
model_tml_str = yaml.dump(top_level, default_flow_style=False, allow_unicode=True)
result = subprocess.run(
["bash", "-c",
f"source ~/.zshenv && ts tml import --policy ALL_OR_NONE "
f"--no-create-new --profile '{profile_name}'"],
input=json.dumps([model_tml_str]),
capture_output=True, text=True,
)
Step C6: Post-import coaching handoff
After a successful import, always surface:
✓ Model "{model_name}" updated.
⚠ Coaching surfaces that may need review:
Column AI Context
{N_new} new columns added — no ai_context yet
{M_updated} existing columns had descriptions or synonyms changed
→ Run /ts-object-model-coach → surface 1 to review and update ai_context
Data Model Instructions
Schema changes (new columns, expression changes, join changes) may affect
Spotter's default behaviours — particularly time_defaults and aggregation_defaults.
→ Run /ts-object-model-coach → surface 5 to review Instructions
Removed columns flagged above
If you intend to remove any of the flagged columns, run /ts-dependency-manager
first to assess downstream impact before editing the Model TML manually.
Step 2: Identify the semantic view
Single mode (merge_mode = False): proceed as documented below.
Merge mode (merge_mode = True):
- Also ask for the output ThoughtSpot Model name now:
Output ThoughtSpot Model name: _______ - Ask the user to list the Semantic Views to merge. Accept either:
- A comma-separated list of names:
SALES_SV, INVENTORY_SV - A wildcard/prefix — Claude will run:
and display matches for user confirmation before proceedingSHOW SEMANTIC VIEWS LIKE '{prefix}%' IN SCHEMA {database}.{schema};
- A comma-separated list of names:
- Confirm the final list before proceeding to Step 3.
Single mode: If the user has named the semantic view, proceed directly to Step 3.
Otherwise, list available semantic views so the user can choose:
SHOW SEMANTIC VIEWS IN SCHEMA {database}.{schema};
If the database and schema are unknown, ask the user or run SHOW DATABASES /
SHOW SCHEMAS IN DATABASE {db} first.
Display results as a numbered list. Ask the user to select one (or enter a full
database.schema.view_name directly).
Step 3: Get the semantic view DDL
Single mode: run as documented below.
Merge mode: execute GET_DDL for each SV in the confirmed list. Parse each DDL
independently using the Step 4 logic and store as a separate parse result object before
proceeding to Step 3.5.
SELECT GET_DDL('SEMANTIC_VIEW', '{database}.{schema}.{view_name}');
Store the returned DDL string in full — it will be parsed in the next step.
If the call fails with "object does not exist", verify the fully-qualified name and
the user's role has USAGE on the schema.
Converting multiple views from the same schema? List then fetch each DDL:
SHOW SEMANTIC VIEWS IN SCHEMA {database}.{schema};
SELECT "name" FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()));
-- then per name:
SELECT GET_DDL('SEMANTIC_VIEW', '{database}.{schema}."' || name || '"') AS ddl;
Parse each DDL in Step 4 before switching Snowflake queries.
Step 3.5: Merge and Deduplication (merge mode only)
Skip this step if merge_mode = False.
Combine all parse results from Step 3 into a single merged result that Steps 4–13 will treat as if it came from one Semantic View.
1. Tables — union of all tables[] entries across all SVs.
- Deduplicate by physical identity: two entries with the same
base_table.database + schema + tablerepresent the same Snowflake table. Keep one. - If their column definitions differ (different dimensions, different data types for the same column name), flag as a column conflict — list each conflicting column and ask the user which definition wins before continuing.
2. Relationships — union of all relationships[].
- Deduplicate by (left_table, right_table, left_column, right_column) — exact match on all four fields. Keep one entry.
- If the same table pair has conflicting relationship definitions (different column pairs), flag as a relationship conflict for user resolution.
3. Metrics — union of all metrics[].
- Deduplicate by (name, expr) — exact match on both. Keep one entry.
- If same name but different expr: flag as a metric conflict. User must choose which definition wins or rename one before the merge can proceed. Do not silently prefer either definition.
4. Dimensions / time_dimensions / metrics / facts (if present) — union across all
views, deduplicated by (table_name, column_name). DDL facts () entries (row-level named
expressions) are also merged and available for identifier resolution in Step 9.
5. Fact table identification in merged context — re-run the fact-table detection algorithm (tables with no incoming relationships in the merged relationship set = fact tables). If a table was a fact in one SV but gains an incoming relationship from another SV in the merged graph, present it to the user:
{TABLE} had no incoming joins in {SV1} but gains one from {SV2} in the merged model.
Treat as: F — Fact table D — Dimension table
6. Present merge summary and require confirmation before continuing:
Merging {M} Semantic Views:
{SV1}: {n} tables, {n} relationships, {n} metrics
{SV2}: {n} tables, {n} relationships, {n} metrics
...
Merged result: {n} tables ({x} deduplicated), {n} relationships, {n} metrics
Conflicts: {None / list of conflicts to resolve}
Output model name: {name from Step 2}
Proceed? YES / NO
If there are unresolved conflicts, require all to be resolved before accepting YES. After confirmation, continue with Step 4 using the merged result.
Step 4: Parse the DDL
Read and parse the DDL returned in Step 3. The DDL is a SQL CREATE OR REPLACE SEMANTIC VIEW statement. See ../../shared/mappings/ts-snowflake/ts-from-snowflake-rules.md
for the full format — it is NOT the hypothetical nested format; the real format has flat
dimensions and metrics sections at the view level.
Extract the following:
- View identity: database, schema, view name.
- Top-level
comment='...'(after the metrics block, beforewith extension) → Model description.
- Top-level
- Tables block: for each table entry, record:
- Fully-qualified table reference (
DB.SCHEMA.TABLE) — this is the Snowflake view/table - Table alias (explicit
ALIAS as DB.SCHEMA.TABLE, or defaults to last segment of the name) - Primary key column(s) (if present — marks this as a join target)
- Range constraint (if present):
constraint <NAME> distinct range between <START> and <END> exclusive— extract constraint name, start column, end column. Stored inrange_constraintsmap keyed by table alias. Used in Step 8 to generate range joinonexpressions. - Table-level
comment='...'if present → maps to TS Table TMLtable.description.
- Fully-qualified table reference (
- Relationships block: for each relationship, record name, from table alias, from
column(s), to table alias, to column(s), and join style:
- Equi-join (standard):
REL_NAME as FROM(COL) references TO(COL)— record asjoin_style: "equi". - Composite equi-join:
REL_NAME as FROM(COL1, COL2) references TO(COL1, COL2)— multiple column pairs. Record asjoin_style: "equi"with parallel column lists. - Range join (BETWEEN):
REL_NAME as FROM(COL) references TO(between START and END exclusive)— record asjoin_style: "range", withto_startandto_endcolumns from the BETWEEN clause. Theexclusivekeyword means half-open interval (>=start,<end). - ASOF join:
REL_NAME as FROM(COL1, COL2) references TO(COL1, ASOF COL2)— record asjoin_style: "asof". The equi-join columns pair normally; the ASOF column generates a>=predicate.
- Equi-join (standard):
- Dimensions block (flat, all tables): for each entry (
TABLE.COL as view_alias.NAME [with synonyms=(...)] [comment='...']), record:- Source: TABLE alias + VIEW column name (column in the Snowflake view layer)
- Semantic alias:
view_alias.NAME - Synonyms list from
with synonyms=(...)— first → display name, rest →properties.synonyms - Description from
comment='...'→ columndescription - Filter label: if the entry contains
labels = (filter)before theaskeyword, setis_filter: true. The expression afterasis a BOOLEAN expression. Seets-from-snowflake-rules.md"Filter Labels → ThoughtSpot" for the full mapping. - If no synonyms: title-cased NAME → display name
- Metrics block (flat): for each entry, record:
- Simple:
TABLE.COL as AGG(view_alias.NAME)— extract source column + aggregation - Semi-additive:
TABLE.COL non additive by (DATE.col asc|desc nulls last) as SUM(view_alias.col)— translates to alast_value(asc) orfirst_value(desc) formula. See the formula reference's Semi-additive section for the full DDL → TS mapping. - Window function:
... OVER (PARTITION BY ...)— translates togroup_sum,safe_divide(..., group_sum(...))for contribution ratios, etc. - Synonyms + description mapping: same rule as dimensions.
- Simple:
- Facts block (if present): for each entry (
TABLE.FACT_NAME as EXPR [comment='...'] [with synonyms=(...)]), record:- Source: TABLE alias + fact name
- Expression (SQL): the right-hand side
- Synonyms + description: same mapping as dimensions
- Filter label:
labels = (filter)beforeas→ setis_filter: true(same rule as dimensions) - Visibility:
PRIVATEmodifier if present
- Extension JSON (
with extension (CA='...')): parse for column type confirmation (dimensions / time_dimensions / metrics per table). Do not map to ThoughtSpot. - Verified queries (
ai_verified_queries (...)): if present after thecomment=clause, parse each query entry. Format:
Extract: name, question text, SQL string, verified_at timestamp, onboarding flag. Store inQUERY_NAME AS (QUESTION 'text' [VERIFIED_AT epoch] [ONBOARDING_QUESTION TRUE|FALSE] SQL 'select ...')verified_querieslist. These are emitted as NLS Feedback TML after Model import (Step 12). Seets-from-snowflake-rules.md"Verified Queries → NLS Feedback TML".
Build an internal map:
tables: alias → fully-qualified ref, primary key, range_constraint (if any), table descriptionrelationships: list of (name, from_alias, from_cols[], to_alias, to_cols[], join_style — one ofequi,range,asof)columns(flat): all dimensions and metrics, keyed by (table_alias, view_col), with display name, synonyms[], description, and is_filter fields populated.facts: keyed by (table_alias, fact_name) → {expression, comment, synonyms[], visibility, is_filter}verified_queries: list of {name, question, sql, verified_at, onboarding}model_description: from the top-levelcomment='...'clause
4x. Unrecognized-construct scan (MANDATORY — do not skip). After extracting the known blocks, scan the remaining DDL text for these tokens (case-insensitive). Each hit is a construct this skill cannot yet convert. NEVER silently drop one:
| Token | Construct | Action |
|---|---|---|
facts ( |
FACTS block (row-level expressions metrics may reference) | Extract into the facts map (see item 6 above). Each fact becomes a formulas[] entry in Step 8 (see ts-from-snowflake-rules.md "Facts Block → ThoughtSpot"). Step 9's identifier resolution uses this map to resolve metric references to facts. If a metric references a fact name that was not successfully parsed → FAIL that column loudly with the fact name. |
ai_sql_generation / ai_question_categorization |
CA custom instructions | Add Unmapped Report row: "Custom instructions present — review for ThoughtSpot data_model_instructions equivalent (GAP-06)" |
ai_verified_queries |
CA verified queries | Parse into verified_queries list (see item 8 above). Emitted as NLS Feedback TML after Model import in Step 12 |
with cortex search service |
dimension search service | Unmapped Report row naming the dimension |
private (as visibility modifier) |
private dims/metrics | Convert but set index_type: DONT_INDEX + report |
unique ( |
uniqueness constraints | Record for join cardinality inference (see Task 1.4) |
range between (NOT inside a constraint clause) |
stray range token | STOP — likely an unsupported DDL variant; show user the unconsumed text |
| anything else unparsed (non-whitespace remains after extraction) | unknown grammar | STOP and show the user the unconsumed text — the SV spec evolves; do not guess |
Top-level COMMENT extraction fix: the comment '...' clause is no longer guaranteed to
be the last clause — AI_* clauses may follow it. Anchor on the comment '...' token
pattern, not on position relative to the end of the DDL.
Step 5: Table registration question
After parsing, display the tables found and ask a single question:
The semantic view references {n} tables:
{database}.{schema}.{TABLE_1}
{database}.{schema}.{TABLE_2}
...
Are these tables already registered in ThoughtSpot?
Y Yes — use existing ThoughtSpot Table objects
N No — create new Table objects from scratch
? Not sure — search ThoughtSpot first
Enter Y / N / ?:
- Y → skip search, go to Step 6A (column verification only)
- N → skip search, go to Step 6B (create)
- ? → go to Step 6A (search + verify)
Step 6A: Discover and verify existing ThoughtSpot Table objects (Y and ? paths)
Skip this step if the user answered N in Step 5 — go directly to Step 6B.
Choose the search scope first. A whole-instance scan is the slow path — on a
large instance --all pulls every table. Offer the narrower option and search by
table-name pattern (--name), never --all-then-filter:
How should I search for these tables?
C Within a specific connection — fastest; search that one connection's tables
I Entire ThoughtSpot instance — broader, slower
Enter C / I :
Search by name (both scopes start here):
source ~/.zshenv && ts metadata search --subtype ONE_TO_ONE_LOGICAL --name "%{table_name}%" --profile {profile}
- C (within a connection) → first identify the connection using the
N (name it) / F (filter by substring) / L (list all) prompt in Step 6B — present that
prompt and let the user choose; do NOT run
ts connections listand dump every connection by default. Then keep only results whosemetadata_header.dataSourceNameequals the chosen connection name (each result carries its connection there, e.g."APJ_SNOW"). Fastest, and unambiguous when the same table name exists on several connections. - I (entire instance) → run the name search above with no connection filter.
Filter the JSON to match each semantic view base table by table name (metadata_name)
and, for the connection scope, metadata_header.dataSourceName; use
metadata_header.database_stripes / metadata_header.schema_stripes to disambiguate
same-named tables. Build a map: physical_table_name → {metadata_id, metadata_name}.
Only fall back to
--all(fetch every table) when no usable name pattern can be formed (e.g. the name is too generic). Tell the user that cost before running it.
Export TMLs for all found tables in one call to verify columns:
source ~/.zshenv && ts tml export {guid1} {guid2} ... --profile {profile} --parse
--parse returns structured JSON — access columns via item["tml"]["table"]["columns"]
directly. Parse table.columns[].name from each returned item. Build a column map per table:
table_name → [col_name, ...]. Compare against the columns referenced in
the semantic view dimensions and metrics to identify any column gaps.
The
column_idin the model TML must use the column names from the ThoughtSpot Table TML — export the TMLs to confirm them.
Confirm the plan before making any changes:
Show the user a full status table and wait for confirmation:
Table Plan:
✓ {TABLE_1} — found (GUID: {guid}) — all {n} columns present → use as-is
⚠ {TABLE_2} — found (GUID: {guid}) — missing {n} columns: {COL_A}, {COL_B} → update
✗ {TABLE_3} — not found in ThoughtSpot → create new
Actions to be taken:
• Update {TABLE_2}: add {n} missing columns
• Create {TABLE_3}: {n} columns from Snowflake schema
No changes have been made yet. Proceed? (yes/no):
Do not proceed until the user confirms. If any table is not found, follow Step 6B for those tables. If any table has missing columns, follow Step 6C before building the model.
Step 6D: Apply SV table-level metadata to ThoughtSpot Table TMLs
If the SV tables (...) block has comment='...' on any base table, push those
descriptions onto the corresponding ThoughtSpot Table objects before building the
model. This is a separate Table TML import, run with --no-create-new so existing
tables are updated in place.
Per table that has an SV table-comment:
- Take the parsed Table TML from Step 6A.
- Set
table.descriptionto the SV table comment. - Verify
table.schemamatches the actual Snowflake schema — older Table objects sometimes claim a different schema than the live object, which breaks import validation. If there's a mismatch, also fixtable.schemahere. - Wrap with
{guid: ..., table: ...}at top level so--no-create-newupdates the existing object.
Batch all updates into one ts tml import --policy ALL_OR_NONE --no-create-new call.
If the SV does not put comment='...' on any table, skip this step.
Step 6C: Update existing tables with missing columns
For each table from Step 6A with a column gap, introspect the Snowflake schema for the missing columns only:
SELECT table_name, column_name, data_type
FROM {database}.information_schema.columns
WHERE table_schema = '{SCHEMA}'
AND table_name IN ({comma_quoted_table_names})
AND column_name IN ({comma_quoted_missing_col_names})
ORDER BY table_name, ordinal_position;
Map Snowflake types to ThoughtSpot types using ../../shared/mappings/ts-snowflake/ts-from-snowflake-rules.md.
Find the ThoughtSpot connection for those tables:
source ~/.zshenv && ts connections list --profile {profile}
Note: ts connections list auto-paginates and returns all connections.
Add the missing columns to the connection, then re-import the updated Table TML for each affected table (batch all imports in one call):
source ~/.zshenv && ts tml import --policy ALL_OR_NONE --profile {profile}
After import, re-export the updated TMLs to refresh the column map before Step 8.
Step 6B: Create ThoughtSpot Table objects for views (Scenario B) — also the connection picker for the Step 6A connection-scoped search
Do all Snowflake introspection in a batch query — not per-table calls.
Batch: get all column names and types for the entire schema in one query:
SELECT table_name, column_name, data_type FROM {database}.information_schema.columns WHERE table_schema = '{SCHEMA}' ORDER BY table_name, ordinal_position;This returns every column for every table/view in the schema in one round-trip.
Choose which ThoughtSpot connection to use — use an existing one or create a new one. Use the connection name directly in table TML — no GUID lookup is needed or possible from available procedures.
Ask first:
The new Table objects need a ThoughtSpot connection that can reach {database}. E Use an existing connection C Create a new connection (Snowflake, key-pair auth) Enter E / C:When to create: a ThoughtSpot connection only sees databases its Snowflake role is granted. If no existing connection's role can see
{database}, table creation fails with "Database {db} does not exist in connection" — that is the signal to create one (do not trial-and-error existing connections to find out).E — use an existing connection. Don't dump the full list by default — a long connection list is noise when the user already knows the one they want. Ask:
How would you like to identify the connection? N Name it — type the exact connection name; I'll use it directly F Filter — give a partial string; I'll list only connections that match L List all — show every connection and pick by number Enter N / F / L:Then fetch the connections once (auto-paginated, returns all):
source ~/.zshenv && ts connections list --profile {profile}Resolve the user's choice against that result:
- N (name it) — match the typed name against the returned
namevalues (case-sensitive). Exactly one match → use it. No match → show the closest names and re-ask. Don't fabricate a name the list doesn't contain — the table TML needs the exact, case-sensitive connection name. - F (filter) — keep connections whose
namecontains the string (case-insensitive), show them as a short numbered list (name, type, database), and pick from that. One match → auto-select and confirm; none → widen the string or switch to L. - L (list all) — show the full numbered list and pick by number.
If only one connection exists in total (or only one matches the semantic view's database), auto-select it and confirm regardless of the choice. Use the exact
namevalue from the API response.C — create a new connection (Snowflake, key-pair auth). Collect the connection name, Snowflake account identifier, user, role, warehouse, and the path to the unencrypted PKCS#8 private key (
.p8), then run:source ~/.zshenv && ts connections create \ --name "{connection_name}" \ --account "{account}" --user "{user}" --role "{role}" --warehouse "{warehouse}" \ --database "{database}" \ --private-key-path "{key_path}" \ --profile {profile}The role must have
USAGEon{database}and its schema (andSELECTon the tables) — otherwise the tables won't resolve. The matching public key must already be registered on the Snowflake user (DESC USER {user}showsRSA_PUBLIC_KEY).Credential handling (required): never ask the user to paste a private key, password, or secret into the conversation. The key is passed by file path only —
ts connections createreads it and never echoes it. Key-pair is the only auth this path supports; for password/OAuth, direct the user to create the connection in the ThoughtSpot UI and return on the E path. The command prints{id, name, data_warehouse_type}— use the returnednamefor the table spec.- N (name it) — match the typed name against the returned
Create ThoughtSpot Table objects for all tables in one command:
cat tables-spec.json | ts tables create --profile {profile}Where
tables-spec.jsonis a JSON array built from the column data above. Seets tables create --helpfor the spec format. This command handles JDBC retry and GUID resolution automatically, and outputs{name: guid}.Inline joins will be defined directly in the model TML (no
referencing_join).
Step 7: Find join names (Scenario A only)
If there is only ONE table in the semantic view, there are no joins by definition.
Skip this step and proceed to Step 8 with a single model_tables entry.
Joinless semantic views (GAP-03) — multi-table SVs with no relationships:
If the SV has multiple tables but no relationships(...) block (or the block is empty),
ThoughtSpot still requires joins for cross-table queries. Present the user with join
discovery options:
No relationships defined in the Semantic View ({n} tables found).
ThoughtSpot requires joins for cross-table queries.
How should we discover joins?
1 — Auto-discover from database constraints (PK/FK)
2 — Analyse column overlap and suggest joins (deeper dive)
3 — I'll specify the joins manually
4 — Skip — create model with no joins (single-table queries only)
Option 1 — Database constraint discovery:
Query Snowflake for foreign key relationships between the SV's tables:
-- For each table in the SV:
SHOW IMPORTED KEYS IN TABLE {db}.{schema}.{table};
The result contains pk_table_name, pk_column_name, fk_table_name, fk_column_name,
and key_sequence (for composite FKs with the same constraint name). Build relationships
from these — each FK→PK pair becomes a join. Composite FKs (multiple rows with the same
constraint name) become composite equi-joins.
If FK constraints are found, present them for confirmation:
Found {n} foreign key relationships:
1. {FK_TABLE}.{FK_COL} → {PK_TABLE}.{PK_COL} (MANY_TO_ONE)
2. {FK_TABLE}.({COL1},{COL2}) → {PK_TABLE}.({COL1},{COL2}) (composite, MANY_TO_ONE)
Accept these joins? [Y / edit / skip]
If no FK constraints are found, offer to fall back to Option 2 (column overlap analysis).
Option 2 — Column overlap analysis (deeper dive):
For each pair of tables in the SV:
Scan column name overlap — find columns with identical names (case-insensitive) across the two tables:
SELECT a.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS a JOIN INFORMATION_SCHEMA.COLUMNS b ON UPPER(a.COLUMN_NAME) = UPPER(b.COLUMN_NAME) WHERE a.TABLE_SCHEMA = '{schema}' AND a.TABLE_NAME = '{table_a}' AND b.TABLE_SCHEMA = '{schema}' AND b.TABLE_NAME = '{table_b}' AND a.TABLE_CATALOG = '{db}' AND b.TABLE_CATALOG = '{db}';Check composite key uniqueness — for each candidate set of join columns, verify uniqueness on the target table:
SELECT COUNT(*) AS total_rows, COUNT(DISTINCT ({col1}, {col2})) AS distinct_keys FROM {db}.{schema}.{table};If
total_rows == distinct_keys, the column set is a valid unique key.Validate cardinality — confirm the join direction:
SELECT MAX(cnt) FROM ( SELECT {join_cols}, COUNT(*) AS cnt FROM {db}.{schema}.{from_table} GROUP BY {join_cols} );max(cnt) == 1→ ONE_TO_ONE;max(cnt) > 1→ MANY_TO_ONE from the source table.Present suggestions with evidence:
Suggested joins (based on column overlap analysis): 1. EMPLOYEES.(COMPANY_ID, DEPARTMENT) → EMPLOYEE_SUMMARY_VW.(COMPANY_ID, DEPARTMENT) Uniqueness: 15 rows, 15 distinct keys ✓ Cardinality: MANY_TO_ONE (max 12 employees per group) Type: LEFT_OUTER Accept / Modify / Skip each:
Option 3 — User-specified joins:
Prompt the user to define each join:
Specify joins between the {n} tables.
For each join, provide:
From table: ______
From column(s): ______ (comma-separated for composite)
To table: ______
To column(s): ______
Cardinality: MANY_TO_ONE / ONE_TO_ONE / MANY_TO_MANY
Type: LEFT_OUTER (default) / INNER / RIGHT_OUTER / FULL_OUTER
Add another join? [Y / done]
Option 4 — Skip (separate model per table):
Since ThoughtSpot cannot query across unjoined tables in a single model, create a separate model for each table:
⚠ No joins defined. Creating {n} separate models — one per table.
Cross-table queries will not be possible.
Model 1: {TABLE_A} ({m} columns)
Model 2: {TABLE_B} ({p} columns)
You can combine them later by editing Model TML and adding joins.
Proceed? [Y / n]
Each model gets its own model_tables entry (single table), its own columns
(only those from that table), and its own formulas (only those referencing that
table's columns). Import each model separately.
All discovered/specified joins (Options 1–3) are added to the relationships map
and treated identically to SV-declared relationships in Step 8 (inline joins on the
FROM table).
For each relationship in the semantic view, find the name of the pre-defined join in the ThoughtSpot Table objects.
Re-use the TMLs already exported in Step 6A — do not make another export call.
The --parse output gives item["tml"]["table"] directly for each FROM table.
For a relationship FROM {from_table} KEY {from_col} TO {to_table} KEY {to_col}:
- In the FROM table's parsed TML (
item["tml"]["table"]), find thejoins_withsection. - Match the entry where
destination.name(ordestination) equals the TO table name. - Record the join
name— this is thereferencing_joinvalue for theto_tableentry in the model TML.
If no matching join is found:
- Warn the user: "No pre-defined join from
{from_table}to{to_table}." - Options: (1) use an inline join instead (Scenario B for this relationship), (2) abort and define the join at the ThoughtSpot Table level first.
Step 8: Build the model TML
Construct the model TML as a YAML string. Use the templates in ../../shared/mappings/ts-snowflake/ts-from-snowflake-rules.md.
Model name: {view_name_title_case} — derived from the Snowflake Semantic View name.
Ask the user if they want a different name. Do not add a TEST_SV_ or other prefix —
see ../../shared/schemas/ts-model-conversion-invariants.md (N1).
CRITICAL — Never normalise names from API responses. Names that came from
ts tml export (join names, column names, table names) or from import response GUIDs
must be used exactly as returned — no .lower(), no .upper(), no title-casing,
no whitespace trimming. Any silent transformation will cause a lookup failure in the
model TML (wrong referencing_join, wrong table.name, wrong column_id). When in
doubt, copy the string character-for-character from the API response.
Identify the fact table (the table that is never on the "TO" side of any relationship)
— it gets no referencing_join and no joins[].
Joinless models (user chose Option 4 in Step 7): create a separate model per
table. Each model contains only columns and formulas that reference that table.
Name each model {view_name} — {TABLE_NAME} (or let the user choose). Import each
independently. Report all created models in Step 12.
Discovered joins (Options 1–3 in Step 7): joins discovered via PK/FK constraints,
column overlap analysis, or user specification are treated identically to SV-declared
relationships — use inline joins[] on the FROM table entry (Scenario B pattern).
Critical id rules (applies to all scenarios):
idmust equalnameexactly (same case, same characters). ThoughtSpot resolveswithandonjoin references against the table's actualname— ifiddiffers in case (e.g.id: dm_orderwithname: DM_ORDER), joins fail with "{table_name} does not exist in schema". Use the exact ThoughtSpot table object name for bothidandname(often uppercase for newly-created tables).idvalues must be unique across allmodel_tablesentriesnamevalues must also be unique — ThoughtSpot rejects models where two tables share the samenamevalue ("Multiple tables have same alias")- If two semantic view tables map to the same ThoughtSpot table (same GUID), include
it only ONCE and use ONE
id/name
Model TML skeleton (Scenario A — pre-defined joins exist in table TML):
model:
name: "{view_name}"
model_tables:
- id: FACT_TABLE # MUST equal name exactly (copy verbatim — often uppercase)
name: FACT_TABLE # exact ThoughtSpot table object name — FK side, joins go here
fqn: "{fact_guid}" # GUID from Step 6A
joins:
- with: DIM_TABLE # must equal the target entry's name exactly
referencing_join: "{join_name}" # from Step 7
- id: DIM_TABLE # MUST equal name exactly — PK side, no joins
name: DIM_TABLE # exact ThoughtSpot table object name
fqn: "{dim_guid}" # GUID from Step 6A
columns:
- name: "{display_name}"
column_id: fact_table::{col_name} # col_name from ThoughtSpot Table TML
properties:
column_type: ATTRIBUTE
- name: "{display_name}"
column_id: fact_table::{col_name}
properties:
column_type: MEASURE
aggregation: SUM
formulas:
- name: "{display_name}"
expr: "{thoughtspot_formula}"
properties:
column_type: MEASURE
Join type and cardinality defaults:
SV relationships carry no join type — they define foreign key paths only. Use these defaults:
type: LEFT_OUTER— preserves fact rows with NULL FKs, matching SV query semantics where unmatched facts still aggregate. State the assumption in the conversion report.cardinality: MANY_TO_ONE— default for FK→PK relationships. If the target table's key carries aUNIQUEconstraint (detected in Step 4x scan), useONE_TO_ONEinstead.
Model TML skeleton (Scenario B / Hybrid — inline joins, or no pre-defined table joins):
Use this when ThoughtSpot Table objects have no joins_with entries, or when creating
new Table objects for views. Inline joins live on the source (FROM) table entry.
model:
name: "{view_name}"
model_tables:
- id: FROM_TABLE # MUST equal name exactly (copy verbatim from import response)
name: FROM_TABLE # exact ThoughtSpot table object name — never lowercase or transform
fqn: "{from_guid}"
joins:
- name: "{join_name}"
with: TO_TABLE # REQUIRED — must equal `id` (= `name`) of the target entry exactly
on: "[FROM_TABLE::{fk_col}] = [TO_TABLE::{pk_col}]" # uses id values (= name values)
type: LEFT_OUTER
cardinality: MANY_TO_ONE # or ONE_TO_ONE if target key has UNIQUE constraint
- id: TO_TABLE # matches `with` value above — same case
name: TO_TABLE
fqn: "{to_guid}"
columns:
# ... same pattern as Scenario A ...
Range joins (Scenario B / Hybrid — join_style: "range"):
When a relationship has join_style: "range", the on expression uses >= and <
instead of =. The exclusive keyword in the DDL means half-open interval:
joins:
- name: "{rel_name}"
with: PERIOD_TABLE
on: "[FROM_TABLE::{col}] >= [PERIOD_TABLE::{start_col}] and [FROM_TABLE::{col}] < [PERIOD_TABLE::{end_col}]"
type: LEFT_OUTER
cardinality: MANY_TO_ONE
ASOF joins (Scenario B / Hybrid — join_style: "asof"):
Equi-join columns pair with =; the ASOF column generates >=:
joins:
- name: "{rel_name}"
with: TO_TABLE
on: "[FROM_TABLE::{equi_col}] = [TO_TABLE::{equi_col}] and [FROM_TABLE::{asof_col}] >= [TO_TABLE::{asof_col}]"
type: LEFT_OUTER
cardinality: MANY_TO_ONE
Composite equi-joins (multiple column pairs):
joins:
- name: "{rel_name}"
with: TO_TABLE
on: "[FROM_TABLE::{col1}] = [TO_TABLE::{col1}] and [FROM_TABLE::{col2}] = [TO_TABLE::{col2}]"
type: LEFT_OUTER
cardinality: MANY_TO_ONE
Filter labels → boolean formula columns:
For any dimension or fact with is_filter: true, create a boolean formula column
(ATTRIBUTE, not MEASURE) regardless of whether the expression is numeric:
formulas:
- id: "formula_{display_name}"
name: "{display_name}"
expr: "if ( [TABLE::{col}] >= 90000 ) then true else false" # translated from SV BOOLEAN_EXPR
properties:
column_type: ATTRIBUTE
columns:
- name: "{display_name}"
formula_id: "formula_{display_name}"
properties:
column_type: ATTRIBUTE
At the Step 10 review checkpoint, note which columns are filter-derived and offer the user the option to add them as model filters (default: column only).
Duplicate column_id detection (I8):
After assembling all columns[] entries, scan for duplicate column_id values.
When two metrics reference the same physical column with different aggregations
(e.g. SUM(SALARY) and AVG(SALARY)), keep only the first as a column_id-based
entry (prefer SUM). Express all others as formulas[] entries:
# First metric keeps column_id
columns:
- name: "Total Salary"
column_id: EMPLOYEES::SALARY
properties:
column_type: MEASURE
aggregation: SUM
# Second metric becomes a formula
formulas:
- id: "formula_Avg Salary"
name: "Avg Salary"
expr: "average ( [EMPLOYEES::SALARY] )"
properties:
column_type: MEASURE
See ../../shared/schemas/ts-model-conversion-invariants.md (I8).
Column entries — display name, synonyms, description:
For each dimension or metric in the semantic view, populate metadata as follows:
| SV DDL field | TS column field |
|---|---|
with synonyms=('Display Name','Alt 1','Alt 2',...) (1st value) |
name |
with synonyms=(...) (remaining values) |
properties.synonyms (with properties.synonym_type: USER_DEFINED) |
comment='...' |
description (at column root) |
| (no synonyms clause) | name = title-cased SV alias (LHS) |
Critical placement: synonyms live under properties.synonyms, NOT at column root.
A top-level synonyms: field is silently dropped on import. Always pair with
properties.synonym_type: USER_DEFINED.
For each dimension:
column_id:{id}::{col_name}— whereidis the model_tablesidfor that table, andcol_nameis from the ThoughtSpot Table TMLproperties.column_type: ATTRIBUTE
For each simple metric (AGG(view_alias.metric_name)):
column_id:{id}::{col_name}properties.column_type: MEASUREaggregation: mapped from the SQL aggregate function (see ts-from-snowflake-rules.md)
COUNT(DISTINCT col) metrics — use a formula, not aggregation: COUNT_DISTINCT (I5):
COUNT(DISTINCT col) must be expressed as a formulas[] entry with unique count ( [TABLE::col] ).
Never use aggregation: COUNT_DISTINCT on a column_id entry — ThoughtSpot silently overrides
column_type: MEASURE → ATTRIBUTE when COUNT_DISTINCT is used this way.
See ../../shared/schemas/ts-model-conversion-invariants.md (I5).
For each complex metric (formula expression):
- See Step 9 for translation. Results go into
formulas[].
For each public fact in the facts map:
- Create a
formulas[]entry with the translated expression (apply the same SQL → ThoughtSpot formula rules as metrics). Usecolumn_type: MEASUREfor numeric expressions andcolumn_type: ATTRIBUTEfor string/date expressions. - Create a paired
columns[]entry withformula_idmatching the formula'sid. - For private facts referenced by at least one metric: create the formula with
index_type: DONT_INDEXon thecolumns[]entry. For private facts not referenced by any metric: skip entirely. - Fact formulas are emitted before metric formulas in the
formulas[]array so that[formula_<id>]references resolve correctly. Metric formulas reference facts by their formulaid(e.g.[formula_Tenure Months]), NOT display name.
See ../../shared/mappings/ts-snowflake/ts-from-snowflake-rules.md "Facts Block →
ThoughtSpot" for the full mapping pattern and examples.
- Never add
aggregation:to aformulas[]entry — formulas are self-contained via theirexpr. ThoughtSpot rejects TML withFORMULA is not a valid aggregation type.
Every formula must have a columns[] entry. Add a columns[] entry with
formula_id: for every entry in formulas[]:
formulas:
- id: formula_Inventory Balance # id: "formula_" + name (spaces preserved)
name: "Inventory Balance"
expr: >-
last_value ( sum ( [DM_INVENTORY::FILLED_INVENTORY] ) , query_groups ( ) , { [DM_DATE_DIM::DATE_VALUE] } )
properties:
column_type: MEASURE
columns:
# ... physical columns ...
- name: "Inventory Balance"
formula_id: formula_Inventory Balance # must match the formula's `id` exactly
properties:
column_type: MEASURE
aggregation: SUM
index_type: DONT_INDEX # recommended for computed numeric measures
aggregation: on a columns[] formula entry is allowed (unlike in formulas[] entries
where it causes an import error).
Step 9: Translate SQL expressions → ThoughtSpot formulas
MANDATORY — read the reference before assessing any expression: Open ../../shared/mappings/ts-snowflake/ts-snowflake-formula-translation.md and use its Reverse translation sections for each SQL pattern. Do not classify an expression as untranslatable based on SQL syntax recognition alone. Patterns that appear Snowflake-specific have documented ThoughtSpot equivalents — for example:
Looks untranslatable Actually translatable as SUM(col)+NON ADDITIVE BY (date ASC NULLS LAST)last_value ( sum ( [col] ) , query_groups ( ) , { [date_col] } )SUM(m) OVER (PARTITION BY dim1, dim2)group_sum ( measure, dim1, dim2 )SUM(m) OVER (PARTITION BY EXCLUDING dim1)group_aggregate ( sum(m), query_groups()-{dim1}, query_filters() )DIV0(tbl.metric, SUM(tbl.metric) OVER (PARTITION BY dim.COL))safe_divide ( sum(m), group_sum(m, dim) )— contribution ratioSUM(m) OVER (PARTITION BY EXCLUDING dim ORDER BY dim ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)cumulative_sum ( measure, dim )Consult the reference. Never reason from first principles about SQL window functions.
9a. Identifier resolution (MANDATORY pre-pass).
Before translating any metric expression, resolve every table_alias.name reference
in the expression. Use the Identifier Resolution Algorithm in
../../shared/mappings/ts-snowflake/ts-from-snowflake-rules.md:
Physical column? Check the ThoughtSpot Table TML columns for
table_alias. Ifnamematches a column → use[TABLE::col]reference. No further resolution needed.Fact? Check the
factsmap for(table_alias, name). If found → use formula reference[formula_<id>]where<id>is the fact'sidvalue from itsformulas[]entry (e.g.formula_Tenure Months). The reference must use the formulaid, NOT the display name —[Tenure Months]fails during TML import;[formula_Tenure Months]succeeds. NoTABLE::prefix.Metric? Check the
metricsmap for(table_alias, name). If found → this is double aggregation. Apply the Double Aggregation rules from ../../shared/mappings/ts-snowflake/ts-from-snowflake-rules.md:a. Find the relationship connecting the inner metric's table to the outer metric's table. If the DDL uses
USING REL_NAME, use that relationship. Otherwise, find the relationship where one endpoint is the inner metric's table alias and the other is the outer metric's table alias.b. Identify the inner metric's aggregation function and column:
INNER_AGG(inner_col).c. Build the ThoughtSpot formula:
outer_agg ( group_inner_agg ( [CHILD_TABLE::inner_col] , [PARENT_TABLE::pk_col] ) )Use
group_*shorthand when one exists for the inner aggregation (group_count,group_sum,group_average,group_unique_count,group_min,group_max). Fall back to fullgroup_aggregate(inner_agg(...), {[PARENT::pk]}, query_filters())for other aggregation types.d. If the inner metric itself references another metric (triple aggregation), FAIL with: "Triple aggregation detected —
{outer}→{middle}→{inner}. This skill supports one level of metric-on-metric nesting."None of the above? FAIL the column loudly: "Metric references
{table_alias}.{name}which is not a physical column, fact, or metric."
Window metrics referencing metrics (GAP-13): when a window function metric
(e.g. SUM(...) OVER (ORDER BY ... ROWS BETWEEN ...)) references another metric
in its base expression, resolve the inner metric first:
- If the inner metric is a simple
AGG(col): inline the aggregation directly:cumulative_sum(count([TABLE::col]), [TABLE::order_col]) - Do NOT wrap in
group_aggregate— cumulative/moving functions already handle the aggregation grain internally.
For each metric whose EXPR is not a simple AGG(table.col) (after applying identifier resolution above — references have been resolved or the metric has been translated via double aggregation):
- Apply the SQL → ThoughtSpot formula translation rules in ../../shared/mappings/ts-snowflake/ts-snowflake-formula-translation.md (bidirectional reference — use the Snowflake → ThoughtSpot direction).
- Replace column references:
table.COLUMN→[TABLE_ALIAS::COLUMN] - If the expression translates successfully → add a
formulas[]entry. - If the expression is confirmed untranslatable after consulting the reference → omit the column and log it in the Formula Translation Log (for the summary report in Step 12).
Column references in translated formulas:
Use the name: from the corresponding model_tables[] entry (which matches the semantic
view table alias). Column name is the column name from the ThoughtSpot Table TML.
Example:
- Semantic view EXPR:
SUM(DM_ORDERDETAILS.UNIT_PRICE * DM_ORDERDETAILS.QUANTITY) - ThoughtSpot formula:
sum ( [DM_ORDERDETAILS::UNIT_PRICE] * [DM_ORDERDETAILS::QUANTITY] ) - Add as
formulas[]entry withcolumn_type: MEASURE
last_value / curly brace formulas — YAML block scalar required:
When the translated formula contains { [col] } (curly braces), use a >- block scalar
for the expr field. Inline YAML string assignment fails because { is a flow mapping
start character:
formulas:
- name: "Inventory Balance"
expr: >-
last_value ( sum ( [DM_INVENTORY::FILLED_INVENTORY] ) , query_groups ( ) , { [DM_DATE_DIM::DATE_VALUE] } )
properties:
column_type: MEASURE
In Python, set the formula string in the dict as a plain string — yaml.dump will emit
it as a block scalar automatically when the string contains {. If it doesn't, force it:
from yaml.representer import SafeRepresenter
def literal_representer(dumper, data):
if '{' in data or '\n' in data:
return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='>')
return dumper.represent_scalar('tag:yaml.org,2002:str', data)
yaml.add_representer(str, literal_representer)
Step 9.5: Spotter enablement
Before assembling the final TML, ask whether Spotter (AI search) should be enabled for this model. Default is yes — Spotter is the primary natural-language interface for Models, and a converted SV usually exists to be queried this way.
Enable Spotter (AI search) for this model? [Y / n] (default: Y)
Apply the answer to the model TML's properties block:
model:
name: {view_name}
# ... model_tables, columns, formulas, etc.
properties:
spotter_config:
is_spotter_enabled: true # or false based on answer
If the user answers n or no, set is_spotter_enabled: false. Pre-existing
models being updated in place (Step 11): if the user does not explicitly answer,
preserve the existing setting from the previously-exported model TML rather than
overwriting it with a default.
Step 10: Review checkpoint
Before importing, show the user a summary:
Model to import: {view_name}
Tables:
✓ {FACT_TABLE} (GUID: {guid}) — fact table
✓ {DIM_TABLE} (GUID: {guid}) — referencing_join: {join_name}
...
Columns ({n} total):
ATTRIBUTE: {list of display names}
MEASURE: {list of display names}
Formulas: {list of display names}
Formula translations:
✓ {name}: {sql_expr} → {ts_formula}
🔄 {name}: DOUBLE AGGREGATION — {outer_agg}(group_{inner_agg}(...))
📐 {name}: FACT REFERENCE — inlines fact expression (from {fact_name})
⚠ {name}: OMITTED — {reason}
Filter labels ({n}):
{name}: boolean formula (column only / also add as model filter?)
Verified queries ({n}):
{name}: "{question}" → will import as NLS Feedback after Model import
Spotter (AI search): enabled / disabled
Proceed with import?
yes — import to ThoughtSpot
no — cancel
file — write TML files without importing (for environments where you lack
DATAMANAGEMENT access, or to review the TML before committing)
Wait for user confirmation before proceeding.
If the user selects file, skip to Step 10-FILE.
Step 10-FILE: Output TML files (file-only mode)
This path is used when the user selected file at the Step 10 checkpoint, explicitly
said "file only", or has no ThoughtSpot DATAMANAGEMENT access.
1. Determine output filenames:
- Model TML:
{model_name}.model.tml - Any new Table TMLs created in Step 6B (Scenario B):
{table_name}.table.tml
2. Write the files:
from pathlib import Path
import yaml
# Model TML
model_tml_str = yaml.dump(
{"model": model_dict}, default_flow_style=False, allow_unicode=True
)
Path(f"{model_name}.model.tml").write_text(model_tml_str, encoding="utf-8")
# Table TMLs (Scenario B only)
for tbl_name, tbl_dict in new_table_tmls.items():
tbl_str = yaml.dump(
{"table": tbl_dict}, default_flow_style=False, allow_unicode=True
)
Path(f"{tbl_name}.table.tml").write_text(tbl_str, encoding="utf-8")
3. Report:
TML files written:
{model_name}.model.tml — ThoughtSpot Model TML
{table_name}.table.tml — ThoughtSpot Table TML (if new tables were needed)
To import to ThoughtSpot when you have access:
1. Package all .tml files into a zip:
zip {model_name}_tml.zip *.tml
2. In ThoughtSpot: Data → TML Import → upload the zip
(table TMLs will import first, then the model)
3. Or import via CLI:
ts tml import --file {model_name}.model.tml --policy ALL_OR_NONE --profile {profile}
Note: On first import, omit `guid` from the TML (already omitted here). ThoughtSpot
will assign a GUID — save it from the import response if you need to update the model later.
4. Proceed to Step 12 (Produce summary report) — include the formula translation log and column summary so the user has the full picture before importing.
Pre-import validation gate (ts tml lint — I1 / I2 / I4 / I5 / I8)
Before running ts tml import, lint the generated Model TML with ts tml lint — a
parser-based check of the hard invariants in
../../shared/schemas/ts-model-conversion-invariants.md
that --policy VALIDATE_ONLY does not catch (ThoughtSpot accepts the TML and then
behaves wrong, or rejects it on import):
- I1 — every
formulas[]entry has a pairedcolumns[]entry (formula_id:==id:). (Unpaired formula silently dropped.) - I2 — no
aggregation:inside anyformulas[]entry. (Raises "FORMULA is not a valid aggregation type".) - I4 — every
model_tables[]id:(when present) equals itsname:. (Mismatch makes joins silently fail.) - I5 — no physical-column
aggregation: COUNT_DISTINCT; use aunique count ( [TABLE::col] )formula. (Silently flips MEASURE → ATTRIBUTE.) - I8 — no duplicate
column_idacrosscolumns[]. (Hard import rejection: "columns should have unique column_id values".)
ts tml lint reads the same stdin shape as ts tml import and exits non-zero on any
finding, so it gates the import (replace <file>):
python3 -c "import json,pathlib; print(json.dumps([pathlib.Path('<file>').read_text()]))" | ts tml lint
Do not import until it reports "clean": true. Fix any finding and re-lint.
Step 11: Import the model
IMPORTANT — Updating vs creating: Without a guid field in the TML, ThoughtSpot
always creates a new object, even if a model with the same name already exists.
To update an existing model in-place, add guid at the document root — as a
top-level key alongside model:, NOT nested inside model::
# CORRECT — guid at document root
top_level = {"guid": "{existing_model_guid}", "model": model_dict}
# WRONG — guid nested under model (silently ignored by ThoughtSpot)
# model_dict["guid"] = "..." ← do NOT do this
On the first import (new model), omit guid. After import, record the GUID from the
response — you will need it if you reimport to fix any errors.
Serialize the top-level dict to a YAML string, then import:
import yaml, json, subprocess
# First import (new model):
top_level = {"model": model_dict}
# Update existing model:
top_level = {"guid": existing_guid, "model": model_dict}
model_tml = yaml.dump(top_level, default_flow_style=False, allow_unicode=True)
payload = json.dumps([model_tml])
result = subprocess.run(
["bash", "-c",
f"source ~/.zshenv && ts tml import --policy PARTIAL --profile '{profile_name}'"],
input=payload,
capture_output=True, text=True,
)
print(result.stdout)
if result.returncode != 0:
print(result.stderr)
Import policy: Use --policy PARTIAL when importing multiple models in a batch.
ALL_OR_NONE rolls back the entire batch if any single TML fails — including
models that parsed and imported successfully. The response still returns success GUIDs
for the rolled-back models, making the failure silent. Use ALL_OR_NONE only for
atomic pairs (one table + one model that references it).
On success, parse the response JSON to extract the created model's GUID. Save it — required for any future reimports to update the model without creating a duplicate.
Common import errors:
| Error | Likely cause | Fix |
|---|---|---|
referencing_join not found |
Join name is wrong or join doesn't exist at table level | Export table TML again and verify join name |
column_id not found |
Column name is wrong — left-hand side of semantic view dimension used instead of ThoughtSpot Table TML column name | Check Table TML for the correct column name |
Compulsory Field … joins(N)->with is not populated |
Missing with field on an inline join |
Add with: {target_id} to every inline join entry |
{table_name} does not exist in schema (on with field) |
with value is wrong case or doesn't match any id |
Ensure with matches the target's id exactly — same case as name |
Invalid srcTable or destTable in join expression |
on clause references a table name that doesn't match any id in model_tables |
Check that both [table1::col] refs in on use id values, not Snowflake table names |
Multiple tables have same alias {name} |
Two model_tables entries have the same name value |
Deduplicate — if two aliases map to the same Snowflake object, keep only one entry |
fqn resolution failed |
GUID is stale or from a different ThoughtSpot instance | Re-run Step 6A to get fresh GUIDs |
formula syntax error |
ThoughtSpot formula has invalid syntax | Fix the formula expression |
YAML mapping error on formula with { |
last_value or similar formula with { [col] } emitted as inline YAML string |
Use >- block scalar for expr — see Step 9 for pattern |
| YAML parse error | Non-printable characters in strings | Strip non-printable chars from all string values before serialising |
Step 11b: Verify Import
After a successful import response, confirm the model was indexed and has the expected shape — not just that the API returned 200.
1. Search for the model by GUID:
source ~/.zshenv && ts metadata search --subtype WORKSHEET --name "%{view_name}%" --profile {profile}
The GUID returned by the import response must appear in the results. If it is absent, the import succeeded at the API level but indexing is delayed — wait 5 seconds and retry once.
2. Export the imported model and count columns:
source ~/.zshenv && ts tml export {created_guid} --fqn --profile {profile}
Parse the returned TML and count model.columns[] entries. This count must be ≥ the
number of translatable fields from the semantic view (i.e. total dimensions + metrics,
minus any omitted from the untranslatable list in Step 9).
If the column count is lower than expected: compare the exported TML against the TML sent in Step 11 to identify which columns ThoughtSpot silently dropped, and investigate.
3. Report the model URL:
Model imported successfully.
Name: {view_name}
GUID: {created_guid}
URL: {base_url}/#/model/{created_guid}
Open the URL in a browser to verify the model appears in the ThoughtSpot Data panel.
Step 12: Produce summary report
After a successful import, output:
## Model Import Complete
**Model:** {view_name}
**GUID:** {created_guid}
**ThoughtSpot URL:** {base_url}/#/model/{created_guid}
### Columns Imported ({n})
| Display Name | Type | Source |
|---|---|---|
| {name} | ATTRIBUTE | {TABLE}::{COL} |
| {name} | MEASURE ({agg}) | {TABLE}::{COL} |
| {name} | MEASURE (formula) | translated from SQL |
| ... | ... | ... |
### Formula Translation Log
| Column | Original SQL | Status | ThoughtSpot Formula |
|---|---|---|---|
| {name} | `{sql}` | ✓ Translated | `{ts_formula}` |
| {name} | `{sql}` | 🔄 Double aggregation | `{ts_formula}` |
| {name} | `{sql}` | 📐 Fact formula | `{ts_formula}` |
| {name} | `{sql}` | ⚠ Omitted | {reason} |
### Not Mapped
- Extension JSON (Cortex Analyst context): not translated to ThoughtSpot
### Facts Mapped ({n})
| Fact Name | Source Table | Expression | ThoughtSpot Formula |
|---|---|---|---|
| {name} | {table} | `{sql_expr}` | `{ts_formula}` |
### Identifier Resolution Summary
- Physical columns resolved: {n}
- Fact references resolved: {n}
- Double aggregation patterns: {n}
- Unresolvable references: {n} (see OMITTED above)
### Filter Labels ({n})
| Column | Source Expression | Type |
|---|---|---|
| {name} | `{boolean_expr}` | Boolean formula (ATTRIBUTE) |
### Verified Queries ({n})
| Query Name | Question | Status |
|---|---|---|
| {name} | {question} | ✓ Imported as NLS Feedback / ⚠ Manual review needed |
Step 12.5: Import verified queries as NLS Feedback TML
Skip this step if verified_queries is empty.
After a successful Model import (Step 11), translate each verified query from the SV into NLS Feedback TML and import it against the newly-created Model.
SQL-to-search-token translation:
- Map SV column names to TS Model display names (from the column mapping in Steps 8/9)
COUNT(col)→count [Col Display Name];SUM(col)→sum [Col];AVG(col)→avg [Col]- Non-aggregate SELECT columns → dimension tokens:
[Col Display Name] WHERE col = 'val'→[Col] = 'val'
For each verified query with translatable SQL:
guid: "{model_guid}"
nls_feedback:
feedback:
- id: "{index}"
type: REFERENCE_QUESTION
access: GLOBAL
feedback_phrase: "{question_text}"
search_tokens: "{translated_search_tokens}"
rating: UPVOTE
display_mode: UNDEFINED
chart_type: KPI
Import with: source ~/.zshenv && ts tml import --policy ALL_OR_NONE --profile {profile}
Complex SQL (subqueries, CTEs, CASE, window functions) cannot be faithfully converted to search tokens. Log these in the report as "manual review needed" — do not attempt a partial translation.
Step 13: Cleanup
Remove any temporary files written during the workflow:
rm -f /tmp/ts_model_build_*.yaml /tmp/ts_model_build_*.json
The ts CLI manages its own token cache — do not remove /tmp/ts_token_*.txt
unless the user explicitly requests a logout.
Multiple semantic view conversion
Sequential (separate models): After completing Step 12 for one view, ask: "Convert another semantic view?" If yes: return to Step 2. Reuse the already-confirmed ThoughtSpot and Snowflake profiles. Do not re-authenticate between views.
Merge into one model: Use merge_mode = True (Step 1.5 → B). All Semantic Views
are ingested in Step 3, merged in Step 3.5, and converted into a single ThoughtSpot
Model in one pass through Steps 4–13.
Changelog
| Version | Date | Summary |
|---|---|---|
| 1.12.0 | 2026-06-17 | Step 6B connection step now offers E — use existing / C — create a new connection (Snowflake, key-pair auth via ts connections create) instead of only selecting an existing one. Adds the "Database does not exist in connection → role can't see it → create one" guidance and a credential-handling guardrail (private key by file path only; never pasted into chat; password/OAuth → UI + E path). Mirrors the connection-step change in ts-convert-from-tableau; ts-convert-from-databricks-mv gets the explicit stop-and-instruct fallback. |
| 1.11.2 | 2026-06-17 | Replace the hand-written pre-import grep gate with ts tml lint (parser-based; now also catches I8 duplicate column_id). From the full audit sweep (codification, angle 11). |
| 1.11.1 | 2026-06-16 | Extend the N/F/L connection prompt into the Step 6A connection-scoped search path. The 6A "C — within a connection" path now explicitly presents the Step 6B N (name it) / F (filter by substring) / L (list all) prompt to identify the connection — it must NOT run ts connections list and dump every connection by default. Mirrors the same fix in ts-convert-from-tableau and ts-convert-from-databricks-mv. |
| 1.11.0 | 2026-06-16 | Connection selection (Step 6B): add a how-to-identify-the-connection prompt (N name it / F filter by partial string / L list all) before dumping the full connection list. Fetch once via ts connections list, then use the typed name directly, show a filtered subset, or show the full numbered list. Single/database-matched connection still auto-selects. Mirrors the same prompt added to ts-convert-from-tableau and ts-convert-from-databricks-mv. |
| 1.10.0 | 2026-06-16 | Step 6A table discovery: add a connection-scoped vs instance-wide search choice and search by --name "%table%" pattern instead of --all-then-filter. Connection scope filters results on metadata_header.dataSourceName (verified field). Avoids slow whole-instance scans on large instances. Mirrors the ts-convert-from-tableau Step 4c change. |
| 1.9.0 | 2026-06-13 | Identifier resolution engine: facts parsing (BL-003b), metric→fact resolution (BL-003c), double aggregation via group_aggregate (BL-003), window metrics referencing metrics (GAP-13), joinless SV handling (GAP-03/BL-004). |
| 1.8.0 | 2026-06-13 | Fail-loud parsing (C5): Step 4x scans for facts, AI clauses, cortex search, private, unknown grammar. LEFT_OUTER join default (F5). Fix SV discovery SQL (F8). Fix Mode C comparison to translate before diff (F7). |
| 1.7.1 | 2026-06-13 | Add "never normalise API response names" rule (reverse-port from CoCo). |
| 1.7.0 | 2026-06-12 | Adopt PT1 pass-through policy (scalar reliable; flag aggregate pass-through for review). |
| 1.6.0 | 2026-06-12 | Add pre-import validation gate (I1/I2/I4/I5) before model TML import (BL-001). |
| 1.5.0 | 2026-06-11 | Drop TEST_SV_ prefix — model name now uses the bare SV name (N1); cite canonical conversion invariants doc. Add I5 explicit note: COUNT(DISTINCT) → unique count(...) formula, never aggregation: COUNT_DISTINCT. Add references/open-items.md tracking sql_view generation gap. |
| 1.4.1 | 2026-05-11 | Add source ~/.zshenv && prefix to all bash blocks and convert subprocess.run calls from ["ts", ...] to ["bash", "-c", "source ~/.zshenv && ts ..."] for consistent env var loading |
| 1.4.0 | 2026-05-05 | Add Mode C (update existing): Steps C1–C6. Identifies a changed SV and an existing TS Model, diffs columns/descriptions/synonyms/expressions, applies per-column reviewed changes with --no-create-new, and surfaces /ts-object-model-coach handoff. ai_context and Instructions are never touched. Step 1.5 menu updated to A/B/C. |
| 1.3.0 | 2026-04-28 | Add Step 9.5 — confirm Spotter (AI search) enablement before import. Default Y; preserves existing setting on in-place updates. |
| 1.2.0 | 2026-04-28 | Map SV synonyms/descriptions to TS Model + Table TMLs. Add Step 6D for table-description updates. Document non additive by ... desc → first_value. Fix synonyms placement (properties.synonyms not column root). |
| 1.1.0 | 2026-04-24 | Add Step 0 session plan with confirmation gate |
| 1.0.0 | 2026-04-24 | Initial versioned release |