ts-convert-from-tableau

name: ts-convert-from-tableau description: Convert or import a Tableau workbook (.twb or .twbx) into ThoughtSpot — parses TWB XML, generates table + model TMLs, validates and imports. Optionally migrates dashboards to liveboards with layout approximation. Direction is always Tableau → ThoughtSpot. Not for ThoughtSpot → Tableau or standalone TML exports.

Tableau Workbook → ThoughtSpot

Converts a Tableau workbook into ThoughtSpot objects. Parses the TWB XML to extract tables, columns, joins, and calculated fields, then generates Table TMLs and a Model TML per datasource. Optionally converts Tableau dashboards into ThoughtSpot Liveboards with approximate layout mapping.

Ask one question at a time. Wait for each answer before proceeding.

References

File	Purpose
../../shared/mappings/tableau/tableau-formula-translation.md	Tableau → ThoughtSpot formula and function mapping
../../shared/mappings/tableau/tableau-tml-rules.md	TML generation rules — critical invariants for valid import
../../shared/schemas/thoughtspot-table-tml.md	Table TML structure reference
../../shared/schemas/thoughtspot-model-tml.md	Model TML structure reference
../../shared/schemas/thoughtspot-sql-view-tml.md	SQL View TML structure — for custom SQL datasources
../../shared/schemas/thoughtspot-liveboard-tml.md	Liveboard TML structure reference
../../shared/schemas/thoughtspot-answer-tml.md	Answer/visualization TML structure
../ts-profile-thoughtspot/SKILL.md	ThoughtSpot auth setup
references/open-items.md	Known validation quirks and workarounds
references/liveboard-style-themes.md	Step 10.5 curated themes — brand tokens + per-chart `viz_style` color palettes

Prerequisites

ThoughtSpot profile configured — run /ts-profile-thoughtspot if not
ts CLI installed: pip install -e tools/ts-cli
Tableau workbook file (.twb or .twbx) accessible on disk
The source tables and their data already exist in a warehouse, and a ThoughtSpot connection exposes them. This skill creates ThoughtSpot logical objects (Table, Model, cohorts, Liveboard) over existing physical tables — it does not create warehouse tables or load/populate data. A ThoughtSpot table binds to a live connection that already surfaces the physical table and its columns (see Step 4 / thoughtspot-table-tml.md); if no such connection/table exists, set that up first (the data pipeline is out of scope). The skill may read the warehouse for confirmation (value formats, ranges, membership) — with your authorization — but never loads or modifies data.

Working principle — surface, recommend, resolve

Whenever the parse or generation hits a situation that has no clean 1:1 automatic translation or needs a judgement call — e.g. a cross-datasource blend, a join key that doesn't exist / spans two tables, a date stored as VARCHAR, bins (formula vs cohort), an ambiguous count column, a manual group (cohort vs if/then), an untranslatable formula, or a value-vs-data mismatch — do not silently drop it, guess, or merely flag it. Instead:

Surface it — tell the user plainly what was found and why it's not a straight translation.
Recommend — if there's a sound solution (or a small set of options), say which and why, with the trade-offs.
Resolve — with the user's go-ahead, do it (build the SQL view, prompt for the value, retype the column, create the cohort, etc.). Only fall back to omit-and-flag when no solution exists or the user declines.

Default to enabling the migration, not abandoning the hard parts. The per-step prompts and checkpoints below are how this principle is applied in practice.

Read the actual calculation — never infer from the name. A worksheet called "Highest Growth in past 5 years" tells you the intent, not the logic. Always inspect the real Tableau definition — the table-calc type (pcdf, pctd, running_*), the filters (Top-N, recent-N-years), the compute-using/partition, and the sort — and translate that. (Example: that title is really "top 5 sectors by FDI % change over a 6-year window" — a period comparison, not a raw growth of line.)

Placeholder charts when a full translation isn't possible. If a viz can't be fully reproduced, don't silently omit it — build a placeholder: a TABLE with the columns you can produce, and write a note in both the viz's answer.description and the Migration Summary tab that the chart is partial and needs review. A visible, labelled stub the user can finish beats a missing tile.

Step 0 — Overview

On skill invocation, display this plan before doing any work:

ts-convert-from-tableau — convert a Tableau workbook into ThoughtSpot TML objects, with optional dashboard-to-liveboard migration.

Modes

A Audit — analyse a TWB file (or multiple files) and report migration coverage. No ThoughtSpot auth required. No TMLs generated. Use this to assess feasibility before committing to a migration.

M Migrate — full conversion: parse, generate TMLs, validate, and import.

Enter A / M:

Steps (Migrate mode)

Authenticate to ThoughtSpot .......................... auto
Locate and extract the TWB file ...................... you provide path
Parse TWB XML — extract tables, columns, joins, calculated fields, blend relationships ............ auto
Select ThoughtSpot connection (required) ............ you choose 4.5 Confirm source tables (reuse vs. create; search) .... you choose
Generate TML files (table + sql_view + model) ...... auto 5.5 Confirm Spotter (AI search) enablement (default Y) .. you choose
Validate against ThoughtSpot (up to 10 fix cycles) .. auto
Review checkpoint (formula map + omissions) + import you confirm 7.5 Confirm the model is correct (test in Search/Spotter) you confirm
Migrate dashboards? + separate vs single-tabbed (2+) . you choose (skip → Step 12)
Parse dashboard layout and map to grid ............... auto 9d. Orphan worksheets (not on a dashboard) — add as tiles? you choose
Generate liveboard TML (export model for params first) auto 10f. Add referenced parameters to the header? (default Y) . you choose 10g. Add a "Migration Summary" tab (migrated/decisions/omitted) auto 10.5 Pick a liveboard style (curated theme; default) ..... you choose
Import liveboard ..................................... you confirm 11.5 Formula coverage answers (every formula testable) ... auto
Migration report (outcomes + links + formula map) ... auto

Confirmation required: Steps 4.5, 5.5, 7, 7.5, 8, 9d, 11 Auto-executed: Steps 1, 3, 5, 6, 9, 10, 12

Steps (Audit mode)

A1. Locate and extract TWB file(s) ...................... you provide path(s) A2. Parse TWB XML — same extraction as Step 3 .......... auto A3. Classify formulas into translation tiers ............ auto A4. Migration coverage report ........................... auto

No auth, no TML generation, no import. Supports multiple files in one run.

If Audit mode, proceed to Step A1. If Migrate mode, proceed to Step 1.

Step A1 — Locate TWB File(s) (Audit Mode)

Ask: "Provide the path to a .twb or .twbx file, or a directory containing multiple workbooks."

If a directory is provided, find all .twb and .twbx files recursively. For each .twbx, extract to a temp directory to access the inner .twb.

Save the list of TWB paths. Process each file through Steps A2–A4 independently.

Step A2 — Parse TWB XML (Audit Mode)

Run the same extraction as Step 3 (3a through 3e) on each TWB file. Do NOT skip any datasource type. For extracts, resolve the underlying source (Step 3b) and report it as migratable via that source; mark as "Extract — no underlying source" only when none resolves.

Step A3 — Classify Formulas (Audit Mode)

MANDATORY (I7) — before classifying any calculated field as untranslatable, open ../../shared/mappings/tableau/tableau-formula-translation.md and check its full function table and pass-through section. Do not decide from syntax alone.

For each calculated field extracted in Step A2, classify it into one of these tiers based on the patterns in tableau-formula-translation.md:

Tier	Description	Examples
Native / Set	Direct ThoughtSpot mapping exists	IF/THEN, IFNULL, DATEDIFF, LEFT, ABS, ROUND, IIF; bins (`class='bin'`) → `floor([x]/size)size` or BIN_BASED cohort; manual groups* (`class='categorical-bin'`, incl. fields named "… clusters") → `GROUP_BASED` cohort; `Number of Records`/row counts → `count([column])` (prompt for the column; default the primary key); static sets (`<group>` with `union`/`member`) → `GROUP_BASED` column-set cohort — incl. ones anchored on a formula column, with a `%null%` member (via `EQ {Null}`), or an `except` member-list (via `NE`) (Phase 2a); Top-N/Bottom-N sets (`function='end'`) → query set (`cohort_type: ADVANCED`, `COLUMN_BASED`) via a rank formula + parameter-filter formula (Phase 2b); condition-based sets (`function='filter'`) → query set with aggregate condition formula (Phase 2c); member-list intersect → `GROUP_BASED` cohort of common members (Phase 2c); all-except-Top-N → query set with inverted rank filter (Phase 2c); computed set operations (intersect/except of mixed types) → multi-formula query set (Phase 2c)
LOD	LOD expression → `group_aggregate()`	`{FIXED dim : SUM(col)}`; `TOTAL(SUM(x))` / percent-of-total → `group_aggregate(..., {}, query_filters())`
Cumulative	Running calculation → `cumulative_*()`	RUNNING_SUM, RUNNING_AVG
Moving	Window table calc → `moving_*()`	WINDOW_SUM, WINDOW_AVG (when sort attr determinable)
Pass-through	Valid SQL but no native function → `sql_*_aggregate_op()`	Partitioned RANK, DENSE_RANK, WINDOW_* without sort context
Partial / Unmapped (sets)	Tableau set construct with no current ThoughtSpot equivalent — logged as deferred, never mis-translated	set controls (`level-members` only, no fixed members) → no set object, surface as a liveboard filter; set actions (`<action>`) → no equivalent
Untranslatable	No ThoughtSpot equivalent — will be omitted	LOOKUP, INDEX, SIZE(), FIRST(), LAST(), PREVIOUS_VALUE (standalone partition-position table calcs — e.g. the comma-separated-list-of-set-members technique; not FIRST()/LAST() as `WINDOW_`/`RUNNING_` offset args, which map to moving/cumulative); true k-means clustering (the analytics-engine "Clusters" calc — not `categorical-bin`); geospatial (`MAKEPOINT`, `MAKELINE`, `DISTANCE`, `BUFFER`, `AREA`) — decompose `MAKEPOINT` lat/lon args to individual attribute columns, omit the spatial formula (see `tableau-formula-translation.md` Geospatial Policy)
Parameter ref (auto)	References a Tableau parameter with static list/range — parameter auto-created in model	`[Parameters].[Currency]` where Currency has `<member>` values
Parameter ref (query)	References a Tableau parameter with SQL-lookup list — queryable at migration time	SQL-populated parameter lists (needs connection)

Classifier implementation notes

Function detection — require parentheses. Match FUNCTION_NAME( (with optional whitespace before the paren), not bare word boundaries. Bare \bSIZE\b false-positives on dimension values like 'Size' or column names like [Size]. Correct patterns:

LOOKUP\s*\(   INDEX\s*\(   SIZE\s*\(   FIRST\s*\(   LAST\s*\(   PREVIOUS_VALUE\s*\(   RAWSQL_
RUNNING_(SUM|AVG|MAX|MIN|COUNT)\s*\(
WINDOW_(SUM|AVG|MAX|MIN|COUNT|STDEV|VAR|MEDIAN|PERCENTILE)\s*\(
RANK(_UNIQUE|_MODIFIED|_DENSE|_PERCENTILE)?\s*\(
TOTAL\s*\(
MAKEPOINT\s*\(   MAKELINE\s*\(   DISTANCE\s*\(   BUFFER\s*\(   AREA\s*\(

FIRST()/LAST() precedence. These are untranslatable standalone (partition-position calcs like LAST()=0, or the comma-separated-list-of-members technique using FIRST()/LAST()/LOOKUP()/PREVIOUS_VALUE()). BUT FIRST()/0 also appear as the offset arguments of a WINDOW_*/RUNNING_* window (e.g. WINDOW_SUM(SUM([x]), FIRST(), 0)) — there they belong to the moving/cumulative mapping, not the untranslatable tier. So match RUNNING_*/WINDOW_* first; only classify FIRST()/LAST() as untranslatable when they are not inside a window function's argument list.

Parameter references. Detect [Parameters].[...] pattern — this is Tableau's cross-datasource parameter reference syntax. These formulas use translatable syntax (IF/CASE/WHEN). Cross-reference the parameter name against the parameter definitions extracted in Step A2/3:

If the parameter has static <member> list values or a <range> → Parameter ref (auto) — the parameter will be auto-created in the model TML, formula translates with a simple [Parameters].[Name] → [Name] prefix strip
If the parameter has no static values (SQL-lookup populated) → Parameter ref (query) — auto-migratable at migration time (requires warehouse connection to populate list values), but flagged separately in audit mode since no connection is available

LOD first. Check {FIXED|INCLUDE|EXCLUDE} before other tiers — LOD expressions may also contain functions like SUM that would match Native.

For each formula, also check:

Does it reference other calculated fields? (cross-reference depth)
Does it use functions from the untranslatable list?
Does it mix translatable and untranslatable patterns?

Step A4 — Migration Coverage Report (Audit Mode)

For each TWB file, produce a coverage report. If multiple files were audited, also produce a combined summary at the end.

Per-file report:

Audit: {workbook_name}
══════════════════════════════════════════════════════

  Datasources:          {N} total
    Live:               {N}
    Extract:            {N} (skipped in migration)
    Published (sqlproxy): {N}

  Physical tables:      {N}
  Custom SQL relations: {N} → will generate sql_view TMLs
  Joins:                {N}

  Calculated fields:    {N} total
  ┌──────────────────────────────────────────────────────┐
  │ Tier                    Count    %     Examples      │
  ├──────────────────────────────────────────────────────┤
  │ Native                  {N}     {%}   IF, DATEDIFF  │
  │ LOD → group_agg         {N}     {%}   {FIXED ...}   │
  │ Cumulative              {N}     {%}   RUNNING_SUM   │
  │ Moving                  {N}     {%}   WINDOW_SUM    │
  │ Pass-through            {N}     {%}   DENSE_RANK    │
  │ Parameter ref (auto)    {N}     {%}   static list   │
  │ Parameter ref (query)   {N}     {%}   SQL lookup    │
  │ Geospatial (omit+log)   {N}     {%}   MAKEPOINT     │
  │ Untranslatable          {N}     {%}   LOOKUP        │
  └──────────────────────────────────────────────────────┘

  Tableau Sets (top-level <group> elements — separate from calculated fields):
  ┌──────────────────────────────────────────────────────┐
  │ Set tier                Count    Notes               │
  ├──────────────────────────────────────────────────────┤
  │ Native / column set     {N}      static + member-intersect → GROUP_BASED cohort (2a/2c) │
  │ Query set               {N}      Top-N, condition, all-except-Top-N, mixed ops → ADVANCED (2b/2c) │
  │ Partial / deferred      {N}      set controls + set actions (no equivalent)     │
  └──────────────────────────────────────────────────────┘

  Parameters:           {N} total ({N} static, {N} SQL-lookup — query at migration)
  Dashboards:           {N} (optional liveboard migration)

  ──────────────────────────────────────────────────
  Migration coverage:   {(all except untranslatable) / total}%
                         (all parameters auto-created — static or queried)
  Untranslatable:       {N} formula(s) — will be omitted
  Geospatial:           {N} formula(s) — spatial funcs omitted; lat/lon cols migrated as attributes
  Deferred sets:        {N} (set controls/actions — flagged for manual creation)
  SQL-lookup params:    {N} — need warehouse connection at migration time
  Pass-through formulas require SQL Passthrough Functions enabled.

  Data Blending (resolve federated IDs to datasource captions for display):
  ┌──────────────────────────────────────────────────────┐
  │ Primary Datasource   Secondary DS(s)   Link Columns  │
  ├──────────────────────────────────────────────────────┤
  │ {ds_caption}         {ds_caption}      {col1}, {col2}│
  └──────────────────────────────────────────────────────┘
  Blended datasources:  {N} of {total} — will merge into single model(s)
  Blend relationships:  {M} total
  Star topologies:      {S} (1 primary → 2+ secondaries)
  ──────────────────────────────────────────────────

Migration coverage includes everything except Untranslatable. All parameter types are auto-migratable: static params are created directly in the model TML; SQL-lookup params are populated by querying the warehouse at migration time. The formula reference [Parameters].[Name] is rewritten to [Name] in both cases.

If any formulas are classified as Untranslatable, list them:

  Untranslatable formulas (will be omitted):
    - {formula_name}: {reason} — {expression excerpt}
    - ...

If any SQL-lookup parameters exist, note them:

  SQL-lookup parameters ({count} — populated from warehouse at migration time):
    - {param_name}: query/column reference from TWB
    - ...
  Values are a point-in-time snapshot. Consider /ts-recipe-parameter-sync for
  ongoing refresh.

If any formulas are classified as Pass-through, list them with the generated expression:

  Pass-through formulas (require SQL Passthrough Functions enabled):
    - {formula_name}: sql_{type}_aggregate_op("...", ...)
    - ...

Write the report to /tmp/ts_tableau_mig/audit/{workbook_name}_audit.md and display it inline.

Combined summary (multiple files):

Audit Summary: {N} workbook(s)
══════════════════════════════════════════════════════

  Workbook                          Tables  Calcs  Coverage
  ─────────────────────────────────────────────────────────
  {workbook_1}                      {N}     {N}    {%}%
  {workbook_2}                      {N}     {N}    {%}%
  ...
  ─────────────────────────────────────────────────────────
  Total                             {N}     {N}    {%}%

After the audit, exit cleanly. Do NOT proceed to Migrate mode steps.

Step 1 — Authenticate

Read ~/.claude/thoughtspot-profiles.json. If multiple profiles exist, display a numbered menu and ask the user to choose. If only one profile, use it automatically.

source ~/.zshenv && ts auth whoami --profile "{profile_name}"

Save {base_url} and {profile_name} for all subsequent steps.

Step 2 — Extract TWB File

Ask for the file path if not yet provided.

If the file ends in .twbx (a ZIP archive), extract it:

mkdir -p /tmp/ts_tableau_mig && unzip -o "{twbx_path}" -d /tmp/ts_tableau_mig/

Then find the .twb inside:

find /tmp/ts_tableau_mig -name "*.twb" | head -1

Save the resolved .twb path as {twb_path}.

Step 3 — Parse TWB XML

Read {twb_path} in full. The TWB is XML. Extract the following elements:

3a. Workbook name

Take from the filename (strip .twb). Save as {workbook_name}.

3b. For each `<datasource>` element (skip those named `Parameters`)

Each datasource is processed independently for extraction (Steps 3b–3d). Datasources are merged into a single model only when they are connected by blend relationships (detected in Step 3e). Even when datasources share tables or point at the same database, they are NOT merged unless a <datasource-relationship> explicitly links them. See Step 5b "Blend-aware model grouping" for the merge procedure.

Datasource type detection:

If the datasource contains <connection class="sqlproxy">, it is a Published Datasource (hosted on Tableau Server). The table name resolves to connection.get('dbname'), not the literal [sqlproxy].
If the datasource contains <extract>, do not blindly skip it. An extract is a local snapshot, but it almost always wraps an underlying connection that names a real table — a file source (textscan/CSV, excel-direct), a database, etc. What matters for migration is that underlying source, because that's what gets queried in the warehouse. Look past the <extract>/hyper connection to the real one:
- The relation has two parents in <metadata-records> — the live source (e.g. [Amazon Sales data.csv]) and [Extract]. Use the live-source relation; ignore the [Extract] relation. The physical table name comes from the live source (mapped to its warehouse table per Step 4).
- Only treat a datasource as truly skippable when there is no resolvable underlying connection (a pure Tableau-authored extract with no source) — and say so in the report.
- File-based sources (CSV/Excel) imply the data was loaded into the warehouse out of band; bind the table to the connection that now exposes it (Step 4/4.5).
Otherwise, it is a Live datasource — proceed with extraction.

Non-warehouse sources — explicit unsupported policy: The following Tableau connection classes are NOT warehouse-bound and cannot be mapped to a ThoughtSpot connection: cloudfile:googledrive-excel-direct, google-sheets, ogrdirect (spatial/OGR), webdata-direct (web data connector), CustomMapbox. When any of these appear as a datasource's connection class, do NOT assume a warehouse table exists. Instead:

Log: "Datasource '<name>' uses a non-warehouse source (<class>) — cannot map to a ThoughtSpot connection. Skipped; data must be loaded into a warehouse first."
Skip the datasource entirely (do not generate table or model TML for it).
Surface in the audit report under a "Skipped sources" section.

Redshift and Postgres dialect notes: When <connection class="redshift"> or <connection class="postgres"> is detected, pass-through SQL (sql_*_op) formulas should use the corresponding dialect syntax. Key differences from Snowflake:

String concatenation: || (same as Snowflake)
Date truncation: date_trunc('month', col) (same syntax, both dialects)
LISTAGG → Redshift: LISTAGG(col, ',') WITHIN GROUP (ORDER BY col); Postgres: string_agg(col, ',' ORDER BY col)
Type casting: Redshift uses ::type; Postgres uses CAST(x AS type) or ::type

No other mapping changes are needed — the Tableau-to-ThoughtSpot formula translation is warehouse-agnostic (ThoughtSpot formulas are the target, not SQL). The dialect only matters for sql_*_op pass-through functions.

Relation wrapper handling: TWB XML wraps <relation> elements in one of three structures. Check in order:

_.fcp.ObjectModelEncapsulateLegacy.false...relation tag
_.fcp.ObjectModelEncapsulateLegacy.true...relation tag
<relation> directly under <connection class='federated'> (fallback)

All three contain the same child elements — the wrapper determines where to look.

For each datasource, extract:

Physical tables — <relation> elements of type="table":

name attribute = table alias used in joins
table attribute = fully-qualified physical table name — may be [DB].[SCHEMA].[TABLE] format; strip brackets and split on . to extract db, schema, and table components
For Published Datasources (sqlproxy): if table name is [sqlproxy], use connection.get('dbname') instead

Custom SQL relations — <relation> elements of type="text":

These contain raw SQL in the element text content — do NOT try to extract a table name
Flag the relation as source_type: "custom-sql" and save the full SQL text
Refactor the SQL: replace << with <, >> with >, == with = (XML encoding artifacts from the TWB)
These will generate a sql_view: TML instead of a table: TML (see Step 5c)
Extract column names from the SQL SELECT clause aliases for column mapping

Joins — <relation> elements of type="join":

join attribute = join type (inner | left | right | full)
<clause> child = join condition (decode HTML entities: "→", &→&, <→<, >→>)
Extract left and right table references from the clause

Physical columns — from <metadata-records> → <metadata-record class="column">:

local-name = column identifier
remote-name = physical column name in the database (use for db_column_name)
local-type = Tableau data type
parent-name = which table this column belongs to
Also extract from <column> elements WITHOUT a <calculation> child: name (strip brackets), datatype, role (dimension/measure), caption (display name)

Calculated fields — <column> elements WITH a <calculation class="tableau"> child:

Skip columns where param-domain-type is list or range — these are Tableau parameters, not calculated fields
caption or name = display name
calculation formula attribute = Tableau expression (decode HTML entities)
datatype attribute
Build a cross-reference map: Tableau internal names ([Calculation_1234567890]) → display names. Calculated fields reference each other by internal ID in the TWB XML, not by display name — resolve these references before translating formulas.

Parameters — <datasource name="Parameters"> children:

For each <column> with param-domain-type attribute:
- caption = display name (used as ThoughtSpot parameter name)
- datatype = string | integer | real | date | boolean
- param-domain-type = list | range | any
- value attribute or calculation.formula = default value
- <member value="..."> children = list values (when param-domain-type="list")
- <range min="..." max="..."> child = range bounds (when param-domain-type="range")
Save parameter definitions — these generate model.parameters[] in Step 5b
SQL-lookup parameters (where the list values come from a database query rather than static <member> elements): save the query/column reference — at migration time (Step 5b), query the warehouse to populate list_config.list_choice[] with current values. In audit mode (no connection), flag as "requires connection"

Save the parsed structure internally. Announce a summary:

Parsed {workbook_name}: {N} datasource(s), {N} physical table(s), {N} calculated field(s), {N} join(s), {N} dashboard(s)

3c. Topological sort of calculated fields

Some calculated fields reference other calculated fields. Sort them so that fields with no formula-dependencies come first (Level 0), then Level 1, etc. This determines the order they must appear in the model TML formulas section.

Resolve all internal Tableau cross-references ([Calculation_\d+] → display name) before sorting. The topological sort must use display names, not internal IDs.

3d. Dashboard metadata (for Step 8 decision)

Count <dashboard> elements in the TWB. Save the count and names — this is shown in Step 8 when asking whether to migrate dashboards.

3e. Extract blend relationships (data blending)

Parse the <datasource-relationships> element at the workbook root (child of <workbook>). If absent, no blending is used — skip this step.

Build the blend graph:

blend_graph = {}  # {source_ds_name: [{target_ds_name, column_mappings}]}

ds_rels = root.find('.//datasource-relationships')
if ds_rels is not None:
    # Build a map of datasource-dependencies: ds_id → {column_instance_name → base_column_name}
    dep_map = {}
    for dep in ds_rels.findall('datasource-dependencies'):
        ds_id = dep.get('datasource')
        instance_to_col = {}
        for col_inst in dep.findall('column-instance'):
            instance_to_col[col_inst.get('name')] = col_inst.get('column')
        dep_map[ds_id] = instance_to_col

    # Parse each datasource-relationship (pairwise blend link)
    for rel in ds_rels.findall('datasource-relationship'):
        source_ds = rel.get('source')   # primary datasource
        target_ds = rel.get('target')   # secondary datasource
        col_maps = []
        for m in rel.findall('column-mapping/map'):
            # key format: [federated.xxx].[instance_name]
            # Extract the instance_name portion after the datasource prefix
            src_key = m.get('key')
            tgt_key = m.get('value')

            # Parse instance name from fully-qualified reference
            src_inst = src_key.split('].[')[1].rstrip(']') if '].[' in src_key else src_key
            tgt_inst = tgt_key.split('].[')[1].rstrip(']') if '].[' in tgt_key else tgt_key

            # Resolve to base column names via dep_map
            src_col = dep_map.get(source_ds, {}).get(src_inst, src_inst)
            tgt_col = dep_map.get(target_ds, {}).get(tgt_inst, tgt_inst)

            # Strip brackets from column names: [Category] → Category
            src_col = src_col.strip('[]')
            tgt_col = tgt_col.strip('[]')

            col_maps.append({'source_col': src_col, 'target_col': tgt_col})

        blend_graph.setdefault(source_ds, []).append({
            'target_ds': target_ds,
            'column_mappings': col_maps,
        })

Store blend_graph alongside the per-datasource extraction results from Step 3b. The graph keys are datasource name attributes (the federated.xxx IDs from the XML).

What to log:

Number of blend relationships found
For each: primary datasource → secondary datasource, with linking columns listed

Date-grain linking columns:

When a <column-instance> has a derivation other than "None" (e.g. "Month", "Month-Trunc", "Year", "Year-Trunc"), the blend links at a specific time grain. For the ThoughtSpot model join, the physical date column is used directly — ThoughtSpot's date bucketing at query time handles the grain alignment.

However, if the source and target columns are physically different date columns with different native grains (e.g. source has daily Order Date, target has monthly Month of Order Date that is already pre-truncated), the join requires a date-truncation formula or SQL View to materialize the matching grain.

Resolution strategy:

If both columns are date/datetime type and the derivation indicates a truncation (Month-Trunc, Year-Trunc), emit a model formula: date_trunc ( 'month' , [TABLE::Order Date] ) and use that formula as the join key via a SQL View (the formula can't be a direct join key in model TML).
Surface the grain mismatch to the user in the review checkpoint with a recommendation:
- "Blend links Order Date (daily) to Month of Order Date (monthly) at month grain. Recommend: create a SQL View with DATE_TRUNC('MONTH', ORDER_DATE) AS ORDER_MONTH and join on ORDER_MONTH = MONTH_OF_ORDER_DATE."
If both columns are the same physical type and grain, use them directly in the join on clause — no materialization needed.

No model merging happens here — this step only extracts the relationships. Model merging happens in Step 5b.

Step 4 — Select ThoughtSpot Connection

List available connections and let the user select:

source ~/.zshenv && ts connections list --profile {profile_name}

ts connections list auto-paginates and returns all connections. Display the results as a numbered list showing connection name, type, and database. If only one connection exists, auto-select it and confirm with the user.

Available ThoughtSpot connections:

  1. SNOWFLAKE_PROD    (RDBMS_SNOWFLAKE)   — PROD_DB
  2. ANALYTICS_DW      (RDBMS_SNOWFLAKE)   — ANALYTICS_DB

Which connection should the generated tables use? (Enter number):

Save the selected connection's exact name value as {connection_name}. This name is used in SQL View TMLs (where connection.name is required) and for schema resolution.

If the user selects a connection, fetch the schema to resolve db/schema/table names:

source ~/.zshenv && ts connections get {connection_id} --profile {profile_name}

Parse the response to extract available databases, schemas, and table names. For each physical table from Step 3, find the best match (case-insensitive) in the connection schema. Save the resolved {db}, {schema}, and {db_table} for each table.

If the connection response has no tables (empty externalDatabases), ask the user for the database and schema names directly.

A connection is required — there is no skip path. ThoughtSpot tables are logical objects over a live connection: the physical table must already exist in the database and the connection must already exist for the table to be created at all. You cannot generate a usable table without one, so do not offer placeholders or a dry-run mode — they only produce objects that can never bind to data. If the user has no suitable connection, stop and tell them a connection exposing the source tables must be created first (the data pipeline / connection setup is out of this skill's scope).

Use the selected connection's exact name in every table TML and SQL View TML — never a GUID. The v2 API cannot search connections by name, so the name string is both necessary and sufficient; do not try to resolve it to an ID. See ../../shared/schemas/thoughtspot-table-tml.md "Connection Reference".

Step 4.5 — Confirm Source Tables & Search Decision

Step 4 resolved a {db}/{schema}/{db_table} for each physical table — but resolving a name is not the same as confirming the table is actually there. A model TML that points at a table the connection can't see will still import cleanly, yet every search and liveboard built on it comes back empty. That failure is silent and easy to miss, so confirm the source situation before generating anything. This step only searches and confirms — it never loads or modifies warehouse data (that is the data pipeline's job, not this skill's). It mirrors ts-convert-from-databricks-mv Step 8.

4.5a — Present the table list and ask (do NOT search yet)

Show the user the full inventory of physical tables from Step 3, then ask whether those tables already exist as ThoughtSpot Table objects. Ask before searching. Searching ThoughtSpot for every table on every run is wasteful when the user already knows the answer — so the search is gated behind the user's response, not run up front.

Source tables referenced by {workbook_name} ({N} total):

  1. AGENT_SKILLS.AMAZON_SALES_DATA.AMAZON_SALES_DATA
  2. AGENT_SKILLS.DUAL_AXIS_EXAMPLE.LISTOFORDERS
  …

Do these already exist as ThoughtSpot Table objects?
  E  Exist      — reuse them (I'll look up their GUIDs)
  N  Don't exist — create new Table TMLs            (default)
  ?  Unsure     — search ThoughtSpot to find out

Enter E / N / ? :

If the tables differ in status (some exist, some don't), the user can say so — accept a per-table answer or let them point out the exceptions.

4.5b — Act on the answer

N (don't exist) → create a new Table TML for each in Step 5a (the default path). No search.
E (exist) or ? (unsure) → now search ThoughtSpot to locate/confirm:
```
source ~/.zshenv && ts metadata search --subtype ONE_TO_ONE_LOGICAL --all --profile {profile_name}
```
Match on database + schema + table name (metadata_header.database_stripes, metadata_header.schema_stripes, metadata_name). For each table found, reuse its name/GUID in the model's model_tables[] and skip generating a Table TML for it in Step 5a. For ?, report what was/wasn't found and treat the not-found ones as create. For E, if a table the user expected is not found, say so and confirm before falling back to create.

4.5c — Confirm any missing sources before proceeding

If any table the plan intends to create is not found in the connection, surface it and require confirmation — this is the silent-failure case:

⚠ The following table(s) are not visible to connection "{connection_name}":
    - {db}.{schema}.{db_table}
  Their models will import, but searches return no data until the data is loaded
  and visible to the connection. This skill does not load data.

  Proceed anyway (generate the TMLs as-is)?   (yes / no):

Do not proceed past this warning without the user's confirmation.

Step 5 — Generate TML Files

Create output directory:

mkdir -p /tmp/ts_tableau_mig/output/{workbook_name}

5a. Table TML — one per physical table (skip custom SQL relations)

For each physical table identified in Step 3 with type="table", generate a .table.tml file. Skip custom SQL relations — those are handled in Step 5c.

Follow all rules in tableau-tml-rules.md.

Template:

table:
  name: TABLE_NAME
  db: RESOLVED_DATABASE
  schema: RESOLVED_SCHEMA
  db_table: physical_table_name
  connection:
    name: "{connection_name}"       # exact ThoughtSpot connection name, case-sensitive — NOT a GUID
  columns:
  - name: COLUMN_NAME
    db_column_name: COLUMN_NAME
    data_type: VARCHAR              # VARCHAR | INT64 | DOUBLE | FLOAT | BOOL | DATE | DATETIME
    properties:
      column_type: ATTRIBUTE        # or MEASURE
      aggregation: SUM              # only if MEASURE — SUM | AVERAGE | COUNT
    db_column_properties:
      data_type: VARCHAR            # must match data_type above

Key rules:

connection.name is required — a ThoughtSpot logical table must sit on a connection that already exposes the physical table and its columns. Use the connection name directly (case-sensitive); never look up a GUID — the v2 API cannot search connections by name, and the name is what the TML needs. See ../../shared/schemas/thoughtspot-table-tml.md "Connection Reference".
Use the db, schema values resolved from Step 4 (the connection is required, so these are always real).
db_column_name must match the physical column the connection exposes — not the Tableau name. When a file source (CSV/Excel) was loaded into the warehouse, the loader usually normalizes names (Item Type → ITEM_TYPE: spaces→_, upper-cased). Use the warehouse column name for db_column_name (and the friendly Tableau caption for the model column's display name). If unsure, the connection schema from Step 4 (externalDatabases) lists the real column names; validation reports column not found in connection when they don't match.
Date stored as VARCHAR — flag it. If the Tableau column is typed date/datetime but the warehouse column is VARCHAR (common when a CSV date loaded as text), binding it as VARCHAR loses all date capability (no buckets/trends/relative-date filters; Spotter won't read it as time). The TS column data_type must match the physical column, so you can't just declare DATE. Surface it and offer: (a) retype at the source (warehouse ALTER/reload to a real DATE — outside this skill; needs the user) then bind as DATE, or (b) keep VARCHAR and add a to_date([col]) derived formula column for date analytics. Don't silently bind a date as a string.
Partial date strings must produce a full YYYY-MM-DD date. When a source column contains a year-only value (e.g. _2016_17, FY2016, 2016) and needs to become a DATE, always append -01-01 (or -01 for year-month) to produce a complete date. A bare-year conversion like to_date('2016', 'yyyy') produces an ambiguous value that ThoughtSpot cannot bucket (.yearly, .monthly), use for KPI sparklines, or filter as a date range. If the datasource already uses a SQL View, apply the conversion in the SQL query (TO_DATE(SUBSTRING(col, 2, 4) || '-01-01', 'YYYY-MM-DD')). If it uses a regular table, apply it as a model formula (to_date ( concat ( substr ( [col] , 1 , 4 ) , '-01-01' ) , 'yyyy-MM-dd' )). See tableau-tml-rules.md "Date Column Rules" for the full pattern table.
Use INT64 for Tableau integer — never INT
db_column_properties is required on every column
No guid or fqn sections
If validation (Step 6) returns connection not found, the name/case is wrong; if it returns column not found in connection, the physical table/column the connection sees doesn't match db_table/db_column_name — both are surfaced there, so a wrong binding fails loudly rather than silently.

Write each file to /tmp/ts_tableau_mig/output/{workbook_name}/{TABLE_NAME}.table.tml.

5b. Model TML — one per datasource (strict separation)

Generate one .model.tml per datasource the workbook actually uses — don't blindly merge independent datasources, but also don't materialize an unused model for every datasource. Blend-aware model grouping (requires blend_graph from Step 3e):

When blend_graph is non-empty, datasources connected by blend relationships produce a single merged model instead of separate models. The merge procedure:

Build connected components from blend_graph. Each connected component (a primary datasource and all its direct or transitive secondaries) becomes one model. Use the primary datasource's display name as the model name.

# Build undirected adjacency from blend_graph for connected-component discovery
adjacency = {}
for src, targets in blend_graph.items():
    for t in targets:
        adjacency.setdefault(src, set()).add(t['target_ds'])
        adjacency.setdefault(t['target_ds'], set()).add(src)

visited = set()
model_groups = []  # each group: {'primary': ds_id, 'members': [ds_id, ...]}

for ds_id in adjacency:
    if ds_id in visited:
        continue
    # BFS to find connected component
    component = []
    queue = [ds_id]
    while queue:
        node = queue.pop(0)
        if node in visited:
            continue
        visited.add(node)
        component.append(node)
        queue.extend(adjacency.get(node, set()) - visited)

    # Primary = a datasource that appears as `source` but NEVER as any `target`
    all_targets = {t['target_ds'] for edges in blend_graph.values() for t in edges}
    roots = [d for d in component if d in blend_graph and d not in all_targets]
    primary = roots[0] if roots else component[0]
    model_groups.append({'primary': primary, 'members': component})

Build the datasource → table mapping. Each Tableau datasource has a primary physical table (the first <relation> element or the relation named in the <datasource> caption). Map each datasource's federated ID to its ThoughtSpot Table TML name from Step 5a. Also map each federated ID to its Tableau caption attribute (the display name shown in Tableau).
```
ds_id_to_table = {}    # federated.xxx → ThoughtSpot table name (from Step 5a)
ds_id_to_caption = {}  # federated.xxx → Tableau datasource display name
for ds_id, ds_info in datasources.items():
    ds_id_to_table[ds_id] = ds_info['primary_table_name']   # TS table name
    ds_id_to_caption[ds_id] = ds_info['caption']             # Tableau caption
```
For multi-table datasources (internal joins within one datasource), the blend link column determines which table is the join anchor. Resolve the link column from Step 3e to its owning table via the column-to-table mapping already built in Step 3b.
For each model group, generate a single model TML that contains:
- All model_tables[] entries from every member datasource (tables + SQL views)
- All columns[] from every member datasource (with column_id prefixed by the correct table name: TABLE_NAME::col_name)
- All formulas[] from every member datasource
- Inline joins derived from blend_graph column mappings (see below)

Generate blend joins — iterate ALL edges in the connected component, not just the primary's. This handles star topologies (A→B, A→C) and transitive blends (A→B, B→C):

for member_ds in model_group['members']:
    for target_info in blend_graph.get(member_ds, []):
        target_ds = target_info['target_ds']
        col_maps = target_info['column_mappings']

        src_table = ds_id_to_table[member_ds]
        tgt_table = ds_id_to_table[target_ds]

        on_parts = []
        for cm in col_maps:
            on_parts.append(f"[{src_table}::{cm['source_col']}] = [{tgt_table}::{cm['target_col']}]")
        on_clause = " and ".join(on_parts)

        # Append join to the TARGET table's model_tables entry (secondary joins to source)
        target_model_table = model_tables_by_name[tgt_table]
        target_model_table.setdefault('joins', []).append({
            'with': src_table,
            'on': on_clause,
            'type': 'LEFT_OUTER',
            'cardinality': 'MANY_TO_ONE',
        })

Cardinality heuristic: if the secondary datasource has no dimension-only columns (all columns are measures or aggregated), it is likely a fact table → use MANY_TO_MANY. Otherwise default to MANY_TO_ONE. Surface the choice in the review checkpoint (Step 7) so the user can override.

Datasources not in any blend continue to produce one model per datasource as before.
Column name conflicts: when merging, if two datasources define columns with the same display name but different semantics, disambiguate by prefixing with the datasource display name (e.g. Orders Revenue vs Targets Revenue). Log every rename.

The model_tables[] section references both regular tables (from Step 5a) and SQL Views (from Step 5c) — both are referenced by name in the same way.

Model name: use the Tableau datasource display name — no prefix (no TEST_ or environment markers). Ask the user if they want a different name before importing. See ../../shared/schemas/ts-model-conversion-invariants.md (N1).

Model TML hard rules — these apply to every model this step generates. Violations cause silent data loss or import rejections with no clear error. See ../../shared/schemas/ts-model-conversion-invariants.md for full detail.

I1 — Every formulas[] entry must have a paired columns[] entry with formula_id: matching the formula's id. An unpaired formula is silently dropped on import.

I2 — Never add aggregation: to a formulas[] entry. It belongs only on columns[] entries. Adding it to formulas[] causes FORMULA is not a valid aggregation type.

I3 — Add index_type: DONT_INDEX on every columns[] entry that has a formula_id and column_type: MEASURE.

I4 — with: must exactly match the target table's name:. (In ThoughtSpot, with: resolves against name, not an id. If you add an id: field to a model_tables entry, it must equal name: exactly — same case, same characters — or joins break with "{table} does not exist in schema" at query time.)

I5 — COUNTD(x) → unique count ( [T::x] ) formula entry, never aggregation: COUNT_DISTINCT. Using aggregation: COUNT_DISTINCT silently flips column_type from MEASURE to ATTRIBUTE.

I6 — Connection referenced by name, never GUID. In every table and sql_view TML block, use connection: name: "{name}" — the display name from Step 4. GUIDs are environment-specific and will fail on any ThoughtSpot instance other than the one they were exported from. See ../../shared/schemas/ts-model-conversion-invariants.md (I1–I6).

Template:

model:
  name: "Datasource Display Name"
  properties:
    spotter_config:
      is_spotter_enabled: true  # set by Step 5.5 — Spotter is on by default
  model_tables:
  - name: TABLE_NAME
    joins:                      # only if this table has joins to others
    - with: OTHER_TABLE         # must match OTHER_TABLE's name exactly (same case)
      on: "[TABLE_NAME::JOIN_COL] = [OTHER_TABLE::JOIN_COL]"
      type: LEFT_OUTER          # INNER | LEFT_OUTER | RIGHT_OUTER | OUTER
      cardinality: ONE_TO_MANY
  - name: OTHER_TABLE
  parameters:                   # omit if no Tableau parameters to migrate
  - name: Currency
    data_type: VARCHAR
    default_value: "USD"
    list_config:
      list_choice:
      - value: USD
      - value: CAD
      - value: GBP
  formulas:                     # omit section entirely if no translatable calculated fields
  - id: formula_Formula Name    # id: "formula_" + display name
    name: Formula Name
    expr: "ThoughtSpot expression"
    properties:
      column_type: MEASURE      # or ATTRIBUTE — NO aggregation: here (I2)
  - id: formula_Unique Customers   # COUNTD(x) → unique count formula, NOT aggregation: COUNT_DISTINCT (I5)
    name: Unique Customers
    expr: "unique count ( [TABLE_NAME::customer_id] )"
    properties:
      column_type: MEASURE
  columns:
  - name: display_name
    column_id: TABLE_NAME::COLUMN_NAME
    properties:
      column_type: ATTRIBUTE    # or MEASURE
  - name: Formula Name          # paired columns[] entry for every formulas[] entry (I1)
    formula_id: formula_Formula Name   # must match the formula's id exactly
    properties:
      column_type: MEASURE
      aggregation: SUM
      index_type: DONT_INDEX    # always on computed MEASURE formula columns (I3)
  - name: Unique Customers      # paired entry for the COUNTD formula (I1 + I5)
    formula_id: formula_Unique Customers
    properties:
      column_type: MEASURE
      aggregation: SUM
      index_type: DONT_INDEX

Parameter migration (Tableau → ThoughtSpot `parameters[]`)

When the TWB has a Parameters datasource (Step 3), generate parameters[] entries in the model TML. Omit id — ThoughtSpot assigns it on import.

Type mapping:

Tableau `param-domain-type`	Tableau `datatype`	ThoughtSpot `data_type`	Config
`list`	`string`	`VARCHAR`	`list_config` with `list_choice[]` from `<member>` values
`list`	`date`	`DATE`	`list_config` with date values (strip `#` delimiters)
`list`	`integer`	`INT64`	`list_config`
`list`	`real`	`DOUBLE`	`list_config`
`range`	`integer`	`INT64`	`range_config` with `range_min`, `range_max` — unless the `<range>` has a `granularity` attribute (step size); then use `list_config` (see note below)
`range`	`real`	`DOUBLE`	`range_config` — same granularity rule applies
`range`	`date`	`DATE`	Free-form (no `range_config` — ThoughtSpot range is numeric only)
`any`	any	mapped type	Free-form (no config)
`list`	`boolean`	`BOOL`	`list_config` with `'true'`/`'false'` values

Value cleanup:

Tableau wraps string member values in double quotes: '"USD"' → strip to USD
Tableau date defaults use # delimiters: #2026-05-10# → strip to 2026-05-10 then format as MM/DD/YYYY (ThoughtSpot's date parameter format)

Stepped range → list_config (not range_config): A Tableau <range> parameter that has a granularity attribute (step size) enumerates to a small discrete choice list → use list_config (enumerate min→max by step), NOT range_config (which cannot express the step). Plain ranges (no granularity) keep range_config.

Note: A parameter that drives a Top-N/Bottom-N set's count should be list_config (discrete choices — live-verified ground truth used list_config; range_config loses the step). Example: <range granularity='5' min='5' max='25'/> → list_choice: [5, 10, 15, 20, 25], data_type: INT64.

SQL-lookup parameters: If a parameter's list values come from a database query (no static <member> elements in the TWB), query the warehouse at migration time to populate list_config.list_choice[]:

Extract the SQL query or column reference from the Tableau parameter definition
Execute against the warehouse connection from Step 4
Use the distinct result values as list_choice[] entries
Log in MIGRATION_LIMITATIONS.md that these values are a point-in-time snapshot

If the selected connection cannot be queried for the values, omit the parameter and log the omission with the original SQL query for manual recreation.

Critical parameter invariants (from live-instance testing):

range_config values (range_min, range_max, default_value) must be strings in the TML — range_min: "1", not range_min: 1. Bare integers cause "Invalid YAML/JSON syntax in file" on import. This applies even when the parameter's data_type is INT64 or DOUBLE.
When a formula references another formula inside sum() — e.g. sum([Attrition Count]) where Attrition Count is if(x='Yes') then 1 else 0 — ThoughtSpot rejects it with "Function sum expects 1st argument to be Numeric". The fix is to inline the referenced formula's expression: write sum ( if ( [x] = 'Yes' ) then 1 else 0 ) directly, not sum ( [Attrition Count] ). Apply this when any MEASURE formula references another formula column inside an aggregation function.
After importing a model with parameters, export the model and read the parameters[].id field — ThoughtSpot assigns the UUID on import. You need this UUID for Step 10f (liveboard parameter chips).

Formula reference translation

In Tableau, calculated fields reference parameters as [Parameters].[Parameter Name]. In ThoughtSpot, parameters are referenced as [Parameter Name] (no prefix, no table qualifier). Apply this transformation:

Tableau:     [Parameters].[Currency]
ThoughtSpot: [Currency]

This is a simple prefix strip: [Parameters].[X] → [X]. Apply AFTER resolving Tableau internal cross-references ([Calculation_\d+]) and BEFORE translating function syntax.

MANDATORY (I7) — before classifying any calculated field as untranslatable, open ../../shared/mappings/tableau/tableau-formula-translation.md and check its full function table and pass-through section. Do not decide from syntax alone. See ../../shared/schemas/ts-model-conversion-invariants.md (I7).

Formula translation rules: use tableau-formula-translation.md.

Convert Tableau join types: full → OUTER, left → LEFT_OUTER, right → RIGHT_OUTER, inner → INNER
Write formulas in topological dependency order (Level 0 first)
Resolve Tableau internal IDs ([Calculation_\d+]) to display names before translating
LOD expressions ({FIXED}, {INCLUDE}, {EXCLUDE}) → group_aggregate() — see the LOD section in tableau-formula-translation.md
TOTAL(SUM(x)) / percent-of-total → group_aggregate(..., {}, query_filters())
Tableau bins (class='bin'): prompt the user for how to create each one — there are two valid representations and the choice is theirs:
```
This workbook has {N} bin field(s):
  - Age (bin):     binned on [Age],     size = parameter "Age Groups" (dynamic)
  - Balance (bin): binned on [Balance], size = parameter "Balance (bin) Parameter" (dynamic)

How should each bin be created?
  F  floor() formula        — keeps it dynamic when the size is parameter-driven
  C  cohort / column set     — native BIN_BASED set, fixed bin size
  B  both
(default: F for parameter-driven bins, C for fixed-size bins)
```
- F — floor() formula: floor([x]/size)*size referencing the migrated parameter (resolve its internal name to the parameter caption) or a literal for fixed size. Stays dynamic if parameter-driven.
- C — cohort: a separate cohort: TML object (cohort_grouping_type: BIN_BASED, anchor_column_id, bins.{minimum_value,maximum_value,bin_size}) bound to the model by obj_id. A cohort needs a fixed range — prompt the user for minimum_value, maximum_value, and bin_size, offering the Tableau parameter's default as the suggested bin_size. If the user can't supply the range, fall back to a warehouse lookup (SELECT MIN/MAX, with their authorization) — prompt first, DB lookup second. See the Bins section in tableau-formula-translation.md and ../../shared/schemas/thoughtspot-sets-tml.md. Generate as *.cohort.tml and import after the model.
- B — both: emit the formula and the cohort.
Offer the smart default per bin (F for dynamic, C for fixed) so the user can just accept.
Manual groups (class='categorical-bin') → a GROUP_BASED cohort (*.cohort.tml): one groups[] entry per <bin>, its <value> list → the condition value[], the calc's default → null_output_value. Classify by the calculation class, not the field name — a field called "… (clusters)" is usually a categorical-bin (translatable), not k-means. Only true statistical clustering is untranslatable. Bind by the model obj_id; import after the model. Watch the value-format caveat (stored values must match the group's values).
- Cohort vs. if/then formula: if each group is a contiguous, non-overlapping range, an if … then … else if … then … else … formula is cleaner (ThoughtSpot has no CASE — use the if/then/else-if chain); if groups are arbitrary/interleaved value sets, use the cohort (a range formula would misclassify). Check membership before choosing — see the categorical-bin section in tableau-formula-translation.md.
Number of Records / row-count fields → count([column]). Prompt the user for which column to count (default the table's primary key); carry the same choice into dependent formulas (e.g. percent-of-total). Don't emit sum(1).
Referencing one formula from another: use the formula id [formula_<id>], not its display name [<Name>] — the name form errors "Search did not find …". E.g. [formula_Attrition Count] / sum([T::EMPLOYEECOUNT]). (Column refs still use [T::COL].)
Model-level vs answer-level formulas. A calculated field used across many worksheets belongs in the model formulas[] (reusable). One used by only a single worksheet can instead be an answer-level formula on that liveboard viz (answer.formulas[], with a matching answer_columns[] entry) — keeping the model lean. Decide by reuse: shared → model; viz-specific → answer-level.
Growth / decline (Tableau pcdf / percent-difference / running-percent table calcs). Prefer the growth of search keyword when the breakdown is over a date: growth of [Measure] by [Date] (this is a viz search_query, not a model formula). If there is no date (e.g. growth across a sector attribute), build explicit this-period vs last-period formulas and a percentage — but when a date exists, growth of is the right tool.
Running calculations (RUNNING_SUM, etc.) → cumulative_sum(), etc.
Rank functions → rank()
Window functions (WINDOW_SUM, WINDOW_AVG, etc.) → moving_sum(), moving_average(), etc. — requires identifying the sort dimension from the worksheet shelf. See "Window / Moving Functions" in tableau-formula-translation.md.
Pass-through fallback for formulas with valid Snowflake SQL but no native ThoughtSpot function (partitioned RANK, DENSE_RANK, WINDOW_* when sort dimension is unknown): use sql_*_aggregate_op() pass-through functions — see "Pass-Through Fallback" in tableau-formula-translation.md. Always prefer native functions first.
Comma-separated-list / string-concatenation technique (FIRST/LAST/LOOKUP/PREVIOUS_VALUE used together to build one delimited string of a column's values — e.g. Jonathan Drummey's CSV-list / set-member-list dashboards): do NOT omit — translate the intent to LISTAGG string aggregation (sql_string_aggregate_op ( "LISTAGG({0}, ', ') WITHIN GROUP (ORDER BY {0})" , [col] ), answer-level, ⚑ flag for review per PT1) or a plain table of the values. The feeder/Last scaffolding calcs collapse into the one LISTAGG formula. See tableau-formula-translation.md "String aggregation".
Geospatial formulas (MAKEPOINT, MAKELINE, DISTANCE, BUFFER, AREA): omit the spatial formula entirely. For MAKEPOINT(lat, lon), ensure the underlying latitude and longitude columns are migrated as individual ATTRIBUTE columns — they are useful for filtering and display even without a map visualization. For DISTANCE/BUFFER/AREA, flag more prominently (the spatial computation is lost, not just the wrapper). See tableau-formula-translation.md "Geospatial Policy". Log each omission.
INDEX() prevalence note: INDEX() is correctly untranslatable, but it appears in ~43 of the 127 audited workbooks, usually implementing Top-N row numbering or ranking. When you encounter INDEX() used for ranking/filtering intent (e.g. INDEX() <= 10), recommend the ThoughtSpot substitute: rank() model formula or an answer-level top N keyword search — not literal positional addressing. Surface this in the log: "INDEX() used for ranking/Top-N intent — consider rank() or answer-level 'top N' instead."
Truly untranslatable formulas (LOOKUP, INDEX, SIZE, FIRST, LAST, PREVIOUS_VALUE — standalone partition-position table calcs that are NOT part of the string-aggregation technique above): omit from formulas[] entirely, omit the corresponding columns[] entry, and log the omission for the Step 12 limitations report. Never generate a placeholder — incorrect syntax fails the entire model import.
Every join MUST have a non-empty on field. Multi-column joins are fine — on: "[A::k1] = [B::k1] AND [A::k2] = [B::k2]".
Join keys must be physical columns — you cannot join on a model formula. And a ThoughtSpot relationship is binary: a join's on cannot span more than two tables, so multi-table join keys must be co-located into ONE relation first (e.g. targets keyed by (month, category) where month derives from one table and category lives on another → build a single SQL view spanning both so both keys sit on one relation). If a needed key simply doesn't exist (e.g. month-of-order-date when orders only have a full ORDER_DATE), stop and advise the user; don't skip it or fake a formula key. Present the two ways to make the column(s) physically exist, and let the user choose:
1. ThoughtSpot SQL View (a sql_view TML — Step 5c): write the derived/pre-aggregated columns into a SELECT over the connection (DATE_TRUNC('month', ORDER_DATE) AS …, GROUP BY …). Its sql_output_columns are physical → valid multi-column join keys. Fast, stays entirely in TML, no warehouse change. Use this as the foundation table for the model.
2. Database table/view the user creates in the warehouse, then adds to the connection so ThoughtSpot can see it — then bind a normal Table TML to it. More setup (DB work + connection refresh) but governed/reusable outside this model. State exactly what the object needs to expose (which derived/aggregated columns, at what grain) so the user can act. A ThoughtSpot join can be multi-column; the keys just have to be real columns the relation exposes.
Cross-datasource formulas (Tableau data blends). When datasources are merged into a single model via blend-aware grouping (Step 5b), cross-datasource references resolve naturally — all columns from all blended datasources exist in the same model. A formula like SUM([Sales]) - SUM([OtherDS].[Target]) becomes sum ( [ORDERS::Sales] ) - sum ( [TARGETS::Target] ) because both ORDERS and TARGETS are model_tables[] entries in the same model.

Reference resolution: Tableau formulas reference other datasources in two formats:
- By federated ID: [federated.xxx].[column_name] (the internal XML format)
- By caption: [Datasource Caption].[column_name] (the display format)
During formula translation:
1. Detect the datasource prefix ([federated.xxx] or [Caption]) using the ds_id_to_caption mapping from Step 5b — match against both IDs and captions
2. Strip the prefix, leaving just [column_name]
3. Resolve the column name against the merged model's columns[] (it will exist because the secondary datasource's columns were included in the merge)
4. Prefix with the correct TABLE_NAME:: for the ThoughtSpot model reference
If a cross-datasource formula references a datasource NOT in the blend group (shouldn't happen in well-formed workbooks, but possible in hand-edited TWBs): log a warning and omit the formula with a flag in the audit report.
No fqn in model_tables
obj_id is optional on fresh import — omit it unless repointing an existing model

Tableau Sets → ThoughtSpot column sets (Phase 2a)

Construct distinction: A Tableau set is a top-level <group ...> element (a named in/out partition on a dimension column). It is entirely different from a manual group (<column><calculation class='categorical-bin'>) — which is already handled above as a GROUP_BASED cohort. Do NOT confuse the two. Sets are identified by the <group> XML element; manual groups by the calculation class.

Detection — scan for top-level <group> elements in the datasource XML.

For each <group> element, inspect its <groupfilter> tree and classify:

Static set (Phase 2a — translate): the groupfilter tree contains only function='union' and function='member' nodes (optionally function='level-members'). There is no function='end' and no function='except'/'intersect'.

Extract:
- caption attribute → set name
- The level='[Dimension]' attribute on the groupfilter → anchor column → its ThoughtSpot column display name (map via the model's column mapping). If level is a calculated field ([Calculation_NNN], i.e. a set anchored on a derived dimension like YEAR([Order Date])): resolve the internal ID to the calc's display name via the calc cross-reference map (Step 3), and ensure that calc is emitted as a model formula column (an ATTRIBUTE formula, e.g. year ( [Order Date] )) so the set has a column to anchor on. Column sets can anchor on a formula column by its display name (live-verified 2026-06-12 — a set anchored on the Sales Rep formula column imported cleanly). Never emit the raw Calculation_NNN id as anchor_column_id.
- Each child <groupfilter function='member' member='...'/> → a member value:
  - HTML-decode the value (" → ", & → &, < → <, > → >)
  - Strip Tableau's surrounding double-quotes from string values (e.g. '"Aaron Bergman"' → Aaron Bergman)
  - Match filter_value_type to the anchor's type — text → STRING; a numeric calc anchor (e.g. year() → integer, member 2018) → DOUBLE; a date anchor → DATE_FILTER (per 1.5.9).
  - %null% member → use the literal {Null} grouping value. NULL is selectable in a column set (live-verified 2026-06-12 — the UI emits the token {Null} for a null selection). Emit a condition operator: EQ, value: ["{Null}"], filter_value_type: STRING.
    - %null% included (a union/member set putting NULL in the set) → add the EQ {Null} condition alongside the member-list condition with combine_type: ANY (in the list or null).
    - %null% excluded (an except removing NULL) → no condition needed: nulls already fall to the catch-all "out" bucket via combine_non_group_values. (Or be explicit with NE/no-{Null}.) No formula alternative is required for null — column sets handle it directly via {Null}.
Top-N / Bottom-N set (Phase 2b — TRANSLATE to a query set): groupfilter tree contains function='end' (with count and/or order child/attributes). Translate to a cohort_type: ADVANCED / COLUMN_BASED query set in one of two forms, chosen by count:
- Literal count='N' (static N) → the simplest form: the embedded answer's search_query is a plain top N [dimension] [measure] (or bottom N …) keyword search (anchor dimension first, then measure) — no formulas, no parameter. (The top N keyword search_query IS correct for a fixed N.)
- count='[Parameters].[X]' (dynamic, parameter-driven N) → a rank formula + parameter-filter formula, with N read from the migrated model parameter. This is the only form that stays in sync with the parameter as the user changes it. (B2VBWeek11 uses this.)
Detection (applies to both forms):
- end='top' → top N keyword / rank(..., 'desc'); end='bottom' → bottom N keyword / rank(..., 'asc').
- The order child's expression (e.g. SUM([measure])) → the ranking measure (and, in the dynamic form, the rank's aggregation). If the ordering measure is a derived/conditional field (null-pad, IF-exclude), use the plain underlying measure and flag the dropped nuance.
- count type selects the form: [Parameters].[X] → dynamic (filter references the migrated model param [<alias>::<param>]); a literal N → static (top N/bottom N keyword).
- The innermost level='[Dim]' → anchor/return column display name.
Extract:
- Set caption → cohort name.
- Ordering measure column display name (via the model's column mapping).
- Parameter name (if count is a parameter reference) — must already exist on the model (migrated via the Parameters datasource → model.parameters[]).
Emit one *.cohort.tml per Top-N/Bottom-N set — see Query-set TML emission below. Log: "Set '<name>' is a Top-N/Bottom-N set → translated to a ThoughtSpot query set (rank formula + parameter-filter, Phase 2b) — flag for review."

Flag dropped nuances: if the ordering measure is conditional/null-padded, note the simplification: "Dropped null-padding / conditional ranking — using plain <measure>; verify ranking matches the Tableau set."
except of a member-list (TRANSLATABLE) — column set with NE: an except whose excluded side is a union/member list (e.g. all categories except {Furniture, %null%}) maps to a column set: one group with an operator: NE condition per excluded member, combine_type: ALL ("not A AND not B"). operator: NE is a valid cohort operator (live-verified 2026-06-12). Any %null% in the excluded side needs no condition — it's already excluded by combine_non_group_values (catch-all). Anchor + member rules are the same as a static set.
intersect of two member lists (Phase 2c — TRANSLATABLE): groupfilter tree has function='intersect' and both children are member/union sub-trees (no function='end', 'filter', or nested set-op). Compute the set intersection at conversion time — the members common to both lists. Emit a GROUP_BASED column set with operator: EQ conditions for the shared members (same emission as a static set). If the intersection is empty, log and skip: "Set '<name>' intersect yields zero common members — omitted." Otherwise log: "Set '<name>' is an intersect of two member lists → column set (GROUP_BASED, {N} common members, Phase 2c) — flag for review."
except where the excluded side is a Top-N/Bottom-N (Phase 2c — TRANSLATABLE): groupfilter tree has function='except' and the excluded child contains function='end'. This means "all dimension values EXCEPT the Top/Bottom N" — the complement of the Top-N set. Translate to a query set using an inverted rank filter: [formula_rank] > N (or > [param]) instead of <= N. All other emission rules are identical to Phase 2b (same rank formula, same anchor/measure, same static-vs-dynamic form selection). Log: "Set '<name>' is 'all except Top/Bottom-N' → query set with inverted rank filter (Phase 2c) — flag for review."
Condition-based set (Phase 2c — TRANSLATE to a query set): groupfilter tree contains function='filter' (with a quantitative or expression child specifying an aggregate condition like SUM([Sales]) > 10000). This is a Tableau set created via the Condition tab — membership is determined by an aggregate condition evaluated per dimension member at query time.

Detection:
- function='filter' in the groupfilter tree (distinct from 'end' which is Top-N).
- The condition expression is in the expression attribute or a <groupfilter function='quantitative'> child with <groupfilter function='range' from='...' to='...'/> bounds.
- The level='[Dim]' attribute → anchor column display name (same resolution as static/Top-N sets).
Extract:
- Set caption → cohort name.
- The aggregate expression (e.g. SUM([Sales])) → translate through the formula translation reference to a ThoughtSpot formula.
- The comparison operator and threshold(s) from the range element or the expression itself.
Emit as a query set (cohort_type: ADVANCED, cohort_grouping_type: COLUMN_BASED) with:
- One formula: the translated condition as a boolean expression (e.g. sum ( [Model_1::Sales] ) > 10000). Set properties.column_type: ATTRIBUTE.
- search_query: "[<measure>] [<dimension>] [formula_condition] [formula_condition] = true"
- Same answer structure as the Top-N query set (tables, table_paths, answer_columns, display_mode).
Log: "Set '<name>' is a condition-based set (condition: <expr>) → query set with condition formula (Phase 2c) — flag for review."

Computed set operations — intersect / except of mixed types (Phase 2c — TRANSLATE to a multi-formula query set): a set operation (intersect or except) where at least one side is a computed set (Top-N, condition-based) and the other is a member list, a computed set, or level-members (all). The query set's embedded answer can hold multiple formulas — compose each side's filter logic into the same answer and combine via the search_query.

Composition rules — build one formula per side, then combine:

Side type	Formula to generate
Member list (`union`/`member`)	`formula_members`: `[Model_1::Dim] = 'val1' or [Model_1::Dim] = 'val2' or ...` (one `or` per member). Set `properties.column_type: ATTRIBUTE`.
Top-N (`function='end'`)	`formula_rank`: `rank ( sum ( [Model_1::measure] ) , 'desc' )` + `formula_topn`: `[formula_rank] <= N` (or `<= [Model_1::param]`). Same as Phase 2b.
Condition (`function='filter'`)	`formula_cond`: translated aggregate condition (e.g. `sum ( [Model_1::Sales] ) > 10000`). Set `properties.column_type: ATTRIBUTE`.

Combining in search_query:

Operation	search_query pattern
Intersect (A ∩ B)	`"[measure] [dimension] ... [formula_a] = true [formula_b] = true"` — both filters must pass (AND).
Except (A EXCEPT B)	`"[measure] [dimension] ... [formula_a] = true [formula_b] = false"` — A passes, B fails. For Top-N exclusion, invert the rank filter: `[formula_rank] > N` instead of `<= N`, then use `= true`.

The answer_columns, table_columns, and ordered_column_ids include the dimension, the aggregated measure, and every formula column. The display_mode is TABLE_MODE.

Example — "East States ∩ Top 10 by Revenue":

cohort:
  name: East Top Revenue
  answer:
    formulas:
    - id: formula_members
      name: member_filter
      expr: "[Model_1::State] = 'NY' or [Model_1::State] = 'CA' or [Model_1::State] = 'TX'"
      properties:
        column_type: ATTRIBUTE
    - id: formula_rank
      name: rank
      expr: "rank ( sum ( [Model_1::Revenue] ) , 'desc' )"
      properties:
        column_type: ATTRIBUTE
    - id: formula_topn
      name: topn_filter
      expr: "[formula_rank] <= 10"
    search_query: "[Revenue] [State] [formula_rank] [formula_members] = true [formula_topn] = true"
    # ... tables, table_paths, answer_columns, display_mode as per Phase 2b
  config:
    cohort_type: ADVANCED
    cohort_grouping_type: COLUMN_BASED
    anchor_column_id: State
    return_column_id: State

Log: "Set '<name>' is a computed set operation (<op> of <type-A> and <type-B>) → query set with {N} formulas (Phase 2c) — flag for review."

Deeply nested set-ops: if a side is itself a set operation (e.g. (A ∩ B) EXCEPT C), recursively decompose — flatten all member lists into one or formula, and each computed side into its own formula pair. The search_query combines all filters. Flag deeply nested cases prominently: "Nested set operation — {depth} levels deep; verify the combined filter logic."

Set control / dynamic set (no static members) → an interactive filter; drop the scaffolding. A set whose groupfilter tree is level-members only (ui-enumeration="all", ui-builder="filter-group") has no fixed membership — it's a Tableau Set Control the user toggles live, usually feeding IF [Set] THEN measure ELSE NULL calcs. That set + IF-calc machinery is Tableau scaffolding to fake interactive filtering — ThoughtSpot does it natively. Translate the intent, not the scaffolding:
1. Migrate the anchor as a model formula column if it's a calc (e.g. 01. Month = DATE(DATETRUNC('month',[Order Date])) → start_of_month ( [Order Date] )) — a useful filterable dimension. (Same calc-anchor rule as a static set.)
2. Map the control to an interactive filter on that column (Step 10). The filter is the selection.
3. Drop the IF [Set] THEN measure ELSE NULL referencing calcs — do NOT migrate them as formulas. The measure + filter replaces them (sum(sales) filtered to the chosen months). Treat them like the "redundant pass-through formula" case: recognize the intent and collapse to the native pattern.
4. Do not emit a cohort. The only case needing more than a filter is a genuine side-by-side in-set vs out-set comparison viz — handle that with a grouping attribute (a real static column set) or two answers; flag it specifically rather than generalising a "capability gap" onto every control. Log: "Set '<name>' is a dynamic Set Control → mapped to a filter on <anchor> (anchor calc migrated as a column); its IF-[Set] scaffolding calcs were collapsed into measure+filter, not migrated."
Worksheet set action (no equivalent — defer): a <action> element that adds/removes members from a set based on viz selection. No ThoughtSpot equivalent. Log: "Set action on '<set name>' has no ThoughtSpot equivalent — omitted."

Emit one *.cohort.tml per static set — see "Column-set TML emission" below. Emit one *.cohort.tml per Top-N/Bottom-N set — see "Query-set TML emission" below. Import cohorts after the model (the payload order in Step 5.5 already includes *.cohort.tml). Import order for query sets: model (with parameter) → cohort — the set's formula references the parameter, which must exist on the model first.

⚠ MANDATORY — flag every set conversion for the user to review. Set conversions are semantic reinterpretations, not literal 1:1 translations — a column set, a filter, dropped scaffolding, or a deferral may not behave exactly like the Tableau set. For each set, surface its outcome and ask the user to confirm it matches intent, in both the Step 7 review checkpoint and the Migration Summary (Step 10g) / Step 12 report. Show a per-set line with its kind and how it was handled, e.g.:
Sets ({N}) — review each result matches intent:
  ✓ State Set            → column set (GROUP_BASED, 3 members)         [verify membership]
  ✓ Category Set         → column set via NE (except {Furniture})      [verify exclusion + nulls]
  ✓ Year Set             → column set on formula column "Order Year"   [verify the calc + values]
  ⚠ Customer Group 1     → column set (231 members)                    [large list — spot-check]
  ⚙ 01. Month Set        → interactive filter on "Order Month"; IF-[Set] calcs collapsed to
                           measure+filter, NOT migrated                [confirm filter ≈ the control]
  ✓ State_TopN           → query set (rank desc by SUM gallons, N=topN param)   [verify ranking + N]
  ✓ State_BottomN        → query set (rank asc by SUM gallons, N=topN param)    [verify ranking + N]
  ✓ Region_Intersect     → column set (GROUP_BASED, 4 common members from intersect)   [verify membership]
  ✓ State_NotTopN        → query set (inverted rank desc, all except top N)          [verify ranking + N]
  ✓ HighRevCustomers     → query set (condition: SUM(Revenue) > 10000)               [verify condition]
  ✓ East_TopRevenue     → query set (member-list ∩ Top-N, 3 formulas)              [verify combined filter]
The reinterpreted ones (except→NE, %null%→{Null}, formula-anchor, set-control→filter, collapsed IF [Set] calcs, Top-N/Bottom-N → query set, condition-based → query set, member-list intersect → computed common members, all-except-Top-N → inverted rank) especially need a human eye — call them out explicitly, don't bury them. For Top-N/Bottom-N and condition-based sets, explicitly call out any dropped ranking nuances (null-padding, conditional measure) or simplified conditions so the user can verify the result matches intent.

Set IN/OUT semantics — the column set IS the In/Out classification

A Tableau set returns a boolean per row — every dimension value is either a member (IN) or not (OUT). The migrated GROUP_BASED column set already encodes exactly that: its group label is the In value and the combine_non_group_values catch-all (null_output_value) is the Out value. So the three ways Tableau uses In/Out all map cleanly — translate the intent, don't migrate the IF [Set] scaffolding calcs:

Compare In vs Out (e.g. "Compare In vs Out" / "Part to Whole" dashboards) → group a measure by the cohort column ([measure] [Set] → two groups, In vs Out). Native — this is the comparison; it is not a capability gap for a static set.
In/Out measure (IF [Set] THEN [Sales] END / Set Sales / Group 1 Sales) → a conditional aggregate. Three equivalent forms (all live-verified 2026-06-12) — a column set is formula-referenceable as [<cohort name>] = '<in/out label>':
- Literal translation (mirrors Tableau's IF [Set] THEN x END exactly): sum ( if ( [Product Category set] = 'in' ) then [Sales] else null ).
- sum_if shorthand (preferred, esp. for large member lists — no inlining): sum_if ( [Product Category set] = 'in' , [Sales] ) (and … = 'out' for OUT). Family: sum_if/average_if/count_if/unique_count_if/max_if/min_if.
- Dimension + member list (no cohort dependency; fine for small lists): sum_if ( [Category] in { 'Furniture','Technology' } , [Sales] ) / sum_if ( not ( [Category] in { 'Furniture','Technology' } ) , [Sales] ).
- ⚠️ Pitfall (cohort-ref forms): the cohort name must differ from its group labels — a name==label collision (e.g. cohort Focus Categories with group also Focus Categories) makes the formula fail with "Search did not find …". Emit distinct labels (group in, out out); see the emission template.
Filter to In / Out → filter on the cohort column = the In label (or the Out label).
IF [Set] THEN [dimension] label calcs (In, Out, Set Label) → the cohort column itself (its two labels), or the dimension filtered to In/Out.

Pick sum_if(...) when In and Out are wanted as separate measure columns (KPIs, side-by-side, an In/Out ratio) — reference the cohort for large lists, the dimension for small; pick grouping by the cohort for an in-vs-out breakdown viz. Either way the pile of IF [Set] THEN … calcs collapses onto the one column set / a couple of sum_ifs — don't emit them as per-row formulas.

See ../../shared/schemas/thoughtspot-sets-tml.md (column set + query set) and the live-verified worked examples ../../shared/worked-examples/tableau/static-set-to-column-set.md (column set) and ../../shared/worked-examples/tableau/topn-set-to-query-set.md (Top-N/Bottom-N query set).

Column-set TML emission (static set → `GROUP_BASED` cohort)

For each static set detected above, generate a .cohort.tml file with the following shape:

# guid omitted on first import
cohort:
  name: "<set caption>"              # from the Tableau set's group caption attribute
  config:
    cohort_type: SIMPLE
    cohort_grouping_type: GROUP_BASED
    anchor_column_id: "<dimension display name>"  # ThoughtSpot column DISPLAY name (live-verified), from groupfilter level=
    combine_non_group_values: true          # DEFAULT CATCH-ALL: every value not matched by a group — incl. NULL — combined into one group
    null_output_value: "out"                # OUT label for the catch-all — keep DISTINCT from the cohort name (see below)
    groups:
    - name: "in"                     # IN label — MUST differ from the cohort `name` above, or formula refs
                                     # (`sum_if([<cohort>] = 'in', …)`) fail with "Search did not find" (live-verified).
                                     # Formula refs must match this label EXACTLY (case-sensitive).
      combine_type: ANY              # ANY = membership in the value list ("in set")
      conditions:
      - operator: EQ                 # PROVEN pattern (changelog 1.5.6, from a working column set):
        column_name: "<dim name>"    #   operator: EQ with a MULTI-VALUE list = "in set".
        value: ["Aaron Bergman", "Aaron Hawkins", ...]  # NOT operator: IN.
        filter_value_type: STRING    # STRING for text anchors; for a DATE anchor use DATE_FILTER
                                     # + date_filter_values instead (changelog 1.5.9).
  worksheet:                         # BINDING FIELD IS `worksheet:` NOT `model:` (live-verified — `model:` → "Table cant be empty")
    id: "<model display name>"
    name: "<model display name>"
    obj_id: "<model obj_id>"         # stable object id, e.g. TEST_SV_..._AI_CONTEXT-889a704f (from the model's exported TML header)

Key rules:

anchor_column_id and column_name = the dimension's ThoughtSpot display name (live-verified — works even for a multi-table model). Map from level='[Dimension]' via the same column mapping as Step 5b.
combine_non_group_values: true is the default catch-all: every value not matched by a group condition — including NULL — is combined into one group, labelled by null_output_value. This mirrors Tableau's in/out semantics: unmatched + NULL rows land in the catch-all ("out") bucket.
Member values must be HTML-decoded and have Tableau's surrounding double-quotes stripped, AND converted to the column's stored format, not Tableau's display format (changelog 1.5.6: e.g. 01.Apr.15 → 2015-04-01) — display-format values match nothing.
Membership uses operator: EQ with the full value list + combine_type: ANY (proven in 1.5.6) — do not use operator: IN. For a DATE anchor, switch each condition to filter_value_type: DATE_FILTER + date_filter_values (changelog 1.5.9), not STRING/value[].
%null% is selectable as a grouping value — column sets DO support NULL membership (live-verified 2026-06-12). To include null in the set, add a condition operator: EQ, value: ["{Null}"], filter_value_type: STRING to the group (with combine_type: ANY so it's "in the list OR null"). To exclude null, omit it (the catch-all already excludes it). The literal token is {Null}. No IF/THEN/ELSE formula alternative is needed for null.
except / not-in → operator: NE (live-verified 2026-06-12): one NE condition per excluded value, combine_type: ALL. (except {Furniture, %null%} → NE Furniture; null auto-excluded.)
Bind the set to its model via the worksheet: block (id/name = the model display name; obj_id = the model's stable object id, from the model's exported TML header) — not model:. Using model: fails import with "Invalid save request, Table cant be empty" (live-verified 2026-06-12: set "Focus Categories" created on model TEST_SV_DMSI_AI_CONTEXT only after switching model: → worksheet:).
No top-level guid on first import.
File extension: <SetName>.cohort.tml; write to /tmp/ts_tableau_mig/output/{workbook_name}/

Write each file to /tmp/ts_tableau_mig/output/{workbook_name}/{DatasourceName}.model.tml.

Query-set TML emission (Top-N/Bottom-N → ADVANCED cohort)

For each Top-N/Bottom-N set detected above, generate a .cohort.tml file. There are two forms (see classification above): the dynamic form (parameter-driven N — a rank formula + parameter-filter formula, live-verified 2026-06-12 against se-thoughtspot, model TEST_SV_DMSI_AI_CONTEXT), and the simpler static form (fixed N — a top N/bottom N keyword search, no formulas) shown after it. Cross-refs: ../../shared/schemas/thoughtspot-sets-tml.md (query set section) + ../../shared/worked-examples/tableau/topn-set-to-query-set.md.

Dynamic form (parameter-driven N — count='[Parameters].[X]'):

# guid omitted on first import
cohort:
  name: "<set caption>"
  answer:
    tables:
    - id: "<model display name>"
      name: "<model display name>"
      obj_id: "<model obj_id>"
    table_paths:
    - id: "<model display name>_1"          # self-path alias used by the formulas
      table: "<model display name>"
    formulas:
    - id: formula_filter
      name: filter
      expr: "[formula_rank] <= [<model display name>_1::<paramName>] "
      was_auto_generated: false
    - id: formula_rank
      name: rank
      expr: "rank ( sum ( [<model display name>_1::<measure col>] ) , 'desc' )"   # 'asc' for Bottom-N
      properties:
        column_type: ATTRIBUTE
      was_auto_generated: false
    search_query: "[<measure>] [<dimension>] [formula_rank] [formula_filter] = true"
    answer_columns:
    - name: <dimension display name>
    - name: "<aggregated measure display name>"   # e.g. "Total gallons" for a SUM measure
    - name: rank
    table:
      table_columns:
      - column_id: <dimension display name>
        show_headline: false
      - column_id: "<aggregated measure display name>"
        show_headline: false
      - column_id: rank
        show_headline: false
      ordered_column_ids:
      - <dimension display name>
      - rank
      - "<aggregated measure display name>"
      client_state: ""
    display_mode: TABLE_MODE
  worksheet:
    id: "<model display name>"
    name: "<model display name>"
    obj_id: "<model obj_id>"
  config:
    cohort_type: ADVANCED
    anchor_column_id: <dimension display name>
    return_column_id: <dimension display name>
    cohort_grouping_type: COLUMN_BASED
    hide_excluded_query_values: true
    group_excluded_query_values: "Excluded values"
    pass_thru_filter:
      accept_all: false

Key rules:

Parameter prerequisite (dynamic form) — the count parameter MUST be on the model first (already migrated via the Parameters datasource → model.parameters[]). The set's formula_filter references it as [<model display name>_1::<paramName>]. Import order: model (with param) → cohort. (The static form below has no parameter dependency.)
Top vs Bottom — end='top' → rank(sum(measure), 'desc'); end='bottom' → rank(sum(measure), 'asc') (user-confirmed 2026-06-12).
Rank aggregation = the set's order expression aggregation (SUM here). Translate the ordering measure to its TS column; if it's a derived/conditional field, use the plain measure + flag the dropped nuance for review.
table_paths alias = <model display name>_1; all formulas[].expr column refs use [<alias>::<col>]. answer_columns, config, and table.* use display names (no alias).
answer_columns measure entry uses the aggregated display name ThoughtSpot generates (Total <measure> for a SUM measure, e.g. Total gallons).
A stepped range parameter (Tableau <range granularity='5' min='5' max='25'/>) maps to list_config (enumerate min→max by step: [5,10,15,20,25]), NOT range_config. See the Parameter migration section for this rule.
Bind via worksheet: (id/name/obj_id) — NOT model: (same rule as column sets).
No top-level guid on first import.
File: <SetName>.cohort.tml → /tmp/ts_tableau_mig/output/{workbook_name}/.

Static form (fixed N — literal count): no formulas, no parameter; the top N/bottom N keyword search_query defines membership. Use this when the Tableau set's count is a literal.

# guid omitted on first import
cohort:
  name: "<set caption>"
  answer:
    tables:
    - id: "<model display name>"
      name: "<model display name>"
      obj_id: "<model obj_id>"
    search_query: "top 10 [<dimension>] [<measure>]"   # anchor dimension FIRST, then measure; "bottom 10 …" for Bottom-N; N is the literal count
    answer_columns:
    - name: <dimension display name>
    - name: "<aggregated measure display name>"         # e.g. "Total gallons"
    table:
      table_columns:
      - column_id: <dimension display name>
        show_headline: false
      - column_id: "<aggregated measure display name>"
        show_headline: false
      ordered_column_ids:
      - <dimension display name>
      - "<aggregated measure display name>"
      client_state: ""
    display_mode: TABLE_MODE
  worksheet:
    id: "<model display name>"
    name: "<model display name>"
    obj_id: "<model obj_id>"
  config:
    cohort_type: ADVANCED
    anchor_column_id: <dimension display name>
    return_column_id: <dimension display name>
    cohort_grouping_type: COLUMN_BASED
    hide_excluded_query_values: false       # false = show a remainder bucket (label below); true = hide non-members
    group_excluded_query_values: "Others"   # label for the non-member remainder bucket
    pass_thru_filter:
      accept_all: false

Live-verified 2026-06-12 against se-thoughtspot (set "Static Top 10" on model TEST_SV_DMSI_AI_CONTEXT). The top N [dimension] [measure] keyword search_query (anchor dimension first, then measure) is the correct representation for a fixed-N query set — no formulas, no parameter. hide_excluded_query_values is a display choice: false keeps a remainder bucket (labelled by group_excluded_query_values, e.g. "Others"); true hides non-members.

5c. SQL View TML — one per custom SQL relation

For each custom SQL relation identified in Step 3b (those with source_type: "custom-sql"), generate a .sql_view.tml file. Follow the rules in tableau-tml-rules.md "SQL View TML Rules" and the full schema in thoughtspot-sql-view-tml.md.

Template:

sql_view:
  name: "Datasource Custom SQL"
  connection:
    name: "Connection Display Name"
  sql_query: |
    SELECT col1, col2, col3
    FROM catalog.schema.table_name
    WHERE condition = 'value'
  sql_view_columns:
  - name: COL1
    sql_output_column: col1
    data_type: VARCHAR
    properties:
      column_type: ATTRIBUTE
  - name: COL2
    sql_output_column: col2
    data_type: DOUBLE
    properties:
      column_type: MEASURE
      aggregation: SUM

Key rules:

connection.name is required — use {connection_name} from Step 4
sql_query contains the full SQL text from the Tableau <relation> element (decode HTML entities)
sql_output_column must match a column name or alias from the SQL query output
Map Tableau column datatypes to ThoughtSpot types using the same mapping as table TMLs
No db, schema, db_table, or db_column_properties fields
File extension: *.sql_view.tml

Write each file to /tmp/ts_tableau_mig/output/{workbook_name}/{Name}.sql_view.tml.

The model TML (Step 5b) references these SQL Views by name in model_tables[], just like regular tables.

Step 5.5 — Spotter Enablement

Before validating, confirm whether Spotter (AI search) should be enabled for each model — the same step ts-convert-from-snowflake-sv and ts-convert-from-databricks-mv run. Spotter is the primary natural-language interface for a Model, and a migrated workbook almost always exists to be queried this way, so the default is yes.

Enable Spotter (AI search) for this model? [Y / n] (default: Y)

Write the answer into the model TML properties block (see the Step 5b template):

model:
  properties:
    spotter_config:
      is_spotter_enabled: true   # or false if the user declines

On an in-place update of an existing model, preserve its current setting unless the user asks to change it. Default new models to enabled.

ts tml import reads a JSON array of TML strings from stdin — not a zip and not a single document. Build that array with tables first, then SQL views, then models (so a model's tables are validated alongside it):

cd /tmp/ts_tableau_mig/output/{workbook_name}
python3 - > /tmp/ts_tableau_mig/{workbook_name}_payload.json <<'PY'
import json, glob
order = sorted(glob.glob("*.table.tml")) + sorted(glob.glob("*.sql_view.tml")) + sorted(glob.glob("*.model.tml")) + sorted(glob.glob("*.cohort.tml"))
print(json.dumps([open(f).read() for f in order]))
PY

Pre-import validation gate (I1 / I2 / I4 / I5)

Before running ts tml import, validate the generated Model TML against the hard invariants in ../../shared/schemas/ts-model-conversion-invariants.md. --policy VALIDATE_ONLY does not catch these — ThoughtSpot accepts the TML and then behaves wrong. Do not import until all four pass:

I1 — every formulas[] entry has a columns[] entry whose formula_id: matches its id: exactly. (Unpaired formula is silently dropped.)
I2 — no aggregation: key appears inside any formulas[] entry. (Raises "FORMULA is not a valid aggregation type".)
I4 — every model_tables[] id: (when present) equals its name: with identical case. (Mismatch makes joins silently fail: "{table} does not exist in schema".)
I5 — no physical-column columns[] entry uses aggregation: COUNT_DISTINCT; distinct counts are unique count ( [TABLE::col] ) formulas. (COUNT_DISTINCT silently flips MEASURE → ATTRIBUTE.)

Quick mechanical check on the generated file (replace <file>):

grep -nE '^\s*aggregation:\s*COUNT_DISTINCT' <file>   # I5 — expect NO matches
grep -nE '^\s*aggregation:' <file>                    # confirm none sit under a formulas[] entry (I2)

Inspect formulas[]/columns[] for I1 pairing and model_tables[] for I4 id==name. If any check fails, fix the TML and re-validate before importing.

Validate (up to 10 fix cycles). --policy VALIDATE_ONLY checks without persisting:

cat /tmp/ts_tableau_mig/{workbook_name}_payload.json \
  | ts tml import --policy VALIDATE_ONLY --profile {profile_name}

For each cycle:

Parse the validation response. Each element has a status.status_code of OK, WARNING, or ERROR. Only ERROR blocks; WARNING does not.
Expected WARNING (ignore): Table with id null not found. Matching with db/schema/dbTable with status_code: WARNING. A freshly generated table TML has no GUID, so ThoughtSpot matches it by db/schema/dbTable instead — this is normal for a new table and is not a problem. (Note: a clean binding still shows this warning; it does not mean the connection failed.)
Real ERRORs to fix: connection not found (wrong connection.name/case) and column not found in connection (the connection doesn't expose that db_table/column) are genuine ERRORs — the table won't bind. Fix the name or the column mapping.
For any other errors, identify the affected TML file and the specific issue. Apply the fix from the error table in tableau-tml-rules.md.
Rewrite the affected TML file and rebuild the JSON payload.
Re-validate.

After 10 cycles with remaining errors, stop and report to the user:

Errors that persist after all retries
Which fix was attempted for each
Ask whether to proceed with import anyway or make manual corrections

Step 7 — Review Checkpoint & Import

Before importing, show the user a review summary — the same convention the ts-convert-from-snowflake-sv and ts-convert-from-databricks-mv skills use. The user should see exactly how every calculated field was translated, and what (if anything) will not migrate, before committing — not discover omissions only in the Step 12 report afterward. Reuse the formula tier classification from Step A3/Step 5b.

Ready to import to {base_url}:

Tables:
  ✓ {TABLE_NAME}   → create new on connection "{connection_name}"
  ↺ {TABLE_NAME}   → reuse existing object (GUID {guid})        # if Step 4.5 reuse
  …

Model: {datasource_name}
  Columns: {n} total — {a} attribute(s), {m} measure(s), {f} formula(s)
  Parameters: {p}  ({names or "none"})
  Spotter (AI search): enabled / disabled   # from Step 5.5

Formula translations ({F} total):
  ✓ {name}  [{tier}]:        {tableau_expr}  →  {ts_expr}
  ⚙ {name}  [pass-through]:   {tableau_expr}  →  {sql_*_op expr}
       (works only with SQL Passthrough Functions enabled in ThoughtSpot admin)
  ⚠ {name}  [untranslatable]: OMITTED — {reason}

Sets ({S}) — semantic reinterpretations, REVIEW each matches intent:   # omit section if no sets
  ✓ {name} → column set ({GROUP_BASED, N members | NE except | {Null} | formula-col anchor})  [what to verify]
  ✓ {name} → query set (rank {desc|asc} by SUM {measure}, N={param|literal})   [verify ranking + N]
  ⚙ {name} → interactive filter on {anchor} (set control; IF-[Set] calcs collapsed to measure+filter)
  ⊘ {name} → DEFERRED ({intersect/computed except 2c | set action}) — manual

Will NOT migrate ({K}):
  - {name}: {reason}
  # if none: "Nothing omitted — full coverage."

Dashboards: {N}  (liveboard migration offered after import)

Blended models: {N} model(s) merged from {M} datasources via data blending
  - {primary_ds} ← {secondary_ds} on [{col1}, {col2}]  (LEFT_OUTER, {cardinality})

Proceed?
  yes   — import the table + model TMLs
  no    — cancel
  file  — write the TMLs to /tmp/ts_tableau_mig/output/{workbook_name}/ without importing

Tiers are the Step A3 set: Native, LOD, Cumulative, Moving, Pass-through, Parameter ref, Untranslatable. Show ⚠ … OMITTED for every untranslatable formula (and its dropped columns[] entry) and ⚙ … pass-through for every formula needing SQL Passthrough — so the un-migratable and caveated items are flagged here, up front, for the user to weigh. Always include the Sets section when the workbook has sets (per the MANDATORY set-review rule in Step 5b) — set conversions are semantic reinterpretations, so the user must confirm each matches intent before import.

Reviewer checks before import:

Every translated division has a div-by-zero guard (FT "Division-by-zero" section)

Wait for confirmation. no cancels. file writes the TMLs and skips to Step 12 (report only, no import). yes imports:

On confirmation, reuse the JSON payload from Step 6 (rebuild it if any TML changed). Pass --create-new because these are brand-new objects with no GUID — without it, the default --no-create-new only updates existing objects. (Do not pass --create-new if you are re-importing TML that already carries a GUID — that silently creates a duplicate.)

cat /tmp/ts_tableau_mig/{workbook_name}_payload.json \
  | ts tml import --policy ALL_OR_NONE --create-new --profile {profile_name}

Parse the response. Extract the GUID for each imported object. On failure, show the error and stop.

Updating something that already exists. If Step 4.5 found an existing object, or a first import already created one and you need to re-import (e.g. to set Spotter, fix a column type), do not re-run with --create-new. Pin the object's guid at the TML root and import with --no-create-new — this is true for tables, models, AND liveboards alike. Re-importing without the root guid does not reliably update in place: it can create a duplicate with a new GUID (observed on tables — a re-import without guid churned the table's identity and left an orphan), even though the object "matches" by name/db/schema. Always capture the id_guid from the first import, write it back into the TML root, and re-import with --no-create-new. Verify the returned id_guid matches; a new GUID means you just made a duplicate — delete the orphan.

Save the imported GUIDs internally as {datasource_guids} and {table_guids} — these are used by Step 10 if the user proceeds with dashboard migration. Also save {formula_column_map} (Tableau calc field caption → ThoughtSpot formula display name) and {parameter_map} from the TWB parse.

Step 7.5 — Confirm the Model (before any liveboards)

Pause and have the user verify the model is correct before building liveboards on it. Every liveboard viz references this model's columns and formulas, so a wrong model means re-doing every tile — far cheaper to catch it here. (Do this even when there are no dashboards — a verified model is the deliverable either way.)

Present a confirmation summary and wait:

Model imported: {model_name}
  {base_url}/#/data/tables/{model_guid}

  Tables:     {table list}
  Columns:    {N} — {a} attribute, {m} measure, {f} formula
  Parameters: {names + type}            (or "none")
  Spotter:    enabled / disabled

  Translated formulas — please sanity-check:
    {name}: {ts_expr}
    ...
  Omitted (untranslatable): {names}     (or "none")

  Try these in Search/Spotter to confirm it behaves:
    - "{suggested NL question 1}"
    - "{suggested NL question 2}"
    - "{suggested NL question 3}"

Does the model look correct? (yes → continue / describe changes)

Suggest 3–5 natural-language test questions grounded in the model's actual columns and formulas (mirrors the snowflake/databricks skills). If the user asks for changes, edit the model TML and re-import in place — include the model's guid at the document root and import with --no-create-new (a model has no natural key, so omitting the root guid creates a duplicate; see Step 7). Re-confirm, then proceed. Do not start Step 8 until the user confirms the model.

Step 8 — Migrate Dashboards?

If Step 3d found zero <dashboard> elements, skip to Step 11.5 (a model-only workbook still benefits from coverage answers), then Step 12.

Otherwise, present the decision:

The workbook contains {N} dashboard(s):
  - {dashboard_name_1}
  - {dashboard_name_2}
  ...

Would you like to migrate these to ThoughtSpot Liveboards?
This maps Tableau dashboard layout to a 12-column grid with chart and note tiles.

  Y  Yes — migrate dashboards to liveboards
  N  No  — skip to summary

Enter Y / N:

If N, skip to Step 12.

When there are 2+ dashboards, also ask how to package them:

This workbook has {N} dashboards. Create:
  S  Separate liveboards — one per dashboard
  T  A single liveboard with one tab per dashboard   (+ the Migration Summary tab)

Enter S / T:

ThoughtSpot liveboards support layout.tabs[], so T puts each dashboard on its own tab in one liveboard (often tidier for a related set), while S keeps them independent. Either way, add the Step 10g Migration Summary as a final tab.

Step 9 — Parse Dashboard Layout and Map to Grid

9a. Zone extraction

For each <dashboard> element in the TWB, walk <zones> → <zone> elements recursively. For each leaf zone, extract:

Field	Source
`zone_id`	`id` attribute
`zone_type`	`type` attribute (`text`, `title`, `viz`, `bitmap`, `web`, `extension`, `metric`)
`worksheet_name`	`name` attribute (for `viz` zones)
`x`, `y`, `w`, `h`	`x`, `y`, `w`, `h` attributes (Tableau uses 0–100,000 coordinate space)
`text_content`	`<formatted-text>` child text (for `text` / `title` zones)

Classify each zone:

Chart zones: a worksheet viz — a leaf zone carrying a name (worksheet name) and no more specific sub-type. These become visualization tiles.
Text/title zones: type="text" or type="title" → becomes a note tile (Step 10c).
Skip: type="bitmap" (images), type="web", type="extension", type="metric", type="filter" (quick filters — handled via liveboard filters[], not as tiles), type="paramctrl" (parameter controls — the migrated model parameters[] cover these), type="color"/type="legend" (legend zones — ThoughtSpot draws its own), type="flipboard"/type="flipboard-nav" (Tableau Story-style flipboards — no ThoughtSpot liveboard equivalent). Before skipping a flipboard/story dashboard, salvage its content: a flipboard usually re-presents worksheets already migrated from another dashboard (check — it may reference no unique worksheets), but it often carries narrative captions (analyst commentary). Migrate any unique worksheets as vizzes and preserve the narrative text as note tiles rather than losing it; only the flip interaction itself is dropped. A single worksheet often emits several zones (the viz plus its color/filter companions); keep the viz zone, drop the companions, and de-duplicate by worksheet name.

9b. Worksheet shelf data

For each chart zone's worksheet_name, find the corresponding <worksheet> element in the TWB. Extract:

Columns shelf (<datasource-dependencies> → <column> with shelf column)
Rows shelf → shelf row
Mark type: <mark class="{type}"> (bar, line, circle/scatter, square, text, pie)
Color encoding: column on color shelf
Size encoding: column on size shelf
Aggregation: from column caption prefix (SUM(...), AVG(...), etc.)

9c. Map coordinates to ThoughtSpot 12-column grid

ThoughtSpot liveboards use a 12-column responsive grid. Tableau dashboards use absolute pixel coordinates (0–100,000 range).

Use a band-based approach:

Group zones by y-band — zones within 2,000 units of each other vertically are in the same row band.
Sort bands from smallest y to largest y (top to bottom).
Within each band, sort zones by x (left to right).
Assign columns: divide 12 columns proportionally by each zone's w relative to the total dashboard width. Round to nearest integer; ensure columns sum to 12.
Assign height: convert Tableau h to ThoughtSpot height units (1 unit ≈ 1/20th of the dashboard height; minimum 4 units).
Assign y position: start from 0; each new row band starts at the bottom of the previous band.

Save the grid layout as a list of tiles with zone_id, zone_type, worksheet_name, col, col_span, row_span, y.

9d. Orphan worksheets — surface and prompt to include

A workbook often contains worksheets that aren't placed on any dashboard being migrated. By default they produce no tile — but the author built them for a reason, and the model fully supports them, so the user should decide, not have them silently dropped (surface → recommend → resolve).

Detect. Compute the set of worksheets referenced by the dashboard(s) being migrated (the name on each chart zone). Any <worksheet> in the TWB not in that set is an orphan.
Describe each. Read the orphan's shelves (as in 9b) and state, in one line, what it shows and its ThoughtSpot equivalent — not just the name. E.g. "Attrition Yes/No Count — pie of headcount split by Attrition (Yes/No) → PIE [Attrition] [Total Employee Count]." A bare name leaves the user unable to choose.
Recommend. Say whether each looks worth adding (a meaningful, distinct view) or is likely a draft/superseded by a tile already on the dashboard.
Prompt (per the references — ask, don't assume). Offer: add all, add a subset (name which), or none. For any the user picks, build them as additional tiles in Step 10 (same chart-type resolution, theming, and grid placement as dashboarded vizzes) and append them after the dashboard's own tiles.
Record the outcome in the Migration Summary (Step 10g): which orphans existed, which were added, which were left off (and that the model still supports them via Spotter).

This is a MUST-ASK step — never skip the prompt or decide on the user's behalf. Orphans frequently include an overall-rate or breakdown view the author drafted but forgot to place. Even when the dashboard looks complete, the user may want the extra coverage. When recommending, default to "add as tiles" for orphans that represent a distinct, useful view (a different aggregation, a different dimension breakdown) — the user can always decline.

Step 10 — Generate Liveboard TML

10-pre. Export model and check for parameters (BEFORE generating TML)

Do this first, before writing any liveboard YAML. Export each model referenced by the liveboard to discover parameters and their UUIDs:

source ~/.zshenv && ts tml export {model_guid} --profile {profile_name}

Parse the exported model for parameters[] entries. If any exist, record:

name — the parameter display name
id — the UUID assigned by ThoughtSpot (needed for parameter_overrides[].key)

These will be used in Step 10f to add parameter_overrides and ordered_chips to the liveboard TML. If you skip this step, Step 10f cannot be completed — the UUIDs are not available from the TWB or from the import response.

10a. Resolve chart types

Default to CHART_MODE with the closest chart type — TABLE_MODE is a last resort. Only use TABLE_MODE for explicit crosstabs (Tableau text mark class) or when there is genuinely no chart type that can render the data. For untranslatable visualizations (k-means cluster, forecast), build a CHART_MODE placeholder with the most representative type (SCATTER for cluster inputs, LINE for forecast historical trend) and flag for review in the description — never fall back to TABLE_MODE as a lazy alternative.

Tableau mark class / zone	ThoughtSpot `chart.type`
`bar`	`BAR`
`line`	`LINE`
`circle` / `point`	`SCATTER`
`square`	`BAR`
`pie`	`PIE`
`area`	`AREA`
`text` (crosstab)	`TABLE` (display_mode `TABLE_MODE`)
Map (lat/long generated + geo role)	`GEO_BUBBLE` (or `GEO_AREA` for a filled/choropleth map)
"Measure Names / Measure Values" KPI block	`KPI` — one tile per measure (see KPI rule below)

KPI rule. A Tableau scorecard/KPI worksheet (Measure Names + Measure Values, no dimension) maps poorly to a single tile. Emit one KPI viz per measure — that's the idiomatic ThoughtSpot KPI (headline + sparkline + period-over-period). ALWAYS include a date when the model has one — this applies to every KPI tile (not just measure blocks), and is easy to forget. Date selection: 0 date fields → static KPI (measure only); exactly 1 → include it automatically; 2+ → ask the user which. Use the data's grain ([Date].yearly for annual data, [Date].monthly otherwise) — the default is monthly, so set .yearly explicitly for annual sources. So a "count of sectors" KPI in a workbook with a Fiscal Year column is [Total Sectors] [Fiscal Year].yearly, not a bare [Total Sectors].

For the trend/sparkline to actually render, three things are required:

The date must be in both chart_columns and on axis x, with the measure on y
A table: block with table_columns and ordered_column_ids
client_state_v2 on the chart: block with showSparkline: true in the kpiColumnProperties — without this, the KPI renders as a plain number with no trend line

See thoughtspot-liveboard-tml.md "KPI sparkline client_state_v2" for the verified template. The template requires:

kpiDisplayProperties at the chart level (showChange, showChangeAs: "PERCENT")
Per-column kpiColumnProperties with showSparkline: true on both the date and measure columns
axisProperties with fresh UUIDs (use python3 -c "import uuid; print(uuid.uuid4())")
Optional seriesColors to match the chosen theme palette

Full KPI viz template (substitute column names, UUIDs, and colors):

chart:
  type: KPI
  chart_columns:
  - column_id: "{ResolvedMeasure}"
  - column_id: "{Reso

> Content truncated for page performance. Open the source repository for the full SKILL.md file.

Tableau Workbook → ThoughtSpot

References

Prerequisites

Working principle — surface, recommend, resolve

Step 0 — Overview

Modes

Steps (Migrate mode)

Steps (Audit mode)

Step A1 — Locate TWB File(s) (Audit Mode)

Step A2 — Parse TWB XML (Audit Mode)

Step A3 — Classify Formulas (Audit Mode)

Classifier implementation notes

Step A4 — Migration Coverage Report (Audit Mode)

Step 1 — Authenticate

Step 2 — Extract TWB File

Step 3 — Parse TWB XML

3a. Workbook name

3b. For each <datasource> element (skip those named Parameters)

3c. Topological sort of calculated fields

3d. Dashboard metadata (for Step 8 decision)

3e. Extract blend relationships (data blending)

Step 4 — Select ThoughtSpot Connection

Step 4.5 — Confirm Source Tables & Search Decision

4.5a — Present the table list and ask (do NOT search yet)

4.5b — Act on the answer

4.5c — Confirm any missing sources before proceeding

Step 5 — Generate TML Files

5a. Table TML — one per physical table (skip custom SQL relations)

5b. Model TML — one per datasource (strict separation)

Parameter migration (Tableau → ThoughtSpot parameters[])

Formula reference translation

Tableau Sets → ThoughtSpot column sets (Phase 2a)

Set IN/OUT semantics — the column set IS the In/Out classification

Column-set TML emission (static set → GROUP_BASED cohort)

Query-set TML emission (Top-N/Bottom-N → ADVANCED cohort)

5c. SQL View TML — one per custom SQL relation

Step 5.5 — Spotter Enablement

Pre-import validation gate (I1 / I2 / I4 / I5)

Step 7 — Review Checkpoint & Import

Step 7.5 — Confirm the Model (before any liveboards)

Step 8 — Migrate Dashboards?

Step 9 — Parse Dashboard Layout and Map to Grid

9a. Zone extraction

9b. Worksheet shelf data

9c. Map coordinates to ThoughtSpot 12-column grid

9d. Orphan worksheets — surface and prompt to include

Step 10 — Generate Liveboard TML

10-pre. Export model and check for parameters (BEFORE generating TML)

10a. Resolve chart types

3b. For each `<datasource>` element (skip those named `Parameters`)

Parameter migration (Tableau → ThoughtSpot `parameters[]`)

Column-set TML emission (static set → `GROUP_BASED` cohort)