name: dtge-populate-template description: Populate styled ODT/OTT templates with content while preserving formatting
ODT Template Population
Populate styled ODT/OTT templates with structured content by replacing placeholder text while preserving all paragraph styles, table formatting, heading levels, and document structure.
Overview
LibreOffice MCP tools (libreoffice/create_document, libreoffice/insert_text_at_position) can create documents with content but cannot selectively replace text within a pre-styled template while preserving formatting. This skill solves that problem by manipulating the ODT's internal XML (content.xml) directly via Python and lxml.
The expected input is a markdown file containing structured content. The populate script parses the markdown to extract sections, tables, and list items, then maps them onto the template's XML elements. Markdown is the ideal intermediate format because it's trivial to parse, easy to review, and naturally represents the document structure (headings, paragraphs, tables, lists).
Use this skill when:
- You have a branded/styled ODT template with placeholder text
- You have structured content (typically as markdown) that needs to be placed into the template
- The template has styled headings, colored tables, separator borders, cover page layouts, or other visual elements that must be preserved
Do NOT use this skill when:
- You just need to create a plain document from scratch (use
libreoffice/create_documentinstead) - The document has no special formatting to preserve
Prerequisites
- Python with
lxmlavailable in the project's devShell (checkscripts/flake.nix) - If
lxmlis not available, add it to the flake's Python packages before proceeding
Workflow
Phase 1: Analyze Template
Write a small Python script to dump the template's body elements with indices, tags, styles, and text content. This reveals the exact XML structure needed for targeted replacement.
Analysis script pattern:
#!/usr/bin/env python3
"""Dump template structure for analysis."""
import zipfile
from lxml import etree
NS = {
'text': 'urn:oasis:names:tc:opendocument:xmlns:text:1.0',
'table': 'urn:oasis:names:tc:opendocument:xmlns:table:1.0',
'office': 'urn:oasis:names:tc:opendocument:xmlns:office:1.0',
}
def get_full_text(elem):
parts = []
if elem.text:
parts.append(elem.text)
for child in elem:
parts.append(get_full_text(child))
if child.tail:
parts.append(child.tail)
return ''.join(parts)
TEMPLATE = '<path-to-template.odt>'
with zipfile.ZipFile(TEMPLATE, 'r') as z:
root = etree.fromstring(z.read('content.xml'))
body = root.find('.//office:body/office:text', NS)
for i, elem in enumerate(body):
tag = etree.QName(elem.tag).localname
style = elem.get('{urn:oasis:names:tc:opendocument:xmlns:text:1.0}style-name', '')
text = get_full_text(elem).strip()[:80]
if tag == 'table':
tname = elem.get('{urn:oasis:names:tc:opendocument:xmlns:table:1.0}name', '')
rows = elem.findall('.//table:table-row', NS)
print(f"[{i:3d}] <{tag}> name={tname} rows={len(rows)}")
for ri, row in enumerate(rows):
cells = row.findall('table:table-cell', NS)
cell_texts = [get_full_text(c).strip()[:30] for c in cells]
print(f" row[{ri}]: {cell_texts}")
elif tag == 'h':
level = elem.get('{urn:oasis:names:tc:opendocument:xmlns:text:1.0}outline-level', '')
print(f"[{i:3d}] <{tag}> level={level} style={style} | {text}")
else:
print(f"[{i:3d}] <{tag}> style={style} | {text}")
Run this from the scripts/ directory. The output tells you:
- Which element index corresponds to which content
- What paragraph styles are used (needed to preserve formatting)
- How tables are structured (header rows vs data rows)
- Where placeholder text appears
Phase 2: Map Content
Write a markdown file containing all the content that will go into the template. Use a consistent structure that mirrors the template layout so the populate script can parse it predictably.
Markdown content conventions:
- Use heading levels (
#,##,###,####) to delimit major sections - Use markdown tables (
| col | col |) for tabular data - Use numbered/bulleted lists for list items
- Use
**bold**markers to tag field labels (e.g.,**Severity:** Critical) - Use
---horizontal rules to separate repeating blocks (e.g., between gap findings) - Keep section headings consistent so the parser can locate content by heading text
Then define the mapping between markdown sections and template elements. Identify three operation types:
- Simple text replacement — A markdown section's body text replaces the text in a
text:portext:helement at a known index - Table row cloning — A markdown table's rows are parsed and each row clones a template data row
- Paragraph block cloning — A repeating markdown section (delimited by
---) clones a group of template elements for each entry
For each placeholder in the template, note:
- Element index (from Phase 1 analysis)
- Operation type (replace, clone row, clone block)
- Which markdown heading/section provides the source content
Phase 3: Write Populate Script
Using the utility functions below, write a script specific to the template that:
- Reads and parses the markdown file to extract content
- Copies the template to produce the output file (template stays pristine)
- Replaces placeholder content in the output with parsed markdown content
Script structure:
#!/usr/bin/env python3
"""Populate <template-name> with content from markdown."""
import zipfile, shutil, copy, re, sys
from lxml import etree
# --- Paths ---
TEMPLATE = '<path-to-template.odt>' # Pristine template (never modified)
MARKDOWN = '<path-to-content.md>' # Markdown content source
OUTPUT = '<path-to-output.odt>' # Populated output file
# --- ODF Namespaces ---
NS = { ... } # (see utility functions below)
# --- Utility functions ---
# (see utility functions below)
# --- Markdown parsing functions ---
def parse_markdown(md_path):
"""Parse markdown file into a structured dict of sections."""
with open(md_path, 'r') as f:
text = f.read()
# Parse headings, tables, lists, field values...
# Return a dict keyed by section name
...
# --- Main script ---
def main():
# Parse markdown content
content = parse_markdown(MARKDOWN)
# Copy template to output (template stays untouched)
shutil.copy2(TEMPLATE, OUTPUT)
# Read output ODT
with zipfile.ZipFile(OUTPUT, 'r') as z:
content_xml = z.read('content.xml')
all_files = z.namelist()
file_data = {n: z.read(n) for n in all_files if n != 'content.xml'}
root = etree.fromstring(content_xml)
body = root.find('.//office:body/office:text', NS)
elements = list(body)
# Map parsed markdown content onto template elements...
# Save output — mimetype must be first entry and uncompressed per ODF spec
new_content_xml = etree.tostring(root, xml_declaration=True, encoding='UTF-8')
with zipfile.ZipFile(OUTPUT, 'w', zipfile.ZIP_DEFLATED) as zout:
if 'mimetype' in file_data:
zout.writestr('mimetype', file_data['mimetype'], compress_type=zipfile.ZIP_STORED)
zout.writestr('content.xml', new_content_xml)
for name, data in file_data.items():
if name != 'mimetype':
zout.writestr(name, data)
if __name__ == '__main__':
main()
Key difference from in-place modification: The template is copied to the output path first, then the output is modified. The template file is never altered, so it can be reused for future reports.
Phase 4: Verify
After running the populate script, verify the output:
- Check for remaining placeholders — Search the output document's XML for any
<or>placeholder markers that weren't replaced - Check element counts — Verify cloned sections have the expected number of elements (e.g., 15 gap blocks should produce 15 headings)
- Open in LibreOffice — Use
libreoffice/open_document_in_libreofficeto visually confirm formatting is preserved - Spot-check content — Verify representative content appears correctly in the output
Verification script pattern:
#!/usr/bin/env python3
"""Verify populated template."""
import zipfile
from lxml import etree
NS = { ... }
with zipfile.ZipFile('<output.odt>', 'r') as z:
root = etree.fromstring(z.read('content.xml'))
body = root.find('.//office:body/office:text', NS)
# Check for remaining placeholders
for elem in body.iter():
text = (elem.text or '') + (elem.tail or '')
if '<' in text and '>' in text:
print(f"PLACEHOLDER REMAINING: {text[:80]}")
# Count specific elements
headings = body.findall('.//text:h', NS)
tables = body.findall('.//table:table', NS)
print(f"Headings: {len(headings)}")
print(f"Tables: {len(tables)}")
Utility Functions Reference
These are the reusable Python functions for ODT template manipulation. Copy them into your populate script.
ODF Namespace Constants
NS = {
'text': 'urn:oasis:names:tc:opendocument:xmlns:text:1.0',
'table': 'urn:oasis:names:tc:opendocument:xmlns:table:1.0',
'office': 'urn:oasis:names:tc:opendocument:xmlns:office:1.0',
'style': 'urn:oasis:names:tc:opendocument:xmlns:style:1.0',
'fo': 'urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0',
}
Core Functions
def qn(ns_prefix, local):
"""Create a qualified XML name from namespace prefix and local name."""
return '{%s}%s' % (NS[ns_prefix], local)
def get_full_text(elem):
"""Recursively get all text from an element and its children."""
parts = []
if elem.text:
parts.append(elem.text)
for child in elem:
parts.append(get_full_text(child))
if child.tail:
parts.append(child.tail)
return ''.join(parts)
def set_para_text(elem, new_text):
"""Set paragraph text, removing any child spans while preserving the paragraph's style."""
for child in list(elem):
elem.remove(child)
elem.text = new_text
def set_cell_text(cell, new_text):
"""Set text in a table cell's first paragraph, creating one if needed."""
paras = cell.findall('text:p', NS)
if paras:
set_para_text(paras[0], new_text)
else:
p = etree.SubElement(cell, qn('text', 'p'))
p.text = new_text
def clone_table_row(table, template_row_index, data_list):
"""
Clone a template data row for each entry in data_list.
Args:
table: The table:table element
template_row_index: Index of the row to use as template (0 = header, 1 = first data row)
data_list: List of lists — one list of cell values per new row
The template row is cloned once per entry, populated, and the original template row is removed.
"""
rows = table.findall('table:table-row', NS)
template_row = rows[template_row_index]
parent = template_row.getparent()
insert_point = template_row
for row_data in data_list:
new_row = copy.deepcopy(template_row)
cells = new_row.findall('table:table-cell', NS)
for ci, val in enumerate(row_data):
if ci < len(cells):
set_cell_text(cells[ci], val)
insert_point.addnext(new_row)
insert_point = new_row
parent.remove(template_row)
ODT ZIP Read/Write Pattern
import zipfile, shutil
# Copy template to output (template stays pristine)
shutil.copy2(template_path, output_path)
# Read output ODT
with zipfile.ZipFile(output_path, 'r') as z:
content_xml = z.read('content.xml')
all_files = z.namelist()
file_data = {n: z.read(n) for n in all_files if n != 'content.xml'}
root = etree.fromstring(content_xml)
body = root.find('.//office:body/office:text', NS)
# ... modify body elements ...
# Write — repack ALL files in the output, replacing only content.xml
# ODT spec (ODF 1.2, §3.3) requires mimetype as the first entry, stored uncompressed
new_content_xml = etree.tostring(root, xml_declaration=True, encoding='UTF-8')
with zipfile.ZipFile(output_path, 'w', zipfile.ZIP_DEFLATED) as zout:
if 'mimetype' in file_data:
zout.writestr('mimetype', file_data['mimetype'], compress_type=zipfile.ZIP_STORED)
zout.writestr('content.xml', new_content_xml)
for name, data in file_data.items():
if name != 'mimetype':
zout.writestr(name, data)
ODT XML Structure Reference
Quick reference for common ODF element patterns encountered in templates.
Paragraphs
<text:p text:style-name="P9">Body text content here</text:p>
- Style name (e.g.,
P9,Standard) controls font, size, spacing, alignment set_para_text()preserves the style while replacing text
Headings
<text:h text:style-name="P10" text:outline-level="2">Section Title</text:h>
text:outline-leveldetermines heading level (1 = H1, 2 = H2, etc.)- Style name controls appearance (font size, bold, color, etc.)
Tables
<table:table table:name="Table1">
<table:table-column ... />
<table:table-row> <!-- header row -->
<table:table-cell>
<text:p text:style-name="P12">Header</text:p>
</table:table-cell>
</table:table-row>
<table:table-row> <!-- data row (clone this) -->
<table:table-cell>
<text:p text:style-name="P13">Data</text:p>
</table:table-cell>
</table:table-row>
</table:table>
- Row 0 is typically the header row — never clone it
- Row 1 is typically the template data row — clone it for each data entry
clone_table_row()handles this pattern
Line Breaks
# Insert a line break within a paragraph
lb = etree.SubElement(paragraph, qn('text', 'line-break'))
lb.tail = "Text after the line break"
- Use instead of
\n— ODT does not interpret newlines as line breaks
Element Navigation
next_elem = elem.getnext() # Next sibling element
prev_elem = elem.getprevious() # Previous sibling element
parent = elem.getparent() # Parent element
elem.addnext(new_elem) # Insert new_elem after elem
elem.addprevious(new_elem) # Insert new_elem before elem
parent.remove(elem) # Remove elem from parent
Finding Elements After Modification
After inserting or removing elements, the original elements = list(body) snapshot is stale. To find elements after modification:
# By table name attribute
for t in body.findall('.//table:table', NS):
if t.get(qn('table', 'name')) == 'Table4':
target_table = t
# By text content
for elem in body:
if get_full_text(elem).strip().startswith('Some heading text'):
target = elem
Template Design Conventions
When creating ODT templates intended for population with this skill:
Use
< angle bracket >placeholders for variable content (e.g.,<Client Name>,<Date>). These are easy to search for during verification.Include exactly one example data row in tables that need cloning. The
clone_table_row()function uses this row as a template and removes it after cloning.Include exactly one example block for repeating sections (e.g., gap findings). The populate script clones this block for each entry and removes the original.
Keep placeholder text in paragraph
.text(not inside child<text:span>elements) for simplest replacement. If you need styled sub-ranges within a paragraph, use spans, butset_para_text()will strip them.Name tables in LibreOffice (right-click table > Table Properties > Name) so they can be found reliably by name after element indices shift during population.
Use consistent paragraph styles — each visual pattern (body text, citations, sub-headings, list items) should use a distinct named paragraph style so cloned elements inherit the correct appearance.
Reference Implementation
A complete working example is available:
- Template:
Work/Clients/Qualira/Higi/special_510k/11-Gap_Assessment/[DTG] Higi_Gap_Analysis_QP-07-10.odt - Populate script:
Work/Clients/Qualira/Higi/scripts/populate_gap_report.py - Markdown content:
Work/Clients/Qualira/Higi/special_510k/11-Gap_Assessment/Higi_Gap_Analysis_QP-07-10.md
This implementation demonstrates the full pattern:
- Gap analysis outputs structured content as markdown
- Analyze template structure (element indices, styles, tables)
- Write a populate script that parses the markdown and maps content onto template elements
- Populate cover page, executive summary, multiple tables (with row cloning), repeating gap finding blocks (with block cloning), and recommendation sections
- Handle line breaks, dynamic paragraph insertion, and post-modification element lookup
Integration
With /dtge-gap-analysis
The gap analysis skill generates structured content as a markdown file, then uses this skill to populate a branded [DTG]-prefixed template. The workflow is:
- Gap analysis produces a markdown file with all analysis content
- This skill's populate script reads the markdown, parses it, and maps content onto the template
- The template is copied (never modified) and the copy is populated to produce the formatted deliverable
With /dtge-create-dhf
DHF document generation can use this skill for template-based document creation where branded formatting must be preserved.
General Usage
Any skill that produces structured content for a pre-formatted ODT template can use this skill's approach. The expected flow is:
- Upstream skill generates content as markdown
- This skill's workflow populates a branded template from that markdown
- The template file remains pristine and reusable