name: pdf-export description: Convert any Ariadne-rendered Office artifact (.pptx gate-review deck, .xlsx compliance matrix, .docx proposal volume draft) into a frozen sibling .pdf via headless LibreOffice. Use this skill whenever the user wants to build, edit, or debug Python code that turns an existing .pptx / .docx / .xlsx into PDF for emailing - including writing core/pdf_export.py, adding "Render .pdf" / "Download .pdf" controls to the Pursuit Detail Compliance Matrix tab, the Pursuit Detail Volume Drafts tab, or the Gate Reviews milestone deck section, debugging LibreOffice profile lock-up, or handling cross-platform soffice discovery. Trigger whenever the user mentions "pdf export", "render pdf", "to pdf", "freeze the artifact", "email-ready pdf", "soffice", "libreoffice headless", "docx2pdf alternative", or references .pdf alongside .pptx / .xlsx / .docx in the Streamlit UI. This is the authoritative recipe - follow it instead of reaching for python-docx2pdf (Windows-Word-only), pypandoc, or hand-rolled subprocess invocations.
pdf-export
Render .pptx / .xlsx / .docx -> sibling .pdf via headless LibreOffice.
Single helper module, three UI surfaces, one skill.
When to use
- Adding a "Render .pdf" / "Download .pdf" pair next to an existing Office-artifact render flow on any Ariadne page.
- Producing an email-ready frozen artifact a capture lead can send to the customer or to internal leadership without exposing edits.
- Debugging or extending
core/pdf_export.py(cross-platform soffice discovery, profile isolation, timeout handling).
Do not use this skill for:
- Generating PDFs from scratch (no source .pptx/.xlsx/.docx). Use ReportLab or weasyprint in a separate module.
- OCR or PDF -> text. That is Theseus's job (MinerU).
- Image-only PDFs.
Inputs / outputs
- Input: an existing path with suffix in
{".pptx", ".docx", ".xlsx"}produced bycore.milestone_deck.render_deck,core.compliance_xlsx.render_xlsx, orcore.proposal_doc.render_docx. - Output: a sibling
.pdfatsrc.with_suffix(".pdf"). Always the same filename stem; PDF lands in the same folder as the source. - No new file conventions. We piggyback on the source artifact's
location (deck under
06_reviews/briefs/, matrix under04_proposal/compliance/, draft under04_proposal/volumes/<volume>/drafts/).
Public API
from core.pdf_export import to_pdf, SUPPORTED_SUFFIXES
pdf_path = to_pdf(office_path) # Path
SUPPORTED_SUFFIXES # frozenset[".pptx", ".docx", ".xlsx"]
Errors raised:
| Condition | Exception |
|---|---|
src does not exist |
FileNotFoundError |
src.suffix not in supported |
ValueError |
| LibreOffice missing | RuntimeError |
| Conversion timeout (>90s) | RuntimeError |
| LibreOffice ran, no PDF emitted | RuntimeError |
The helper has no LLM, no Theseus, no Ariadne-specific assumptions about the source path - it works on any .pptx/.docx/.xlsx anywhere.
How LibreOffice is invoked
Always with these flags, in order, via subprocess.run:
soffice
-env:UserInstallation=file:///<isolated-temp-profile>
--headless --norestore --nologo --nofirststartwizard
--convert-to pdf
--outdir <src.parent>
<src>
Why each flag matters:
-env:UserInstallationpoints LibreOffice at a throwaway profile intempfile.TemporaryDirectory(). Without this, an already-running foreground LibreOffice (developer has Writer open) will hold a lock on the shared profile and the headless invocation will silently produce no PDF.--headless --norestore --nologo --nofirststartwizard: server mode, suppress recovery dialog, splash, and the "first run" wizard.--outdir <src.parent>: PDF lands beside the source so callers can compute the result withsrc.with_suffix(".pdf").
HOME and USERPROFILE are also overridden to point at the temp dir
so LibreOffice cannot fall back to a shared user profile.
Cross-platform soffice discovery
_find_soffice():
shutil.which("soffice")shutil.which("libreoffice")- Probes a static list of well-known install paths (Windows two
Program Filesvariants, macOS.appbundle, Linux/usr/bin,/usr/local/bin,/snap/bin).
Raises RuntimeError with the probed list in the message if all miss.
UI wiring (Streamlit)
Pattern: extend the existing render src + download src column pair to a 4-column row by appending a third "Render .pdf" button and a fourth "Download .pdf" button. Cache the resulting path under a session-state key parallel to the source key.
from core.pdf_export import to_pdf
src_state_key = f"<artifact>_<slug>_<bits>" # already exists
pdf_state_key = f"{src_state_key}_pdf"
# auto-restore PDF if it already lives on disk
src_path = st.session_state.get(src_state_key)
if src_path and Path(src_path).with_suffix(".pdf").exists():
st.session_state[pdf_state_key] = Path(src_path).with_suffix(".pdf")
c1, c2, c3, c4 = st.columns(4)
with c1: # Render src (existing button)
with c2: # Download src (existing button)
with c3:
src_path = st.session_state.get(src_state_key)
disabled = not (src_path and Path(src_path).exists())
if st.button("Render .pdf", key=f"render_pdf_{src_state_key}",
use_container_width=True, disabled=disabled,
help=("Render the source artifact first" if disabled else None)):
try:
with st.spinner("Converting to PDF..."):
pdf_path = to_pdf(Path(src_path))
st.session_state[pdf_state_key] = pdf_path
st.success(f"Wrote {pdf_path.name}")
except Exception as e:
st.error(f"PDF render failed: {e}")
with c4:
pdf_path = st.session_state.get(pdf_state_key)
if pdf_path and Path(pdf_path).exists():
st.download_button(
"Download .pdf",
data=Path(pdf_path).read_bytes(),
file_name=Path(pdf_path).name,
mime="application/pdf",
key=f"dl_pdf_{src_state_key}",
use_container_width=True,
)
else:
st.caption("Render PDF to enable download.")
Place the 4-col row in the same with tabs[N]: block right after the
existing 2-col render/download. The "Render .pdf" button stays disabled
until the source artifact has been rendered (otherwise the user gets a
FileNotFoundError).
Workflow checklist (when adding to a new surface)
- Confirm the surface already produces an Office artifact via one of
the three render functions and stores the path in
st.session_state[<src_state_key>]. - Define
pdf_state_key = f"{src_state_key}_pdf". - Auto-restore the PDF path if
src.with_suffix(".pdf")already exists on disk (mirrors how src auto-restores). - Replace the existing 2-col row with a 4-col row using the snippet above.
- Ensure
from core.pdf_export import to_pdfis at the top of the page along with the existing render imports. - Smoke: render src first, click Render .pdf, verify Download .pdf appears within ~10s and the file opens in a PDF viewer.
Validation / smoke
.\.venv\Scripts\Activate.ps1
python _smoke.py # creates tiny .docx -> to_pdf -> assert .pdf > 1KB
Remove-Item _smoke.py -Force
The smoke must:
- Build a one-paragraph .docx with
python-docx. - Call
to_pdf(docx)and assert the returned.pdfexists and is1 KB (LibreOffice produces a ~25 KB PDF for a single paragraph).
- Assert PDF magic bytes (
%PDF-). - Assert error paths: missing path ->
FileNotFoundError;.txtsource ->ValueError. - Static-parse all three UI pages with
astand grep forfrom core.pdf_export import,Render .pdf,Download .pdf, and_pdfstate-key suffix.
Anti-patterns (block at review)
- Calling
docx2pdf(depends on Microsoft Word, Windows-only). - Calling
pypandocfor Office formats (Pandoc routes through LibreOffice anyway and adds an extra dep). - Reusing the user's foreground LibreOffice profile (locks fail silently with no PDF).
- Eagerly converting on every page render (PDF generation is slow; always gate behind an explicit button).
- Hand-rolling MIME types - use
application/pdf. - Skipping the
disabled=guard on Render .pdf when src does not exist - users will hit FileNotFoundError on first click.