pdf-export

star 0

Convert any Ariadne-rendered Office artifact (.pptx gate-review deck, .xlsx compliance matrix, .docx proposal volume draft) into a frozen sibling .pdf via headless LibreOffice. Use this skill whenever the user wants to build, edit, or debug Python code that turns an existing .pptx / .docx / .xlsx into PDF for emailing - including writing core/pdf_export.py, adding "Render .pdf" / "Download .pdf" controls to the Pursuit Detail Compliance Matrix tab, the Pursuit Detail Volume Drafts tab, or the Gate Reviews milestone deck section, debugging LibreOffice profile lock-up, or handling cross-platform soffice discovery. Trigger whenever the user mentions "pdf export", "render pdf", "to pdf", "freeze the artifact", "email-ready pdf", "soffice", "libreoffice headless", "docx2pdf alternative", or references .pdf alongside .pptx / .xlsx / .docx in the Streamlit UI. This is the authoritative recipe - follow it instead of reaching for python-docx2pdf (Windows-Word-only), pypandoc, or hand-rolled subprocess invocations.

BdM-15 By BdM-15 schedule Updated 4/23/2026

name: pdf-export description: Convert any Ariadne-rendered Office artifact (.pptx gate-review deck, .xlsx compliance matrix, .docx proposal volume draft) into a frozen sibling .pdf via headless LibreOffice. Use this skill whenever the user wants to build, edit, or debug Python code that turns an existing .pptx / .docx / .xlsx into PDF for emailing - including writing core/pdf_export.py, adding "Render .pdf" / "Download .pdf" controls to the Pursuit Detail Compliance Matrix tab, the Pursuit Detail Volume Drafts tab, or the Gate Reviews milestone deck section, debugging LibreOffice profile lock-up, or handling cross-platform soffice discovery. Trigger whenever the user mentions "pdf export", "render pdf", "to pdf", "freeze the artifact", "email-ready pdf", "soffice", "libreoffice headless", "docx2pdf alternative", or references .pdf alongside .pptx / .xlsx / .docx in the Streamlit UI. This is the authoritative recipe - follow it instead of reaching for python-docx2pdf (Windows-Word-only), pypandoc, or hand-rolled subprocess invocations.

pdf-export

Render .pptx / .xlsx / .docx -> sibling .pdf via headless LibreOffice. Single helper module, three UI surfaces, one skill.

When to use

  • Adding a "Render .pdf" / "Download .pdf" pair next to an existing Office-artifact render flow on any Ariadne page.
  • Producing an email-ready frozen artifact a capture lead can send to the customer or to internal leadership without exposing edits.
  • Debugging or extending core/pdf_export.py (cross-platform soffice discovery, profile isolation, timeout handling).

Do not use this skill for:

  • Generating PDFs from scratch (no source .pptx/.xlsx/.docx). Use ReportLab or weasyprint in a separate module.
  • OCR or PDF -> text. That is Theseus's job (MinerU).
  • Image-only PDFs.

Inputs / outputs

  • Input: an existing path with suffix in {".pptx", ".docx", ".xlsx"} produced by core.milestone_deck.render_deck, core.compliance_xlsx.render_xlsx, or core.proposal_doc.render_docx.
  • Output: a sibling .pdf at src.with_suffix(".pdf"). Always the same filename stem; PDF lands in the same folder as the source.
  • No new file conventions. We piggyback on the source artifact's location (deck under 06_reviews/briefs/, matrix under 04_proposal/compliance/, draft under 04_proposal/volumes/<volume>/drafts/).

Public API

from core.pdf_export import to_pdf, SUPPORTED_SUFFIXES

pdf_path = to_pdf(office_path)          # Path
SUPPORTED_SUFFIXES                       # frozenset[".pptx", ".docx", ".xlsx"]

Errors raised:

Condition Exception
src does not exist FileNotFoundError
src.suffix not in supported ValueError
LibreOffice missing RuntimeError
Conversion timeout (>90s) RuntimeError
LibreOffice ran, no PDF emitted RuntimeError

The helper has no LLM, no Theseus, no Ariadne-specific assumptions about the source path - it works on any .pptx/.docx/.xlsx anywhere.

How LibreOffice is invoked

Always with these flags, in order, via subprocess.run:

soffice
  -env:UserInstallation=file:///<isolated-temp-profile>
  --headless --norestore --nologo --nofirststartwizard
  --convert-to pdf
  --outdir <src.parent>
  <src>

Why each flag matters:

  • -env:UserInstallation points LibreOffice at a throwaway profile in tempfile.TemporaryDirectory(). Without this, an already-running foreground LibreOffice (developer has Writer open) will hold a lock on the shared profile and the headless invocation will silently produce no PDF.
  • --headless --norestore --nologo --nofirststartwizard: server mode, suppress recovery dialog, splash, and the "first run" wizard.
  • --outdir <src.parent>: PDF lands beside the source so callers can compute the result with src.with_suffix(".pdf").

HOME and USERPROFILE are also overridden to point at the temp dir so LibreOffice cannot fall back to a shared user profile.

Cross-platform soffice discovery

_find_soffice():

  1. shutil.which("soffice")
  2. shutil.which("libreoffice")
  3. Probes a static list of well-known install paths (Windows two Program Files variants, macOS .app bundle, Linux /usr/bin, /usr/local/bin, /snap/bin).

Raises RuntimeError with the probed list in the message if all miss.

UI wiring (Streamlit)

Pattern: extend the existing render src + download src column pair to a 4-column row by appending a third "Render .pdf" button and a fourth "Download .pdf" button. Cache the resulting path under a session-state key parallel to the source key.

from core.pdf_export import to_pdf

src_state_key = f"<artifact>_<slug>_<bits>"     # already exists
pdf_state_key = f"{src_state_key}_pdf"

# auto-restore PDF if it already lives on disk
src_path = st.session_state.get(src_state_key)
if src_path and Path(src_path).with_suffix(".pdf").exists():
    st.session_state[pdf_state_key] = Path(src_path).with_suffix(".pdf")

c1, c2, c3, c4 = st.columns(4)
with c1:    # Render src     (existing button)
with c2:    # Download src   (existing button)
with c3:
    src_path = st.session_state.get(src_state_key)
    disabled = not (src_path and Path(src_path).exists())
    if st.button("Render .pdf", key=f"render_pdf_{src_state_key}",
                 use_container_width=True, disabled=disabled,
                 help=("Render the source artifact first" if disabled else None)):
        try:
            with st.spinner("Converting to PDF..."):
                pdf_path = to_pdf(Path(src_path))
            st.session_state[pdf_state_key] = pdf_path
            st.success(f"Wrote {pdf_path.name}")
        except Exception as e:
            st.error(f"PDF render failed: {e}")
with c4:
    pdf_path = st.session_state.get(pdf_state_key)
    if pdf_path and Path(pdf_path).exists():
        st.download_button(
            "Download .pdf",
            data=Path(pdf_path).read_bytes(),
            file_name=Path(pdf_path).name,
            mime="application/pdf",
            key=f"dl_pdf_{src_state_key}",
            use_container_width=True,
        )
    else:
        st.caption("Render PDF to enable download.")

Place the 4-col row in the same with tabs[N]: block right after the existing 2-col render/download. The "Render .pdf" button stays disabled until the source artifact has been rendered (otherwise the user gets a FileNotFoundError).

Workflow checklist (when adding to a new surface)

  1. Confirm the surface already produces an Office artifact via one of the three render functions and stores the path in st.session_state[<src_state_key>].
  2. Define pdf_state_key = f"{src_state_key}_pdf".
  3. Auto-restore the PDF path if src.with_suffix(".pdf") already exists on disk (mirrors how src auto-restores).
  4. Replace the existing 2-col row with a 4-col row using the snippet above.
  5. Ensure from core.pdf_export import to_pdf is at the top of the page along with the existing render imports.
  6. Smoke: render src first, click Render .pdf, verify Download .pdf appears within ~10s and the file opens in a PDF viewer.

Validation / smoke

.\.venv\Scripts\Activate.ps1
python _smoke.py            # creates tiny .docx -> to_pdf -> assert .pdf > 1KB
Remove-Item _smoke.py -Force

The smoke must:

  • Build a one-paragraph .docx with python-docx.
  • Call to_pdf(docx) and assert the returned .pdf exists and is

    1 KB (LibreOffice produces a ~25 KB PDF for a single paragraph).

  • Assert PDF magic bytes (%PDF-).
  • Assert error paths: missing path -> FileNotFoundError; .txt source -> ValueError.
  • Static-parse all three UI pages with ast and grep for from core.pdf_export import, Render .pdf, Download .pdf, and _pdf state-key suffix.

Anti-patterns (block at review)

  • Calling docx2pdf (depends on Microsoft Word, Windows-only).
  • Calling pypandoc for Office formats (Pandoc routes through LibreOffice anyway and adds an extra dep).
  • Reusing the user's foreground LibreOffice profile (locks fail silently with no PDF).
  • Eagerly converting on every page render (PDF generation is slow; always gate behind an explicit button).
  • Hand-rolling MIME types - use application/pdf.
  • Skipping the disabled= guard on Render .pdf when src does not exist - users will hit FileNotFoundError on first click.
Install via CLI
npx skills add https://github.com/BdM-15/project-ariadne --skill pdf-export
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator