name: docx description: > Word document handler for creating, reading, editing, and manipulating .docx files. Use when: (1) creating Word documents with formatting, (2) reading/extracting from .docx, (3) working with tables of contents, headings, page numbers, (4) tracked changes/comments. Triggers: "Word doc", "word document", ".docx", "report", "memo", "letter", "template", "manuscript", "essay", "documentation", "tracked changes".
Word Document Handler
Create, read, and edit Microsoft Word (.docx) documents with python-docx.
Capabilities
- Create professional documents from scratch
- Read and extract text from .docx files
- Format with headings, styles, fonts, colors
- Add tables, images, and lists
- Create tables of contents
- Work with headers, footers, page numbers
- Handle tracked changes and comments
- Use templates and document sections
Python Library
from docx import Document
from docx.shared import Inches, Pt, RGBColor
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.style import WD_STYLE_TYPE
from docx.enum.table import WD_TABLE_ALIGNMENT
Common Operations
Create New Document
from docx import Document
from docx.shared import Pt
doc = Document()
# Add title
doc.add_heading("Document Title", level=0)
# Add paragraph
doc.add_paragraph("This is the first paragraph of the document.")
# Add formatted paragraph
p = doc.add_paragraph()
run = p.add_run("Bold text")
run.bold = True
p.add_run(" and ")
run = p.add_run("italic text")
run.italic = True
doc.save("document.docx")
Add Headings
doc.add_heading("Main Section", level=1)
doc.add_heading("Subsection", level=2)
doc.add_heading("Sub-subsection", level=3)
Format Text
from docx.shared import Pt, RGBColor
p = doc.add_paragraph()
run = p.add_run("Formatted text")
run.font.name = "Arial"
run.font.size = Pt(14)
run.font.bold = True
run.font.color.rgb = RGBColor(0x00, 0x66, 0xCC)
Add Bullet List
doc.add_paragraph("First item", style="List Bullet")
doc.add_paragraph("Second item", style="List Bullet")
doc.add_paragraph("Third item", style="List Bullet")
Add Numbered List
doc.add_paragraph("Step one", style="List Number")
doc.add_paragraph("Step two", style="List Number")
doc.add_paragraph("Step three", style="List Number")
Add Table
table = doc.add_table(rows=3, cols=3)
table.style = "Table Grid"
# Header row
hdr_cells = table.rows[0].cells
hdr_cells[0].text = "Name"
hdr_cells[1].text = "Age"
hdr_cells[2].text = "City"
# Data rows
row = table.rows[1].cells
row[0].text = "Alice"
row[1].text = "30"
row[2].text = "New York"
Add Image
from docx.shared import Inches
doc.add_picture("image.png", width=Inches(4))
Add Header and Footer
section = doc.sections[0]
# Header
header = section.header
header_para = header.paragraphs[0]
header_para.text = "Document Header"
# Footer
footer = section.footer
footer_para = footer.paragraphs[0]
footer_para.text = "Page "
Add Page Break
doc.add_page_break()
Read Existing Document
doc = Document("existing.docx")
for para in doc.paragraphs:
print(para.text)
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
print(cell.text)
Find and Replace
def find_replace(doc, find_text, replace_text):
for para in doc.paragraphs:
if find_text in para.text:
for run in para.runs:
if find_text in run.text:
run.text = run.text.replace(find_text, replace_text)
return doc
Set Margins
from docx.shared import Inches
section = doc.sections[0]
section.top_margin = Inches(1)
section.bottom_margin = Inches(1)
section.left_margin = Inches(1.25)
section.right_margin = Inches(1.25)
Add Table of Contents
# Note: TOC needs Word to update field codes
from docx.oxml.ns import qn
from docx.oxml import OxmlElement
paragraph = doc.add_paragraph()
run = paragraph.add_run()
fldChar = OxmlElement('w:fldChar')
fldChar.set(qn('w:fldCharType'), 'begin')
run._r.append(fldChar)
run = paragraph.add_run()
instrText = OxmlElement('w:instrText')
instrText.text = 'TOC \\o "1-3" \\h \\z \\u'
run._r.append(instrText)
run = paragraph.add_run()
fldChar = OxmlElement('w:fldChar')
fldChar.set(qn('w:fldCharType'), 'end')
run._r.append(fldChar)
Document Styles
| Style Name | Use Case |
|---|---|
| Title | Document title |
| Heading 1-9 | Section headings |
| Normal | Body text |
| List Bullet | Bullet points |
| List Number | Numbered lists |
| Quote | Block quotes |
| Intense Quote | Emphasized quotes |
Best Practices
- Use styles - Maintain consistency with built-in styles
- Structure - Use headings for document outline
- Tables - Use "Table Grid" style for visibility
- Images - Set appropriate width, maintain aspect ratio
- Fonts - Stick to common fonts (Arial, Times, Calibri)
- Spacing - Use paragraph spacing, not empty paragraphs