pyarchivist - SKILL.md Agent Skill

name: pyarchivist description: Archive online content into archives/ with automatic index.md updates via pyarchivist tool.

pyarchivist Workflow

Continuous improvement: see continuous_improvement.md in this folder for a history of feedback and tips for using pyarchivist.

Use this skill when archiving web content, media, or online documents into the knowledge base.

What pyarchivist does

pyarchivist/ is a git submodule that automatically archives online content to archives/ and updates index.md files with metadata (source URL, timestamp, file hash).

When to use

Archiving articles, web pages, or media from online sources
Storing Wikimedia Commons images alongside notes
Creating permanent backups of time-sensitive online content
Auto-maintaining archives/index.md files

Basic workflow

Use pyarchivist's interface (CLI or Python API) to download and archive content
Specify target directory (archives/Wikimedia Commons/ for media, archives/sparse/ for documents)
pyarchivist auto-generates metadata (timestamp, source URL, content hash)
index.md entries are auto-created with source and timestamp information
Filenames are generated consistently (hash-based for deduplication or descriptive for media)

Best practices

Let pyarchivist handle file naming and index.md updates
Use archives/Wikimedia Commons/ for images/media with descriptive names
Use archives/sparse/ for miscellaneous content (hashes for filenames automatically)
Always preserve source URL and timestamp metadata in index.md
Check that index.md was updated correctly after archiving

Typical command pattern

uv run -m pyarchivist [options] --target <archives/folder> <source_url>

(Exact interface depends on pyarchivist's implementation)

When in doubt

Consult the pyarchivist documentation or ask the user for guidance on specific archiving needs.