name: e2e-tests description: "End-to-end validation suite for vykar backup tool on a Linux sandbox server"
Vykar E2E Test Suite
You are a Linux sysadmin testing vykar on a dedicated sandbox server. Your goal is to validate backup and restore correctness across all supported backends, database integrations, container workflows, filesystem snapshot patterns, and performance benchmarks.
Work autonomously. This is a disposable sandbox — do not ask for permission or confirmation before running commands, installing packages, creating/deleting files, or making destructive changes. If something fails, diagnose and fix it yourself. Only stop to ask the user if you are completely stuck with no viable path forward.
Sandbox Environment
The test server provides:
| Resource | Path | Purpose |
|---|---|---|
| Large corpus | ~/corpus-local |
Test data for local backend (large) |
| Small corpus | ~/corpus-remote |
Test data for S3/SFTP/REST backends (bandwidth-aware) |
| Base config | ~/vykar.sample.yaml |
Repo definitions, credentials, and connection details |
| Vykar docs | https://vykar.borgbase.com/ | Recipe reference for hooks, command_dumps, etc. |
Installed tools: vykar, vykar-server, rclone, docker, podman, database clients (pg, mariadb, mongo).
Install missing packages with sudo apt-get install ....
Sub-skills
Run each sub-skill to execute a specific test area. Results go to ~/runtime/.
Backends — Corpus backup + restore validation
e2e-tests:backends:local— Full backup/restore with large corpus on local backende2e-tests:backends:local-none— Full backup/restore with large corpus on local backend usingencryption.mode: nonee2e-tests:backends:rest— Backup/restore against localvykar-serverREST backende2e-tests:backends:s3— Backup/restore with small corpus on S3 backende2e-tests:backends:sftp— Backup/restore with small corpus on SFTP backend (timeout-bounded)e2e-tests:backends:interrupt-minimal— Interruptbackupmid-run and verify resume + restore integrity acrosslocal,rest,s3(local MinIO), andsftp
Databases — Hooks and command_dumps patterns (large realistic data)
e2e-tests:databases:postgres— PostgreSQL with hooks dump and command_dumps variants on ~10 GiB randomized schemae2e-tests:databases:mariadb— MariaDB with hooks dump and command_dumps variants on ~10 GiB randomized schemae2e-tests:databases:mongodb— MongoDB with command_dumps (mongodump --archive) on ~2.5 GiB randomized collections
Containers — Volume backups and container integration
e2e-tests:containers:docker— Static volumes, downtime hooks, DB exec dumps via Dockere2e-tests:containers:podman— Same scenarios using Podman commands
Filesystems — Snapshot hooks patterns
e2e-tests:filesystems:btrfs— Btrfs read-only subvolume snapshot hookse2e-tests:filesystems:zfs— ZFS dataset snapshot hooks via .zfs/snapshot pathe2e-tests:filesystems:vm-image— Ubuntu cloud image mutate+backup dedupe validation via guestmount/chroot
Benchmarks
e2e-tests:benchmarks— Compare vykar performance against restic and rustic (usebenchmarks.md+ bundled scripts underscripts/)e2e-tests:stress— Run long-loop backup/restore/delete stress validation against local corpus (usestress.md+scripts/stress.sh)
Recommended Execution Order
- Backends first (establishes corpus validation baseline)
- Databases (large-data, container-based tests)
- Containers (reuses DB patterns with volume workflows)
- Filesystems (requires disk/partition setup; run VM image dedupe scenario here)
- Stress next (long-loop correctness/locking pressure on local backend)
- Benchmarks last (long-running, independent)
Shared Conventions
Environment Setup
export VYKAR_PASSPHRASE=123 # non-interactive passphrase
- Use
sudofor package installs and root-owned paths - Working directory for all test artifacts:
~/runtime/
Database Data Volume Baseline
- Database scenarios are not small-smoke tests by default.
- Seed randomized, high-entropy data before backup using these defaults:
- PostgreSQL: ~10 GiB
- MariaDB: ~10 GiB
- MongoDB: ~2.5 GiB (faster validation target)
- Prefer these helper scripts from repo root:
scripts/postgres-generate-random-data.sh --container <name> --target-gib 10scripts/mariadb-generate-random-data.sh --container <name> --target-gib 10scripts/mongodb-generate-random-data.sh --container <name> --target-gib 2.5
- Record resulting size and table/collection counts in scenario logs/reports.
VM Image Package Baseline (Large-Mutation)
- For
e2e-tests:filesystems:vm-image, prefer large package mutations by default (not tiny smoke installs). - Ubuntu desktop-class package to use:
ubuntu-desktop-minimal(primary). - Optional heavier variant when disk space allows:
ubuntu-desktop. - Recommended phase split for online VM-image tests:
- Phase 1:
apt-get install -y ubuntu-desktop-minimal - Phase 2:
apt-get install -y --no-install-recommends thunderbird libreoffice-core htop curl jq git
- Phase 1:
- If the cloud image runs out of free space, document the deviation and use the largest subset that fits.
Config Strategy
- Copy
~/vykar.sample.yamlto a scenario-specific config (e.g.,config.postgres.yaml) - Add test-specific
sourcesblocks per scenario; keep repo definitions from sample - Keep each scenario in a separate config file to avoid source overlap
- Reference repos by label:
-R local,-R rest,-R s3,-R sftp - Include one dedicated local-backend scenario with
encryption.mode: none; treat it as a required backend validation, not an optional smoke test - For the
e2e-tests:backends:local-nonescenario, set:encryption.mode: none- omit
encryption.passphrase/VYKAR_PASSPHRASE - keep the same backup/list/restore/diff validation standard as encrypted local runs
- For local REST server mode, use a single repository URL root (e.g.
http://127.0.0.1:8585) with:access_token: "<token>"allow_insecure_http: true- do not append
/<repo-name>in single-repo mode
- For
e2e-tests:backends:interrupt-minimal, use a dedicated config file and point S3 to local MinIO:url: "s3+http://127.0.0.1:9000/vykar-stress/<run-id>"allow_insecure_http: trueregion: "us-east-1"access_key_id: "minioadmin"secret_access_key: "minioadmin"
Validation Standard
Every test must verify:
vykar backupexits 0vykar listshows new snapshot for expected source labelvykar --config <config> snapshot list -R <repo> <snapshot_id>confirms expected files or artifacts- Restore into temp directory and verify:
- Corpus tests:
diff -qr --no-dereference <source> <restore_dir>reports no differences - Database tests: restore dump, verify row/document counts and sampled content match seeded large dataset
- After verification, immediately remove restored files:
rm -rf <restore_dir>
- Corpus tests:
- Optional: SHA256 manifest comparison for stronger content verification
Cleanup Standard
- Reset repo before reruns: run
vykar --config <config> delete -R <repo> --yes-delete-this-repobeforeinit- Treat
not found/missing repo as non-fatal - REST single-repo servers may reject
delete(for example400/404); if so, continue withinit/backupand record it
- Treat
- Local: remove temporary directories (dumps, restores, configs)
- Do not keep restore trees after
diff/integrity checks unless actively debugging
- Do not keep restore trees after
- Local REST server data: if using single-repo mode, wipe server data dir between reruns (for this sandbox:
/mnt/repos/bench-vykar/vykar-server-data/*) - Remote storage:
rclone delete --rmdirs <remote:path>between runs- Do NOT use
rclone purge(may fail with 403 on restricted buckets) - Treat
directory not foundfrom rclone as non-fatal
- Do NOT use
- Containers: stop and remove after each scenario
- Filesystems: unmount/destroy test pools after runs
Run Matrix
For tests that span multiple backends, run in this order:
- local first (fast feedback loop)
- rest second (local server path, still exercises HTTP backend)
- s3 third
- sftp last (known instability, use timeouts)
SFTP Guardrails
SFTP can be intermittent even when rclone works fine against the same server:
- Wrap all vykar commands with
timeout:timeout 120s vykar init ...,timeout 3600s vykar backup ... - On timeout (exit 124), mark test as BLOCKED, kill stuck process, continue cleanup
- Do NOT rerun the entire test suite if only SFTP failed — isolate SFTP results
- Ensure no stuck
vykarprocess remains after aborted SFTP steps
Interruption Minimal Test (All Backends)
Run order:
localrests3(local MinIO only)sftp
Source mapping:
local->~/corpus-local/snapshot-1rest->~/corpus-remote/snapshot-1s3(local MinIO) ->~/corpus-remote/snapshot-1sftp->~/corpus-remote/snapshot-1
Interrupt method (backup only):
- Start backup in background:
vykar --config <config> backup -R <repo> <src> > <backup_log> 2>&1 & pid=$! - Sleep briefly (
1-5s, backend-specific), then send SIGTERM:kill -TERM "$pid" - If still running after 2 seconds, send SIGKILL:
kill -KILL "$pid" - Wait and record interrupt exit code (
interrupt_rc).
Resume + validation sequence:
- Re-run backup (for
sftp, use timeout-bounded commands per SFTP guardrails). - Parse latest
snapshot_idfromSnapshot created: <id>in resume backup log. vykar --config <config> list -R <repo> --last 5and confirm snapshot appears.vykar --config <config> snapshot list -R <repo> <snapshot_id>.- Restore into empty directory.
- Validate with
diff -qr --no-dereference <src> <restore_dir>. - Run
vykar --config <config> check -R <repo>.
Status rules:
PASS: interrupted backup + resume backup + list/snapshot list/restore/diff/check all succeed.FAIL: non-timeout command failure or diff mismatch.BLOCKED: timeout (rc=124) on timeout-bounded steps.
Deliverables
Each sub-skill should produce:
- Scenario-specific config file saved under
~/runtime/ - Log file under
~/runtime/logs/ - Pass/fail summary report under
~/runtime/reports/ - For interruption scenarios, include per-backend result fields:
backend,status,reason,interrupt_rc,resume_rc,snapshot_id,log_dir,report_path
Common Gotchas
- Mixing
sudo vykarand regularvykarcreates root-owned repo files — usesudo rm -rffor cleanup - Command dump artifacts appear under
vykar-dumps/in snapshot listings - Prefer
vykar --config <config> ...in automation; keep--configexplicit in all commands --label(or-l) is for ad-hoc backup paths only. If sources/command_dumpsare already defined in config, runvykar --config <config> backup -R <repo>without-l.vykar snapshotCLI forms:vykar --config <config> snapshot list -R <repo> <snapshot_id>vykar --config <config> snapshot delete -R <repo> <snapshot_id>
- REST local server may run in single-repo mode (
http://127.0.0.1:8585root URL) and reject path-style repos - Use
diff -qr --no-dereferenceto avoid false negatives on broken symlinks in corpora - MariaDB modern images use
mariadb,mariadb-dump,mariadb-admin(notmysql*names) - For MariaDB
docker execdumps, prefer socket protocol with retries; in-container TCP to127.0.0.1can be intermittently unreliable on this sandbox - For high-entropy PostgreSQL seed data, use
scripts/postgres-generate-random-data.sh --container <name> --target-gib <N> - For high-entropy MariaDB seed data, use
scripts/mariadb-generate-random-data.sh --container <name> --target-gib <N> - For high-entropy MongoDB seed data, use
scripts/mongodb-generate-random-data.sh --container <name> --target-gib <N> - Btrfs hook snapshots require backing up a real Btrfs subvolume (not a plain directory)
- ZFS restore diffs should ignore the virtual
.zfsdirectory - guestmount/chroot image workflows need bind mounts for
/dev,/proc,/sys, and/runbefore apt operations - For Ubuntu VM-image scenarios, use
ubuntu-desktop-minimalas the default large package mutation target - MongoDB host tools may be missing — use
docker execorpodman execas fallback - Pre-pull container images before timed runs to avoid skewing measurements
- Sample config repo paths may need adjustment for the sandbox — verify and update before first run
- In interruption tests,
interrupt_rc=0usually means the backup finished before the signal landed. Re-run with shorter sle