e2e-tests

star 644

End-to-end validation suite for vykar backup tool on a Linux sandbox server

borgbase By borgbase schedule Updated 6/8/2026

name: e2e-tests description: "End-to-end validation suite for vykar backup tool on a Linux sandbox server"

Vykar E2E Test Suite

You are a Linux sysadmin testing vykar on a dedicated sandbox server. Your goal is to validate backup and restore correctness across all supported backends, database integrations, container workflows, filesystem snapshot patterns, and performance benchmarks.

Work autonomously. This is a disposable sandbox — do not ask for permission or confirmation before running commands, installing packages, creating/deleting files, or making destructive changes. If something fails, diagnose and fix it yourself. Only stop to ask the user if you are completely stuck with no viable path forward.

Sandbox Environment

The test server provides:

Resource Path Purpose
Large corpus ~/corpus-local Test data for local backend (large)
Small corpus ~/corpus-remote Test data for S3/SFTP/REST backends (bandwidth-aware)
Base config ~/vykar.sample.yaml Repo definitions, credentials, and connection details
Vykar docs https://vykar.borgbase.com/ Recipe reference for hooks, command_dumps, etc.

Installed tools: vykar, vykar-server, rclone, docker, podman, database clients (pg, mariadb, mongo). Install missing packages with sudo apt-get install ....

Sub-skills

Run each sub-skill to execute a specific test area. Results go to ~/runtime/.

Backends — Corpus backup + restore validation

  • e2e-tests:backends:local — Full backup/restore with large corpus on local backend
  • e2e-tests:backends:local-none — Full backup/restore with large corpus on local backend using encryption.mode: none
  • e2e-tests:backends:rest — Backup/restore against local vykar-server REST backend
  • e2e-tests:backends:s3 — Backup/restore with small corpus on S3 backend
  • e2e-tests:backends:sftp — Backup/restore with small corpus on SFTP backend (timeout-bounded)
  • e2e-tests:backends:interrupt-minimal — Interrupt backup mid-run and verify resume + restore integrity across local, rest, s3 (local MinIO), and sftp

Databases — Hooks and command_dumps patterns (large realistic data)

  • e2e-tests:databases:postgres — PostgreSQL with hooks dump and command_dumps variants on ~10 GiB randomized schema
  • e2e-tests:databases:mariadb — MariaDB with hooks dump and command_dumps variants on ~10 GiB randomized schema
  • e2e-tests:databases:mongodb — MongoDB with command_dumps (mongodump --archive) on ~2.5 GiB randomized collections

Containers — Volume backups and container integration

  • e2e-tests:containers:docker — Static volumes, downtime hooks, DB exec dumps via Docker
  • e2e-tests:containers:podman — Same scenarios using Podman commands

Filesystems — Snapshot hooks patterns

  • e2e-tests:filesystems:btrfs — Btrfs read-only subvolume snapshot hooks
  • e2e-tests:filesystems:zfs — ZFS dataset snapshot hooks via .zfs/snapshot path
  • e2e-tests:filesystems:vm-image — Ubuntu cloud image mutate+backup dedupe validation via guestmount/chroot

Benchmarks

  • e2e-tests:benchmarks — Compare vykar performance against restic and rustic (use benchmarks.md + bundled scripts under scripts/)
  • e2e-tests:stress — Run long-loop backup/restore/delete stress validation against local corpus (use stress.md + scripts/stress.sh)

Recommended Execution Order

  1. Backends first (establishes corpus validation baseline)
  2. Databases (large-data, container-based tests)
  3. Containers (reuses DB patterns with volume workflows)
  4. Filesystems (requires disk/partition setup; run VM image dedupe scenario here)
  5. Stress next (long-loop correctness/locking pressure on local backend)
  6. Benchmarks last (long-running, independent)

Shared Conventions

Environment Setup

export VYKAR_PASSPHRASE=123    # non-interactive passphrase
  • Use sudo for package installs and root-owned paths
  • Working directory for all test artifacts: ~/runtime/

Database Data Volume Baseline

  • Database scenarios are not small-smoke tests by default.
  • Seed randomized, high-entropy data before backup using these defaults:
    • PostgreSQL: ~10 GiB
    • MariaDB: ~10 GiB
    • MongoDB: ~2.5 GiB (faster validation target)
  • Prefer these helper scripts from repo root:
    • scripts/postgres-generate-random-data.sh --container <name> --target-gib 10
    • scripts/mariadb-generate-random-data.sh --container <name> --target-gib 10
    • scripts/mongodb-generate-random-data.sh --container <name> --target-gib 2.5
  • Record resulting size and table/collection counts in scenario logs/reports.

VM Image Package Baseline (Large-Mutation)

  • For e2e-tests:filesystems:vm-image, prefer large package mutations by default (not tiny smoke installs).
  • Ubuntu desktop-class package to use: ubuntu-desktop-minimal (primary).
  • Optional heavier variant when disk space allows: ubuntu-desktop.
  • Recommended phase split for online VM-image tests:
    • Phase 1: apt-get install -y ubuntu-desktop-minimal
    • Phase 2: apt-get install -y --no-install-recommends thunderbird libreoffice-core htop curl jq git
  • If the cloud image runs out of free space, document the deviation and use the largest subset that fits.

Config Strategy

  1. Copy ~/vykar.sample.yaml to a scenario-specific config (e.g., config.postgres.yaml)
  2. Add test-specific sources blocks per scenario; keep repo definitions from sample
  3. Keep each scenario in a separate config file to avoid source overlap
  4. Reference repos by label: -R local, -R rest, -R s3, -R sftp
  5. Include one dedicated local-backend scenario with encryption.mode: none; treat it as a required backend validation, not an optional smoke test
  6. For the e2e-tests:backends:local-none scenario, set:
    • encryption.mode: none
    • omit encryption.passphrase / VYKAR_PASSPHRASE
    • keep the same backup/list/restore/diff validation standard as encrypted local runs
  7. For local REST server mode, use a single repository URL root (e.g. http://127.0.0.1:8585) with:
    • access_token: "<token>"
    • allow_insecure_http: true
    • do not append /<repo-name> in single-repo mode
  8. For e2e-tests:backends:interrupt-minimal, use a dedicated config file and point S3 to local MinIO:
    • url: "s3+http://127.0.0.1:9000/vykar-stress/<run-id>"
    • allow_insecure_http: true
    • region: "us-east-1"
    • access_key_id: "minioadmin"
    • secret_access_key: "minioadmin"

Validation Standard

Every test must verify:

  1. vykar backup exits 0
  2. vykar list shows new snapshot for expected source label
  3. vykar --config <config> snapshot list -R <repo> <snapshot_id> confirms expected files or artifacts
  4. Restore into temp directory and verify:
    • Corpus tests: diff -qr --no-dereference <source> <restore_dir> reports no differences
    • Database tests: restore dump, verify row/document counts and sampled content match seeded large dataset
    • After verification, immediately remove restored files: rm -rf <restore_dir>
  5. Optional: SHA256 manifest comparison for stronger content verification

Cleanup Standard

  1. Reset repo before reruns: run vykar --config <config> delete -R <repo> --yes-delete-this-repo before init
    • Treat not found/missing repo as non-fatal
    • REST single-repo servers may reject delete (for example 400/404); if so, continue with init/backup and record it
  2. Local: remove temporary directories (dumps, restores, configs)
    • Do not keep restore trees after diff/integrity checks unless actively debugging
  3. Local REST server data: if using single-repo mode, wipe server data dir between reruns (for this sandbox: /mnt/repos/bench-vykar/vykar-server-data/*)
  4. Remote storage: rclone delete --rmdirs <remote:path> between runs
    • Do NOT use rclone purge (may fail with 403 on restricted buckets)
    • Treat directory not found from rclone as non-fatal
  5. Containers: stop and remove after each scenario
  6. Filesystems: unmount/destroy test pools after runs

Run Matrix

For tests that span multiple backends, run in this order:

  1. local first (fast feedback loop)
  2. rest second (local server path, still exercises HTTP backend)
  3. s3 third
  4. sftp last (known instability, use timeouts)

SFTP Guardrails

SFTP can be intermittent even when rclone works fine against the same server:

  • Wrap all vykar commands with timeout: timeout 120s vykar init ..., timeout 3600s vykar backup ...
  • On timeout (exit 124), mark test as BLOCKED, kill stuck process, continue cleanup
  • Do NOT rerun the entire test suite if only SFTP failed — isolate SFTP results
  • Ensure no stuck vykar process remains after aborted SFTP steps

Interruption Minimal Test (All Backends)

Run order:

  1. local
  2. rest
  3. s3 (local MinIO only)
  4. sftp

Source mapping:

  • local -> ~/corpus-local/snapshot-1
  • rest -> ~/corpus-remote/snapshot-1
  • s3 (local MinIO) -> ~/corpus-remote/snapshot-1
  • sftp -> ~/corpus-remote/snapshot-1

Interrupt method (backup only):

  1. Start backup in background:
    vykar --config <config> backup -R <repo> <src> > <backup_log> 2>&1 &
    pid=$!
    
  2. Sleep briefly (1-5s, backend-specific), then send SIGTERM:
    kill -TERM "$pid"
    
  3. If still running after 2 seconds, send SIGKILL:
    kill -KILL "$pid"
    
  4. Wait and record interrupt exit code (interrupt_rc).

Resume + validation sequence:

  1. Re-run backup (for sftp, use timeout-bounded commands per SFTP guardrails).
  2. Parse latest snapshot_id from Snapshot created: <id> in resume backup log.
  3. vykar --config <config> list -R <repo> --last 5 and confirm snapshot appears.
  4. vykar --config <config> snapshot list -R <repo> <snapshot_id>.
  5. Restore into empty directory.
  6. Validate with diff -qr --no-dereference <src> <restore_dir>.
  7. Run vykar --config <config> check -R <repo>.

Status rules:

  • PASS: interrupted backup + resume backup + list/snapshot list/restore/diff/check all succeed.
  • FAIL: non-timeout command failure or diff mismatch.
  • BLOCKED: timeout (rc=124) on timeout-bounded steps.

Deliverables

Each sub-skill should produce:

  1. Scenario-specific config file saved under ~/runtime/
  2. Log file under ~/runtime/logs/
  3. Pass/fail summary report under ~/runtime/reports/
  4. For interruption scenarios, include per-backend result fields:
    • backend,status,reason,interrupt_rc,resume_rc,snapshot_id,log_dir,report_path

Common Gotchas

  • Mixing sudo vykar and regular vykar creates root-owned repo files — use sudo rm -rf for cleanup
  • Command dump artifacts appear under vykar-dumps/ in snapshot listings
  • Prefer vykar --config <config> ... in automation; keep --config explicit in all commands
  • --label (or -l) is for ad-hoc backup paths only. If sources/command_dumps are already defined in config, run vykar --config <config> backup -R <repo> without -l.
  • vykar snapshot CLI forms:
    • vykar --config <config> snapshot list -R <repo> <snapshot_id>
    • vykar --config <config> snapshot delete -R <repo> <snapshot_id>
  • REST local server may run in single-repo mode (http://127.0.0.1:8585 root URL) and reject path-style repos
  • Use diff -qr --no-dereference to avoid false negatives on broken symlinks in corpora
  • MariaDB modern images use mariadb, mariadb-dump, mariadb-admin (not mysql* names)
  • For MariaDB docker exec dumps, prefer socket protocol with retries; in-container TCP to 127.0.0.1 can be intermittently unreliable on this sandbox
  • For high-entropy PostgreSQL seed data, use scripts/postgres-generate-random-data.sh --container <name> --target-gib <N>
  • For high-entropy MariaDB seed data, use scripts/mariadb-generate-random-data.sh --container <name> --target-gib <N>
  • For high-entropy MongoDB seed data, use scripts/mongodb-generate-random-data.sh --container <name> --target-gib <N>
  • Btrfs hook snapshots require backing up a real Btrfs subvolume (not a plain directory)
  • ZFS restore diffs should ignore the virtual .zfs directory
  • guestmount/chroot image workflows need bind mounts for /dev, /proc, /sys, and /run before apt operations
  • For Ubuntu VM-image scenarios, use ubuntu-desktop-minimal as the default large package mutation target
  • MongoDB host tools may be missing — use docker exec or podman exec as fallback
  • Pre-pull container images before timed runs to avoid skewing measurements
  • Sample config repo paths may need adjustment for the sandbox — verify and update before first run
  • In interruption tests, interrupt_rc=0 usually means the backup finished before the signal landed. Re-run with shorter sle
Install via CLI
npx skills add https://github.com/borgbase/vykar --skill e2e-tests
Repository Details
star Stars 644
call_split Forks 17
navigation Branch main
article Path SKILL.md
More from Creator