name: deploying-with-isucon-ansible
description: Deploys ISUCON contest code and configs to competition servers using the isucon-ansible layout (Ansible playbook for provisioning + Makefile-over-SSH for the per-benchmark loop). The two deploy commands are make bench (regular deploy, with instrumentation ON) and make maji (final-run deploy, with instrumentation OFF). Use when running benchmarks, deploying app/nginx/MySQL config changes, reassigning roles between servers, or when working in a repo that uses mazrean/isucon-ansible (server.yaml, hosts inventory, .make.env, remote/Makefile).
Deploying with isucon-ansible
ISUCON deployment with this layout is a two-layer workflow:
- Ansible (
server.yaml) — one-shot / occasional. Provisions every competition server: installs tools, clones the contest repo, sets kernel params, installs systemd units for app/nginx/mysql, enables/disables each service based on the host group it belongs to. - Makefile (
Makefile+remote/Makefile) — every benchmark run. Pulls the latest contest-repo commit on the target server, copies app code +nginx/mysqlconfigs into place, toggles instrumentation (access log, slow query log, app metrics), rebuilds the binary, and restarts services.
Use this skill when you need to push a change and re-benchmark (make bench), do the final scored run (make maji), swap which ISU runs which
role (mysql/app/nginx), or read pprof / kataribe / slow query output from a
target server.
Repo-specific values (service names, paths, repo URL, host IPs) live in
group_vars/all/var.yaml, .make.env, host_vars/isuN, and hosts — never
hard-code them; let the existing variables drive everything.
The Two Deploy Commands
There are exactly two end-to-end deploy commands. Pick one based on what the next run is for:
| Command | Purpose | Instrumentation |
|---|---|---|
make bench REMOTE_ID=N |
Regular deploy for iteration / measured runs. | ON |
make maji REMOTE_ID=N |
Final-run deploy for the scored / "本気" run. | OFF |
benchis the default for everyday work: it deploys and turns on every measurement channel (fluent-bit, app metrics endpoint, nginx access_log in kataribe format, MySQL slow_query_log) so the next benchmark produces analyzable data.maji("マジ" = serious / final) is for when you're done iterating and want the highest score: same deploy steps, but every measurement channel is turned off so logging/profiling overhead doesn't cost you points. Use it for the last submitted run.
If in doubt, use bench. Only switch to maji when you explicitly want to
sacrifice observability for throughput.
Mental Model
local servers (isu1, isu2, isu3, ...)
───── ──────────────────────────────
ansible-playbook server.yaml ──provision──▶ install tools, deploy systemd
units, enable/disable services
per [active] host-group membership
make bench REMOTE_ID=N ──ssh──▶ remote/Makefile on isuN:
git pull → cp configs → instrumentation ON
→ build → restart (regular deploy)
make maji REMOTE_ID=N ──ssh──▶ remote/Makefile on isuN:
git pull → cp configs → instrumentation OFF
→ build → restart (final deploy)
The root Makefile does nothing locally except make -C remote $TARGET,
where remote/Makefile runs with SHELL=ssh -t -A isu$REMOTE_ID. Every
recipe in remote/Makefile therefore executes on the target server over
SSH with agent forwarding (so git pull against a private repo works).
Inventory & Role Assignment
hosts defines two kinds of groups:
[isucon] # all servers — common provisioning runs here
isu1
isu2
[app] # which servers run the app
isu1
isu2
[mysql] # which server runs MySQL
isu1
[nginx] # which servers terminate HTTP
isu1
isu2
[active:children]
app
mysql
nginx
server.yaml then uses active:!app, active:!mysql, active:!nginx
patterns to disable the role on every active host that isn't in that role
group. So to move MySQL from isu1 to isu2, edit hosts (move isu2 into
[mysql], remove from [mysql]) and re-run the playbook — the role swap is
declarative.
Each host is bound to its SSH target via host_vars/isuN (ansible_host,
ansible_user, ansible_ssh_private_key_file).
Provisioning (run on setup or after host-group changes)
# Install role dependencies once per checkout
ansible-galaxy install -r requirements.yml
# Full provision (all roles, all hosts in [active])
ansible-playbook -i hosts server.yaml
# Re-run a single role only — every role has a tag matching its name
ansible-playbook -i hosts server.yaml --tags nginx
ansible-playbook -i hosts server.yaml --tags mysql,mysql_down
ansible-playbook -i hosts server.yaml --tags repo
The *_down tags are the disable-on-inactive-hosts side; pair them with the
enable tag when reshuffling roles.
Local-only monitoring stack (Grafana / Loki / etc., on the operator's machine):
ansible-playbook -i hosts monitor.yaml
Per-Benchmark Loop
REMOTE_ID selects which server to deploy to (isu$REMOTE_ID). It defaults
to 1; set it explicitly when deploying to other ISUs. The two top-level
deploy commands are:
# Regular deploy (instrumentation ON) — use this for every iteration
make bench REMOTE_ID=1
# Final-run deploy (instrumentation OFF) — use this only for the scored run
make maji REMOTE_ID=1
bench (regular) vs. maji (final): step-by-step
Both share the same deploy steps; the only difference is whether each measurement channel is enabled.
| Step | bench (regular) |
maji (final) |
|---|---|---|
backup access/slow logs |
✅ | ✅ |
pull git repo |
✅ | ✅ |
replace app/nginx/mysql |
✅ | ✅ |
| fluent-bit log shipper | enable | disable |
| app process metrics env var | on | off |
| nginx access_log (kataribe) | on | off |
| MySQL slow_query_log | on | off |
build Go binary |
✅ | ✅ |
restart services |
✅ | ✅ |
Use bench while iterating (you want the data); use maji for the final
scored run (you want max throughput, no logging overhead).
Targeting a Single Subsystem
When you only changed one layer, deploy only that layer:
make app-replace REMOTE_ID=1 # copy app source from cloned repo
make build REMOTE_ID=1 # rebuild Go binary in BUILD_DIR
make app-restart REMOTE_ID=1 # systemctl restart $APP_SRV_NAME
make nginx-replace REMOTE_ID=1 # cp nginx.conf, conf.d/, sites-available/
make nginx-restart REMOTE_ID=1 # nginx -t, then restart
make mysql-replace REMOTE_ID=1 # cp my.cnf, conf.d/, mysql.conf.d/
make mysql-restart REMOTE_ID=1 # also greps journal for "ignored" config
replace always runs from the freshly-pulled $REPO_DIR, so always make pull (or make bench/maji) before replace to pick up new commits.
Inspection & Profiling
Every diagnostic target also runs over SSH against isu$REMOTE_ID:
make log REMOTE_ID=1 # journalctl -e -u $APP_SRV_NAME
make log-cont REMOTE_ID=1 # journalctl -e -f -u $APP_SRV_NAME (follow)
make slow REMOTE_ID=1 # pt-query-digest on $MYSQL_LOG
make kataribe REMOTE_ID=1 # kataribe over $NGX_LOG
make mysql REMOTE_ID=1 # MySQL CLI as the app user
make mysql-root REMOTE_ID=1 # MySQL CLI as root
pprof / fgprof are local targets — they call go tool pprof -http=...
against http://localhost:606$REMOTE_ID/debug/pprof/profile (resp.
/debug/fgprof). That URL only resolves if you've SSH-port-forwarded
606$REMOTE_ID from the target server, e.g.:
ssh -L 6061:localhost:6060 isu1 # in another shell, while make pprof runs
make pprof REMOTE_ID=1
Toggling Instrumentation Independently
If you don't want to redeploy just to flip a knob:
make metrics-on REMOTE_ID=N # sed-edits ISUTOOLS_ENABLE in $APP_SRV_FILE + daemon-reload
make metrics-off REMOTE_ID=N
make access-on REMOTE_ID=N # rewrites nginx access_log to kataribe format
make access-off REMOTE_ID=N # access_log off;
make slow-on REMOTE_ID=N # SET GLOBAL slow_query_log=ON, long_query_time=0
make slow-off REMOTE_ID=N
make fluentbit-enable REMOTE_ID=N
make fluentbit-disable REMOTE_ID=N
A metrics-* or slow-* flip on its own doesn't restart the app/MySQL —
follow with app-restart / mysql-restart if the change must take effect
immediately. The bench/maji macros already do this in the right order.
Typical Workflows
"I changed Go code" — regular deploy
make bench REMOTE_ID=1 # regular deploy: pull, replace, instrumentation ON, build, restart
# run benchmark
make kataribe REMOTE_ID=1 # nginx breakdown
make slow REMOTE_ID=1 # slow query digest
"I changed only nginx config"
# commit + push nginx/ in the contest repo first
make pull REMOTE_ID=1
make nginx-replace REMOTE_ID=1
make nginx-restart REMOTE_ID=1
"Move MySQL from isu1 to isu2"
- Edit
hosts: moveisu2into[mysql], removeisu1from it. ansible-playbook -i hosts server.yaml --tags mysql,mysql_down- Update app DB host wherever it's configured (commonly
group_vars/all/var.yaml'smysql.connection.host, plus the contest app's env/config), commit, push, thenmake bench REMOTE_ID=<app-host>.
"Final scoring run" — final deploy
make maji REMOTE_ID=1 # final deploy: same as bench, but instrumentation OFF
make maji REMOTE_ID=2
make maji REMOTE_ID=3
# trigger benchmark
Run maji against every active host so logs/metrics are off everywhere.
After the scored run finishes, switch back to make bench for the next
iteration so measurements come back on.
Pre-flight Checklist
Before the first deploy in a fresh checkout:
-
ansible-galaxy install -r requirements.ymlran cleanly -
host_vars/isuNhas the rightansible_host/ SSH key for each ISU -
hosts[app]/[mysql]/[nginx]reflects the intended topology -
.make.envGIT_REPO/REPO_BRANCHpoint at the contest repo -
group_vars/all/var.yamlservice names +BUILD_CMDmatch the contest's app (e.g.,isuride-go.servicefor ISUCON14) - SSH agent forwarding works end-to-end (
ssh -A isu1 'ssh -T git@github.com') -
ansible-playbook -i hosts server.yamlcompleted at least once
Gotchas
benchis regular;majiis final-only. Don't runmajiwhile iterating — you'll lose the kataribe / slow-query / metrics data you need to decide what to optimize next. Conversely, don't submit abench-prepped run as the scored run — instrumentation overhead is non-trivial.REMOTE_IDis per-invocation, not sticky.make bench/make majideploys to exactly one server; run it once per active app host.replaceis destructive on the server side — itcp -r -T's repo contents over/etc/nginx,/etc/mysql,$APP_BASE. Hand-edits on the server are lost on the next deploy. Always edit in the contest repo.mysql-restartgrepsjournalctlforignoredto catch silent config-rejection (wrong perms / unknown options). If it fails after a config change, look forchmod/chownissues in$MYSQL_CFG_DIR.bench/majialways runbackupfirst, moving the previous access_log + slow_query_log into~/logs/<unix-timestamp>/on the server. Pull old logs from there if you need to compare runs.active:!rolepatterns require the host to be in[active]. If you add a new ISU, add it to[isucon]and to at least one of[app]/[mysql]/[nginx], otherwise the disable side won't run on it.make pprof/fgprofneed port forwarding. They're local-only and hitlocalhost:606$REMOTE_ID; without an SSH tunnel they'll fail with a connection refused.
Resources
- Ansible inventory patterns (
group:!other): https://docs.ansible.com/ansible/latest/inventory_guide/intro_patterns.html - kataribe (nginx access_log analyzer): https://github.com/matsuu/kataribe
- pt-query-digest (slow query analyzer): https://docs.percona.com/percona-toolkit/pt-query-digest.html