semaphore-test-results

name: semaphore-test-results description: Publish JUnit test reports from Semaphore jobs and surface them in the UI's Test Reports tab. Covers the `test-results` CLI, why it must live in `epilogue` (not the test command itself), and per-framework JUnit configuration for Go, pytest, RSpec, Jest, Vitest, ExUnit, Java. Use when a pipeline writes JUnit / test reports, the Test Reports tab is empty after a failure, failures publish silently, the user asks about flaky test surfacing or per-framework JUnit setup, or any time `test-results publish` / `test-results gen-pipeline-report` is involved.

Publishing test results on Semaphore

The test-results CLI (part of the toolbox, preinstalled) uploads JUnit-formatted reports to Semaphore's artifact store and renders them in the Test Reports tab of each job / pipeline. Without it, a failed test is just a red job — no per-test detail, no failure history, no flake detection.

This skill is the depth that semaphore-toolbox defers to. See semaphore-toolbox for the broader CLI surface.

The two commands

test-results publish <junit-file-or-dir> [--name <suite>] [--generate-mcp-summary]
test-results gen-pipeline-report [--generate-mcp-summary]

publish — uploads JUnit data as an artifact, scoped to the current job; the file(s) appear under that job's Test Reports tab. Accepts either a single XML file OR a directory — when given a directory, it publishes every XML inside (handy for sharded runners that emit cypress-results/junit-shard-1.xml, cypress-results/junit-shard-2.xml, …). --name labels the suite in the UI (useful when one job emits multiple junit files for different test types).
gen-pipeline-report — runs once at end of pipeline, gathers every junit artifact the publishes uploaded, and produces the pipeline-level aggregated report (the rolled-up view across all jobs).

--generate-mcp-summary (on either command) also emits mcp-summary.json — a compact, AI-readable summary of pass/fail counts and failure messages.

Why epilogue, not inline (THE rule)

Put test-results publish in the job's epilogue, never in the main commands: list. Here's why:

Where you put it	What happens when tests fail
Inline (after test command)	Test command exits non-zero → shell stops → publish never runs → no report in UI
`epilogue.always.commands`	Runs even when the job fails — publish always fires → failure detail surfaces in UI

The epilogue under always: is Semaphore's "finally" block. Without it, the only thing you ever see for a failing job is the raw stderr, which buries the actually useful information.

Canonical pipeline shape

global_job_config:
  prologue:
    commands:
      - checkout
  epilogue:
    always:
      commands:
        # Publish every junit-*.xml the job produced.
        # Loop pattern lets a single job emit multiple suite reports.
        - 'for f in junit-*.xml; do [ -f "$f" ] || continue; n="${f#junit-}"; n="${n%.xml}"; test-results publish "$f" --name "$n"; done'

blocks:
  - name: Test
    task:
      jobs:
        - name: ...
          commands:
            - ...
            # produces junit-unit.xml and junit-integration.xml
            - go test -junit unit.xml ./...
            - go test -junit integration.xml ./integration

after_pipeline:
  task:
    jobs:
      - name: Pipeline test report
        commands:
          - test-results gen-pipeline-report

The loop pattern is what we use in sem-ai's own pipeline — it lets a job emit junit-tests.xml, junit-race.xml, etc., and each becomes its own labeled suite without changing the epilogue.

Per-framework JUnit configuration

Each test framework needs a flag, plugin, or formatter to write JUnit XML. The publishing step is identical across frameworks; only the test invocation changes.

Go — gotestsum

go test itself doesn't emit JUnit. Use gotestsum (small wrapper).

- go install gotest.tools/gotestsum@latest
- gotestsum --junitfile junit-tests.xml --format pkgname -- ./...

For race detection in a separate job:

- gotestsum --junitfile junit-race.xml --format pkgname -- -race ./...

Python — pytest

Built-in:

- pytest --junitxml=junit-pytest.xml

Ruby — RSpec

Needs the rspec_junit_formatter gem in your bundle:

# Gemfile
group :test do
  gem 'rspec_junit_formatter'
end

Then:

- bundle exec rspec --format RspecJunitFormatter --out junit-rspec.xml

Ruby — Minitest

Needs minitest-junit:

gem 'minitest-junit'

- ruby -Itest -r minitest-junit -e 'Dir["test/**/*_test.rb"].each { |f| require_relative f }' --junit --junit-filename=junit-minitest.xml

JavaScript — Jest

Install jest-junit:

npm i -D jest-junit

Config (in package.json or jest.config.js):

"jest": {
  "reporters": ["default", ["jest-junit", { "outputFile": "junit-jest.xml" }]]
}

JavaScript — Vitest

Built-in JUnit reporter:

- npx vitest run --reporter=junit --outputFile=junit-vitest.xml

Elixir — ExUnit

Add junit_formatter:

# mix.exs
{:junit_formatter, "~> 3.0", only: [:test]}

# config/test.exs
config :junit_formatter,
  report_dir: ".",
  report_file: "junit-exunit.xml"

- mix test

Java — Maven

Maven Surefire produces XML by default at target/surefire-reports/TEST-*.xml. Move/rename so the epilogue loop picks them up:

- mvn test
- 'for f in target/surefire-reports/TEST-*.xml; do mv "$f" "junit-$(basename "$f" .xml | sed s/^TEST-//).xml"; done'

Cypress (E2E)

cypress-junit-reporter:

npm i -D cypress-multi-reporters cypress-junit-reporter

// cypress.config.js
reporter: 'cypress-multi-reporters',
reporterOptions: {
  reporterEnabled: 'cypress-junit-reporter',
  cypressJunitReporterReporterOptions: {
    mochaFile: 'junit-cypress-[hash].xml'
  }
}

When sharding Cypress across N jobs (see semaphore-blocks parallelism), each shard writes its own file — the epilogue loop publishes them with shard-distinguished names.

What shows up in the UI

Per-job Test Reports tab — populated by test-results publish inside that job. Lists each test with status, duration, error message; failed tests sorted first.
Pipeline-level Test Report — populated by test-results gen-pipeline-report in after_pipeline. Aggregates every junit artifact across every job. This is what shows you "8 failed across 4 jobs" at a glance.
mcp-summary.json artifact — when --generate-mcp-summary is passed; compact JSON suitable for downstream tooling (CI annotations, AI summarization).

If you skip gen-pipeline-report, the per-job tabs still work, but the rolled-up pipeline view is empty.

Debugging a failed job — pull, don't tweak

If your pipeline already publishes JUnit and a test fails, don't change the reporter to also dump to stdout, and don't re-run with -v. The failure detail is already published — pull it.

sem-ai test summary --pipeline <pipeline-id>
# → verdict, total/passed/failed/skipped, per-failure: test name, file:line, message

sem-ai test report --pipeline <pipeline-id>
# → per-job view; tries the junit artifact first, falls back to log parsing

sem-ai artifact get --scope jobs --id <job-id> --path junit-<name>.xml --output local.xml
# → raw artifact when you want to inspect it yourself

See test-intelligence skill for full surface.

When to use which

test summary — first move when a pipeline went red. Compact digest. Right answer 90% of the time.
test report — when summary doesn't have enough; per-job breakdown.
artifact get — when you want the raw junit XML (e.g. to diff against a previous run, feed into other tooling).

Anti-pattern — junit-only reporter

A test runner configured with only a JUnit reporter (Vitest --reporter=junit and nothing else; mocha-junit-reporter with toConsole: false; Jest with junit-only setup) produces zero human-readable output in the job log. When something fails, the log shows "exit 1" and nothing about which test broke.

Don't configure runners this way to begin with. Keep the framework's default reporter alongside the JUnit one — the default reporter is calibrated to be useful without flooding the log (test names, failure messages, summary; no per-assertion noise). You get readable logs and the structured Test Reports tab.

Concrete shapes:

Framework	Don't	Do
Vitest	`--reporter=junit`	`--reporter=default --reporter=junit`
mocha / cypress-junit	`toConsole: false`	`toConsole: true` (or `mocha-multi-reporters` with `spec` + `mocha-junit-reporter`)
Jest	only `jest-junit` in `reporters:`	`["default", ["jest-junit", { ... }]]`
pytest	`--quiet --junitxml=...`	`--junitxml=...` (drop `--quiet`)
RSpec	only `--format RspecJunitFormatter`	`--format progress --format RspecJunitFormatter --out junit-rspec.xml`
gotestsum	`--format=junit` only	default `--format=pkgname` writes log; `--junitfile` adds junit

Rule: more log = better debugging, up to the point of flooding. Defaults across frameworks are well-tuned for this — keep them on, add JUnit alongside. Avoid -v / --verbose flags unless you actually need per-assertion trace; those triple log size for marginal debugging value.

If you DID end up junit-only, still don't change the pipeline first

sem-ai test summary --pipeline <id> parses the junit artifact and prints a digest in under a second. Use that to identify what failed before deciding whether the right fix is a reporter config change or a code change. Patching the pipeline to dump more output, re-running, and waiting another few minutes is rarely the right first move — pull what's already published.

Common failure modes

"I see no test report, but the job's command produced junit-foo.xml"

Almost always: publish wasn't in the epilogue, was inline, and the test command exited non-zero before the publish line ran. Move publish to epilogue.always.commands.

"Report is partial — only some suites show up"

Either:

The epilogue loop missed a file (glob doesn't match — check the actual filename produced by the framework)
A test-results publish call returned non-zero and set -e killed subsequent publishes. Fix the offending file rather than papering over with || true — see below.

Don't mask `test-results publish` failures with `|| true`

Common (and wrong) pattern:

- test-results publish cypress-results/ || true

Semaphore reports a job as failed when ANY epilogue command exits non-zero. Wrapping test-results publish in || true to "make sure the job doesn't fail on publish errors" actively hides real publish failures: malformed XML, missing file, CLI error — all swallowed. The Test Reports tab silently stays empty and there's no signal that anything went wrong.

Let it fail. If publish errors are blocking real runs, fix the cause (file path, framework JUnit config, missing dir) instead of masking. The same applies to gen-pipeline-report in after_pipeline.

The only legitimate use of || true here is in a fan-out loop where you genuinely want best-effort across N independent files AND you can stomach a silent miss — and even then, prefer to fix the offending file.

"Test Reports tab is empty after a passing job"

Pass-only state: junit file wasn't written (config issue) or path mismatch (file is in a subdir the epilogue glob doesn't reach). Confirm the framework actually emits the file and where.

"Pipeline-level report is empty even though per-job reports work"

gen-pipeline-report not running. Check:

It's in after_pipeline.task.jobs[].commands, not inside a block
The job in after_pipeline actually runs (no missing prologue dep)

Naming convention recommendation

Adopt junit-<suite>.xml everywhere (the loop pattern depends on it). Examples:

junit-tests.xml (default Go unit tests)
junit-race.xml (race detection)
junit-trivy.xml (security scan as test results)
junit-cypress-1.xml, junit-cypress-2.xml, … (sharded Cypress)
junit-vitest.xml

The --name arg in the loop strips junit- and .xml, so the suite labels in the UI become tests, race, cypress-1, etc.

Related skills

semaphore-toolbox — broader CLI surface (cache, artifact, retry, sem-version, sem-service); test-results is the deep dive
semaphore-blocks — where epilogue lives; how block / global_job_config / after_pipeline relate
test-intelligence — analyzing the results after publication (flaky detection, trends); complements this skill
gha-to-semaphore — when translating GHA test-reporter actions, point here