name: buildbuddy-flaky-tests description: Fetch and triage recent BuildBuddy flaky-test data with BuildBuddyService RPCs. Use when Codex needs to list the latest flaky targets, match the Test Analytics flakes UI, get sample flaky invocations/log pointers, rank recent flakes, check whether a PR is already addressing a flaky target, or start reproducing and fixing the top flaky test.
BuildBuddy Flaky Tests
Overview
Use BuildBuddyService target-stat RPCs to fetch the same data behind the Test Analytics flakes page, then use the top entries to drive focused PR search, reproduction, and fixes.
The helper script calls:
GetTargetStatsfor the flaky target list.GetDailyTargetStatsfor recent daily aggregate counts.GetTargetFlakeSamplesfor sample flaky invocations for selected labels.GetUserorGetGrouponly to resolve the group ID when needed.
Load references/requests.md when you need raw HTTP JSON request templates or
field mapping details.
Quick Start
From a BuildBuddy repo with bb login already done:
FLAKY_SKILL_DIR="${BUILDBUDDY_FLAKY_TESTS_SKILL_DIR:-<path-to-this-skill>}"
python3 "$FLAKY_SKILL_DIR/scripts/fetch_flaky_tests.py" \
--org-slug buildbuddy \
--repo auto \
--branch master \
--days 7 \
--limit 10 \
--samples-for-top 1
Use --format json --out <file> when the next step needs structured data.
Use --sort flake-percent to mirror the default table sort in
enterprise/app/tap/flakes.tsx; use the default --sort total-flakes for
automation because it prioritizes the highest-volume flakes.
Preconditions
- Check API key presence without printing it:
git config --local buildbuddy.api-key | wc -c. - The script also accepts
BUILDBUDDY_API_KEY,BUILD_BUDDY_API_KEY,BUILDBUDDY_GROUP_ID,BUILD_BUDDY_GROUP_ID,BUILDBUDDY_ORG_SLUG, andBUILD_BUDDY_ORG_SLUG. - If group resolution fails, rerun with
--group-id GR...or--org-slug <url-identifier>. - For the public BuildBuddy repo, use
--org-slug buildbuddy; the selected group inferred fromGetUsermay not be the group that owns the repo data. - Keep windows small by default. The proto default is 7 days; the flakes UI can also pass explicit start/end filters.
Triage Workflow
- Fetch a recent ranked list with
fetch_flaky_tests.py. - Pick the top entry using the requested sort. For daily automation, prefer total flaky plus likely-flaky runs over pure percentage.
- For that label, inspect sample invocation IDs from the script output. If
deeper logs are needed, use
buildbuddy-invocation-troubleshooton a sample invocation and target label. - Check whether an open PR already addresses it before editing: search GitHub PRs for the exact label, package path, test suite/class name, and distinctive error text from logs.
- If no PR is clearly addressing it, reproduce narrowly. Start with the target:
bazel test <label> --config=remote-minimal --nocache_test_results --runs_per_test=30 --test_output=errors
Use the sample invocation's branch, commit, flags, and environment if local reproduction does not fail.
- Make the smallest plausible fix, then validate with the target test using a higher repeat count when the failure was reproduced.
- Report the flaky target label, ranking evidence, sample invocation links, PR-search result, reproduction result, and validation commands.
Resources
scripts/fetch_flaky_tests.py: fetch, rank, and summarize flaky target stats.references/requests.md: RPC field mapping and raw request templates.
references/
Documentation and reference material intended to be loaded into context to inform Codex's process and thinking.
Examples from other skills:
- Product management:
communication.md,context_building.md- detailed workflow guides - BigQuery: API reference documentation and query examples
- Finance: Schema documentation, company policies
Appropriate for: In-depth documentation, API references, database schemas, comprehensive guides, or any detailed information that Codex should reference while working.
assets/
Files not intended to be loaded into context, but rather used within the output Codex produces.
Examples from other skills:
- Brand styling: PowerPoint template files (.pptx), logo files
- Frontend builder: HTML/React boilerplate project directories
- Typography: Font files (.ttf, .woff2)
Appropriate for: Templates, boilerplate code, document templates, images, icons, fonts, or any files meant to be copied or used in the final output.
Not every skill requires all three types of resources.