name: "galaxy-cli" description: "Operate Galaxy through the galaxy-cli command line interface with low-token, progressive command lookup."
galaxy-cli
Use this skill when the task requires Galaxy operations through galaxy-cli.
Keep token use low: read this file once, then use galaxy-cli <command> --help
only for the specific command you are about to run.
Token-Cheap Defaults
galaxy-cli is agent-first. The default path is:
- Submit each tool with
galaxy-cli tool run ... --inputs-json FILE. - Let
tool runwait. Do not add--no-waitunless the task explicitly asks for asynchronous submission. - Use the returned
outputsarray for output IDs, state, datatype, and size. - Do not call
job show,dataset show,collection show,dataset list, orcollection listfor routine verification. - Do not call
job show --logsunless debugging a failed job. - Do not call
tool showwhen task files already provide exact tool IDs and parameter JSON.
Rules
- Use only
galaxy-clifor Galaxy actions in this condition. - Do not use BioBlend, raw HTTP clients, MCP tools, or Galaxy source code.
- Do not inspect or print API keys. Use
GALAXY_URLandGALAXY_API_KEYfrom the environment. - Compact JSON output is the default. Use
--humanonly when a task needs human-readable terminal output. - Pass
--history-idexplicitly on every history-scoped command. Do not rely on shared session state when multiple agents or concurrent runs may touch the same machine. - Prefer
--inputs-json FILEfor tool runs with conditionals, repeats, or more than two parameters. - Store large command output in files and extract only needed fields with
jq. - Safe GET calls retry
429,502,503, and504automatically. Do not blindly retrytool runordataset uploadafter an unknown submission state. - If the task already provides exact tool IDs and parameter JSON, submit the
tool directly. Do not call
tool showjust to re-discover supplied params. - Use
tool run --dry-run-payloador--save-payload PATHwhen you need to validate inputs and inspect the exact Galaxy POST body before submission. - Tool and workflow submissions validate obvious mistakes before POSTing to Galaxy: unknown input names, missing required dataset or collection inputs, invalid dataset-vs-collection source prefixes, and simple select, boolean, integer, and float values.
- Do not download datasets or reports to local files unless the task explicitly asks for a local artifact. Reuse Galaxy dataset ids and collection ids directly in downstream tool runs.
- For
workflow run, explicit source prefixes must behda:,hdca:, orldda:. Treat any other prefix as invalid input and fix it before submit. workflow run --waitshould be trusted only when the invocation reaches Galaxy'sscheduledorcompletedstate and all discovered jobs are terminal; this avoids reporting success while later steps are still being scheduled.
Minimal Command Recipes
Create a fresh history:
HID=$(galaxy-cli history create "task run" | jq -r .id)
echo "$HID" > history_id.txt
Copy a prepared source history into a fresh working history:
HID=$(galaxy-cli history copy "$SOURCE_HISTORY_ID" "task run copy" | jq -r .id)
echo "$HID" > history_id.txt
Upload local datasets:
FWD=$(galaxy-cli dataset upload inputs/reads_1.fastq.gz --history-id "$HID" --file-type fastqsanger.gz | jq -r .id)
REV=$(galaxy-cli dataset upload inputs/reads_2.fastq.gz --history-id "$HID" --file-type fastqsanger.gz | jq -r .id)
dataset upload waits by default. Do not create collections from uploaded
datasets until the returned upload JSON reports state: "ok".
For large files, set an upload/request timeout explicitly:
DATASET=$(galaxy-cli dataset upload matrix.tsv --history-id "$HID" --file-type tabular --upload-timeout 7200 --timeout 7200 | jq -r .id)
--timeout is the upload job wait timeout and also the HTTP upload timeout
when --upload-timeout is not set. GALAXY_CLI_REQUEST_TIMEOUT controls
regular API request reads, and GALAXY_CLI_UPLOAD_TIMEOUT controls upload POSTs.
Create collections:
PAIR=$(galaxy-cli collection create "pair" --history-id "$HID" --collection-type paired --forward "$FWD" --reverse "$REV" | jq -r .id)
PAIR_ALT=$(galaxy-cli collection create "pair" --history-id "$HID" --collection-type paired -e forward="$FWD" -e reverse="$REV" | jq -r .id)
LIST_PAIR=$(galaxy-cli collection create "reads" --history-id "$HID" --collection-type list:paired -p "pair:$FWD:$REV" | jq -r .id)
LIST=$(galaxy-cli collection create "reports" --history-id "$HID" --collection-type list -e pair="$DATASET_ID" | jq -r .id)
collection create includes resolved element IDs in JSON mode. Save its output if the
next tool needs a nested collection element; do not call collection show unless
the create output is insufficient.
Run a tool:
cat > tool_inputs.json <<EOF
{
"input": "hda:$DATASET_ID"
}
EOF
galaxy-cli tool run "$TOOL_ID" --history-id "$HID" --inputs-json tool_inputs.json > tool_result.json
JOB=$(jq -r '.jobs[0].id' tool_result.json)
Search for tools with bounded output:
galaxy-cli tool search "fastqc" --limit 5
galaxy-cli tool search "machine learning" --limit 10 --cache
galaxy-cli tool search "machine learning" --limit 10 --refresh-cache
Default tool search output is limited and does not resolve every string-only
hit. Add --resolve only when the search result lacks enough detail.
Inspect a payload before submitting:
galaxy-cli tool run "$TOOL_ID" --history-id "$HID" --inputs-json tool_inputs.json --dry-run-payload
galaxy-cli tool run "$TOOL_ID" --history-id "$HID" --inputs-json tool_inputs.json --save-payload payload.json
galaxy-cli workflow run "$WF_ID" --history-id "$HID" -i 0="$DATASET_ID" --dry-run-payload
galaxy-cli workflow run "$WF_ID" --history-id "$HID" -i 0="$DATASET_ID" --save-payload workflow_payload.json
If dry-run returns invalid_request, fix the payload and rerun dry-run. Do not
submit the job or invocation until the dry-run payload validates.
Check job and output states:
jq '{job:.jobs[0], wait_result, outputs}' tool_result.json
galaxy-cli job show "$JOB" --full
tool run waits by default. In JSON mode, the outputs array includes final
dataset or dataset-collection state/type/size metadata after wait. Do not call
job show --full, dataset show, or collection show for those outputs
unless a needed field is missing.
Preview wide datasets compactly:
galaxy-cli dataset peek "$DATASET_ID" --history-id "$HID" --lines 5 --max-fields 20 --max-chars-per-line 500
dataset peek returns compact lines plus per-row field_count and first
fields under rows, so broad expression matrices do not flood context.
Download outputs only when the task explicitly asks for local artifacts:
galaxy-cli dataset download "$DATASET_ID" results/output.dat
Input Encoding
- Dataset:
hda:DATASET_ID - Dataset collection:
hdca:COLLECTION_ID - For dataset or collection inputs nested inside conditionals or repeats, use
the native JSON object form:
{"src": "hda", "id": "DATASET_ID"}or{"src": "hdca", "id": "COLLECTION_ID"}. - Flattened nested data keys are also normalized, for example
library|input_1=hda:DATASET_IDandselect_data|countsFile=hdca:COLLECTION_ID. - Library dataset:
ldda:DATASET_ID - Boolean:
trueorfalse - Conditional or repeat params: prefer nested JSON in
--inputs-json. - Optional repeat blocks with
min: 0can be omitted. If a repeat item is supplied, its required child inputs still need valid values. - Flattened conditional paths use pipes when needed, for example
single_paired|paired_input. - Repeated and conditional inputs should mirror
galaxy-cli tool show TOOL_ID. - Current IUC MultiQC FastQC inputs use
results -> software_cond -> output:
{
"results": [
{
"software_cond": {
"software": "fastqc",
"output": [
{
"type": "data",
"input": [
{"src": "hda", "id": "FASTQC_RAW_DATA_1"},
{"src": "hda", "id": "FASTQC_RAW_DATA_2"}
]
}
]
}
}
]
}
What To Read Next
Publish/import a completed history when a run needs a shareable result:
galaxy-cli history update "$HID" --published true --importable true
- For command syntax, run
galaxy-cli <group> --helporgalaxy-cli <group> <command> --help. - For tool parameters, use the task's
workflow/step_specs.json,workflow/required_step_params.json, andworkflow/step_execution_hints.json. - Only run
galaxy-cli tool show TOOL_IDwhen those task files do not provide enough input names/options to build the submission JSON. - Do not read package source code. The command help and task files are enough.