name: nvflare-convert-pytorch description: "Convert existing PyTorch training code into an NVFLARE federated job using Client API model exchange, local validation, and job export; do not use for other frameworks or deployment-only tasks." min_flare_version: "2.8.0" blast_radius: edits_files skill_version: "0.1.0"
NVFLARE Convert PyTorch
Use When
Use when the user asks to convert an existing PyTorch training script,
torch.nn.Module, state_dict workflow, data loader, checkpoint, or metric
loop into an NVFLARE federated training job.
Do Not Use When
Do not use for PyTorch Lightning, Hugging Face Trainer, TensorFlow, XGBoost, scikit-learn, Kubernetes deployment, production submission, or generic PyTorch debugging that does not ask for FLARE conversion.
Workflow
- Before Python import/inspect commands, install applicable source
requirements*.txtfiles in the activenvflareenvironment. Useuv pipwhen available; see the shared lifecycle for interpreter selection and avoiduv pip install --systemwith virtual environments. - Follow the shared conversion workflow contract in
../_shared/nvflare-job-lifecycle.md. - Identify the PyTorch model definition, required
nn.Module.__init__arguments, training loop, data loading, metrics, and checkpoint behavior. Determine the concrete constructor values that server and client models must share before creatingjob.py. - Run
nvflare recipe list --framework pytorch --format jsonand select the recipe from the requested FL workflow, not from PyTorch alone. Use FedAvg only for standard horizontal model-parameter aggregation. For standard FedAvg, use the portable fast path inreferences/recipe-selection.md; do not add per-site recipe config unless the sites actually need different training scripts, arguments, or launch settings. - Convert training exchange to the FLARE Client API: initialize FLARE, receive
an
FLModel, loadparamsinto the PyTorch model, train or evaluate, and send anFLModelwith updatedparams, metrics, and useful metadata. - Add or update a
job.pythat uses the selected PyTorch recipe or job API path. Follow the shared lifecycle for generated layout, validation, export, runtime locations, and evidence reporting. - Validate and export through the shared lifecycle. Use
references/job-validation.mdfor PyTorch-specific checks before calling the conversion complete.
Requirements
- Must audit model constructor arguments before writing
job.py. If the model has required non-default__init__parameters, generate explicit recipe model config withpathorclass_pathandargs, then verify recipe construction and export preserve those arguments. - Must follow the shared job lifecycle guidance for validation evidence, including final/best metrics, round/per-site metrics, and artifact paths when those artifacts are present.
- Must keep outbound PyTorch model weights as
torch.Tensorvalues inFLModel(params=...)when usingPTInProcessClientAPIExecutor; loadreferences/pytorch-client-api-conversion.mdfor the exact send pattern. - Must not require
rgto be installed; the shared lifecycle defines fallback search options.
Agent Responsibilities
- Run project inspection and recipe discovery before selecting a recipe.
- Explain the selected recipe when the user's algorithm intent is ambiguous.
- Convert PyTorch Client API model exchange and generate or update
job.py. - Keep PyTorch conversion choices, validation blockers, recipe comparisons, and data-prep decisions within this skill, its references, and the shared lifecycle guidance.
- Report PyTorch-specific blockers such as non-
state_dictmodel state, incompatible checkpoint loading, unsupported metric serialization, or data loaders that cannot be parameterized per site.
User Input And Approval
- Ask the user to clarify FL workflow intent when recipe selection is uncertain.
- Follow the shared lifecycle approval boundary for data-path changes, non-fixture validation data, POC, production, and startup-kit based runtime submission.
Load ../_shared/nvflare-job-lifecycle.md for every conversion. Load the
smallest PyTorch-specific reference needed for the current phase:
references/recipe-selection.md before selecting or constructing a recipe,
references/pytorch-client-api-conversion.md when converting training code to
Client API model exchange, and references/job-validation.md before validation,
export, or debugging PyTorch-specific validation failures. Do not load every
reference preemptively, and do not depend on NVFLARE repository examples being
present in the user's environment.