name: minicpm5-finetune-ms-swift
description: Fine-tune MiniCPM5-1B with ms-swift (ModelScope's SFT / DPO / KTO / ORPO toolkit). Use when the user mentions "ms-swift", "swift sft", "swift rlhf", or wants ModelScope-native training. The two mandatory flags --model_type llama --template chatml are baked in.
Fine-tune MiniCPM5-1B with ms-swift
ModelScope-native SFT / DPO / KTO / ORPO. ChatML template + standard llama model_type.
⚠️ ms-swift 4.x renamed
--train_type→--tuner_type. Older tutorials still use--train_type lora, which on 4.x producesValueError: remaining_argv: ['--train_type', 'lora']. Use--tuner_type lora(or just omit it —lorais the default in 4.x).
Required input
| Var | Example | Default |
|---|---|---|
BASE_MODEL |
openbmb/MiniCPM5-1B |
required |
DATA |
path to messages-format jsonl | required |
OUTPUT_DIR |
./runs/minicpm5_swift |
required |
GPU_ID |
0 |
0 |
Each line of DATA: {"messages": [{"role":"...","content":"..."}, ...]}.
Steps
1. Install (once)
pip install "ms-swift>=3.0"
# or for the dev branch:
pip install git+https://github.com/modelscope/ms-swift.git
2. Train (LoRA SFT)
CUDA_VISIBLE_DEVICES=${GPU_ID} swift sft \
--model "${BASE_MODEL}" \
--model_type llama \
--template chatml \
--tuner_type lora \
--dataset "${DATA}" \
--output_dir "${OUTPUT_DIR}" \
--num_train_epochs 2 \
--per_device_train_batch_size 4 \
--gradient_accumulation_steps 4 \
--learning_rate 2e-4 \
--lora_rank 16 --lora_alpha 32 --lora_dropout 0.05 \
--target_modules q_proj k_proj v_proj o_proj gate_proj up_proj down_proj \
--max_length 4096 \
--warmup_ratio 0.03 \
--bf16 true \
--logging_steps 10 \
--save_steps 200
🔑
--model_type llamaAND--template chatmlare MANDATORY. Without them ms-swift errors withMultiple possible types found: ['codefuse_codellama', 'llama', ...]because MiniCPM5's disk-level structure is shared with several Llama-family models and ms-swift refuses to guess.
3. Validate
Loss should decrease over the first few hundred steps:
{'loss': 4.52, 'token_acc': 0.26, 'epoch': 0.04}
{'loss': 3.57, 'token_acc': 0.35, 'epoch': 1.00}
Adapter is at ${OUTPUT_DIR}/v0-${TIMESTAMP}/checkpoint-${STEP}/.
Merge for serving
swift export \
--model "${BASE_MODEL}" \
--adapters "${OUTPUT_DIR}/v0-${TIMESTAMP}/checkpoint-${STEP}" \
--merge_lora true \
--output_dir ./minicpm5-swift-merged
The merged model is a regular LlamaForCausalLM and serves with any minicpm5-deploy-* skill.
Other trainers (DPO / KTO / ORPO / SimPO)
Same flag surface, just swap the trainer:
swift rlhf --rlhf_type dpo \
--model "${BASE_MODEL}" --model_type llama --template chatml \
--dataset preference.jsonl \
--output_dir ${OUTPUT_DIR} ...
dpo / kto / orpo / simpo all work with the chatml template.
Multi-GPU
NPROC_PER_NODE=8 swift sft \
--model "${BASE_MODEL}" --model_type llama --template chatml \
--tuner_type lora --deepspeed default-zero2 \
...
Common pitfalls
Failed to automatically match model_type: add--model_type llama.Failed to automatically match template_type: add--template chatml.- Conflict with LLaMA-Factory in same env: LLaMA-Factory pins
transformers==4.52, ms-swift wants the latest (currently transformers ≥5.6). Use separate venvs, or setPYTHONNOUSERSITE=1to ignore user-sitetransformers.