name: unsloth description: Efficient LLM fine-tuning with 70% VRAM reduction and 2x speedup. Essential for training Irish language models on consumer hardware.
Unsloth
Version: >=2024.12 | Last Updated: 2025-04
Overview
Unsloth enables efficient fine-tuning of large language models with dramatically reduced memory requirements, making it possible to train Irish language models on consumer GPUs.
| Feature | Description |
|---|---|
| VRAM Reduction | 70% less memory usage |
| Speed | 2x faster training |
| 4-bit Training | QLoRA with optimizations |
| GGUF Export | Edge deployment ready |
| Multilingual Support | Enhanced training for multilingual models |
| Flash Attention | Optimized attention mechanism |
When to Use This Skill
Activate when users need:
- "Fine-tune Irish language model"
- "Train model with limited GPU memory"
- "Export model to GGUF for Ollama"
- "Reduce training time for LLM"
- "Run SFT/DPO training efficiently"
Project Integration
Research References (taighde/)
| Directory | Relevant Documents |
|---|---|
taighde_sruth/meaisinfhoghlaim/ |
Training guides, LoRA configs |
taighde_teanga/ |
Irish NLP datasets |
ML Model Dependencies (sruth/meaisinfhoghlaim/)
| Model | Usage |
|---|---|
| UCCIX-Llama2-13B | Irish text generation |
| Llama-3.2-3B | Base for Irish fine-tuning |
| Qwen2.5-Math-7B | Math reasoning |
Core Concepts
1. FastLanguageModel Setup
from unsloth import FastLanguageModel
import torch
# Load model with 4-bit quantization
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="unsloth/Llama-3.2-3B-Instruct",
max_seq_length=2048,
dtype=None, # Auto-detect
load_in_4bit=True, # 70% VRAM reduction
)
2. LoRA Configuration
# Add LoRA adapters
model = FastLanguageModel.get_peft_model(
model,
r=64, # Rank - higher = more capacity
lora_alpha=128, # Scaling factor
lora_dropout=0.05,
target_modules=[
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
],
bias="none",
use_gradient_checkpointing="unsloth", # Essential for long contexts
random_state=42,
)
3. SFT Training
from trl import SFTTrainer
from transformers import TrainingArguments
# Irish curriculum dataset
dataset = load_dataset("cianfhoghlaim/irish-curriculum-qa")
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=dataset,
dataset_text_field="text",
max_seq_length=2048,
args=TrainingArguments(
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
warmup_steps=10,
num_train_epochs=3,
learning_rate=2e-4,
fp16=not torch.cuda.is_bf16_supported(),
bf16=torch.cuda.is_bf16_supported(),
logging_steps=10,
output_dir="outputs",
optim="adamw_8bit",
seed=42,
),
)
trainer.train()
4. GGUF Export
# Save to GGUF for Ollama/llama.cpp
model.save_pretrained_gguf(
"irish-llama-3b",
tokenizer,
quantization_method="q4_k_m" # Good quality/size balance
)
# Alternative quantization methods:
# q8_0 - 8-bit, highest quality
# q4_k_m - 4-bit, recommended
# q4_0 - 4-bit, fastest
Cianfhoghlaim-Specific Usage
Irish Language Fine-Tuning
from datasets import load_dataset
# Load Irish curriculum Q&A dataset
def format_irish_qa(example):
return {
"text": f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Is cúntóir oideachais Gaeilge tú. Freagraigh i nGaeilge.
<|eot_id|><|start_header_id|>user<|end_header_id|>
{example['question']}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{example['answer']}<|eot_id|>"""
}
dataset = load_dataset("cianfhoghlaim/irish-qa").map(format_irish_qa)
# Train with Irish-specific settings
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
train_dataset=dataset["train"],
args=TrainingArguments(
learning_rate=5e-5, # Lower LR for Irish
num_train_epochs=5, # More epochs for low-resource
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
),
)
MLflow Integration
import mlflow
mlflow.set_tracking_uri("http://mlflow.cianfhoghlaim.ie")
mlflow.set_experiment("irish-llm-training")
with mlflow.start_run():
mlflow.log_params({
"model": "Llama-3.2-3B",
"rank": 64,
"alpha": 128,
"dataset": "irish-curriculum-qa"
})
trainer.train()
mlflow.log_metrics({
"train_loss": trainer.state.log_history[-1]["loss"],
"epochs": 3
})
Best Practices
- Use gradient checkpointing - Enables longer contexts
- Start with lower learning rate for Irish - Low-resource language
- More epochs for Irish - Compensate for limited data
- Export to GGUF - Deploy with Ollama for inference
Related tools (KCG canonical)
Unsloth is the KCG canonical wrapper for PEFT. The full fine-tuning stack is:
.agents/skills/peft/SKILL.md— LoRA / QLoRA / IA³ configuration (the base technology Unsloth wraps).agents/skills/trl/SKILL.md— SFTTrainer, DPOTrainer, GRPOTrainer (the alignment layer; uses RAGAS scores as preference signals).agents/skills/ragas/SKILL.md— RAGAS scoring (the source of DPO preference signals).agents/skills/modal/SKILL.md— Modal H100 burst training for 13B+ models
Resources
- Documentation: https://github.com/unslothai/unsloth
- HuggingFace Skills:
hf-llm-trainerfor full training guide - MLflow Tracking:
mlflowskill for experiment logging - Related Skills: peft, trl, ragas, modal, huggingface, mlflow, litellm