yulan-mini

name: yulan-mini description: Use YuLan-Mini LLM for text generation — song lyrics, audio style prompts, and general chat completions. YuLan-Mini is a 2.4B parameter model served via llama.cpp (CPU or CUDA). Use this skill when users want to generate lyrics, create audio descriptions, or need text generation from the local LLM. allowed-tools: Read, Write, Bash

YuLan-Mini Text Generation Skill

Use YuLan-Mini (2.4B) via llama.cpp for text generation. Always use scripts/yulan-mini.sh script — do NOT call API endpoints directly.

Quick Start

# 1. cd to this skill's directory
cd {project_root}/{.claude or .codex}/skills/yulan-mini/

# 2. Check API service health
./scripts/yulan-mini.sh health

# 3. Generate song lyrics
./scripts/yulan-mini.sh lyrics "A melancholic ballad about leaving home"

# 4. Generate an audio style prompt for ACE-Step
./scripts/yulan-mini.sh prompt "Upbeat summer pop with acoustic guitar"

# 5. Free-form chat
./scripts/yulan-mini.sh chat "Explain the circle of fifths"

Capabilities

YuLan-Mini serves three text generation functions in the studio:

Command	Purpose	Use Case
`lyrics`	Generate song lyrics from a description	Songwriting workflow before ACE-Step generation
`prompt`	Generate audio style/caption descriptions	Create captions for ACE-Step music generation
`chat`	General-purpose chat completion	Any text generation task

Integration with ACE-Step Workflow

YuLan-Mini is the text brain behind the music generation pipeline:

User describes a song idea
YuLan-Mini generates lyrics (lyrics command)
YuLan-Mini generates an audio caption (prompt command)
Both are fed to ACE-Step via the acestep skill for music generation

Script Commands

# need to cd to this skill's directory first
cd {project_root}/{.claude or .codex}/skills/yulan-mini/

# Generate lyrics from a description
./scripts/yulan-mini.sh lyrics "A punk rock anthem about fighting conformity"
./scripts/yulan-mini.sh lyrics -d "Jazz ballad about rainy nights" --max-tokens 800

# Generate audio style prompt
./scripts/yulan-mini.sh prompt "Dark electronic song with heavy bass"
./scripts/yulan-mini.sh prompt -d "Acoustic folk love song" --temperature 0.7

# Free-form chat completion
./scripts/yulan-mini.sh chat "What chord progression works for a sad ballad?"
./scripts/yulan-mini.sh chat -u "Suggest a song structure for a 3-minute pop song" --max-tokens 400

# Override the system prompt
./scripts/yulan-mini.sh lyrics "Spring morning" --system "You are a haiku poet. Write lyrics as a series of haiku."

# Health and info
./scripts/yulan-mini.sh health
./scripts/yulan-mini.sh models

Generation Options

Option	Default	Description
`-d`, `--description`	—	Input text (lyrics/prompt commands)
`-u`, `--user`	—	User message (chat command)
`-s`, `--system`	per-command default	Override system prompt
`--max-tokens`	`600`	Max tokens to generate
`--temperature`	`0.85`	Sampling temperature (0.0–2.0)

Configuration

Important: Configuration follows this priority (high to low):

Command line arguments > config.json defaults
User-specified parameters temporarily override defaults but do not modify config.json
Only config --set command permanently modifies config.json

Default Config File (`scripts/config.json`)

{
  "api_url": "http://127.0.0.1:8003",
  "api_key": "",
  "model": "diskrot/YuLan-Mini-diskrot",
  "generation": {
    "max_tokens": 600,
    "temperature": 0.85
  },
  "prompts": {
    "default": "You are YuLan-Mini, a helpful assistant...",
    "lyrics": "You are a creative lyricist...",
    "audio_style": "You are a music producer..."
  }
}

Option	Default	Description
`api_url`	`http://127.0.0.1:8003`	llama-server address
`api_key`	`""`	API authentication key (optional for local)
`model`	`diskrot/YuLan-Mini-diskrot`	Model identifier
`generation.max_tokens`	`600`	Default max tokens
`generation.temperature`	`0.85`	Default sampling temperature
`prompts.lyrics`	(lyrics system prompt)	System prompt for lyrics generation
`prompts.audio_style`	(audio style prompt)	System prompt for audio prompt generation
`prompts.default`	(general prompt)	System prompt for chat completions

Config Commands

# View config (API key masked)
./scripts/yulan-mini.sh config --list

# Get a specific value
./scripts/yulan-mini.sh config --get api_url
./scripts/yulan-mini.sh config --get generation.max_tokens

# Set a value
./scripts/yulan-mini.sh config --set api_url "http://remote-server:8003"
./scripts/yulan-mini.sh config --set generation.max_tokens 1024

# Check if API key is configured (safe — never exposes key)
./scripts/yulan-mini.sh config --check-key

API Key Handling: When checking whether an API key is configured, use config --check-key which only reports configured or empty without printing the actual key. NEVER use config --get api_key or read config.json directly — these would expose the user's API key. The config --list command is safe — it automatically masks API keys as *** in output.

Prerequisites — YuLan-Mini Service

IMPORTANT: This skill requires the YuLan-Mini llama-server service to be running.

Required Dependencies

The scripts/yulan-mini.sh script requires: curl and jq.

Before First Use

You MUST check the API health before proceeding. Run:

cd "{project_root}/{.claude or .codex}/skills/yulan-mini/" && bash ./scripts/yulan-mini.sh health

If health check succeeds

Proceed with text generation.

If health check fails

Start the service with docker compose up yulan -d and wait for it to become healthy. First startup downloads the GGUF model (~2.5 GB) and may take several minutes. Subsequent starts are fast (model is cached to the data/yulan-gguf volume).

The service works on all platforms:

Mac (Apple Silicon): CPU mode via YULAN_VARIANT=cpu (default)
NVIDIA GPU: CUDA mode via YULAN_VARIANT=cuda + GPU overlay

Model Details

Property	Value
Model	`diskrot/YuLan-Mini-GGUF-diskrot` (GGUF quantized YuLan-Mini)
Parameters	2.4B
Context Length	4096 tokens
Format	GGUF Q8_0
Serving	llama.cpp (llama-server) — OpenAI-compatible API
Port	8003
GPU Required	No (CPU default, optional CUDA acceleration)

API Compatibility

YuLan-Mini is served via llama-server and exposes an OpenAI-compatible API:

Endpoint	Description
`GET /health`	Health check
`GET /v1/models`	List available models
`POST /v1/chat/completions`	Chat completion (used by all commands)

Tips

Lyrics quality: Be specific in your description. "A melancholic piano ballad about leaving your childhood home in autumn" produces better results than "a sad song."
Temperature tuning: Lower temperature (0.5–0.7) for more focused/predictable output, higher (0.8–1.0) for more creative/varied output.
Max tokens: For full song lyrics, consider increasing to 800–1024. The default 600 works well for shorter pieces.
Custom system prompts: Override the system prompt with --system for specialized tasks like writing in a specific language, style, or format.