setup-guide

star 147

Use when the user wants to set up synthetic data generation for the first time, or when sdg_hub is not yet installed/configured in the current environment.

Red-Hat-AI-Innovation-Team By Red-Hat-AI-Innovation-Team schedule Updated 6/4/2026

name: setup-guide description: "Use when the user wants to set up synthetic data generation for the first time, or when sdg_hub is not yet installed/configured in the current environment." allowed-tools: ["Bash(${CLAUDE_PLUGIN_ROOT}/scripts/sdg_detect.sh:)", "Bash(${CLAUDE_PLUGIN_ROOT}/scripts/sdg_flows.sh:)"]

sdg_hub Setup Guide

You are helping the user set up synthetic data generation.

Step 1: Detect Environment

"${CLAUDE_PLUGIN_ROOT}/scripts/sdg_detect.sh"

Step 2: Install if Needed

If library=missing:

  • Explain: "sdg_hub is a framework for synthetic data generation — it uses composable blocks and YAML-defined flows to build LLM training datasets from seed data."
  • Ask permission: "I can install it for you. Want me to proceed?"
  • If yes and installer=uv: run uv pip install sdg_hub
  • If yes and installer=pip: run pip install sdg_hub
  • If installer=none: tell the user they need Python and pip/uv installed first

Step 3: Quick Setup or Custom

Also check for API keys in the environment:

echo "openai_key=${OPENAI_API_KEY:+found}" "anthropic_key=${ANTHROPIC_API_KEY:+found}"

If an API key was detected, offer a one-question fast path:

"I detected your OpenAI API key. I can set up with these defaults:

  • Model: openai/gpt-4o-mini
  • Temperature: 0.7
  • Concurrency: 5

Accept these defaults, or would you like to customize?"

If the user accepts, skip to Step 5 using the detected key and defaults.

For Anthropic keys, default to anthropic/claude-sonnet-4-20250514.

If no API key was detected, or the user wants to customize, proceed to Step 4.

Step 4: Collect Configuration

Ask these questions one at a time:

  1. Model: "Which LLM model do you want to use for generation?" — e.g., openai/gpt-4o-mini, meta-llama/Llama-3.3-70B-Instruct, anthropic/claude-sonnet-4-20250514
  2. API endpoint: "What's your model endpoint URL?" — e.g., http://localhost:8000/v1 for vLLM, or leave empty for cloud provider defaults
  3. Temperature: "What temperature for generation?" (default: 0.7)
  4. Max concurrency: "How many parallel LLM requests?" (default: 5) — higher is faster but may hit rate limits
  5. Checkpoint directory: "Where should generation checkpoints be saved?" (default: ./checkpoints) — allows resuming interrupted runs

Step 5: Ensure API Key

API keys are read from environment variables — never store them in the config file. LiteLLM (used by sdg_hub) reads standard env vars automatically.

If no API key was detected in Step 3, tell the user to set the appropriate environment variable:

"Set your API key as an environment variable before running generation:

export OPENAI_API_KEY="sk-..."        # OpenAI models
export ANTHROPIC_API_KEY="sk-ant-..."  # Anthropic models

For local endpoints (vLLM, Ollama) that don't require authentication, no API key is needed. LiteLLM picks up these env vars automatically — no extra configuration required."

Step 6: Save Config

Write the config to .sdg-hub/config.json:

{
  "model": "<model>",
  "api_base": "<endpoint>",
  "temperature": 0.7,
  "max_concurrency": 5,
  "checkpoint_dir": "./checkpoints"
}

Add .sdg-hub/ to .gitignore if not already present.

Confirm the config file was written, then report success:

"Setup complete! To run generation, use the data-generation skill, or the flow-browser skill to browse available flows.

API keys are read from environment variables, not the config file. Make sure the appropriate variable is set in your shell:

export OPENAI_API_KEY="sk-..."        # OpenAI models
export ANTHROPIC_API_KEY="sk-ant-..."  # Anthropic models

Local endpoints (vLLM, Ollama) don't need an API key."

Step 7: Verify

List available flows to confirm the installation works:

"${CLAUDE_PLUGIN_ROOT}/scripts/sdg_flows.sh" list

Report success and remind the user they can now use the data-generation skill to run generation, or the flow-browser skill to browse available flows.

Updating Config

If this skill is invoked again and a config already exists, ask: "You already have a configuration. Do you want to update it or start fresh?"

Install via CLI
npx skills add https://github.com/Red-Hat-AI-Innovation-Team/sdg_hub --skill setup-guide
Repository Details
star Stars 147
call_split Forks 58
navigation Branch main
article Path SKILL.md
More from Creator
Red-Hat-AI-Innovation-Team
Red-Hat-AI-Innovation-Team Explore all skills →