name: kagenti:deploy description: Deploy or redeploy the Kagenti Kind cluster using the Python installer - quick redeploy, manual steps, and troubleshooting
Deploy Cluster Skill
This skill guides you through deploying or redeploying the Kagenti Kind cluster using the Python installer.
Context-Safe Execution (MANDATORY)
Deploy scripts produce hundreds of lines. Always redirect to files:
export LOG_DIR=/tmp/kagenti/deploy/$(basename $(git rev-parse --show-toplevel))
mkdir -p $LOG_DIR
# Pattern: redirect deploy output
./.github/scripts/local-setup/kind-full-test.sh ... > $LOG_DIR/deploy.log 2>&1; echo "EXIT:$?"
# On failure: Task(subagent_type='Explore') with Grep to find errors
When to Use
- Setting up new local development cluster
- Full cluster redeploy after major changes
- Cluster is corrupted or unstable
- Testing clean deployment
- Running E2E tests locally
Resource Requirements
Minimum (from CLAUDE.md):
- 12GB RAM
- 4 CPU cores
- Docker Desktop, Rancher Desktop, or Podman
Recommended for development:
- 16GB RAM
- 6 CPU cores
- 50GB free disk space
Multiple Clusters
You can run multiple Kind clusters:
- agent-platform - Created by kagenti-installer (default)
- kagenti-demo - Your existing cluster
- Each cluster runs independently with its own name
Check existing clusters:
kind get clusters
Quick Redeploy (Full Installation)
# 1. Setup environment (first time only)
cp kagenti/installer/app/.env_template kagenti/installer/app/.env
# Edit .env with:
# - GITHUB_USER=<your-github-username>
# - GITHUB_TOKEN=<ghcr.io-token>
# - OPENAI_API_KEY=<openai-key>
# - AGENT_NAMESPACES=team1,team2
# 2. Full redeploy (creates new cluster + installs everything)
cd kagenti/installer
uv run kagenti-installer
# What it does (15-25 minutes):
# ✓ Creates Kind cluster "agent-platform"
# ✓ Installs registry (optional)
# ✓ Installs Tekton Pipelines
# ✓ Installs Cert-Manager
# ✓ Installs Platform Operator
# ✓ Installs Istio Ambient
# ✓ Installs Gateway API
# ✓ Installs SPIRE
# ✓ Installs MCP Gateway
# ✓ Installs Keycloak + PostgreSQL
# ✓ Installs Addons (Prometheus, Kiali, Phoenix)
# ✓ Installs UI
# ✓ Installs ToolHive
# ✓ Creates agent namespaces (team1, team2)
Use Existing Cluster
# Install on already running Kind cluster
cd kagenti/installer
uv run kagenti-installer --use-existing-cluster
Cleanup and Fresh Install
# 1. Delete existing cluster
kind delete cluster --name agent-platform
# 2. Clean Docker images (optional)
docker system prune -a
# 3. Fresh install
cd kagenti/installer
uv run kagenti-installer
Selective Component Installation
Skip components you don't need for faster deployment:
# Minimal install (no UI, no observability, no auth)
cd kagenti/installer
uv run kagenti-installer \
--skip-install ui \
--skip-install addons \
--skip-install keycloak \
--skip-install spire
# Skip specific components
uv run kagenti-installer \
--skip-install tekton \
--skip-install operator \
--skip-install gateway \
--skip-install mcp_gateway
# Install only core platform (for testing)
uv run kagenti-installer \
--skip-install addons \
--skip-install ui \
--skip-install keycloak \
--skip-install agents
Available components to skip:
registry- Internal container registrytekton- Tekton Pipelines (build system)cert_manager- Certificate managementoperator- Platform Operator (deprecated, being replaced by kagenti-operator)istio- Service meshgateway- Kubernetes Gateway APIspire- Workload identitymcp_gateway- MCP Gatewayaddons- Observability (Prometheus, Kiali, Phoenix)ui- Kagenti UIkeycloak- Authenticationagents- Demo agentsmetrics_server- Metrics serverinspector- MCP inspectortoolhive- ToolHive operator
Deploy Weather Agents (Demo)
# After platform is installed
kubectl apply -f kagenti/examples/components/
This creates:
- weather-tool in team1 namespace
- weather-service in team1 namespace
Check Deployment Health
Quick Health Check
# Run the health check script (from CI)
chmod +x .github/scripts/verify_deployment.sh
.github/scripts/verify_deployment.sh
# What it checks:
# ✓ Resource usage (RAM, disk, CPU, containers)
# ✓ Deployment status (weather-tool, weather-service, keycloak, operator)
# ✓ Pod health summary (total, running, pending, failed, crashloop)
# ✓ Failed pod details (events, error logs)
Manual Health Checks
# All pods
kubectl get pods -A
# Failed pods only
kubectl get pods -A --field-selector=status.phase!=Running,status.phase!=Succeeded
# Specific namespace
kubectl get pods -n team1
kubectl get pods -n keycloak
kubectl get pods -n kagenti-system
# Deployments
kubectl get deployments -A
# Services
kubectl get svc -A
Run E2E Tests Locally
After platform is deployed:
cd kagenti
# Install test dependencies
uv pip install -r tests/requirements.txt
# Run all deployment health tests
uv run pytest tests/e2e/test_deployment_health.py -v
# Run only critical tests
uv run pytest tests/e2e/test_deployment_health.py -v --only-critical
# Run specific test
uv run pytest tests/e2e/test_deployment_health.py::TestWeatherToolDeployment::test_weather_tool_deployment_ready -v
# Exclude Keycloak tests
uv run pytest tests/e2e/test_deployment_health.py -v --exclude-app=keycloak
# Increase timeout
uv run pytest tests/e2e/test_deployment_health.py -v --app-timeout=600
Run Full CI Workflow Locally
Simulate what runs in CI:
# 1. Install platform
cd kagenti/installer
uv run kagenti-installer --silent
# 2. Deploy weather agents
cd ../..
kubectl apply -f kagenti/examples/components/
# 3. Wait for deployments
kubectl wait --for=condition=available --timeout=300s deployment/weather-tool -n team1
kubectl wait --for=condition=available --timeout=300s deployment/weather-service -n team1
# 4. Run health check
chmod +x .github/scripts/verify_deployment.sh
.github/scripts/verify_deployment.sh
# 5. Run E2E tests
cd kagenti
uv pip install -r tests/requirements.txt
uv run pytest tests/e2e/test_deployment_health.py -v \
--timeout=300 \
--tb=short
Troubleshooting Deployment
Issue: Installer Timeout or Slow
# Check Docker resource allocation
docker info | grep -E "CPUs|Total Memory"
# Increase timeout (images can be slow to pull)
# The installer will retry - just re-run:
cd kagenti/installer
uv run kagenti-installer --use-existing-cluster
Issue: "Error loading config file" or kubectl errors
# Check kubeconfig
kubectl config current-context
# Should show: kind-agent-platform
# If not, set context
kubectl config use-context kind-agent-platform
Issue: Pods stuck in ImagePullBackOff
# Check if images are available in Kind
docker exec agent-platform-control-plane crictl images
# Reload images (for custom builds)
kind load docker-image <image-name> --name agent-platform
# Check pod description for error
kubectl describe pod <pod-name> -n <namespace>
Issue: Keycloak Connection Issues
# Restart Keycloak
kubectl delete -n keycloak -f kagenti/installer/app/resources/keycloak.yaml
kubectl apply -n keycloak -f kagenti/installer/app/resources/keycloak.yaml
# Restart Istio ztunnel
kubectl rollout restart daemonset -n istio-system ztunnel
# Restart Gateway
kubectl rollout restart -n kagenti-system deployment http-istio
Issue: Need to Update Secrets
# Update GitHub token
kubectl -n <namespace> delete secret github-token-secret
# Re-run installer to recreate secrets
cd kagenti/installer
uv run kagenti-installer --use-existing-cluster
Issue: Blank UI on macOS
# Disable Screen Time Content & Privacy Restrictions
# System Settings > Screen Time > Content & Privacy
Issue: GitHub Token Errors
# Ensure token has correct scopes:
# - repo:all
# - write:packages
# - read:packages
# Clear cached credentials
docker logout ghcr.io
Access Platform Services
After deployment, access these services:
# Kagenti UI
open http://kagenti-ui.localtest.me:8080
# Keycloak Admin Console
open http://keycloak.localtest.me:8080
# Username: admin
# Password: (from Keycloak secret)
kubectl get secret -n keycloak keycloak-initial-admin -o jsonpath='{.data.password}' | base64 -d
# Prometheus (if addons installed)
kubectl port-forward -n observability svc/prometheus 9090:9090
open http://localhost:9090
# Grafana (if addons installed)
kubectl port-forward -n observability svc/grafana 3000:3000
open http://localhost:3000
# Kiali (if addons installed)
kubectl port-forward -n kiali svc/kiali 20001:20001
open http://localhost:20001
Platform Configuration
Environment Variables (.env file)
Required in kagenti/installer/app/.env:
# GitHub access for ghcr.io
GITHUB_USER=your-username
GITHUB_TOKEN=ghp_xxx # Classic token with repo:all, write:packages, read:packages
# OpenAI API (for agents)
OPENAI_API_KEY=sk-xxx
# Agent namespaces
AGENT_NAMESPACES=team1,team2
# Optional: Slack (for Slack tool demo)
SLACK_BOT_TOKEN=xoxb-xxx
Cluster Configuration
Edit kagenti/installer/app/config.py:
CLUSTER_NAME = "agent-platform" # Kind cluster name
DOMAIN_NAME = "localtest.me" # Domain for services
CONTAINER_ENGINE = "docker" # or "podman"
Manual Step-by-Step Deployment (Advanced)
For debugging or understanding the installer:
# 1. Create Kind cluster manually
cat <<EOF | kind create cluster --name agent-platform --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 30080
hostPort: 8080
- containerPort: 30443
hostPort: 9443
EOF
# 2. Set kubeconfig context
kubectl config use-context kind-agent-platform
# 3. Install components one by one
cd kagenti/installer
# Install Tekton
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.66.0/release.yaml
# Install Cert-Manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.16.2/cert-manager.yaml
# ... (see installer code for full sequence)
Related Skills
- k8s:health: Check comprehensive platform health
- k8s:logs: Query logs for debugging
- k8s:pods: Debug pod issues
Pro Tips
- Use --use-existing-cluster: Faster reinstalls without recreating cluster
- Skip components: Use --skip-install for faster iteration
- Multiple clusters: Use different cluster names for parallel testing
- Resource allocation: Ensure Docker/Podman has enough RAM (16GB recommended)
- Cache images: Pulled images are cached - subsequent installs are faster
- Silent mode: Use --silent to skip interactive prompts
- Check logs: If installer fails, check pod logs in kagenti-system namespace
Common Workflows
Daily Development
# Use existing cluster, skip slow components
cd kagenti/installer
uv run kagenti-installer --use-existing-cluster \
--skip-install addons \
--skip-install keycloak
Full Test Before PR
# Fresh cluster, all components, run tests
kind delete cluster --name agent-platform
cd kagenti/installer
uv run kagenti-installer --silent
kubectl apply -f kagenti/examples/components/
.github/scripts/verify_deployment.sh
cd kagenti && uv run pytest tests/e2e/test_deployment_health.py -v
Quick Agent Testing
# Minimal platform, just enough for agents
cd kagenti/installer
uv run kagenti-installer \
--skip-install addons \
--skip-install ui \
--skip-install keycloak
kubectl apply -f kagenti/examples/components/