Explore AI Agent Skills & Claude Prompts

Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.

voicebox

Text-to-speech voice toolkit. DEFAULT ACTION: When called with text (e.g. /voicebox hello), IMMEDIATELY run: uv run $SKILL_DIR/scripts/voicebox.py generate "Calm Narrator" "<text>" --play. Do NOT ask questions, do NOT greet the user — just generate and play the speech. Also supports: voice cloning, multi-speaker conversations, recording, and transcription.

multi-style-web-design

Studio-grade single-page web design with **swappable design shells across 16+ styles** (auto-picked by industry or manually chosen) and an opt-in **3D / motion / special-effects toolkit** (depth displacement, tilt+sheen, glass refraction, volumetric slices, light caustics, particle samplers, three.js on demand). Designs with the taste of a tier-1 studio (Apple / Pentagram / Bureau Borsche / Linear / Vercel / Aesop / Klim / Order / MSCHF). Picks aesthetic direction BEFORE writing code, pulls palette from the actual subject, ships portable single-folder static sites with built-in navigation (3 tiers), 7-language i18n via [data-i18n] slots, and zero build step. Use for personal brand sites, founder portfolios, product landings, company about pages, launch teasers, lookbooks, lesson microsites, annual reports, or any one-page site — or when the user asks for "fancy website", "multi-style website", "3D website", "Apple-style depth", "portrait website", "product landing", "brand site", "公司官网", "产品落地页", or provides

schedule Updated 1 month ago

spark-tts

Generate speech from text using iFlytek's Spark TTS model locally on Apple Silicon via mlx-audio. Supports Chinese and English with controllable gender, pitch, and speed. Also supports voice cloning from a 3-second reference audio clip. Use when: user asks to "generate Chinese speech", "中文语音合成", "TTS in Chinese", "spark tts", "clone my voice", "read this in Chinese", or needs Chinese text-to-speech locally. Preferred over Voxtral for Chinese/CJK content. Lightweight 0.5B model (~1GB).

ai-tutor

Real-person tutor mode for any topic. Plans a stepped curriculum, actively drives the learning surface (web pages via browser-use, native macOS apps via computer-use), and teaches by *pointing at the real screen* — not by dumping textbook walls of text. Speaks in short conversational turns sized for TTS, asks eye-exercises after each concept, opens the matching Obsidian deep-dive note one step at a time, and logs the full curriculum into the user's Obsidian vault for later self-study. Use when the user says "teach me X", "be my tutor", "walk me through X like I'm a newbie", "tutor mode", "/ai-tutor", or hands you a live app/webpage and asks you to teach against it.

schedule Updated 1 month ago

self-improvement

Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Claude ('No, that's wrong...', 'Actually...'), (3) User requests a capability that doesn't exist, (4) An external API or tool fails, (5) Claude realizes its knowledge is outdated or incorrect, (6) A better approach is discovered for a recurring task. Also review learnings before major tasks.

schedule Updated 3 months ago

lux-fashion-advisor

Team-wide luxury fashion advisor. Reads any agent's profile (IDENTITY.md + MEMORY.md), cross-references SS26 runway intelligence, decides what to wear based on day of week + time of day + occasion, then builds the optimised generation prompt. Activates when user says 'what should I wear', 'fashion advice', 'style me', 'outfit today', 'plan my outfit', 'consult luxury brands', 'selfie', or any portrait request. Covers ALL occasions — work, social, travel, holiday, morning through late night. Each agent reads their own workspace. When no occasion or day is specified, the agent makes its own decision based on current day, time, and season.

schedule Updated 3 months ago

grok-video-gen

Generate videos using Grok AI via Chrome browser automation. Supports T2V (text-to-video) and I2V (image-to-video) with reference image uploads. Uses grok.com/imagine. Use when user says "Grok video", "create video with Grok", or wants AI video generation through Grok.

mtv-maker

Full end-to-end MTV music video creator. From a song concept and optional character reference photo, produces a complete cinematic MTV: (1) writes lyrics and generates music with ACE-Step, (2) generates cinematic scene images, (3) animates with I2V (Grok/Seedance), (4) assembles clips with audio and crossfade transitions, (5) transcribes audio for accurate lyric timing, (6) burns synced lyric subtitles + opening/ending credits with branding. Use when user says "make an MTV", "create a music video", "generate MTV", "/mtv", or describes a song they want turned into a full visual music video.

screen-to-promo

Turn screen recordings into polished videos — marketing promos, user guides, product demos, and more. Goal-aware pipeline: detects user intent, selects strategy, recommends a plan, then executes. Full pipeline: intent detection → strategy selection → source analysis → storyboard planning → source prep → VO generation → frame-by-frame compositing → audio mixing → final encode. Supports animated presenters (AI animal/character with rembg cutout), per-word caption sync (pop, karaoke, static styles), multi-zoom animations, overlay dissolve transitions, time-mapped VO-to-source sync, CJK-aware captions, and letterbox-aware cropping. Use when: (1) user has screen recordings and wants a polished video — marketing, tutorial, demo, or changelog, (2) user says "make a promo video", "tutorial from this recording", "TikTok video", "marketing video", "user guide", "highlight reel", (3) user provides .mov/.mp4 screen recordings to turn into any kind of video with narration and captions.

schedule Updated 1 month ago

video-creativity

Tier-1 creative agency for end-to-end video production. Owns creative direction, scriptwriting, rich-media generation (T2I/I2I/T2V/I2V), music, word-synced captions, HyperFrames rendering, and QA — delivers a broadcast-quality MP4. Use when user says "make me a video", "product reel", "brand film", "60-second explainer", "cinematic intro". User describes the idea; this skill owns the rest. Never ships AI-slop.

video-prompt-enhancer

Transform simple video prompts into cinematic, structured prompts for AI video generation (Veo 3, Seedance, Grok, Kling, Runway, etc). Adds real camera/lens specs, camera movement, and anti-AI directives without overriding creative intent. Use when: user says 'enhance video prompt', 'make video realistic', 'video prompt', or when a basic video prompt needs upgrading.