381,784 Collected SKILL.md files

Explore AI Agent Skills & Claude Prompts

Discover open-source agent skills for Claude Code, Codex, ChatGPT, and any tool that uses SKILL.md.

search
expand_more
Active:
wandb
Showing 12 of 27 skills
wandb

bump-version

by wandb
star 1.1k

Bump the Weave Python SDK version for release. Use when preparing a new release.

navigation main article SKILL.md
schedule Updated 5 months ago
wandb

publish-pypi

by wandb
star 1.1k

Build and publish the Weave Python SDK to PyPI. Use when releasing a new version.

navigation main article SKILL.md
schedule Updated 5 months ago
wandb

wandb-primary

by wandb
star 59

Primary W&B skill for broad or mixed Weights & Biases work: project overviews, W&B runs and artifacts, Weave traces and evaluations, Reports, Signal Builder, and Launch workflows. Use when the task spans multiple W&B surfaces or the user asks generally what is happening in a W&B project.

navigation main article SKILL.md
schedule Updated 20 days ago
wandb

wandb-primary

by wandb
star 13

Comprehensive primary skill for agents working with Weights & Biases. Covers both the W&B SDK (training runs, metrics, artifacts, sweeps) and the Weave SDK (GenAI traces, evaluations, scorers). Includes helper libraries, gotcha tables, and data analysis patterns. Use this skill whenever the user asks about W&B runs, Weave traces, evaluations, training metrics, loss curves, model comparisons, or any Weights & Biases data — even if they don't say "W&B" explicitly. Also trigger on training-curve diagnostics questions — run health, divergence, overfit/convergence/plateau, spikes, LR-schedule/grad-norm/grad-histogram reading, dead layers, step-axis choice, and run comparisons.

navigation main article SKILL.md
schedule Updated 1 month ago
wandb

wandb-primary

by wandb
star 13

Comprehensive primary skill for agents working with Weights & Biases. Covers both the W&B SDK (training runs, metrics, artifacts, sweeps) and the Weave SDK (GenAI traces, evaluations, scorers). Includes helper libraries, gotcha tables, and data analysis patterns. Use this skill whenever the user asks about W&B runs, Weave traces, evaluations, training metrics, loss curves, model comparisons, or any Weights & Biases data — even if they don't say "W&B" explicitly.

navigation main article SKILL.md
schedule Updated 2 months ago
wandb

check-human-issues

by wandb
star 13

Check and respond to GitHub Issues from the human researcher team. Runs in a forked context (no access to main conversation). Use this skill whenever you need to: check for human messages, respond to human issues, poll for team communications, check GitHub issues. Also triggers for: "any human messages?", "check issues", "respond to humans".

navigation main article SKILL.md
schedule Updated 2 months ago
wandb

alphaxiv-paper-lookup

by wandb
star 13

Look up any arxiv paper on alphaxiv.org to get a structured AI-generated overview. This is faster and more reliable than trying to read a raw PDF.

navigation main article SKILL.md
schedule Updated 2 months ago
wandb

analyze-experiments

by wandb
star 13

Analyzes and categorizes all ML experiment PRs in the senpai research track. Use this skill whenever the user asks to: analyze experiments, categorize PRs, bucket experiments, summarize what's been tried, understand experiment history, review merged vs closed results, or asks "what experiments have we run / worked / failed". Also triggers for: "pull the latest experiments", "what's been tried so far", "category breakdown of PRs", "which experiments succeeded", "noam track analysis". When a branch name is mentioned (e.g. "noam branch", "on the noam branch"), pass it as the base branch to scope the fetch to just those PRs.

navigation main article SKILL.md
schedule Updated 2 months ago
wandb

git-research-log

by wandb
star 13

How to document and publish ML experiment results to GitHub as a pull request. Use this skill whenever a student agent has finished running experiments and needs to create or update a PR — even if the instruction is just "wrap up" or "log your results" or "open a PR". Also use it mid-session to update an in-progress PR as trials complete.

navigation main article SKILL.md
schedule Updated 2 months ago
wandb

list-experiments

by wandb
star 13

Use this skill whenever you need to list all of the experiment ideas tried and in progress for this research programme. It outputs 3 files organized by usefulness — merged winners, a compact results table, and full details for deep dives. Use when generating new experimental ideas to check what has already been tried.

navigation main article SKILL.md
schedule Updated 2 months ago
wandb

plot-experiment-charts

by wandb
star 13

Generate a training curve comparison chart and embed it in a GitHub PR description. Use this skill whenever a student has finished running experiments and is preparing to submit their PR for advisor review. Triggers on: "plot training curves", "add chart to PR", "visualize experiment", "training curve comparison", "plot-experiment-charts", "add chart", "generate comparison chart". Run this before marking the PR ready for review.

navigation main article SKILL.md
schedule Updated 2 months ago
wandb

senpai-status-check

by wandb
star 13

Produce a fresh status report for the senpai ML experiment fleet. Use when the user asks for an experiment status, final status, PR/W&B/pod health check, stale student triage, training shutdown harvest, advisor-state audit, or a "what is really happening right now?" report. The report must prioritize paper-facing test metrics over validation metrics and compare test results to dataset benchmarks or targets.

navigation main article SKILL.md
schedule Updated 2 months ago
Page 1 of 3

Browse Agent Skills by Occupation

23 major groups · 867 SOC occupations

Browse by Category

Explore agent skills organized by their primary use case

SKILLMD / CREATORS AND OCCUPATION CATEGORIES

Explore the agent skills ecosystem by occupation and creator

SkillMD is not just a keyword search box. It is an open map that organizes public skills by occupation, creator, and repository, helping you see which workflows, judgment criteria, and domain habits people are writing for AI agents.

Then follow creators and GitHub repositories back to the source: compare the skills a team maintains, whether the repo is active, and how the README frames the work before you open, install, or reuse anything.

Use it three ways: learn an unfamiliar field by occupation, study how creators organize skills, then use source context to decide what is worth opening or reusing.

01 Map a field

Browse 23 occupation groups and 867 SOC roles to learn what skills exist in adjacent domains and how they break down real work.

02 Follow creators

Use creator and repository pages to inspect maintained skill collections, recent updates, and source context before trusting a result.

03 Search with sources

Search 1.7M+ collected skills, then use occupation tags, creators, and GitHub source context to decide what is worth opening.

Start with the occupation map, then follow creators and repositories back to real code. SkillMD helps explain why a skill is worth opening, not only what it is named.

SEO KNOWLEDGE HUB & TECHNICAL OVERVIEW

Standardizing Agent Capabilities with SKILL.md and Model Context Protocol (MCP)

In the rapidly evolving landscape of artificial intelligence, LLM agents (Large Language Model agents) have transitioned from simple text predictors to autonomous problem solvers. To orchestrate complex, multi-step agentic workflows, developers require a standardized format to specify agent capabilities, prompt instructions, system rules, and database bindings. This is where SKILL.md and the Model Context Protocol (MCP) have emerged as standard developer paradigms. SkillMD serves as the central directory for indexing, exploring, and sharing these critical agent configurations.

Our open-source registry currently tracks over 1.7 million collected SKILL.md configurations and system prompts. By compiling agent configurations from active developers on GitHub, we bridge the gap between prompt engineering research and production execution. Whether you are building agents with Anthropic's Claude Code, OpenAI's GPT-4, Google's Gemini, or local models using Ollama and LlamaIndex, standardized skill definitions ensure your agents behave predictably across different runtime environments.

What is the Model Context Protocol (MCP)?

The Model Context Protocol (MCP) is an open-source standard designed to connect LLMs to data sources, developer tools, and external environments. MCP establishes a bidirectional communication channel between client applications (like Cursor, Claude Desktop, or custom agent systems) and servers hosting data or capabilities. Standardizing instructions via SKILL.md enables LLMs to query databases, read local files, execute terminal commands, and integrate third-party APIs. SkillMD allows you to find ready-to-run MCP servers and prompt instructions for various occupations and technical tasks.

The Structure of a Professional SKILL.md File

A valid SKILL.md configuration is designed to be easily read by humans and parsed by LLMs. It contains precise system instructions, trigger conditions, required parameters, and execution examples. Below is the typical architectural blueprint of a professional agent skill:

  • Metadata & Core Scope: Declares the name of the skill, author details, target models, and a description of the capability.
  • Triggers & Intent Detection: Details semantic triggers that help the agent decide when to invoke this skill.
  • System Prompts: Explicit system-level instructions that direct the agent's behavior, personality, safety guardrails, and formatting preferences.
  • Capabilities & Tools: Lists the files, databases, or APIs the agent must access to complete the tasks.
  • Few-Shot Examples: Demonstrates real inputs and outputs, helping the model generalize behavior through in-context learning.

Optimizing Agent Workflows for Modern LLMs

Writing effective agent skills requires deep knowledge of prompt engineering. With the release of advanced reasoning models like Claude 3.5 Sonnet, ChatGPT o1, and DeepSeek-V3, prompt templates must focus on structured thinking. Developers are encouraged to use XML tags (e.g., <thought>, <context>, and <rules>) to isolate execution boundaries. Standardized prompts prevent agents from suffering from context drift, ensuring that long-running tasks remain aligned with the initial system parameters.

Exploring by SOC Occupations and Creator Profiles

What makes SkillMD unique is its taxonomy. Instead of simple text search, we parse and organize files according to the Standard Occupational Classification (SOC) system. This means you can discover skills written for Computer and Mathematical roles, Business and Financial operations, Legal, Design, and and Educational Instruction fields. By tracking creator profiles, developers can study how different teams organize their custom instructions, compare version updates, and fork public configs for specialized enterprise use cases.

SkillMD operates as a high-performance index running on a fast Go backend and a highly responsive Astro SSR frontend. All search queries execute in milliseconds, featuring smart debouncing to prevent multiple API requests while keeping user data secure. Join our community of developers to standardize your AI agent instructions and optimize your LLM prompting workflows today.

8 QUESTIONS

Frequently Asked Questions

A practical guide to agent skills: what they are, how to inspect them, and how SkillMD helps you explore the ecosystem.