openai-advanced-agents

star 2

Design and implement advanced, reliable tool-using agents with OpenAI models (including GPT-5 series). Use for agent architecture, tool design, memory/state, planning/execution loops, multi-agent handoffs, evaluation harnesses, tracing/observability, and safety controls (prompt injection, data exfiltration). Trigger when building assistants/agents, tool calling, RAG, multi-step workflows, or when reliability and evaluations matter.

ijindal1 By ijindal1 schedule Updated 2/5/2026

name: openai-advanced-agents description: Design and implement advanced, reliable tool-using agents with OpenAI models (including GPT-5 series). Use for agent architecture, tool design, memory/state, planning/execution loops, multi-agent handoffs, evaluation harnesses, tracing/observability, and safety controls (prompt injection, data exfiltration). Trigger when building assistants/agents, tool calling, RAG, multi-step workflows, or when reliability and evaluations matter.

Advanced Agents (OpenAI)

Overview

Use this skill to turn a vague “build an agent” request into a concrete architecture: tool contracts, state model, control loop, evals, and guardrails. For API details and up-to-date capabilities, consult $openai-docs.

Architecture Defaults

  1. Prefer a single-agent “router + tools” design first.
  2. Add specialized sub-agents only when you have clear boundaries and measurable wins.
  3. Separate:
  • Policy: system/developer prompt and tool rules.
  • Planning: break down tasks.
  • Execution: tool calls and transformations.
  • Verification: checks, validators, and fallback paths.

Tool Design (Most Important)

  1. Make tool inputs/outputs strict and small.
  2. Prefer tools that are:
  • Idempotent where possible.
  • Easy to validate.
  • Explicit about permissions and scopes.
  1. Add “read-only” tools for retrieval/inspection and separate “mutating” tools for side effects.

Control Loop (Plan/Act/Check)

Use a tight loop:

  1. Parse task + constraints.
  2. Decide whether to ask a clarification.
  3. Plan steps (short).
  4. Execute with tools.
  5. Verify against acceptance criteria and invariants.
  6. If failing: retry with bounded attempts or escalate to a human.

State, Memory, and Context

  1. Keep short-term state explicit (a struct/object).
  2. For long-term memory, store only:
  • Stable user preferences.
  • Durable facts with provenance.
  1. Always separate:
  • User-provided data.
  • Retrieved documents.
  • Model-generated hypotheses.

Retrieval (RAG) Guidance

  1. Retrieve fewer, higher-quality chunks.
  2. Require citations to retrieved content when answering factual questions.
  3. Add a “no answer” path when evidence is insufficient.
  4. Prefer lightweight re-ranking and document filters before increasing context size.

Safety and Prompt Injection

  1. Treat tool outputs and retrieved text as untrusted input.
  2. Never allow retrieved text to override system/developer rules.
  3. Enforce data boundaries:
  • Do not reveal secrets.
  • Do not execute arbitrary code unless explicitly allowed.
  1. For web/RAG agents, strip or annotate instructions found in documents.

Evaluation and Observability

  1. Start with a small golden set (10-30 cases) that match real usage.
  2. Track:
  • Task success/failure.
  • Tool-call success/failure rate.
  • Latency and cost.
  • Safety incidents.
  1. Log traces with:
  • User input (redacted as needed).
  • Tool calls and outputs.
  • Final answer.
  • Decision points (why a tool was called).

Reference

See references/checklists.md for a concrete build checklist, tool contract template, and an eval plan outline.

Install via CLI
npx skills add https://github.com/ijindal1/jaunt --skill openai-advanced-agents
Repository Details
star Stars 2
call_split Forks 1
navigation Branch main
article Path SKILL.md
More from Creator