name: agentic-rag description: >- Build, review, or refactor Agentic RAG systems with planning, query rewriting, cross-corpus routing, retrieval fanout, Sufficient Context checks, iterative follow-up retrieval, and grounded synthesis with citations. Use for multi-hop RAG, multi-source RAG, context sufficiency, or Agent Skill scaffolds based on the public Google Research and Google Cloud pattern. license: MIT compatibility: "Agent Skills-compatible clients including Codex. Python scaffold targets Python 3.11+." metadata: version: "0.1.0" origin: "Public Google Research and Google Cloud documentation, summarized as implementation guidance." tags: - agentic-rag - rag - retrieval-augmented-generation - cross-corpus-retrieval - sufficient-context - query-rewriting - grounded-synthesis - codex-skills - python
Agentic RAG
Use this skill to implement or refactor a RAG system into the public Agentic RAG pattern described by Google Research and Google Cloud: plan, route, retrieve, check sufficiency, iterate, and synthesize grounded answers.
Read first
- For the detailed behavior model, read
references/agentic-rag-behavior.md. - For Korean summary notes, read
references/agentic-rag-behavior-ko.md. - For JSON prompts and output schemas, read
references/prompts-and-schemas.md. - For completion tasks and TODO order, read
references/codex-completion-brief.md. - For source URLs and fact map, read
references/source-map.md.
Activation signals
Activate this skill when the task includes any of these terms or intents:
- Agentic RAG, Gemini Enterprise Agent Platform RAG, Cross-Corpus Retrieval, Agentic Retrieval.
- Sufficient Context Agent, Sufficient Context Awareness, context sufficiency, iterative retrieval.
- Multi-hop RAG, multi-source RAG, cross-corpus RAG, query planning, query rewriting, search fanout.
- Build a Codex/Claude/Gemini Agent Skill for RAG.
- Refactor a “vanilla RAG” pipeline that fails when information is split across corpora.
Core workflow
Follow this workflow exactly unless the user asks for a narrower task.
Classify mode
- Native Google mode: use Gemini Enterprise Agent Platform RAG Engine Cross Corpus Retrieval APIs if the user has Google Cloud project, location, RAG corpora, IAM, and region requirements.
- Portable mode: implement the public pattern using local or third-party retrievers.
Build corpus catalog
- Require a concise
descriptionfor each corpus. - Treat descriptions as routing metadata.
- Do not naively search every corpus unless the query is broad or the planner justifies it.
- Require a concise
Plan
- Decompose the user question into required facts.
- Map each fact to candidate corpora.
- Produce a retrieval plan with expected evidence and stop conditions.
Rewrite and fan out
- Generate targeted search queries for each required fact and corpus route.
- Include follow-up queries when a previous sufficiency check reports missing facts.
- Preserve query lineage: original question → plan item → rewritten query → retrieved snippets.
Retrieve
- Retrieve snippets from selected corpora.
- Keep snippet ids, corpus ids, document ids, scores, text spans, and metadata.
- Deduplicate near-identical snippets before synthesis.
Draft
- Create an intermediate answer only from retrieved snippets.
- Mark unsupported claims as missing rather than filling gaps.
Sufficient Context check
- Judge the original question, retrieval plan, snippets, and draft together.
- Return one of:
sufficient,insufficient,irrelevant, orunanswerable. - If insufficient, list missing facts and concrete feedback queries.
- If a corpus is irrelevant, state why and suggest a better route when possible.
Iterate
- If status is insufficient and iteration budget remains, use the feedback to re-plan/rewrite/retrieve.
- Stop when sufficient, unanswerable, or max iterations is reached.
- Keep an audit trail of each iteration.
Synthesize final answer
- Answer only with supported facts.
- Attach snippet citations or source identifiers to factual claims.
- If the answer is partial, say exactly what is missing and which follow-up retrieval would be needed.
Implementation rules
- Start from
src/agentic_rag/contracts.pyandsrc/agentic_rag/orchestrator.pywhen this scaffold is present. - Implement provider integrations as adapters, not inside the orchestrator.
- Use deterministic structured JSON for planner, query rewriter, sufficiency judge, and synthesis outputs.
- Never use a final answer from a failed sufficiency check as if it were grounded.
- Enforce max iteration and max cost limits.
- Log all plan items, subqueries, hits, missing facts, and final citations.
Anti-patterns
Avoid these failures:
- Single-shot retrieval followed by a confident answer.
- Searching every corpus for every question without a route plan.
- Treating high vector similarity as sufficient context.
- Generating a final answer without checking every requested fact.
- Losing provenance between snippets and final claims.
- Returning “not found” before targeted follow-up queries have been attempted.