name: prior-auth-azure description: "Demo skill that processes prior authorization requests, performs initial validation checks (NPI, ICD-10, CMS Coverage, CPT), and generates medical necessity assessments using Azure MCP servers."
Prior Authorization Review Skill (Azure)
Process prior authorization requests using AI-assisted review with MCP connectors for NPI Registry, ICD-10 codes, and CMS Coverage policies. This skill generates draft recommendations for human review.
Target Users: Prior authorization specialists, utilization management nurses, medical directors
Key Features:
- Two-subskill workflow (Intake & Assessment → Decision & Notification)
- Parallel MCP validation for optimal performance
- Structured waypoint files for audit trail
- Human-in-the-loop decision confirmation
- Azure APIM integration for secure MCP access
Important Disclaimers
DRAFT RECOMMENDATIONS ONLY: This skill generates draft recommendations only. The payer organization remains fully responsible for all final authorization decisions.
HUMAN REVIEW REQUIRED: All AI-generated recommendations require review and confirmation by appropriate professionals before becoming final decisions. Users may accept, reject, or override any recommendation with documented justification.
AI DECISION BEHAVIOR: AI recommends APPROVE, PEND, or DENY. DENY is recommended when a mandatory criterion is clearly NOT_MET (≥90% confidence, based on documented evidence). All decisions require human confirmation in Subskill 2. Decision logic is configurable in the skill's rubric.md file.
COVERAGE POLICY LIMITATIONS: Coverage policies are sourced from Medicare LCDs/NCDs via the
mcp-reference-dataserver (CMS Coverage tools). If this review is for a commercial or Medicare Advantage plan, payer-specific policies may differ and were not applied.
Prerequisites
Required MCP Servers (Consolidated Architecture)
The project uses 3 consolidated MCP servers (down from ~7 individual servers). Each server bundles multiple tool domains into a single Azure Function endpoint.
mcp-reference-data— NPI + ICD-10 + CMS Coverage (12 tools)- Endpoint: Single Azure Function (
/mcp) — routed via APIM or local athttp://localhost:7071/mcp - NPI Tools:
lookup_npi(npi="..."),validate_npi(npi="..."),search_providers(...) - ICD-10 Tools:
validate_icd10(code="..."),lookup_icd10(code="..."),search_icd10(query="..."),get_icd10_chapter(code_prefix="...") - CMS Tools:
search_coverage(query="...", coverage_type="all", limit=10),check_medical_necessity(cpt_code="...", icd10_codes=[...]),get_coverage_by_cpt(cpt_code="..."),get_coverage_by_icd10(icd10_code="..."),get_mac_jurisdiction(state="...") - Use Cases: Provider verification, diagnosis code validation, coverage policy lookup, medical necessity checks
- Endpoint: Single Azure Function (
mcp-clinical-research— FHIR + PubMed + ClinicalTrials (20 tools)- Endpoint: Single Azure Function (
/mcp) — routed via APIM or local athttp://localhost:7072/mcp - PubMed Tools:
search_pubmed(query="..."),search_clinical_queries(query="...", category="therapy"),get_article(pmid="..."),get_article_abstract(pmid="..."),get_articles_batch(pmids=[...]),find_related_articles(pmid="...") - FHIR Tools:
search_patients(identifier="..."),get_patient(patient_id="..."),get_patient_conditions(patient_id="..."),get_patient_medications(patient_id="..."),get_patient_observations(patient_id="..."),get_patient_encounters(patient_id="..."),search_practitioners(...),validate_resource(...) - ClinicalTrials Tools:
search_trials(...),get_trial(nct_id="..."),get_trial_eligibility(nct_id="..."),get_trial_locations(nct_id="..."),search_by_condition(...),get_trial_results(nct_id="...") - Use Cases: Evidence search via PubMed, FHIR patient data enrichment, clinical trial discovery
- Note: PubMed and FHIR tools are optional for prior auth — they enhance but are not required
- Endpoint: Single Azure Function (
cosmos-rag— Document RAG & Audit Trail (6 tools)- Endpoint: Single Azure Function (
/mcp) — routed via APIM or local athttp://localhost:7073/mcp - Tools:
index_document(...),hybrid_search(query="..."),vector_search(query="..."),record_audit_event(...),get_audit_trail(workflow_id="..."),get_session_history(workflow_type="...") - Use Cases: Coverage policy document retrieval via RAG, workflow audit trail for compliance
- Endpoint: Single Azure Function (
Workflow Overview
flowchart TD
subgraph PA["Prior Authorization Workflow"]
subgraph SUB1["Subskill 1: Intake & Assessment (3-4 min)"]
S1A["1. Collect request details<br/>(member, service, provider)"]
S1B["2. Parallel MCP validation via mcp-reference-data:<br/>• NPI tools → Provider credentials<br/>• ICD-10 tools → Diagnosis codes<br/>• CMS tools → Applicable policies + medical necessity"]
S1C["3. Extract clinical data from documentation"]
S1Lit["4. PubMed evidence search (optional)<br/>via mcp-clinical-research"]
S1D["5. Map evidence to policy criteria"]
S1E["6. Generate recommendation (APPROVE/PEND)"]
S1A --> S1B --> S1C --> S1Lit --> S1D --> S1E
end
SUB1 -->|"OUTPUT: waypoints/assessment.json"| HUMAN
HUMAN{"HUMAN DECISION POINT<br/>Review AI findings"}
HUMAN --> SUB2
subgraph SUB2["Subskill 2: Decision & Notification (1-2 min)"]
S2A["1. Human confirms/overrides recommendation"]
S2B["2. Generate authorization number (if approved)"]
S2C["3. Create decision documentation"]
S2D["4. Generate notification letters"]
S2A --> S2B --> S2C --> S2D
end
SUB2 -->|"OUTPUT: waypoints/decision.json, outputs/letters/"| DONE(["Done"])
end
Subskill Descriptions
Subskill 1: Intake & Assessment (3-4 minutes)
- Collects PA request details (member, service, provider, clinical docs)
- Validates provider credentials, ICD-10 codes, and coverage policies via
mcp-reference-data(NPI + ICD-10 + CMS — parallel execution) - Validates CPT/HCPCS codes via WebFetch to CMS Fee Schedule
- Extracts structured clinical data from documentation
- Searches for supporting literature via
mcp-clinical-researchPubMed tools (optional, strengthens evidence) - Enriches clinical context via
mcp-clinical-researchFHIR tools (optional) - Maps clinical evidence to policy criteria
- Performs medical necessity assessment
- Generates recommendation (APPROVE/PEND)
- Output:
waypoints/assessment.json(consolidated) - Data Sources:
mcp-reference-data(NPI + ICD-10 + CMS, parallel), CMS Fee Schedule (web),mcp-clinical-research(PubMed + FHIR, optional)
Subskill 2: Decision & Notification (1-2 minutes)
- Reads assessment from Subskill 1
- Presents findings for human review
- Accepts human confirmation or override
- Generates authorization number (if approved)
- Creates decision waypoint
- Generates notification letters (approval/denial/pend)
- Generates standardized determination JSON for downstream systems
- Output:
waypoints/decision.json,outputs/determination.json,outputs/approval_letter.mdoroutputs/pend_letter.mdoroutputs/denial_letter.md
Execution Flow
Startup: Request Input Files
Option A: User provides files
To begin a prior authorization review, I need:
1. PA Request Form (member info, service details)
2. Clinical Documentation (progress notes, test results)
3. Provider Information (NPI, credentials)
Please provide the file paths or paste the content.
Option B: Demo mode
Would you like to use sample PA request files for demonstration?
Sample files are in: data/sample_cases/prior_auth_baseline/
- Demo mode note: When sample files are used, the sample data contains demo NPI and sample member ID. This combination triggers demo mode, which skips the NPI lookup for this specific provider only. All other MCP calls execute normally.
Beads (bd) Task Tracking
Use beads to track progress through the prior authorization workflow. Each major phase is a bead with a unique ID, status, and checklist. Update bead status as work progresses to maintain an auditable trail and enable reliable resume.
Bead Definitions
| Bead ID | Phase | Description |
|---|---|---|
bd-pa-001-intake |
Subskill 1, Steps 1-3 | Collect request info + parallel MCP validation |
bd-pa-002-clinical |
Subskill 1, Steps 4-6 | Clinical extraction + rubric read + evidence mapping |
bd-pa-003-recommend |
Subskill 1, Steps 7-10 | Generate recommendation + write waypoint + audit doc |
bd-pa-004-decision |
Subskill 2, Steps 1-5 | Human review + decision capture + authorization |
bd-pa-005-notify |
Subskill 2, Steps 6-7 | Generate notification letters + completion summary |
Bead Lifecycle
not-started → in-progress → completed
- not-started: Bead has not begun (default)
- in-progress: Actively executing this phase (only ONE bead active at a time)
- completed: All checklist items done, outputs verified
Tracking Rules
- Mark bead in-progress before starting its first task
- Check off items as each step completes
- Mark bead completed only after all outputs are verified
- Never skip beads — execute in order (
bd-pa-001→bd-pa-005) - On resume, scan bead statuses to find the first non-completed bead and continue from there
- Persist bead state in
waypoints/assessment.jsonunder a"beads"key so progress survives interruptions
Bead State in Waypoint Files
Include bead tracking in waypoint JSON files:
{
"beads": [
{"id": "bd-pa-001-intake", "status": "completed", "completed_at": "ISO datetime"},
{"id": "bd-pa-002-clinical", "status": "completed", "completed_at": "ISO datetime"},
{"id": "bd-pa-003-recommend", "status": "in-progress", "started_at": "ISO datetime"},
{"id": "bd-pa-004-decision", "status": "not-started"},
{"id": "bd-pa-005-notify", "status": "not-started"}
]
}
Resume via Beads
On startup, if waypoints/assessment.json exists:
- Read the
"beads"array - Find the first bead that is NOT
"completed" - Display bead progress summary to user
- Offer to resume from that bead
Prompt Modules (Context-Optimized)
Structured prompt guidance is organized into 5 phase-aligned modules in references/prompts/. Each module is loaded lazily — only when its bead starts — and released at the next context checkpoint.
Module Loading Table
| Bead | Module(s) to Read | Peak Tokens | Release After |
|---|---|---|---|
bd-pa-001-intake |
01-extraction.md + 02-policy-retrieval.md |
~270 lines | Context Checkpoint 1 (waypoint write) |
bd-pa-002-clinical |
03-clinical-assessment.md |
~120 lines | Context Checkpoint 2 (waypoint update) |
bd-pa-003-recommend |
04-determination.md + rubric.md |
~150 lines | Context Checkpoint 3 (waypoint finalize) |
bd-pa-004-decision |
(none — human review) | ~0 | Context Checkpoint 4 (decision.json write) |
bd-pa-005-notify |
05-output-formatting.md |
~100 lines | Workflow complete |
Loading Rules
- Read the prompt module(s) listed for the current bead BEFORE starting that bead's work
- Do NOT pre-load modules for future beads
- After each context checkpoint, the prior bead's modules and raw data are captured in the waypoint and no longer needed
Context Management
Waypoint files serve as context checkpoints — they compress raw data into structured summaries so downstream beads operate on compact representations instead of replaying all upstream context.
Context Checkpoints
| Checkpoint | Trigger | What Gets Compressed | Waypoint Section |
|---|---|---|---|
| CP1 | Bead 001 complete | Raw clinical docs, MCP tool results, extraction prompts | assessment.{request, clinical, policy} |
| CP2 | Bead 002 complete | Policy full text, evidence mapping work, assessment prompt | assessment.criteria_evaluation |
| CP3 | Bead 003 complete | Determination reasoning, rubric rules | assessment.recommendation |
| CP4 | Bead 004 complete | Assessment details, human decision | decision.{decision, rationale, audit} |
Context Scope Rules Per Bead
Each bead defines what to read and what to ignore:
Bead 001 (Intake): Read raw docs + prompt modules 01, 02. Full context access.
Bead 002 (Clinical): Read assessment.json sections + prompt module 03. Ignore raw docs, MCP results, modules 01-02.
Bead 003 (Recommend): Read assessment.json evaluation + rubric.md + prompt module 04. Ignore raw docs, prior modules.
Bead 004 (Decision): Read assessment.json recommendation only + human input. Ignore everything else.
Bead 005 (Notify): Read decision.json + assessment.json member/service/provider + criteria_evaluation + recommendation.gaps + prompt module 05. Ignore raw clinical documents and raw MCP responses.
The waypoint is the source of truth for downstream beads. Do not re-read raw inputs.
Implementation Requirements
Always read subskill files: Don't execute from memory. Read the actual subskill markdown file and follow instructions.
Load prompt modules per bead: Read the prompt module(s) listed in the Module Loading Table when starting each bead. Do not pre-load modules for future beads.
Auto-detect resume: Check for existing
waypoints/assessment.jsonon startup. If found, read bead state and offer to resume from the first incomplete bead.Parallel MCP execution: In Subskill 1, execute NPI, ICD-10, and Coverage MCP calls in parallel for optimal performance.
Preserve user data: Never overwrite waypoint files without asking confirmation or backing up.
Clear progress indicators: Show users what's happening during operations (MCP queries, data analysis). Update bead status in real-time.
Graceful degradation: If optional data missing, continue with available data and note limitations.
Validate outputs: Check that waypoint files have expected structure before proceeding.
Track beads: Update bead status at every phase transition. Persist bead state in waypoint files for resume capability.
Respect context scoping: At each bead, read only what the Context Scope Rules specify. Use waypoint data instead of re-reading raw inputs from prior beads.
MCP Tool Call Transparency (REQUIRED)
When invoking MCP tools, always inform the user:
- BEFORE: What tool is being called and why
- AFTER: Success notification with summary of results
Example:
Searching mcp-reference-data for applicable coverage policies...
✅ CMS coverage search completed - Found policy: L34567 - Knee Arthroplasty LCD
Common Mistakes to Avoid
MCP and Validation:
- ❌ Don't call
validate_icd10()in a loop if avoidable - consider parallelizing calls - ❌ Don't call
lookup_icd10()with array parameter - it takes a single code string only - ❌ Don't skip CPT/HCPCS validation - it's required even though no MCP exists
- ❌ Don't forget to display MCP success notifications after each connector invocation
Decision Policy Enforcement (CRITICAL):
- ❌ Don't ignore provider verification status when calculating recommendation
- ❌ Don't make decisions without first reading rubric.md
- ✅ DO read rubric.md FIRST to understand current policy
- ✅ DO apply the decision rules specified in rubric.md
Error Handling
Missing MCP Servers: If required MCP connectors not available, display error listing missing connectors.
Missing Subskill Prerequisites:
If Subskill 2 invoked without waypoints/assessment.json, notify user to complete Subskill 1 first.
File Write Errors: If unable to write waypoint files, display error with file path, check permissions/disk space, and offer retry.
Data Quality Issues: If clinical data extraction confidence <60%, warn user with confidence score and low-confidence areas. Offer options to: continue, request additional documentation, or abort.
Quality Checks
Before completing workflow, verify:
- All beads (
bd-pa-001throughbd-pa-005) marked completed - All required waypoint files created
- Decision has clear rationale documented
- All required fields populated
- Output files generated successfully
- Bead state in waypoints reflects final status
Sample Data
Sample case files are included in data/sample_cases/prior_auth_baseline/ for demonstration purposes. When using sample files, the skill operates in demo mode which:
- Skips NPI lookup for the sample provider only
- Executes all other MCP calls (ICD-10, CMS Coverage) normally
- Demonstrates the complete workflow with a pre-configured case
Evaluation Data
A curated evaluation dataset is available in data/ for benchmarking and demo purposes. All data is de-identified/synthetic.
Dataset Structure
data/
├── cases/
│ ├── ground_truth.json # Aggregated ground truth for all case variants
│ └── {001-005}/{a,b}/ # Case variants
│ ├── results.json # Per-case decision (approved/rejected)
│ ├── pa_form/ # Prior auth request form (PDF)
│ ├── doctor_notes/ # Clinical notes (PDF)
│ ├── labs/ # Lab results (PDF, if applicable)
│ └── imaging/ # Imaging reports (PDF, if applicable)
├── policies/ # Indexed retrieval corpus (coverage policy PDFs)
│ ├── 001.pdf
│ ├── 002.pdf
│ ├── 003.pdf
│ ├── 004.pdf
│ └── 005.pdf
├── sample_cases/
│ └── prior_auth_baseline/
│ ├── pa_request.json
│ ├── ct_chest_report.txt
│ ├── pet_scan_report.txt
│ └── pulmonology_consultation.txt
└── pdfs/ # Pre-rendered page images for select cases
Case Catalog
| Case | Variant | Decision | Docs Included |
|---|---|---|---|
| 001 | a | rejected | PA form, notes, labs, imaging |
| 001 | b | approved | PA form, notes, labs, imaging |
| 002 | a | approved | PA form, notes, labs, imaging |
| 002 | b | approved | PA form, notes, labs, imaging |
| 003 | a | approved | PA form, notes, labs |
| 003 | b | approved | PA form, notes, labs |
| 004 | a | approved | PA form, notes |
| 004 | b | approved | PA form, notes |
| 005 | a | approved | PA form, notes, labs, imaging |
| 005 | b | approved | PA form, notes, labs, imaging |
How to Use
Interactive demo (single case):
Run the prior-auth skill using case data/cases/001/a/ — this is a rejected case with full documentation.
Evaluation mode:
Use data/cases/ground_truth.json to compare skill recommendations against MD-reviewed ground truth decisions across all 10 case variants.
Each case variant includes:
- a/b pattern: Variant
aandbfor the same clinical scenario, sometimes with different outcomes (e.g., 001_a rejected vs 001_b approved), enabling testing of the skill's sensitivity to documentation quality and completeness. results.json: Per-variant ground truth withcase_id,decision,evaluation_time, andnotes.../policies/: Coverage policy PDFs (one per case number) used as the indexed retrieval corpus.
Ground Truth Schema
{
"<case_id>": {
"doctor_notes": ["cases/.../notes.pdf"],
"imaging": ["cases/.../imaging.pdf"],
"labs": ["cases/.../labs.pdf"],
"pa_form": ["cases/.../form.pdf"],
"policies": [],
"evaluation_time": "ISO datetime",
"decision": "approved | rejected",
"notes": "Evaluation context"
}
}
Note: File paths in
ground_truth.jsonare relative todata/. Thepoliciesfield is currently empty and reserved for linking case-specific policy references in future iterations.
Azure-Specific Integration
Azure APIM Endpoint Configuration
mcp_servers:
mcp-reference-data:
url: https://healthcare-mcp.azure-api.net/reference-data
auth: azure_ad_token
# Serves: NPI lookup, ICD-10 validation, CMS coverage (12 tools)
mcp-clinical-research:
url: https://healthcare-mcp.azure-api.net/clinical-research
auth: azure_ad_token
# Serves: FHIR, PubMed, ClinicalTrials.gov (20 tools)
cosmos-rag:
url: https://healthcare-mcp.azure-api.net/cosmos-rag
auth: azure_ad_token
# Serves: Document RAG, audit trail (6 tools)
FHIR Integration
Assessment data can be stored as FHIR resources:
{
"resourceType": "Claim",
"use": "preauthorization",
"status": "active",
"patient": { "reference": "Patient/123" },
"insurance": [{ "coverage": { "reference": "Coverage/456" }}]
}