claude-on-foundry

star 26

End-to-end assistant skill for the Claude on Foundry Starter Kit (Azure-Samples/claude). Walks customers through deploying, verifying, modifying, debugging, and tearing down a Claude model deployment on Microsoft Foundry using either the Bicep or Terraform IaC variant in this repo, with one-command guidance via `azd up`, the Anthropic SDK, and the Claude Code CLI over Microsoft Entra ID. USE FOR: deploy Claude on Foundry, azd up fails, quota errors (715-123420, InsufficientQuota), region or model selection, add or remove a Claude family (haiku / sonnet / opus), AnthropicOrganizationCreationException, 401 / 403 from Claude SDK, soft-deleted accounts holding quota, Claude Code CLI wiring, Entra ID token refresh for long-running processes, clean teardown of the starter kit. DO NOT USE FOR: general Microsoft Foundry agent development (use microsoft-foundry skill); non-Claude model deployments such as OpenAI or open-source models (use azure-deploy); Azure cost analysis (use azure-cost-optimization); cross-tenant E

Azure-Samples By Azure-Samples schedule Updated 6/11/2026

name: claude-on-foundry description: >- End-to-end assistant skill for the Claude on Foundry Starter Kit (Azure-Samples/claude). Walks customers through deploying, verifying, modifying, debugging, and tearing down a Claude model deployment on Microsoft Foundry using either the Bicep or Terraform IaC variant in this repo, with one-command guidance via azd up, the Anthropic SDK, and the Claude Code CLI over Microsoft Entra ID. USE FOR: deploy Claude on Foundry, azd up fails, quota errors (715-123420, InsufficientQuota), region or model selection, add or remove a Claude family (haiku / sonnet / opus), AnthropicOrganizationCreationException, 401 / 403 from Claude SDK, soft-deleted accounts holding quota, Claude Code CLI wiring, Entra ID token refresh for long-running processes, clean teardown of the starter kit. DO NOT USE FOR: general Microsoft Foundry agent development (use microsoft-foundry skill); non-Claude model deployments such as OpenAI or open-source models (use azure-deploy); Azure cost analysis (use azure-cost-optimization); cross-tenant Entra ID administration unrelated to this starter.

Claude on Foundry — Starter Kit

This skill is the deep playbook for Azure-Samples/claude (aka.ms/claude/start). The always-on layer is .github/copilot-instructions.md in the same repo — read it first for repo shape, env-var contract, and hard rules. Use this skill when the customer's task falls into one of the five flows below.


Decision tree — where to start

The customer says... Jump to
"Help me deploy", "set this up", "get me started" PLANDEPLOY
"It failed", "I'm getting an error", "azd up didn't work" DIAGNOSE
"Is it working?", "did it actually deploy?", "test it" VERIFY
"Add another family", "change the region", "switch from sonnet to opus" MODIFY
"I'm done", "tear it down", "free the quota", "clean up" TEARDOWN

If the request doesn't fit any of these, fall back to the repo's README.md.


PLAN — before azd up

The most common failures come from skipping these four checks. Walk through them with the customer once, then move on:

  1. Subscription eligibility. Claude on Foundry requires an Enterprise (EA) or MCA-E subscription. If the customer doesn't know, run:

    az account show --query "{name:name, id:id, type:subscriptionPolicies.quotaId}" -o table
    

    quotaId containing EnterpriseAgreement_2014-09-01 or MCAE_2024-07-01 is the green light.

  2. Region choice. Honor what the customer asks for. If they have no preference:

    • eastus2 — all three families (haiku, sonnet, opus). Default recommendation.
    • swedencentral — all three families. Use if they're EU-data-residency-conscious.
    • westus2 — sonnet + opus only (no haiku). Use only if explicitly requested.
  3. Model selection. Run the catalog tool first:

    ./Get-ClaudeCatalog.ps1 -Latest
    

    This shows the newest version per family with regions and TPM limits. Quote the table back to the customer and let them pick.

  4. Attestation fields. These are sent to Anthropic on every request and are part of accepting their commercial terms — get them right:

    • CLAUDE_ORGANIZATION_NAME — the customer's legal entity name (no default).
    • CLAUDE_COUNTRY_CODE — ISO-2, default US.
    • CLAUDE_INDUSTRYlowercase only: technology | finance | healthcare | education | retail | manufacturing | government | media | other.

    If CLAUDE_INDUSTRY is uppercase or unknown, deployment fails with AnthropicOrganizationCreationException — easy to miss because it looks like a transient error.


DEPLOY — running azd up

# Pick a variant. Default to Bicep unless the customer prefers Terraform.
cd infra-bicep   # or: cd infra-terraform

azd auth login                          # add --tenant-id <id> if needed
azd env new <name>                      # one env per region + variant combination
azd env set CLAUDE_ORGANIZATION_NAME "<legal-entity>"
azd env set AZURE_LOCATION "eastus2"

# Pick families — comment out any line to skip that family.
azd env set CLAUDE_HAIKU_MODEL  "claude-haiku-4-5"
azd env set CLAUDE_SONNET_MODEL "claude-sonnet-4-6"
azd env set CLAUDE_OPUS_MODEL   "claude-opus-4-8"

# Optional: auto-install the Claude Code CLI as part of postprovision.
azd env set CLAUDE_CODE_AUTO_INSTALL true

azd up

The preprovision hook runs scripts/preflight-claude.ps1 automatically. It is best-effort: it warns and continues when CLAUDE_ORGANIZATION_NAME / AZURE_LOCATION aren't set (azd / Bicep will prompt) or when az isn't installed / logged in (the RP will surface any errors at deploy time). It still hard-fails on a missing offer in the Anthropic catalog (exit 4) when that check can run — that's the case that turns into an opaque deploy failure otherwise. The quota check is a hard fail on Terraform (because azapi_resource swallows quota into the opaque 715-123420) and a warning on Bicep (because azd's own ARM preflight already prints InsufficientQuota and prompts to continue). Never suggest bypassing the preflight.

The postprovision hook runs scripts/configure-claude-code.ps1 to wire up Claude Code. The customer can re-run it any time without re-deploying:

pwsh -File scripts/configure-claude-code.ps1

Bicep vs Terraform — which to recommend by default? Bicep. It surfaces a clear InsufficientQuota message; Terraform's azapi_resource bypasses ARM preflight and returns the opaque 715-123420 instead. Only pick Terraform if the customer's shop already standardizes on HCL.


DIAGNOSE — error→fix table

Match the customer's exact error string to a row. Verify the diagnostic command output before recommending the fix.

Provisioning failures (azd up fails)

Error fingerprint Root cause Diagnostic Fix
AnthropicOrganizationCreationException / AnthropicOrganizationCreationFailed One of the three attestation fields is missing or CLAUDE_INDUSTRY is uppercase. azd env get-values | Select-String CLAUDE_ORGANIZATION_NAME, CLAUDE_COUNTRY_CODE, CLAUDE_INDUSTRY azd env set CLAUDE_INDUSTRY technology (lowercase). Re-run azd up.
Project can only be created under AIServices Kind account with allowProjectManagement set to true Account property got downgraded. Check the IaC didn't get edited to remove allowProjectManagement. Restore the template; re-deploy.
Marketplace offer ... not found (from preflight, exit 4) CLAUDE_*_MODEL value is misspelled or that SKU isn't in the catalog. ./Get-ClaudeCatalog.ps1 and grep the family. Set CLAUDE_<FAMILY>_MODEL to a name from the catalog.
Quota insufficient (from preflight, exit 6) Requested capacity + existing usage > per-region limit. az cognitiveservices usage list -l <region> --query "[?contains(name.value,'claude-')]" Lower CLAUDE_<FAMILY>_CAPACITY, free quota (see soft-delete row), or request a quota bump in the Foundry portal.
Bicep: InsufficientQuota: This operation require N new capacity in quota Tokens Per Minute (thousands) - Claude <model> Same as above; Bicep gets the clear message because it goes through ARM preflight. Same diagnostic. Same fix.
GatewayTimeout: The gateway did not receive a response from 'Microsoft.CognitiveServices' within the specified time period. — deployment stuck in Creating ARM-layer poll timeout on a slow LRO, not a real failure. The RP keeps provisioning after ARM gives up; deployment usually reaches Succeeded minutes later. More likely on first-time deploys; varies by region and family. az cognitiveservices account deployment list -g <rg> -n <foundry-account> -o table — check provisioningState. If Succeeded: run azd env refresh and proceed. If still Creating: wait it out with pwsh -File scripts/verify-claude-code.ps1 -WaitForDeployment (POSIX: --wait-for-deployment), which polls until terminal state. Do not re-run azd up — it can collide with the in-flight LRO.
Terraform: opaque 400 715-123420 "An error occurred. Please reach out to support for additional assistance." Almost always insufficient quota. Terraform's azapi_resource skips ARM preflight so the RP returns this generic code. az cognitiveservices usage list -l <region> --query "[?contains(name.value,'<model>')].{quota:name.value, used:currentValue, limit:limit}" -o table If used + requested > limit: lower capacity OR purge soft-deleted accounts (next row). Re-run on Bicep variant if you need a clearer error.
Quota looks full but no live deployments exist Soft-deleted Cognitive Services accounts hold quota for up to 48 h. az cognitiveservices account list-deleted -o table Confirm with user first, then for each: az cognitiveservices account purge --name <n> --location <loc> --resource-group <rg>. The original RG name is in the deleted-account id field 9.
Marketplace Subscription purchase eligibility check failed Subscription can't purchase the Anthropic offer (no entitlement / sandbox / paid-offer policy). Confirm sub type (see PLAN). Either use a Claude-eligible sub, or pre-accept explicitly: az term accept --publisher anthropic --product anthropic-<model>-offer --plan anthropic-<model>-plan-new.
Region not available Region doesn't host the requested family. Compare AZURE_LOCATION to the per-region matrix in PLAN step 2. Use eastus2 or swedencentral (all three families), or westus2 (sonnet/opus only).

Inference / runtime failures

Error fingerprint Root cause Diagnostic Fix
404 Not Found on first SDK call base_url is missing the /anthropic suffix. Print the base_url the script is using. Append /anthropic so it's https://<resource>.services.ai.azure.com/anthropic.
401 Unauthorized on first call Token scope wrong, or no az login. az account get-access-token --resource https://ai.azure.com/.default --query expiresOn az login (add --tenant <id> if Foundry is in a different tenant). Scope must be https://ai.azure.com/.default.
401 Unauthorized after ~1 hour of running Captured token expired; plain Anthropic client doesn't auto-refresh. Check how long the process has been alive. Switch to src/hello_claude_token_refresh.py which uses AnthropicIdentity + get_bearer_token_provider for per-request refresh.
401 PermissionDenied: Principal does not have access to API/Operation — intermittently, passes seconds later Data-plane RBAC propagation lag right after a role grant. az role assignment list --assignee <oid> --scope <foundry-account-id> -o table Wait 1-3 minutes and retry. Do NOT suggest disabling retries.
403 Forbidden consistently Caller has no data-plane role on the Foundry account. Same az role assignment list query. Grant Cognitive Services User (minimum), Foundry User, or Azure AI Developer. See the one-liner in README → Granting data-plane roles after azd up.
claude -p says: The model claude-<family>-... is not available on your foundry deployment User-global ~/.claude/settings.json pins a family this workspace didn't deploy, overriding the workspace pin. cat .claude/settings.json and cat ~/.claude/settings.json. Re-run pwsh -File scripts/configure-claude-code.ps1, OR pass --model <sonnet|opus|haiku> explicitly, OR (with user OK) edit the user-global file to remove the "model" line.
Windows: UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f60a' Console codepage is cp1252; Claude returned emoji. chcp shows cp1252. $env:PYTHONIOENCODING = "utf-8" or chcp 65001.
check_claude_quota.py exits with Could not resolve a subscription id ... [WinError 2] Azure CLI not on PATH in the active shell. Get-Command az returns nothing. $env:AZURE_SUBSCRIPTION_ID = "<sub-id>" or pass --subscription <sub-id>.

VERIFY — prove it works

Run the bundled verifier; it covers every check below in one shot:

pwsh -File scripts/verify-claude-code.ps1                    # all checks + claude -p per deployed family
pwsh -File scripts/verify-claude-code.ps1 -SkipClaudeCall    # config only (no tokens spent)
pwsh -File scripts/verify-claude-code.ps1 -RunPythonSample   # also runs python src/hello_claude.py

POSIX: bash scripts/verify-claude-code.sh [--skip-claude-call|--run-python-sample]. Exits non-zero on hard failures — wire into CI if needed.

If the customer wants to spot-check manually:

. ./claude-code.env.ps1                # PowerShell. POSIX: source ./claude-code.env.sh
'who are you?' | claude -p             # one-shot non-interactive probe
claude                                 # interactive REPL; try /status and /model

/status should report API provider: Microsoft Foundry. If it doesn't, the activator wasn't sourced or .vscode/settings.json wasn't picked up — reload the VS Code window and retry.


MODIFY — common post-deploy changes

Task Steps
Add another family azd env set CLAUDE_<FAMILY>_MODEL "claude-<family>-x-y"azd up (incremental; existing deployments untouched) → re-source claude-code.env.ps1
Remove a family Delete it in the portal (or edit Bicep/TF to remove the resource) → azd up → re-source the activator
Bump capacity azd env set CLAUDE_<FAMILY>_CAPACITY <new>azd up. Preflight will block if quota is short.
Switch region azd env new <name-region> in the same variant folder, then redo the DEPLOY flow. Don't try to mutate AZURE_LOCATION on an existing env — the account is region-stamped.
Switch variants (Bicep ↔ Terraform) They produce equivalent infra but with different azd env state. Create a new env in the other folder: cd infra-terraform && azd env new <name> && ....
Refresh Claude Code wiring pwsh -File scripts/configure-claude-code.ps1 (or the .sh variant). Idempotent — runs without re-deploying.
Wire up the Claude Code VS Code extension azd env set CLAUDE_WRITE_VSCODE_SETTINGS 1 (then re-run azd provision or the configure script). Opt-in because the activator + .claude/settings.json are enough for the CLI and SDK; only the Anthropic Claude Code VS Code extension needs claudeCode.* keys in workspace settings.
Convert to long-running auth Replace Anthropic(auth_token=...) with AnthropicIdentity(azure_ad_token_provider=...) from src/hello_claude_token_refresh.py.

TEARDOWN — the full cleanup sequence

azd down alone does not free quota. Soft-deleted Cognitive Services accounts continue to count against per-region TPM for up to 48 hours. The correct sequence:

# 1. Tear down what azd provisioned. Confirm with the user before running.
cd infra-bicep   # or infra-terraform — whichever variant deployed
azd down --purge --force

# 2. List soft-deleted Cognitive Services accounts in the region you used.
az cognitiveservices account list-deleted --query "[?location=='eastus2']" -o table

# 3. Confirm with the user, then purge each. The RG name is in id segment 9.
$accounts = az cognitiveservices account list-deleted -o json | ConvertFrom-Json
foreach ($a in $accounts) {
    $rg = ($a.id -split '/')[8]
    az cognitiveservices account purge --name $a.name --location $a.location --resource-group $rg
}

# 4. Verify the quota is freed.
az cognitiveservices usage list -l eastus2 --query "[?contains(name.value,'claude-')]" -o table

Always confirm with the user before running step 1 or step 3. Both are destructive and irreversible.

The full POSIX parallel-purge snippet lives in README → Free quota held by soft-deleted accounts.


Scripts cheat sheet (paths from repo root)

Script Purpose Key flags
Get-ClaudeCatalog.ps1 Browse models × regions × quota -Latest, -View Detail/Matrix/Summary
scripts/preflight-claude.ps1 Standalone catalog + quota gate runs automatically via preprovision hook
scripts/configure-claude-code.ps1 Generate Claude Code wiring (activator + .vscode/settings.json + .claude/settings.json) idempotent; safe to re-run
scripts/verify-claude-code.ps1 End-to-end smoke test -SkipClaudeCall, -RunPythonSample, -AutoInstall
src/check_claude_quota.py Programmatic quota + capacity inspection --regions, --models, --subscription, --tenant, --json
src/hello_claude.py One-shot Messages call (Entra ID)
src/hello_claude_token_refresh.py Long-running variant with per-request refresh use for daemons / notebooks
src/chat_stream.py Streaming multi-turn REPL exit to quit

Each .ps1 has a .sh POSIX equivalent next to it.


Safety checklist

  • Before any az cognitiveservices account purge — show the customer the account list and get explicit confirmation. The operation is irreversible.
  • Before azd down — confirm the customer is ready to lose the deployment. Suggest --purge only if they want quota freed immediately.
  • Before az role assignment delete — show the role and scope first.
  • Never write CLAUDE_API_KEY, subscription IDs, tenant IDs, or tokens into any tracked file.
  • Never suggest bypassing preflight-claude.ps1 or passing --no-prompt to skip hooks.
  • Never edit ~/.claude/settings.json without first showing the customer the current contents and getting OK.
  • Never mix Bicep and Terraform variants in the same azd env.

Reference

Install via CLI
npx skills add https://github.com/Azure-Samples/claude --skill claude-on-foundry
Repository Details
star Stars 26
call_split Forks 9
navigation Branch main
article Path SKILL.md
More from Creator
Azure-Samples
Azure-Samples Explore all skills →