name: aoai-model-migration description: > Migrate Azure OpenAI applications from GPT-4o/GPT-4o-mini to newer models (GPT-4.1, GPT-5, GPT-5.1 through GPT-5.4, o-series). Covers API changes, client configuration, parameter adaptation, prompt adjustments, and authentication. USE FOR: migrate model, switch model, upgrade model, GPT-4o replacement, AzureOpenAI to OpenAI client, v1 API, max_completion_tokens, reasoning_effort, developer role, system role, parameter adaptation, client factory, model classification. DO NOT USE FOR: retirement dates or lifecycle planning (use aoai-model-lifecycle), evaluation or A/B testing (use aoai-migration-evaluation).
Azure OpenAI Model Migration Skill
⚠️ Retirement dates and model availability change frequently. Always verify against the official Azure OpenAI Model Retirements page.
Purpose
Guide developers through migrating Azure OpenAI applications from GPT-4o / GPT-4o-mini to newer model families (GPT-4.1, GPT-5, GPT-5.1, GPT-5.2) and o-series reasoning models (o1 → o3, o3-mini → o4-mini). This skill covers API surface changes, client configuration, parameter adaptation, and prompt adjustments.
When to Use
- Migrating from GPT-4o or GPT-4o-mini to any newer Azure OpenAI model
- Migrating o-series models (o1 → o3, o3-mini → o4-mini)
- Adapting code to the new v1 API (
/openai/v1/) used by GPT-4.1+ and GPT-5+ - Adapting parameters and system prompts for reasoning models (GPT-5, GPT-5.1, GPT-5.2, o-series)
- Choosing the right replacement model for a given workload
Migration Paths
GPT Series
| Source Model | Target Model | Type | Best For |
|---|---|---|---|
| GPT-4o / GPT-4.1 | GPT-5.4-mini | Reasoning | Recommended — comparable quality at lower cost/latency (tier-down strategy) |
| GPT-4o-mini / GPT-4.1-mini | GPT-5.4-nano | Reasoning | Recommended — comparable quality at a fraction of the cost |
| GPT-4o | GPT-5.1 | Reasoning | Official auto-migration target (Standard deployments, completed March 2026) |
| GPT-4o | GPT-5.4 | Reasoning | Best overall quality (Mar 2026), longest runway |
| GPT-4o-mini | GPT-4.1-mini | Standard | Official auto-migration target (Standard deployments) |
💡 Tier-down strategy: Newer-generation smaller models match or exceed older-generation larger ones with better latency and lower cost. Target GPT-5.4-mini instead of GPT-4.1/GPT-5, and GPT-5.4-nano instead of GPT-4.1-mini — longer runway (Sep 2027), better quality-to-cost tradeoff.
📝 Note: GPT-4o Standard deployments were auto-upgraded to GPT-5.1 and retired on 2026-03-31. GPT-4.1 family was deprecated on 2026-04-14 (no new customers).
o-Series (Reasoning Models)
| Source Model | Target Model | Type | Best For |
|---|---|---|---|
| o1 | o3 | Reasoning | Successor reasoning model |
| o3-mini | o4-mini | Reasoning | Faster, cheaper reasoning |
| o1-pro | o3-pro | Reasoning | Pro-tier reasoning |
How to Choose
| Priority | GPT-4o replacement | GPT-4o-mini replacement |
|---|---|---|
| Best quality/latency tradeoff | GPT-5.4-mini | GPT-5.4-nano |
| Best overall quality | GPT-5.4 | GPT-5.4-mini |
| Best reasoning / agentic | GPT-5.4 | GPT-5.4-mini |
| Lowest cost | GPT-5.4-nano | GPT-5.4-nano |
Key API Changes
1. Client Configuration
GPT-4.1+ and GPT-5+ use the v1 API, which requires the OpenAI client instead of AzureOpenAI.
Before (GPT-4o — versioned API):
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
token_provider = get_bearer_token_provider(
DefaultAzureCredential(),
"https://cognitiveservices.azure.com/.default"
)
client = AzureOpenAI(
azure_ad_token_provider=token_provider,
api_version="2024-12-01-preview",
azure_endpoint=AZURE_OPENAI_ENDPOINT
)
After (GPT-4.1 / GPT-5 — v1 API):
from openai import OpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
token_provider = get_bearer_token_provider(
DefaultAzureCredential(),
"https://cognitiveservices.azure.com/.default"
)
client = OpenAI(
api_key=token_provider(),
base_url=f"{AZURE_OPENAI_ENDPOINT}/openai/v1/"
)
2. Model Family Classification
Use these sets to determine which API and parameters a model requires:
# Models using the new v1 API (OpenAI client with /openai/v1/ endpoint)
V1_MODELS = {
"gpt-4.1", "gpt-4.1-mini", "gpt-4.1-nano",
"gpt-5", "gpt-5.1", "gpt-5.2", "gpt-5-mini", "gpt-5-nano",
"gpt-5-pro", "gpt-5-codex", "gpt-5.1-codex", "gpt-5.1-codex-mini",
"gpt-5.2-codex", "gpt-5.3-codex",
"gpt-5.4", "gpt-5.4-pro", "gpt-5.4-mini", "gpt-5.4-nano",
"codex-mini",
}
# Reasoning models (no temperature/top_p, use max_completion_tokens, developer role)
REASONING_MODELS = {
"gpt-5", "gpt-5.1", "gpt-5.2", "gpt-5-mini", "gpt-5-nano",
"gpt-5-pro", "gpt-5.3-codex", "gpt-5.2-codex",
"gpt-5.4", "gpt-5.4-pro", "gpt-5.4-mini", "gpt-5.4-nano",
}
# o-series reasoning models (also no temperature/top_p, use max_completion_tokens)
# Note: o-series use the classic AzureOpenAI client, NOT the v1 API
O_SERIES_MODELS = {
"o1", "o1-pro", "o3-mini", "o3", "o3-pro", "o3-deep-research", "o4-mini",
}
3. Parameter Adaptation
| Parameter | GPT-4o | GPT-4.1 | GPT-5 / GPT-5.x | o-series (o1, o3, o4-mini) |
|---|---|---|---|---|
max_tokens |
Supported | Use max_completion_tokens |
Use max_completion_tokens |
Use max_completion_tokens |
temperature |
Supported | Supported | Not supported (remove it) | Not supported (remove it) |
top_p |
Supported | Supported | Not supported (remove it) | Not supported (remove it) |
reasoning_effort |
N/A | N/A | See below | Supported |
| System role | "system" |
"system" |
"developer" |
"developer" |
Parameter adaptation pattern:
def adapt_params(model_name: str, params: dict) -> dict:
"""Adapt parameters for the target model."""
adapted = params.copy()
# max_tokens → max_completion_tokens for v1 models
if model_name in V1_MODELS and "max_tokens" in adapted:
adapted["max_completion_tokens"] = adapted.pop("max_tokens")
# Reasoning models don't support temperature/top_p
if model_name in REASONING_MODELS or model_name in O_SERIES_MODELS:
adapted.pop("temperature", None)
adapted.pop("top_p", None)
return adapted
4. Reasoning Effort
| Model | Type | reasoning_effort levels |
Default |
|---|---|---|---|
| GPT-4.1 / 4.1-mini / 4.1-nano | Standard | N/A (no reasoning) | — |
| GPT-5 / 5-mini / 5-nano | Reasoning | minimal, low, medium, high |
medium |
| GPT-5.1 | Reasoning | none, low, medium, high |
none |
| GPT-5.2 / 5.3-codex / 5.4 / 5.4-pro | Reasoning | none, low, medium, high |
none |
| GPT-5.4-mini / 5.4-nano | Reasoning | none, low, medium, high |
none |
| o-series (o1, o3, o4-mini) | Reasoning | low, medium, high |
medium |
Important:
reasoning_effort="none"is only supported from GPT-5.1 onwards (GPT-5.1 and GPT-5.2). GPT-5, GPT-5-mini, and GPT-5-nano minimum is"minimal", which still incurs reasoning tokens and added latency.
5. System Role for Reasoning Models
GPT-5.x and o-series models use "developer" instead of "system" for the system message role:
# GPT-4o / GPT-4.1
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": query},
]
# GPT-5.x / o-series
messages = [
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": query},
]
Automatic role adaptation pattern:
def uses_developer_role(model_name: str) -> bool:
return model_name in REASONING_MODELS or model_name in O_SERIES_MODELS
# In your calling code:
if uses_developer_role(model_name):
messages = [
{**m, "role": "developer"} if m.get("role") == "system" else m
for m in messages
]
6. Client Factory Pattern
Use a factory function to create the right client for any model:
from openai import AzureOpenAI, OpenAI
def create_client(model_name: str, endpoint: str, api_key: str = None) -> AzureOpenAI | OpenAI:
"""Create the appropriate client for a given model."""
if model_name in V1_MODELS:
base_url = endpoint.rstrip("/") + "/openai/v1"
return OpenAI(base_url=base_url, api_key=api_key or token_provider())
else:
return AzureOpenAI(
azure_endpoint=endpoint,
azure_ad_token_provider=token_provider,
api_version="2024-12-01-preview",
)
Error Handling
call_model() provides actionable error messages for common failures:
from src.clients import call_model, create_client
client = create_client("gpt-5.1")
try:
response = call_model(client, "gpt-5.1", messages)
except RuntimeError as e:
# Raises descriptive errors:
# - "Deployment 'gpt-5.1' not found. Check your deployment name..."
# - "Authentication failed. Run 'az login' for Entra ID auth..."
print(f"Migration issue: {e}")
Repository Resources
This repo provides reusable modules under src/:
src/config.py— Model family helpers (is_v1(),is_reasoning(),is_o_series(),uses_developer_role()), environment configsrc/clients.py— Client factory (create_client()), parameter-adaptingcall_model()with automatic role adaptationsrc/evaluate/— Full evaluation framework for comparing models (seeaoai-migration-evaluationskill)
Deep-dive documentation (always check these for the latest dates and guidance):
docs/retirement-timeline.md— authoritative retirement dates and planning matrixdocs/migration-paths.md— detailed migration paths with decision treesdocs/api-changes-by-model.md— comprehensive API changes referencedocs/evaluation-guide.md— evaluation methodology and setupsamples/rag_pipeline/— working end-to-end migration example
💡 Tip: This skill provides quick guidance for common migration tasks. For the latest model dates, detailed walkthroughs, and working code samples, always check the repo documentation above — it is updated more frequently than this skill.
Steps for a Migration
- Identify your target model using the migration paths table above — consider the tier-down strategy (GPT-5.4-mini/nano) for best cost-quality tradeoff.
- Update client initialization — switch from
AzureOpenAItoOpenAIfor v1 models. - Adapt parameters — replace
max_tokenswithmax_completion_tokens, removetemperature/top_pfor reasoning models. - Update system message role — use
"developer"for GPT-5.x and o-series models. - Set
reasoning_effortif using a reasoning model — GPT-5.4-mini supports"none"for zero reasoning overhead; start with"low"for cost-sensitive workloads. - Run evaluations to validate the new model matches or exceeds the old model's quality (see
aoai-migration-evaluationskill). - Deploy progressively — canary rollout for high-traffic workloads.
Validate After Migration
After updating your code, verify output quality hasn't regressed:
from src.evaluate.core import MigrationEvaluator
evaluator = MigrationEvaluator(
source_model="gpt-4o",
target_model="gpt-5.1",
test_cases="data/golden_rag.jsonl", # 54 pre-built test cases in data/
metrics=["coherence", "relevance", "groundedness"],
)
report = evaluator.run()
report.print_report()
See the aoai-migration-evaluation skill for full evaluation guidance, including custom evaluators and PII redaction for production data.
Must Not
- Hard-code model names deep in application code. Use config/environment variables.
- Use
temperatureortop_pwith reasoning models (GPT-5.x, o-series) — they are not supported. - Assume GPT-4.1 family is still available for new deployments — it was deprecated April 14, 2026.
- Use
max_tokenswith v1 API models — usemax_completion_tokensinstead. - Skip evaluation before deploying a new model in production.
- Assume
reasoning_effort="none"works on GPT-5/GPT-5-mini — only GPT-5.1+ supports it. - Use
AzureOpenAIclient with v1 models — useOpenAIclient withbase_urlpointing to/openai/v1/. - Use
"system"role with reasoning models — use"developer"role instead.
Structured Outputs & Responses API
Structured Outputs
If your application uses response_format for JSON output, it works across model generations:
| Feature | GPT-4o | GPT-4.1 | GPT-5+ |
|---|---|---|---|
{ "type": "json_object" } |
Supported | Supported | Supported |
{ "type": "json_schema", ... } |
Supported (2024-08-06+) | Supported | Supported |
| Strict mode | Supported | Supported | Supported |
Test your JSON schemas against the new model — different models may interpret schema constraints differently.
Responses API
Azure OpenAI now supports the Responses API alongside Chat Completions. It offers built-in tool use, file search, and web search. Existing Chat Completions code continues to work. See the Responses API docs.
References
- Azure OpenAI Model Retirements — authoritative retirement dates
- Azure OpenAI Models Overview — model capabilities & availability
- GPT-5 vs GPT-4.1: Choosing the Right Model
- Responses API — new API surface
- Azure OpenAI SDKs — all supported languages
- What's New in Azure OpenAI — latest changes