name: model-resolver-local-mixed description: Local mixed-tier model resolver — fast:7b, balanced:32b, deep:opus
Local Mixed-Tier Model Resolver
Contract
Input Descriptor (via orchestrator context):
{
"role": "string (architect, implementer, reviewer, test)",
"step": "string (1a, 1b, 2, 2.1, 2.2, 3, 3.5, 5)",
"complexity": "string (low, moderate, high, novel)",
"brief_size_chars": "integer",
"files_in_scope": "integer",
"is_retry": "boolean",
"is_escalation": "boolean",
"is_cross_cutting": "boolean",
"stack": "string"
}
Output:
{
"model": "string (qwen2.5-coder:7b | qwen2.5-coder:32b | opus)",
"tier_label": "string (fast, balanced, deep)",
"tier_collapse": "boolean (false)",
"rationale": "string"
}
Decision Rules
Tasks are routed to distinct local and cloud models based on tier:
Fast (fast-tier):
- Assignee: implementer, test-architect
- Tasks: single-file, mechanical, fully-specified (<150 lines)
- Model: qwen2.5-coder:7b
Balanced (balanced-tier):
- Assignee: architect (1a, 1b), reviewer, bug-fix implementer
- Tasks: multi-file, design decisions, moderate reasoning
- Model: qwen2.5-coder:32b
Deep (deep-tier):
- Assignee: architect (1b novel), cross-cutting
- Tasks: novel algorithms, security-critical design
- Model: opus
- Note: deep/high-complexity work is expected to be routed to the cloud resolver by policy; this branch is the cloud-fallback safety net (resolves directly to the cloud escalation model).
When is_escalation: true or is_retry: true applies, fast tasks bump to qwen2.5-coder:32b, balanced tasks stay on qwen2.5-coder:32b, and deep tasks resolve directly to opus — no sentinel value is emitted.
Tier Collapse
tier_collapse: false for every task. Fast tasks run on the 7b model for throughput; balanced tasks run on the 32b model for reasoning quality; deep tasks resolve directly to the cloud model (opus) without emitting a sentinel. This resolver trades operational simplicity for tier-appropriate cost and quality.
Pricing
This resolver advertises pricing as null; token-analysis MUST skip cost computation and report only call counts and escalation rates when this resolver is active.