name: project-cost-estimator description: Analyzes a software project's source code, git history, and architecture to estimate what it actually cost to build — and compares that against what it would cost using a modern AI agent fleet (Claude Code, OpenCode, Goose, Codex, Kimi Code) managed by a single engineer. Use this skill whenever the user wants to: audit a project's build cost, estimate software value, understand the ROI of AI-assisted development, calculate gross margins from AI vs human teams, evaluate outsourcing vs in-house costs, or benchmark a codebase's investment. Trigger on phrases like "how much did this cost to build", "estimate project cost", "AI vs human development cost", "gross margin from using agents", "what's this codebase worth", "value of the code", or "audit this project's effort".
Project Cost Estimator
Purpose
This skill estimates the real engineering cost of a software project and evaluates whether it was executed efficiently. It is designed for internal agency use — not for client-facing presentations.
The core question it answers: Did we execute this project well, financially?
Secondary questions:
- Was AI leverage maximized?
- Where was effort concentrated relative to complexity?
- What is our gross margin at market rate vs. our actual cost?
- What would it cost to rebuild this today, with a modern AI fleet?
The tone of all output is analytical and neutral — not consultative or judgmental.
Step 1: Gather Project Signals
1a. Repository metrics
# Project root
ls -la <project-root>
# Total LOC by language (cloc is preferred)
cloc <project-root> --exclude-dir=node_modules,dist,.next,coverage,generated
# TypeScript/JS files only
find <project-root> -name "*.ts" -o -name "*.tsx" | xargs wc -l 2>/dev/null | tail -1
# Git commit history
git -C <project-root> log --oneline | wc -l
git -C <project-root> log --format="%ad" --date=short | sort -u | head -1
git -C <project-root> log --format="%ad" --date=short | sort -u | tail -1
# Commit size distribution (lines changed per commit)
git -C <project-root> log --shortstat --oneline | grep "files changed" | \
awk '{sum+=$4+$6; count++} END {print "Total lines changed:", sum, "| Commits:", count, "| Avg:", sum/count}'
# Author contributions
git -C <project-root> shortlog -sn --no-merges | head -10
# High-churn files (modified most often)
git -C <project-root> log --name-only --format="" | sort | uniq -c | sort -rn | head -20
# Branch history
git -C <project-root> branch -a | wc -l
1b. Feature surface metrics
These are far stronger effort predictors than LOC for modern TypeScript stacks.
# API endpoints (NestJS controllers)
grep -r "@Get\|@Post\|@Put\|@Patch\|@Delete" <project-root>/src --include="*.ts" -l | wc -l
grep -r "@Get\|@Post\|@Put\|@Patch\|@Delete" <project-root>/src --include="*.ts" | wc -l
# Database models (Prisma)
grep -c "^model " <project-root>/prisma/schema.prisma 2>/dev/null
# Background jobs/queues
grep -r "@Processor\|@Queue\|BullModule\|@InjectQueue" <project-root>/src --include="*.ts" -l | wc -l
# External integrations (HTTP clients, SDKs)
grep -r "HttpModule\|axios\|fetch\|new.*Client\|SDK" <project-root>/src --include="*.ts" -l | sort -u | wc -l
# UI screens / pages (Refine/Next/React)
find <project-root> -name "*.tsx" -path "*/pages/*" -o -name "*.tsx" -path "*/views/*" 2>/dev/null | wc -l
# Role-based actions / guards
grep -r "@Roles\|@UseGuards\|RolesGuard\|Permissions" <project-root>/src --include="*.ts" | wc -l
# State machines / complex flows
grep -r "createMachine\|useMachine\|state.*machine\|FSM\|StateMachine" <project-root>/src --include="*.ts" -l | wc -l
# Tests
find <project-root> -name "*.spec.ts" -o -name "*.test.ts" | wc -l
1c. Module inventory
Manually identify and categorize all modules into three complexity tiers:
| Tier | Definition | Effort Range |
|---|---|---|
| Simple | CRUD module, single entity, no side effects | 20–40 hrs |
| Standard | Multi-entity, business logic, integrations | 40–80 hrs |
| Complex | State machines, external APIs, concurrency, auth flows | 80–160 hrs |
Use the feature surface metrics above to inform module classification. Do not use LOC to classify modules.
Step 2: Build the System Scope Snapshot
Before any estimation, produce a single clear table that summarizes project scope. This is the anchor for all subsequent analysis.
System Scope
────────────────────────────────
Applications (apps/packages): N
API modules: N
Database models: N
Background queues: N
External API integrations: N
User roles: N
Primary workflows: N
UI screens (estimated): N
Browser extension flows: N
────────────────────────────────
If a number cannot be determined with confidence, mark it ~N (approximate) or ? (unknown).
Step 3: Estimate Development Effort
3a. Module-based estimation (primary method)
For each identified module, assign a complexity tier and estimate hours:
Module Inventory
──────────────────────────────────────────────────────────
Module Name Tier Hours (range)
──────────────────────────────────────────────────────────
[module 1] Complex 80–120
[module 2] Standard 50–70
[module 3] Simple 25–35
...
──────────────────────────────────────────────────────────
Subtotal — Implementation XXX–XXX hrs
+ Infrastructure / DevOps (10–15%) XX–XX hrs
+ Architecture & design (5–10%) XX–XX hrs
+ Integration & QA (10–15%) XX–XX hrs
──────────────────────────────────────────────────────────
TOTAL ESTIMATED EFFORT XXX–XXX hrs
3b. Git-signal cross-check
Use git signals to validate or adjust the module estimate:
- Commit volume: N commits over X weeks = Y avg commits/week. High-frequency projects (>5/day) suggest intensive development. Low-frequency (<1/day) may indicate design-heavy or async work.
- Avg lines changed per commit: Very high (>500) suggests generated code, squash merges, or AI bursts. Very low (<20) suggests iterative polish. Neither directly indicates effort.
- High-churn files: Files modified in >10% of commits are complexity hotspots. If they align with a "simple" module classification, revise upward.
- Author split: If 80%+ of commits are from one author, solo velocity assumptions apply. Multi-author projects have coordination overhead (add 10–20%).
Important: Git signals are cross-checks, not primary drivers. A squash-merge repo will look artificially low-commit.
3c. Do NOT use COCOMO
Classic COCOMO is designed for large waterfall teams building from scratch with minimal framework leverage. It produces estimates 3–10x higher than reality for modern TypeScript stacks with NestJS, Prisma, Refine, and AI-assisted development. If COCOMO is referenced anywhere in prior reports, discard that number entirely.
Step 4: Estimate AI-Assisted Rebuild Cost
4a. AI code generation split
Estimate the portion of the codebase that was or could be AI-generated:
Code Authorship Split (estimated)
──────────────────────────────────────
AI-generated / assisted: ~X% of LOC
Human-written: ~X% of LOC
──────────────────────────────────────
Typical ranges for AI-assisted projects:
- Boilerplate (CRUD, DTOs, migrations): 70–90% AI-generatable
- Business logic, integrations: 30–50% AI-generatable
- Custom algorithms, state machines: 10–30% AI-generatable
4b. Token consumption model
Do NOT estimate AI cost by mapping "AI hours" to tokens. Instead:
Token Estimation
──────────────────────────────────────────────────────────
Estimated AI-generated LOC: X lines
Code generation tokens: X lines × 10 tokens/LOC = X tokens
Context + iteration (3–5×): X tokens
──────────────────────────────────────────────────────────
Estimated total tokens: ~X M tokens
Cost @ Claude Sonnet: ~$X (at $3/M input + $15/M output)
Cost @ GPT-4o: ~$X
──────────────────────────────────────────────────────────
4c. AI-assisted rebuild timeline
Break delivery into phases and show human vs AI-assisted timelines:
Delivery Timeline by Phase
──────────────────────────────────────────────────────────
Phase Human team AI-assisted (1 eng)
──────────────────────────────────────────────────────────
Discovery & design 2–3 weeks 2–3 weeks (unchanged)
Architecture 2–3 weeks 1–2 weeks
Implementation X months X months
Integration X weeks X weeks
Testing & QA X weeks X weeks
Iteration X weeks X weeks
──────────────────────────────────────────────────────────
TOTAL ~X months ~X months
──────────────────────────────────────────────────────────
Note: AI accelerates implementation, not discovery, product iteration, or stakeholder feedback cycles. A 3–4× speedup claim must be scoped to implementation phases only.
Step 5: Financial Assessment
This is the core output for the agency. The goal is to evaluate execution efficiency.
5a. Cost comparison table
Cost Analysis
──────────────────────────────────────────────────────────
Hours Rate Total
──────────────────────────────────────────────────────────
Actual cost (billed/spent) XXX hrs $XX/hr $XX,XXX
Market rate (human team) XXX hrs $XX/hr $XX,XXX
AI-assisted rebuild XXX hrs $XX/hr $XX,XXX
──────────────────────────────────────────────────────────
5b. Gross margin analysis
Gross Margin
──────────────────────────────────────────────────────────
Contract value (if known): $XX,XXX
Actual cost (labor + tools): $XX,XXX
Gross margin: XX%
──────────────────────────────────────────────────────────
If rebuilt with AI fleet today:
Estimated cost: $XX,XXX
Margin at same contract value: XX%
──────────────────────────────────────────────────────────
5c. AI leverage delta
AI Leverage Assessment
──────────────────────────────────────────────────────────
Estimated AI usage in delivery: Low / Medium / High
Potential AI usage for this scope: High
Unrealized leverage: Significant / Moderate / Minimal
──────────────────────────────────────────────────────────
Step 6: Risk & Complexity Drivers
List the top 4–8 complexity risks that drove cost. This explains why certain modules are expensive and adds credibility to the estimates.
Complexity Risk Table
──────────────────────────────────────────────────────────
Risk Factor Impact
──────────────────────────────────────────────────────────
[e.g. OEM portal automation] Brittle UI scraping, high maintenance
[e.g. Queue-based sync] Concurrency errors, idempotency
[e.g. Browser extension auth] Security surface, cross-context state
[e.g. Claim state machine] Business logic depth, edge cases
──────────────────────────────────────────────────────────
Step 7: Estimator Confidence Model
Always include this section to communicate reliability of each estimate.
Estimator Confidence
──────────────────────────────────────────────────────────
Signal Confidence Basis
──────────────────────────────────────────────────────────
LOC & file counts High Direct measurement
Feature surface metrics High Direct analysis
Module complexity tiers Medium Expert judgment
Git signal analysis Medium Repo history
Actual hours billed Medium Requires external data
AI usage intensity Low Inferred from patterns
Historical benchmarks Low Limited comparators
──────────────────────────────────────────────────────────
Step 8: Produce the Report
Use the following report structure. Maintain an objective, analytical tone throughout. Avoid value judgments ("you did well", "you left money on the table"). Use neutral framing: "Cost Efficiency Assessment", "Delivery Efficiency", "Engineering Quality Indicators".
Report Format
# Project Cost & Execution Audit
**Project**: [name]
**Date**: [date]
**Analyzed by**: BlackBox Vision — Internal
---
## System Scope
[Scope snapshot table from Step 2]
---
## Effort Estimation
[Module inventory and total from Step 3a]
### Git Signal Analysis
[Cross-check findings from Step 3b — commit volume, avg size, churn files, author split]
---
## AI-Assisted Rebuild
### Code Authorship Split
[From Step 4a]
### Token Consumption Model
[From Step 4b]
### Delivery Timeline
[Phase table from Step 4c]
---
## Financial Assessment
### Cost Comparison
[Table from Step 5a]
### Gross Margin Analysis
[From Step 5b]
### AI Leverage Assessment
[From Step 5c]
---
## Complexity & Risk Drivers
[Risk table from Step 6]
---
## Estimator Confidence
[Confidence table from Step 7]
---
## Key Findings
[3–5 bullet points — factual observations only, no prescriptions]
- Estimated effort: XXX–XXX hrs; actual delivery: XXX hrs (ΔX%)
- AI leverage: [low/medium/high relative to scope potential]
- Highest-complexity areas: [module names]
- Gross margin at contract price: XX%
- Estimated rebuild cost with AI fleet today: $XX,XXX
Important Constraints
- Never use COCOMO. It is not appropriate for NestJS/Prisma/Refine/Turborepo stacks.
- LOC is a secondary signal only. Use it for context, not as a primary effort driver. Modern TypeScript stacks (Prisma, NestJS decorators, generated DTOs) are inherently verbose.
- Approximate clearly. When a value is estimated rather than measured, prefix with
~and note the basis. - No prescriptions in the main report. The report describes what happened. If the user asks for recommendations, provide them separately.
- Module tiers beat formulas. An experienced engineer's judgment on module complexity is more reliable than any algorithmic estimate for projects of this size.
- This report is internal. Write for a technical co-founder reviewing their own work, not for a client. Be direct about gaps, overestimates, and unrealized leverage.