name: hm-l2-product-validation description: > Product-lens methodology for validating technical decisions against user impact, product vision, and business value. Use when the user asks "is this feature valuable", "what's the user impact", "RICE score", "product validation", "validate against users", "how do I measure success", "feature prioritization", "product lens", "product context", "end user value", "does this solve the right problem", "problem vs solution", "stakeholder perspective", "should we build this", "define success metrics", "quantify user benefit", "anti-solution-check", or "prioritize features by impact". NOT for initial brainstorming (use hm-brainstorm), requirements gap analysis (use hm-requirements-analysis), dependency graph design (use hm-feature-ecosystem), or long-term maintainability scoring (use hm-roadmap-maintainability). metadata: layer: "3" role: "domain-execution" pattern: P2 version: "1.0.0" allowed-tools: - Read - Write - Edit - Bash - Glob - Grep
Product Validation: Technical Decisions → User Impact
Overview
Every feature, architecture choice, and implementation decision must trace back to user value. This skill teaches the product-lens methodology: a structured way to step out of the implementation mindset and validate that what gets built actually delivers to real users. It bridges product management perspective with technical implementation across all development stages.
This skill answers five questions about every technical decision:
- Who benefits? Which user segment experiences the improvement, and how many are they?
- How much? What is the measurable difference in their experience — speed, accuracy, satisfaction?
- Is this the right problem? Are we solving a real user pain, or implementing a solution looking for a problem?
- How do we know? What falsifiable success metric will confirm delivery after deployment?
- What's the cost of being wrong? What user harm occurs if this feature misses the mark?
Third-Party Source Synthesis
This skill synthesizes patterns from three inspected third-party sources:
| Source | Adopt/Adapt Decision | Local Transformation |
|---|---|---|
skillmd.ai/product-requirements-2 |
Adopt the iterative dialogue model and 100-point quality scoring; adapt away from PRD-generation framing. | Use iterative Q&A as the product-lens probe during design and implementation. Score decisions against user-impact criteria rather than generating documents. |
skillmd.ai/requirements-clarity |
Adopt the 4-dimension diagnostic model (Functional, User Interaction, Technical, Business) and scoring rubric; adapt away from the requirements-gap-analysis framing. | Apply the 4 dimensions to product decisions — not document gaps, but value gaps. |
skillmd.ai/product-management-3 |
Adopt the RICE prioritization (Reach, Impact, Confidence, Effort) and launch-plan structure; adapt away from the full-PM-workflow framing. | Use RICE as the scoring lens for technical decisions. Adapt launch-plan as validation-checklist, not project-management artifact. |
Boundary Rules
| Nearby Skill | Boundary |
|---|---|
hm-brainstorm |
SURFACES user needs and produces requirements briefs. This skill VALIDATES that what's built delivers those needs. Input: requirements brief. Output: user-impact score. |
hm-requirements-analysis |
DIAGNOSES requirement gaps, contradictions, and vagueness. This skill EVALUATES product impact of each requirement. Input: validated requirements. Output: value-prioritized requirements. |
hm-feature-ecosystem |
DESIGNS feature interdependence and cross-dependency graphs. This skill PRIORITIZES features by user value and feeds that prioritization IN. Input: RICE scores. Output: dependency-informed priorities. |
hm-cross-cutting-change |
ASSESSES implementation impact of changes. This skill PROVIDES product-priority context so cross-cutting decisions optimize for user value. |
hm-roadmap-maintainability |
SCORES long-term health and technical debt. This skill ENSURES the roadmap serves users by validating maintainability trade-offs against user impact. |
hm-spec-driven-authoring |
LOCKS specifications into falsifiable requirements. This skill PROVIDES success-metric definitions so specs have measurable user validation criteria. |
At-a-glance distinction: Brainstorm finds problems. Requirements-analysis diagnoses gaps. Product-validation tests value. Feature-ecosystem maps dependencies. Roadmap-maintainability scores health. This skill's unique role: "Before we build this, will users actually benefit?"
Entry Gate
Proceed when any of these are true:
- User asks "should we build this", "is this valuable", "what's the ROI"
- A feature design is ready for product-lens review before implementation begins
- Prioritization is needed across multiple features or requirements
- Success metrics need to be defined before development starts
- User describes a solution ("build a dashboard") without articulating the problem
- A coordinator skill (
hm-coordinating-loop) routes to this skill during planning phase
Do NOT proceed when:
- No requirements or feature ideas exist yet → route to
hm-brainstorm - Requirements exist but need gap diagnosis → route to
hm-requirements-analysis - Features are already prioritized and need dependency mapping → route to
hm-feature-ecosystem - The question is purely about long-term code health → route to
hm-roadmap-maintainability - The user explicitly says "just build it, I know what my users want" → accept and move to implementation with a note
Main Workflow: 4-Phase Product Lens
Follow phases in order. Complete each before proceeding to the next.
Phase 1: Problem Articulation
Objective: Establish WHAT user pain exists and WHO experiences it — before discussing any solution.
1.1 Extract the Problem From the Solution Description
When users describe solutions ("build a dashboard with 6 widgets"), apply the anti-solution-check from references/problem-vs-solution.md:
- Rephrase the requested feature as: "The user described building [X]. What problem does [X] solve?"
- Ask: "Who has this problem? How many of them? How severe is it today?"
- Ask: "What do users do today to work around this problem? What's the cost of that workaround?"
- Classify: Validated problem (evidence of real pain) vs. Unvalidated problem (assumption without evidence)
If the problem is unvalidated, recommend user research before proceeding. Do not continue to scoring.
1.2 Identify Target User Segments
For each feature or decision under review:
| Dimension | Questions to Answer |
|---|---|
| Segment | Which user group experiences this? (All users, new users, power users, admins?) |
| Size | How many users in this segment? (Absolute count or % of user base) |
| Context | What are they doing when this problem arises? |
| Today's pain | What's their current workaround? Time cost? Error rate? Frustration level? |
Record results. Each feature must have at least one identified user segment.
1.3 Pause for Confirmation
Restate the problem in user terms:
**User Problem Summary:**
- **Segment:** [who]
- **Problem:** [what pain they experience]
- **Scope:** [how many users, how often]
- **Current cost:** [workaround effort, error rate, abandonment rate]
Before we evaluate solutions — does this accurately describe the real user situation?
Wait for confirmation. If the user cannot confirm the problem description, stop and recommend user research.
Phase 2: Impact Quantification
Objective: Score the feature's potential user impact using RICE methodology (Reach × Impact × Confidence / Effort). Reference: references/user-impact-scoring.md.
2.1 Score Each Dimension
For each feature or decision:
| RICE Dimension | Definition | Scale |
|---|---|---|
| Reach | How many users experience the benefit? | 1 (fraction) – 10 (all users) |
| Impact | How much does the user experience improve? | 0.25 (minimal) – 3.0 (massive) |
| Confidence | How sure are we about the Reach and Impact estimates? | 0.2 (guess) – 1.0 (measured data) |
| Effort | How much work to deliver? (Inverted: higher effort = lower score) | Person-weeks (1–12+), then convert to 1/E |
Scoring ground rules:
- Reach: Use actual user counts when available. 10 if every user benefits, 1 if only edge cases.
- Impact: 3 = transforms daily workflow. 2 = significant improvement. 1 = noticeable. 0.5 = minor. 0.25 = barely perceptible.
- Confidence: 1.0 = backed by analytics data. 0.8 = strong qualitative evidence. 0.5 = informed estimate. 0.2 = pure guess.
- Effort: Estimate in person-weeks. Convert to score as
max(0.25, 10 / effort_weeks). A 4-week feature scores 2.5; a 12-week feature scores 0.83.
Compute RICE score: RICE = (Reach × Impact × Confidence) / Effort_weeks
Alternatively, use ICE (Impact × Confidence × Ease) for rapid triage when precise effort estimates are unavailable.
2.2 Create a Value-Effort Matrix
Plot features on a 2×2 grid:
High Value ────────────
│ │
│ INVEST │ SHIP NOW
│ (High value, │ (High value,
│ high effort) │ low effort)
│ │
│ KILL │ LOW-HANGING
│ (Low value, │ (Low value,
│ high effort) │ low effort)
│ │
└──────────────────────
Effort →
For each feature, classify its quadrant and recommend: Ship now, Invest (worth the effort), Low-hanging (do if spare capacity), Kill (don't build).
2.3 Prioritize by RICE Score
Rank features by RICE score. Present a sorted table:
| Rank | Feature | Reach | Impact | Confidence | Effort (wks) | RICE | Quadrant |
|------|---------|-------|--------|------------|--------------|------|----------|
| 1 | [Name] | 8 | 2.0 | 0.8 | 3 | 4.3 | Ship Now |
| 2 | [Name] | 6 | 1.5 | 0.6 | 5 | 1.1 | Invest |
| 3 | [Name] | 3 | 0.5 | 0.4 | 10 | 0.06 | Kill |
Phase 3: Success Metric Definition
Objective: Define falsifiable success metrics BEFORE implementation begins. Reference: references/success-metrics.md.
3.1 Define Metrics for Each Feature
For every feature with a RICE score ≥ threshold (default: 1.0), define:
| Metric Component | Example |
|---|---|
| What to measure | Time to complete checkout |
| Baseline (current) | 4.2 minutes (from analytics, last 30 days) |
| Target (post-launch) | <2.0 minutes |
| Measurement window | 30 days post-launch |
| Minimum detectable effect | 30% improvement (smaller would be noise) |
| How to measure | Analytics event checkout_complete — duration from cart page load to confirmation |
| Counter-metric | Cart abandonment rate (must not increase) |
3.2 Validate Metrics Are Falsifiable
Run this test on every metric:
- Can the metric be measured with existing instrumentation?
- Is the baseline known? If not, flag: Metric requires pre-launch instrumentation.
- Is the target specific and numeric? "Better" or "faster" fail this check.
- Is there a counter-metric to detect unintended harm?
- Can the metric be measured within the defined window?
- Is the minimum detectable effect realistic given sample size?
If any metric fails falsification, flag it and define a replacement before proceeding to Phase 4.
3.3 Define Product Success Gates
Create a validation gate for post-launch:
## Product Validation Gate — [Feature Name]
- **Launch criteria:** All metrics instrumented and baselines captured.
- **1-week check:** Early signal detection. Pause and investigate if counter-metric degrades.
- **30-day check:** Full measurement window complete.
- If target met → SUCCESS. Feature is validated.
- If target not met but trending positively → EXTEND. Monitor for 15 more days.
- If flat or negative → ROLLBACK or ITERATE. Feature did not deliver value.
Phase 4: Stakeholder Communication
Objective: Translate technical decisions into product/business language for stakeholders. Reference: references/stakeholder-communication.md.
4.1 Frame Technical Trade-Offs as Product Decisions
Every technical trade-off has a product consequence. Translate:
| Technical Trade-Off | Product Framing |
|---|---|
| "We chose a queue-based async architecture" | "Users will see confirmation within 5 seconds instead of blocking for 30 seconds." |
| "We deferred edge case handling for X" | "0.5% of users will see a fallback message instead of an error. We chose speed-to-market over 100% coverage." |
| "We used a simpler data model" | "Query performance improved by 40% but the admin dashboard only shows last 30 days." |
| "We chose framework A over B" | "Initial load is 200ms faster. Trade-off: fewer community plugins available for future." |
4.2 Produce a Decision Brief
For major architectural decisions, produce a structured brief:
- Decision: What was chosen (1 sentence)
- User impact: What users experience differently (2-3 sentences)
- Alternatives considered: What else was evaluated and why rejected (bullet list)
- Risk to users: What could go wrong and how it's mitigated (2-3 sentences)
- Success measurement: How we'll know it was right (1 sentence + metric)
4.3 Cross-Stage Application Check
Apply the product lens at each development stage:
| Stage | Product-Lens Question |
|---|---|
| Design | "Does this design solve the validated user problem? What's the simplest version?" |
| Implementation | "Is this implementation complexity justified by user impact? Are there shortcuts that don't hurt users?" |
| Code review | "Does this change maintain or improve the user experience? Any regression risk?" |
| Deployment | "What's the rollback plan if the success metric degrades? Who monitors the counter-metric?" |
| Post-launch | "Did we hit the metric? If not, is the problem the implementation or the problem statement?" |
Decision Tree
Is there a validated user problem statement?
YES → Are there multiple features/requirements to prioritize?
YES → Apply RICE scoring (Phase 2) → Rank → Proceed
NO → Apply anti-solution-check (Phase 1) → Define success metrics (Phase 3) → Proceed
NO → Can we articulate the user problem in one sentence?
YES → Write it down → Validate with user → Proceed
NO → Route to hm-brainstorm for user needs discovery
Is a success metric defined for this decision?
YES → Is it falsifiable (numeric target, known baseline)?
YES → Proceed to stakeholder communication (Phase 4)
NO → Define falsifiable metric (Phase 3) → Re-check
NO → Define falsifiable metric (Phase 3) → Re-check
Routing Table
| Situation | Skill to Load |
|---|---|
| No user problem articulated yet | hm-brainstorm |
| Requirements exist but need gap diagnosis | hm-requirements-analysis |
| Features are prioritized, need dependency mapping | hm-feature-ecosystem |
| Long-term code health vs. feature velocity trade-off | hm-roadmap-maintainability |
| Feature is validated, ready for spec-locking | hm-spec-driven-authoring |
| Cross-cutting implementation impact assessment | hm-cross-cutting-change |
| New technology, no benchmarks for effort estimation | hm-deep-research |
Gate System
| Gate | When | Criteria | Exit |
|---|---|---|---|
| G1: Problem | Before any scoring | Validated user problem statement exists with identified segment | Block: route to hm-brainstorm |
| G2: RICE | After impact quantification | Every feature has Reach, Impact, Confidence, Effort scored | Fix missing scores |
| G3: Metrics | Before any implementation | Every feature ≥ threshold has falsifiable success metric + counter-metric | Block implementation until defined |
| G4: Stakeholder | Before major decisions ship | Decision brief produced for architectural choices affecting user experience | Fix missing briefs |
| G5: Launch | Before deployment | Success gate defined with 1-week and 30-day check criteria | Block deployment until gate defined |
Validation Checklist
Before routing to the next skill or proceeding to implementation:
- Every feature has a validated user problem statement (Phase 1)
- RICE scores are computed and features are ranked (Phase 2)
- Every feature ≥ threshold 1.0 has falsifiable success metrics (Phase 3)
- Counter-metrics are defined to detect unintended harm (Phase 3)
- "Kill" quadrant features are explicitly recommended for removal
- Major architectural decisions have a stakeholder decision brief (Phase 4)
- Launch validation gate is defined with time-boxed checkpoints (Phase 3.3)
Anti-Patterns
| Anti-Pattern | Detection | Correction |
|---|---|---|
| Solution-first thinking — proposing "build X" without articulating the problem | Feature description starts with implementation detail, not user pain | Run anti-solution-check. Ask: "What problem does X solve? Who has it?" |
| Missing user segment — scoring impact without identifying who benefits | RICE score computed but no segment defined | Go back to Phase 1.2. "Who are these users? How many?" |
| Vague success metrics — "users will be happier" or "better performance" | Metric cannot be measured or has no numeric target | Replace with falsifiable metric. "Better" → "P95 latency <200ms." "Happier" → "NPS score increase by ≥5 points." |
| Confidence inflation — scoring Reach/Impact at 0.8 without evidence | Confidence ≥ 0.8 but no analytics data or user research cited | Drop confidence to 0.5 (informed estimate) or 0.2 (guess). Require evidence for ≥ 0.8. |
| Ignoring counter-metrics — measuring only positive outcomes, not harm | Success metric defined but no counter-metric | Add counter-metric. Every change can cause regression. What would indicate the feature made things worse? |
| Skipping product lens post-design — applying validation only at ideation, skipping implementation review | Product validation discussed once then never revisited | Apply cross-stage checks (Phase 4.3): design, implementation, review, deployment, post-launch. |
| Effort-as-afterthought — RICE computed but effort estimated in seconds without engineering input | Effort score used without team validation | Flag: "Effort estimate needs engineering validation. Current score assumes [X weeks]. Confirm with team." |
| Stakeholder-bypass — making product-impact decisions without translating to business language | Technical decision documented with implementation-only justification | Produce stakeholder decision brief (Phase 4.2). Technical trade-offs have product consequences. Communicate them. |
Self-Correction
When anti-solution-check fails (building the wrong feature well)
Detection: RICE score is high, all metrics defined, stakeholder brief produced — but the problem statement was never validated. Phase 1 was skipped or rubber-stamped. No user segment identified with real evidence. Recovery: Halt scoring and metric work. Return to Phase 1.1. Ask: "Who actually reported this problem? When did they report it? What were they trying to do?" If no evidence exists, recommend user research before proceeding. Do not score features for problems nobody has confirmed.
When RICE scores are miscalibrated
Detection: All features score within 10% of each other. Confidence is uniformly 0.8 without cited data. Effort estimates are round numbers (2, 4, 8 weeks) without breakdown. Impact scores cluster at 1.5–2.0 with no justification differentiation. Recovery: Apply the scoring ground rules explicitly. For each dimension: (1) Force Rank features instead of absolute scoring — relative ordering exposes inflation. (2) Demand evidence for any Confidence ≥ 0.8: cite analytics, user research, or team data. (3) Break effort estimates into sub-tasks — "4 weeks" becomes "API design (3d) + implementation (5d) + testing (4d) + review (2d)." If estimates still cluster, the scoring is not discriminating — redesign the evaluation criteria.
When stakeholder needs were assumed without validation
Detection: Decision brief describes "users" or "stakeholders" without naming specific segments. Product framing uses generic language ("improves the experience") instead of segment-specific impact. No stakeholder was consulted before the brief was produced. Recovery: Before producing any Phase 4 output, ask: "Have we confirmed these assumptions with [specific stakeholder role]?" If no, flag the decision brief as DRAFT — requires stakeholder validation. Do not route to implementation. Add a mandatory checkpoint: the brief must be confirmed by at least one stakeholder representative before any code is written.
When metrics measure what's easy, not what matters
Detection: Success metrics are all instrumentation-convenient (page views, click rates, API latency) while the actual user problem is qualitative (trust, comprehension, workflow friction). Counter-metrics are absent. The minimum detectable effect is set to match existing variance rather than user-meaningful improvement. Recovery: For each metric, ask: "If this metric moved by the target amount, would a user notice and care?" If the answer is "not sure" or "not really," replace the metric. Prioritize outcome metrics (task completion, error recovery, time-to-value) over output metrics (page views, button clicks). Always pair with a counter-metric that detects harm. If no meaningful metric exists, the feature may not be worth building — flag for reconsideration.
Files
| Resource | Purpose |
|---|---|
references/user-impact-scoring.md |
Full RICE and ICE methodology with scoring scales, examples, and anti-inflation checks. |
references/success-metrics.md |
Falsifiable metric definition, counter-metric patterns, baseline measurement, measurement-window selection. |
references/problem-vs-solution.md |
Anti-solution-check procedure, Five Whys adaptation for product decisions, user-problem articulation templates. |
references/stakeholder-communication.md |
Technical-to-product translation tables, decision-brief template, cross-stage application checklist, stakeholder-decision frameworks. |