name: domain-problem-interviewer-researcher description: Interview stakeholders and research domain context before data science work begins. Use when a data science request starts from a business need, vague query, opportunity, operational pain point, domain problem, product question, policy question, or research question and Codex needs to ask structured discovery questions, identify the business/domain/end goal, perform web research from authoritative sources when internet access is available and not forbidden, and produce a domain briefing plus next-stage data science plan.
Domain Problem Interviewer Researcher
Use this before EDA or modeling when the real-world problem is not fully understood. Act like a senior data scientist interviewing a stakeholder: direct, specific, and evidence-seeking.
Interview Workflow
- Ask only the highest-value questions needed to understand the work. Prefer 6-10 questions in one pass; ask follow-ups only for blockers.
- Extract the business problem, domain, stakeholder, decision, and end goal before discussing algorithms.
- Separate what the user knows from assumptions, guesses, and unknowns.
- Identify whether the analysis is descriptive, exploratory, inferential, predictive, causal, optimization, monitoring, or decision-support.
- If the user provides enough context, proceed to domain research and the briefing. If not, produce a provisional brief with explicit assumptions and the remaining questions.
Core Interview Questions
Ask questions like these, adapting to the domain:
- What business or domain problem are you trying to solve?
- Who is the decision-maker or user of the result?
- What action will change if the analysis succeeds?
- What is the end goal: reduce cost, increase revenue, reduce risk, improve safety, automate a workflow, detect anomalies, forecast demand, allocate resources, or something else?
- What domain does this belong to, and what local terminology or constraints matter?
- What data exists today, who owns it, and how reliable is it?
- What is the unit of analysis: customer, transaction, patient, asset, well, part, event, location, time window, or other entity?
- What is the target outcome, label, KPI, or decision variable?
- What are the operational constraints: latency, explainability, compliance, privacy, fairness, safety, cost, deployment environment, or human review?
- What does a useful first version look like, and how will success be measured?
- What failure would be unacceptable?
Web Research Rules
When internet access is available and the user has not forbidden browsing:
- Search after the initial interview or after enough context is available to target the research.
- Prefer authoritative sources: official docs, regulators, standards bodies, peer-reviewed papers, company docs, competition pages, domain glossaries, and primary industry sources.
- Use recent sources for markets, regulations, products, benchmarks, or live competitions.
- Capture source links and dates where relevant.
- Clearly label researched facts, user-provided facts, and inferences.
- Do not overfit the plan to generic web summaries when local project docs or user context contradict them.
Domain Context Contract
Always include this compact block near the top of the output. Downstream Handy Data Science skills must carry it forward and update it when new evidence changes the understanding.
## Domain Context Contract
Domain:
Business problem:
Stakeholders/users:
Decision or workflow affected:
End goal:
Unit of analysis:
Target/KPI/label:
Success metric:
Operational constraints:
Domain terminology:
Data assumptions:
Research-backed facts:
Open questions:
Do-not-assume / prohibited claims:
Domain Briefing Output
Return a report with this structure:
## Problem Charter
Business/domain problem:
Stakeholders:
Decision or workflow affected:
End goal:
Question type:
Unit of analysis:
Target/KPI:
Success criteria:
Constraints:
Failure modes:
## Domain Notes
Key terminology:
Domain mechanics:
Known benchmarks or baselines:
Regulatory/compliance/safety concerns:
Relevant source links:
## Data Science Framing
Available data:
Missing data:
Candidate datasets/features:
Potential leakage risks:
Evaluation strategy:
Recommended first baseline:
Recommended next data checks:
## Assumptions And Open Questions
Assumptions:
Open questions:
Risks:
Next-stage skill:
Handoff
- Use
analytic-question-framingafter the domain brief when the data science question still needs sharpening. - Use
data-checking-edaafter the domain brief when data files are available. - Use
modeling-strategy-reviewwhen the question type and data contract are clear enough to choose methods. - Use
reproducible-analysis-reportingwhen the domain brief must become a stakeholder report.