question-generation

star 0

Generate research questions from structural gaps to guide knowledge exploration. Use when user asks what to explore next, needs research direction, or requires gap-bridging guidance.

mrhpython By mrhpython schedule Updated 11/14/2025

name: question-generation description: Generate research questions from structural gaps to guide knowledge exploration. Use when user asks what to explore next, needs research direction, or requires gap-bridging guidance.

Question Generation Skill

Purpose: Auto-generate research questions from structural gaps to guide knowledge exploration

Backend: backend/services/question-generator.cjs (264 lines)

Status: ✅ Operational (implemented Phase 2)


What This Skill Does

Generates targeted research questions from detected gaps in knowledge graphs, helping users explore connections and fill structural holes.

Key Capabilities:

  • Template-based question generation (4 types)
  • Bridge concept suggestions
  • Question deduplication and ranking
  • Output formatting (simple/detailed/markdown)
  • Gap metadata integration

When to Use This Skill

Use question generation when:

  • User asks "what should I explore next?"
  • Graph analysis reveals structural gaps
  • Need to guide deeper investigation
  • Planning research directions
  • Prioritizing knowledge expansion

Question Template System

4 Question Types (by Gap Type)

Structural Questions (disconnected clusters):

  • "How do {clusterA} and {clusterB} relate?"
  • "What connects {conceptsA} with {conceptsB}?"
  • "What is the intermediate concept between {keyA} and {keyB}?"
  • "Why might {keyA} influence {keyB}?"
  • "What bridges the gap between {topicA} and {topicB}?"

Topical Questions (topic mismatch):

  • "How does {clusterA} impact {clusterB}?"
  • "What role does {keyA} play in {topicB}?"
  • "How can {conceptsA} enhance {conceptsB}?"
  • "What insights from {topicA} apply to {topicB}?"
  • "Where do {clusterA} and {clusterB} overlap?"

Semantic Questions (meaning gap):

  • "What semantic relationship exists between {keyA} and {keyB}?"
  • "How are {conceptsA} conceptually similar to {conceptsB}?"
  • "What shared meaning connects {topicA} and {topicB}?"
  • "What underlying principle unites {clusterA} and {clusterB}?"
  • "How can we reframe {keyA} in terms of {keyB}?"

Logical Questions (missing causal links):

  • "What logical steps connect {keyA} to {keyB}?"
  • "What causes the relationship between {topicA} and {topicB}?"
  • "What are the implications of connecting {clusterA} with {clusterB}?"
  • "How does {keyA} lead to {keyB}?"
  • "What conditions allow {conceptsA} to influence {conceptsB}?"

Template Variables

Cluster-Level:

  • {clusterA} / {clusterB} - Cluster IDs (e.g., 0, 1, 2)
  • {topicA} / {topicB} - Cluster topics (if labeled)

Concept-Level:

  • {conceptsA} / {conceptsB} - Top 3 concepts from each cluster (comma-separated)
  • {keyA} / {keyB} - Highest centrality concept from each cluster

Example Substitution:

Template: "What logical steps connect {keyA} to {keyB}?"
Variables: { keyA: "revenue", keyB: "management" }
Output: "What logical steps connect revenue to management?"

Question Output Structure

{
  question: "What logical steps connect revenue to management?",
  gap: {
    from: 0,        // Cluster A ID
    to: 3,          // Cluster B ID
    severity: "major",
    score: 0.805
  },
  type: "logical",
  bridgeConcepts: ["relationship", "connection"]
}

Bridge Concept Suggestions

Heuristic Rules:

  1. Look for substring overlaps between concepts from each cluster
  2. Suggest concept combinations (e.g., "revenue-metrics")
  3. Fallback to generic bridges if no specific matches

Generic Bridge Concepts:

  • relationship
  • connection
  • impact
  • influence
  • correlation
  • dependency

Example:

Cluster A: ["revenue", "projections", "cost"]
Cluster B: ["management", "strategy", "decisions"]

Suggested bridges: ["relationship", "connection"]

Generation Options

{
  maxQuestionsPerGap: 3,      // Questions generated per gap
  maxTotalQuestions: 15,      // Total questions across all gaps
  minGapSeverity: 'minor'     // Only generate for minor+ gaps
}

Defaults:

  • 3 questions per gap
  • 15 total questions maximum
  • Minor severity threshold (excludes "bridged" gaps)

Question Ranking

Priority Order:

  1. Gap severity (critical > major > minor)
  2. Gap score (higher scores prioritized)

Deduplication:

  • Normalizes questions to lowercase
  • Removes exact duplicates
  • Case-insensitive matching

Output Formats

Simple Format

[
  "What logical steps connect revenue to management?",
  "How do cluster 0 and cluster 3 relate?",
  "What causes the relationship between cluster 0 and cluster 2?"
]

Detailed Format

[
  {
    question: "What logical steps connect revenue to management?",
    gap: "0 → 3",
    severity: "major",
    score: "0.805",
    type: "logical",
    suggestedBridges: "relationship, connection"
  }
]

Markdown Format

1. **What logical steps connect revenue to management?**
   - Gap: 0 → 3 (major)
   - Suggested bridges: relationship, connection

2. **How do cluster 0 and cluster 2 relate?**
   - Gap: 0 → 2 (minor)
   - Suggested bridges: relationship, connection

Integration Points

Called by:

  • backend/services/agent-graph-service.cjs:145-151 (automatic on every analysis)

Depends on:

  • Enhanced gaps from gap-scorer.cjs
  • Cluster concepts from community detection
  • Gap metadata (severity, type, score)

Outputs:

  • Research questions array
  • Question summary statistics
  • Formatted question strings

Performance

  • Speed: <2ms overhead per analysis
  • Questions Generated: 3-15 per analysis
  • Cost: $0 (template-based, no API calls)

No regression: Analysis time 16-76ms maintained from Phase 1


Testing

# Test question generation
node -e "
const { analyzeForAgent } = require('./backend/services/agent-graph-service.cjs');
(async () => {
  const result = await analyzeForAgent('finance', 'revenue projections cost structure burn rate management metrics');
  console.log('Questions generated:', result.researchQuestions.length);
  console.log('');
  result.researchQuestions.forEach((q, i) => {
    console.log(\`\${i+1}. \${q.question}\`);
    console.log(\`   Gap: \${q.gap.from} → \${q.gap.to} (\${q.gap.severity})\`);
  });
})();
"

# Expected output:
# Questions generated: 9
#
# 1. What logical steps connect revenue to management?
#    Gap: 0 → 3 (major)
# 2. What causes the relationship between cluster 0 and cluster 3?
#    Gap: 0 → 3 (major)
# ...

Question Summary Statistics

Provided fields:

  • total - Total questions generated
  • byType - Breakdown by structural/topical/semantic/logical
  • bySeverity - Breakdown by critical/major/minor
  • topQuestions - Top 5 most important questions

Example:

{
  total: 9,
  byType: { structural: 0, topical: 0, semantic: 0, logical: 9 },
  bySeverity: { critical: 0, major: 3, minor: 6 },
  topQuestions: [
    "What logical steps connect revenue to management?",
    "How do cluster 0 and cluster 3 relate?",
    ...
  ]
}

Real-World Examples

@finance (Financial Analysis):

9 questions generated from 3 gaps:

1. What logical steps connect revenue to management?
   Gap: 0 → 3 (major, score=0.805)
   Bridges: relationship, connection

2. What causes the relationship between cluster 0 and cluster 3?
   Gap: 0 → 3 (major)

3. What are the implications of connecting cluster 0 with cluster 3?
   Gap: 0 → 3 (major)

4. What logical steps connect revenue to forecasting?
   Gap: 0 → 2 (minor, score=0.786)

... (9 total)

@marketing (Campaign Strategy):

3 questions generated from 1 gap:

1. What logical steps connect product to conversion?
   Gap: 0 → 1 (major, score=0.702)
   Bridges: relationship, connection

2. What causes the relationship between cluster 0 and cluster 1?
   Gap: 0 → 1 (major)

3. What are the implications of connecting cluster 0 with cluster 1?
   Gap: 0 → 1 (major)

@seo (Search Optimization):

3 questions generated from 1 gap:

1. What semantic relationship exists between keywords and performance?
   Gap: 1 → 2 (minor, score=0.654)
   Bridges: relationship, connection

2. How are keyword, research conceptually similar to performance, rankings?
   Gap: 1 → 2 (minor)

3. What shared meaning connects cluster 1 and cluster 2?
   Gap: 1 → 2 (minor)

Algorithmic Details

Concept Extraction:

  • Top 3 concepts per cluster (by centrality or order)
  • Fallback to all concepts if fewer than 3 available
  • Empty clusters skipped

Template Filling:

  • Regex replacement: \{variableName\} → actual value
  • All placeholders replaced in single pass
  • No nested template support

Deduplication Algorithm:

  • Lowercase + trim normalization
  • Set-based duplicate detection
  • Preserves first occurrence

Related Skills

  • Gap Scoring - Provides gap metadata for question generation
  • Evolution Tracking - Questions inform next-stage recommendations
  • AI Enhancement - AI insights complement generated questions

Limitations

Current constraints:

  • Template-based only (no AI generation)
  • Maximum 5 templates per gap type
  • Generic bridge concepts when no specific matches
  • No context-aware question filtering

Future enhancements (not implemented):

  • AI-powered custom question generation
  • Domain-specific template libraries
  • Bridge concept prediction via embeddings
  • Question quality scoring

Implementation: Phase 2 (2025-11-06) Test Status: ✅ Verified operational across @marketing, @seo, @finance Documentation: workspace/docs/Obsidian-v2/daily/2025-11-06-PHASE2-COMPLETE.md

Install via CLI
npx skills add https://github.com/mrhpython/Soulfield --skill question-generation
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator