name: knowledge-graph-research description: Deep discourse analysis of credit market documents using InfraNodus text network analysis. Transforms document collections into knowledge graphs, revealing conceptual relationships, thematic clusters, structural gaps, and research questions. Use when analyzing credit documentation, developing investment theses, conducting due diligence, or synthesizing market intelligence. license: Complete terms in LICENSE.txt
Knowledge Graph Research Skill
Overview
This skill enables deep discourse analysis of credit market documents using InfraNodus text network analysis. It transforms collections of documents into knowledge graphs, revealing conceptual relationships, thematic clusters, structural gaps, and research questions that emerge from the discourse structure itself.
Process
๐ High-Level Workflow
Using knowledge graph research involves understanding the capabilities, preparing documents, analyzing discourse structure, and interpreting results:
Phase 1: Understand Core Capabilities
1.1 Text Network Analysis
- Convert documents into knowledge graphs with nodes (concepts) and edges (relationships)
- Extract main topical clusters from credit reports, research, and market commentary
- Identify structural discourse gaps - areas where concepts are isolated or poorly connected
- Generate research questions based on network topology
1.2 Discourse Gap Detection
- Find clusters that lack connections, suggesting blind spots in analysis
- Identify bridge concepts that could connect disparate themes
- Reveal assumptions embedded in document structure
- Surface areas requiring additional due diligence
1.3 Concept Clustering
- Group related concepts using network modularity algorithms
- Extract dominant themes from credit market discussions
- Track thematic evolution across document sets
- Identify emerging narratives in market sentiment
1.4 Graph-Based AI Insights (Graph-RAG)
- Use knowledge graph structure to provide contextually-aware AI responses
- Query across graph topology rather than just text similarity
- Generate insights that account for concept relationships and discourse patterns
- Produce responses grounded in structural understanding of the domain
1.5 Visual Graph Generation
- Export DOT format graphs for Graphviz rendering
- Create entity-based graphs (high-level overview, sparser)
- Create concept-based graphs (detailed, shows all relationships)
- Visualize discourse structure for presentations and reports
Phase 2: Prepare Documents and Plan Analysis
Load ๐ Document Preparation Guide for comprehensive document preparation strategies.
2.1 Document Selection
Include Context-Rich Documents:
- Credit agreements and amendments
- Research reports and market commentary
- Earnings call transcripts
- Regulatory filings (10-K, 10-Q risk factors)
- Internal memos and analysis
Mix Document Types:
- Combine different perspectives (internal vs. external)
- Include multiple time periods for evolution tracking
- Mix document types to reveal cross-cutting themes
- Balance depth and breadth of coverage
2.2 Text Quality Requirements
Clean, Well-Formatted Text:
- Use OCR-corrected text when needed
- Remove excessive boilerplate if it dominates
- Preserve meaningful structure (headings, sections)
- Ensure sufficient length for meaningful analysis
Consider Document Length:
- Very long documents may need chunking
- Very short documents may not generate meaningful graphs
- Balance between detail and manageability
2.3 Context Naming Strategy
Use Descriptive, Structured Names:
- Include date/version for tracking evolution
- Examples:
bsl_covenants_q4_2024,healthcare_risk_analysis_2024 - Helps organize multiple analyses
- Enables longitudinal tracking
Phase 3: Execute Analysis Workflows
3.1 Covenant Precedent Research
Scenario: Analyzing covenant structures across multiple credit agreements
Workflow:
# Step 1: Collect credit agreements
agreements = [
credit_agreement_1,
credit_agreement_2,
credit_agreement_3
]
# Step 2: Build knowledge graph
graph = analyze_text_network(
texts=agreements,
context_name="financial_covenant_structures"
)
# Step 3: Identify gaps in covenant coverage
gaps = detect_discourse_gaps(
texts=agreements,
context_name="covenant_gap_analysis"
)
# Step 4: Ask strategic questions
insights = get_graph_based_insights(
texts=agreements,
prompt="What covenant structures are common across these agreements? What variations exist?",
mode="question",
context_name="covenant_comparison"
)
# Step 5: Visualize for presentation
viz = generate_knowledge_graph_visualization(
texts=agreements,
extract_entities_only=False,
context_name="covenant_network"
)
3.2 Market Sentiment Analysis
Scenario: Tracking thematic evolution in credit market commentary
Workflow:
# Analyze thematic evolution
reports = [weekly_report_1, weekly_report_2, weekly_report_3]
# Extract clusters
clusters = cluster_credit_concepts(
texts=reports,
min_cluster_size=2,
context_name="weekly_credit_themes"
)
# Identify what's NOT being discussed
gaps = detect_discourse_gaps(
texts=reports,
context_name="market_blind_spots"
)
# Generate strategic insights
insights = get_graph_based_insights(
texts=reports,
prompt="What emerging themes are gaining connectivity in credit discussions?",
mode="chat",
context_name="theme_evolution"
)
3.3 Due Diligence Enhancement
Scenario: Comprehensive analysis of deal documentation
Workflow:
# Combine multiple document types
docs = [
credit_agreement,
security_agreement,
intercreditor_agreement,
fee_letter
]
# Build comprehensive graph
graph = analyze_text_network(
texts=docs,
context_name="comprehensive_deal_structure"
)
# Find structural gaps
gaps = detect_discourse_gaps(
texts=docs,
context_name="documentation_completeness"
)
# Generate diligence questions
questions = get_graph_based_insights(
texts=docs,
prompt="What key topics are isolated or poorly connected in this deal documentation?",
mode="question",
context_name="diligence_gaps"
)
3.4 Investment Thesis Development
Scenario: Building comprehensive understanding of market narratives
Workflow:
- Collect research reports, earnings calls, and market commentary
- Build knowledge graph to identify well-connected vs. isolated themes
- Detect discourse gaps that competitors may have missed
- Generate research questions based on structural analysis
- Use Graph-RAG to synthesize insights across the discourse structure
3.5 Longitudinal Analysis
Scenario: Tracking how knowledge graphs evolve over time
Workflow:
# Quarter 1
q1_graph = analyze_text_network(q1_docs, "q1_2024")
# Quarter 2
q2_graph = analyze_text_network(q2_docs, "q2_2024")
# Compare modularity, cluster evolution, emerging concepts
Phase 4: Interpret and Apply Results
Load โ Best Practices Guide for comprehensive interpretation strategies.
4.1 Graph Interpretation
Modularity Scores:
- High modularity (>0.4): Fragmented discourse with clear clusters
- Low modularity (<0.3): Interconnected, cohesive discussion
- 0.3-0.5: Ideal range - clear themes with some integration
Betweenness Centrality:
- High betweenness = bridge concepts connecting different parts
- Critical for understanding the whole discourse
- Potential areas for innovation or disruption
Cluster Analysis:
- Isolated clusters: Specialized domains or potential blind spots
- Dense clusters: Well-understood, thoroughly discussed topics
- Sparse graphs: Fragmented understanding, need for synthesis
4.2 Gap Analysis
Interpreting Discourse Gaps:
- Gaps don't always mean problems - sometimes specialization is appropriate
- Bridge concepts are critical - they connect otherwise isolated themes
- Use gaps to guide further research and due diligence
- Cross-reference gaps with actual documentation
4.3 Quality Indicators
Good Graph Characteristics:
- Modularity between 0.3-0.5 (clear themes, some integration)
- Multiple clusters of similar size (balanced coverage)
- Bridge concepts connecting major themes
- Interpretable cluster themes
Warning Signs:
- Single dominant cluster (overly general or unfocused)
- Many tiny clusters (fragmented, noisy data)
- No clear thematic structure (poor source quality)
- Modularity >0.6 (siloed, disconnected discourse)
Reference
Key Functions
analyze_text_network(texts, context_name)
Purpose: Create knowledge graph from document collection
Parameters:
texts: List of document strings (articles, reports, agreements)context_name: Identifier for this analysis context
Returns:
- Knowledge graph with nodes, edges, and clusters
- Main topics and key concepts
- Thematic clusters with modularity scores
- Network statistics
Example:
# Analyze a set of credit agreements
texts = [agreement1_text, agreement2_text, agreement3_text]
result = analyze_text_network(
texts=texts,
context_name="syndicated_loan_covenants_q4_2024"
)
# Result contains:
# - nodes: [{id, label, size, metrics}, ...]
# - edges: [{source, target, weight}, ...]
# - clusters: [{id, concepts, theme}, ...]
# - main_topics: ["financial covenants", "collateral", ...]
detect_discourse_gaps(texts, context_name)
Purpose: Identify structural gaps and missing connections
Parameters:
texts: Document collectioncontext_name: Analysis identifier
Returns:
- Discourse gaps (disconnected clusters)
- Bridge concepts that could connect themes
- Research questions generated from gaps
- Recommendations for further investigation
Example:
# Find what's missing in credit analysis
gaps = detect_discourse_gaps(
texts=[research_reports],
context_name="healthcare_lending_gap_analysis"
)
# Result might show:
# - "ESG considerations" cluster isolated from "credit metrics" cluster
# - Bridge concept: "sustainability-linked pricing"
# - Research question: "How do ESG factors impact covenant structures?"
cluster_credit_concepts(texts, min_cluster_size, context_name)
Purpose: Extract and cluster thematic concepts
Parameters:
texts: Document collectionmin_cluster_size: Minimum concepts per cluster (default: 2)context_name: Analysis identifier
Returns:
- Topical clusters with key concepts
- Cluster coherence metrics
- Dominant themes and sub-themes
Example:
# Cluster concepts from market commentary
clusters = cluster_credit_concepts(
texts=[market_reports],
min_cluster_size=3,
context_name="bsl_clo_market_themes"
)
# Result groups related concepts:
# Cluster 1: ["spread compression", "covenant-lite", "EBITDA adjustments"]
# Cluster 2: ["CLO issuance", "AAA tranche", "arbitrage opportunities"]
get_graph_based_insights(texts, prompt, mode, context_name)
Purpose: Generate AI insights using graph structure (Graph-RAG)
Parameters:
texts: Document collectionprompt: Question or research querymode: "chat" | "question" | "summary"context_name: Analysis identifier
Returns:
- AI-generated insights grounded in graph topology
- Main topics and key concepts from graph
- Thematic clusters and their relationships
- Discourse gaps relevant to the query
- Graph modularity metrics
Example:
# Ask strategic questions informed by discourse structure
insights = get_graph_based_insights(
texts=[credit_docs, market_research],
prompt="What hidden risks exist in the healthcare lending market?",
mode="question",
context_name="healthcare_risk_analysis"
)
# Response includes:
# - AI answer informed by graph structure
# - Identification of isolated risk concepts
# - Gaps between discussed and undiscussed risks
# - Structural insights about discourse patterns
generate_knowledge_graph_visualization(texts, extract_entities_only, context_name)
Purpose: Create exportable DOT graph for visualization
Parameters:
texts: Document collectionextract_entities_only: True for high-level (entities), False for detailed (concepts)context_name: Visualization identifier
Returns:
- DOT format graph string
- Graph summary and insights
- Usage instructions for rendering
Example:
# Create high-level entity graph
viz = generate_knowledge_graph_visualization(
texts=[agreements],
extract_entities_only=True, # Sparser, cleaner graph
context_name="loan_agreement_structure"
)
# Save and render:
# 1. Save viz['dot_graph'] to file.dot
# 2. Run: dot -Tpng file.dot -o graph.png
# 3. Or: dot -Tsvg file.dot -o graph.svg
# Create detailed concept graph
viz_detailed = generate_knowledge_graph_visualization(
texts=[agreements],
extract_entities_only=False, # Dense, comprehensive graph
context_name="covenant_network_detailed"
)
Technical Approach
Knowledge Graph Construction
The system uses InfraNodus, which:
- Extracts key concepts from text using NLP
- Builds a network where concepts are nodes and co-occurrence creates edges
- Calculates network metrics (modularity, betweenness centrality, clustering)
- Identifies topical communities using graph algorithms
Discourse Gap Analysis
Gaps are identified by:
- Finding clusters with low inter-connectivity
- Identifying bridge concepts that could connect themes
- Measuring graph modularity to assess discourse fragmentation
- Comparing expected vs. actual concept relationships
Graph-RAG Workflow
- Documents are converted to knowledge graph
- Graph topology is analyzed for structure and patterns
- AI queries use both semantic content AND graph structure
- Responses incorporate understanding of concept relationships
Output Structure
Knowledge Graph
{
"nodes": [
{
"id": "financial_covenants",
"label": "financial covenants",
"size": 15,
"betweenness": 0.42,
"cluster": 1
}
],
"edges": [
{
"source": "financial_covenants",
"target": "leverage_ratio",
"weight": 8
}
],
"clusters": [
{
"id": 1,
"concepts": ["financial_covenants", "leverage_ratio", "EBITDA"],
"theme": "Financial Metrics"
}
],
"main_topics": ["financial covenants", "collateral", "guarantees"],
"modularity": 0.38
}
Discourse Gaps
{
"gaps": [
{
"cluster_a": "pricing_mechanisms",
"cluster_b": "environmental_covenants",
"connectivity": 0.12,
"suggested_bridges": ["sustainability_linked_pricing"]
}
],
"research_questions": [
"How do ESG considerations affect loan pricing?",
"What covenant structures support sustainability goals?"
]
}
Integration with Other Skills
With Document Semantic Search
- Use graph analysis to identify key covenant types
- Feed concept clusters into semantic search queries
- Validate graph insights with specific document retrieval
- Cross-reference gaps with actual documentation
Example:
# Step 1: Build knowledge graph
graph = analyze_text_network(
texts=credit_agreements,
context_name="covenant_structures"
)
# Step 2: Extract key concepts
key_concepts = [cluster['theme'] for cluster in graph['clusters']]
# Step 3: Use in semantic search
from document_search import semantic_search_credit_documents
for concept in key_concepts:
results = semantic_search_credit_documents(
query_text=f"{concept} in credit agreements",
limit=10
)
With SEC Intelligence
- Analyze 10-K risk factors using discourse analysis
- Track thematic evolution across quarterly filings
- Compare discourse structure across peer companies
- Identify gaps in regulatory disclosure
Example:
# Get 10-K risk factors
from sec_skill import get_latest_sec_10k
sec_doc = get_latest_sec_10k("TICKER")
# Analyze discourse structure
graph = analyze_text_network(
texts=[sec_doc['risk_factors']],
context_name="risk_factor_analysis"
)
# Identify gaps
gaps = detect_discourse_gaps(
texts=[sec_doc['risk_factors']],
context_name="risk_gaps"
)
With Counterparty Network
- Map conceptual relationships alongside financial exposure
- Identify sector themes affecting network risk
- Track how discourse about counterparties evolves
- Find gaps in sector coverage or understanding
Advanced Techniques
Multi-Source Synthesis
Combine different document types:
# Mix internal analysis, external research, market data
all_sources = internal_memos + sell_side_research + news_articles
comprehensive_graph = analyze_text_network(
texts=all_sources,
context_name="multi_source_synthesis"
)
Iterative Refinement
Use gaps to guide further research:
# Initial analysis
gaps = detect_discourse_gaps(texts=docs, context_name="initial")
# Focus research on gap areas
# ... gather more documents on identified gaps ...
# Re-analyze
refined = analyze_text_network(
texts=docs + gap_focused_docs,
context_name="refined_analysis"
)
Interpretation Guide
High Betweenness Centrality: Concepts with high betweenness are "bridges" - they connect different parts of the discourse. These are often:
- Integration points between topics
- Critical concepts for understanding the whole
- Potential areas for innovation or disruption
Isolated Clusters: Clusters with few connections to others suggest:
- Specialized domains requiring expert knowledge
- Potential blind spots or incomplete analysis
- Opportunities for novel connections
Dense Clusters: Heavily interconnected clusters indicate:
- Well-understood, thoroughly discussed topics
- Possible groupthink or conventional wisdom
- Areas where consensus exists
Sparse Graphs: Low overall connectivity suggests:
- Fragmented understanding across sources
- Multiple independent narratives
- Need for synthesis or integration
Limitations & Considerations
Text Quality Matters
- OCR errors can create spurious concepts
- Legal boilerplate may dominate graphs if not filtered
- Very short documents may not generate meaningful graphs
Graph Interpretation
- Networks show co-occurrence, not causation
- High-frequency terms may overshadow important rare terms
- Context is critical - interpret results with domain knowledge
Scale Considerations
- Very large document sets may need batching
- Entity-only graphs are better for high-level overviews
- Concept graphs can become dense with many documents
Complementary Approaches
- Graph analysis reveals structure, not semantic depth
- Combine with traditional reading for full understanding
- Use as hypothesis generator, not definitive answer source
Future Enhancements
Planned Capabilities
- Temporal graph evolution tracking
- Automated gap-filling recommendations
- Integration with entity extraction for named entities
- Custom concept filtering and weighting
Integration Opportunities
- Direct SurrealDB storage of graphs
- Cross-referencing with semantic search results
- Automated report generation from graph insights
- Real-time graph updates as new documents arrive
Reference Files
๐ Documentation Library
Load these resources as needed during knowledge graph research:
Core Guides (Load First)
๐ Document Preparation Guide - Best practices for preparing documents for knowledge graph analysis including:
- Document selection strategies
- Text quality requirements
- Context naming conventions
- Multi-source synthesis techniques
๐ Graph Analysis Guide - Comprehensive guide on interpreting knowledge graphs including:
- Modularity and centrality metrics
- Cluster interpretation
- Gap analysis strategies
- Quality indicators and warning signs
โ Best Practices Guide - Complete best practices for knowledge graph research including:
- Workflow patterns
- Result interpretation guidelines
- Integration strategies
- Advanced techniques
Skill Version: 1.0 Last Updated: 2025-01-06 Maintained By: Arda Insights Team Dependencies: InfraNodus API, arda-insights MCP server