anti-corruption-investigation

star 3

Advanced anti-corruption investigation system for analyzing chat logs and communications to detect suspicious patterns, corruption indicators, and relationship networks. Supports Chinese and English chat logs in JSON/TXT formats, handles million-scale datasets, and provides human-friendly relationship analysis with evidence-backed conclusions.

s1366560 By s1366560 schedule Updated 2/10/2026

name: anti-corruption-investigation description: Advanced anti-corruption investigation system for analyzing chat logs and communications to detect suspicious patterns, corruption indicators, and relationship networks. Supports Chinese and English chat logs in JSON/TXT formats, handles million-scale datasets, and provides human-friendly relationship analysis with evidence-backed conclusions.

Anti-Corruption Investigation v7.0

What's New in v7.0

Multi-Language Pattern Detection

  • Chinese Patterns: Financial corruption, power abuse, secret meetings, collusion
  • English Patterns: Evidence destruction, insider trading, pressure manipulation, enterprise fraud
  • Universal Enterprise Fraud Detection: SPE, accounting manipulation, financial metrics manipulation
  • Semantic Pattern Matching: Detects implicit expressions and euphemisms

Social Network Analysis

  • Person Profile Analysis: Comprehensive profiling including role detection, activity patterns, and risk assessment
  • Intermediary Detection: Automatically identifies bridge persons connecting corruption networks
  • Community Detection: Discovers corruption groups based on communication patterns
  • Influence Analysis: Ranks individuals by network influence and centrality
  • Connection Path Analysis: Finds shortest paths between high-risk individuals

Validation Framework

  • Pattern Validation: Validates detection accuracy against known corruption patterns
  • Report Validation: Ensures report completeness and quality
  • False Positive Control: Estimates and controls false positive rates
  • Continuous Improvement: Generates recommendations for pattern enhancement

When to Use This Skill

Use when analyzing chat logs, messages, or communications for:

  • Corruption detection: Financial corruption, power abuse, secret meetings, collusion
  • Relationship analysis: Identifying key players, corruption networks, intermediaries
  • Social network analysis: Understanding person profiles, influence, and group structures
  • Large-scale analysis: Processing 100K+ messages efficiently
  • Evidence gathering: Extracting specific evidence for relationships
  • Risk assessment: Evaluating corruption risk levels

Quick Start

Basic Analysis

from anti_corruption import ChatAnalyzer

# Analyze chat data
messages = [...]  # Load your messages
analyzer = ChatAnalyzer(messages)
results = analyzer.analyze()

# View results
print(f"Risk Level: {results['risk_level']}")
print(f"Suspicious Messages: {len(results['suspicious_messages'])}")

Multi-Language Pattern Detection

from multi_lang_patterns import analyze_text, analyze_email

# Analyze text in any language
result = analyze_text("We need to delete these documents before audit")
print(f"Risk Score: {result['risk_score']}")
print(f"Categories: {result['categories']}")

# Analyze email with enterprise fraud detection
email_data = {
    'sender': 'john@company.com',
    'receiver': 'jane@company.com',
    'subject': 'Q4 Results',
    'content': 'We need to hit the target number.',
    'title': 'CFO'
}
result = analyze_email(email_data)
print(f"Risk Level: {result['risk_level']}")

Relationship Analysis

from anti_corruption import RelationshipAnalyzer

# Analyze relationships
analyzer = RelationshipAnalyzer(messages)
relationships = analyzer.analyze()

# View top relationships
for rel in relationships['top_relationships'][:10]:
    print(f"{rel['person_a']} ↔ {rel['person_b']}")
    print(f"  Type: {rel['relationship_type']}")
    print(f"  Evidence: {len(rel['evidence'])} items")
    print(f"  Risk: {rel['risk_level']}")

Social Network Analysis

from anti_corruption import SocialNetworkAnalyzer

# Analyze social network
analyzer = SocialNetworkAnalyzer(messages)
results = analyzer.analyze()

# View person profiles
for name, profile in results['person_profiles'].items():
    print(f"{name}: {profile['primary_role']} - {profile['risk_level']}")

# View intermediaries
for inter in results['intermediaries'][:5]:
    print(f"Intermediary: {inter['name']} (Score: {inter['brokerage_score']})")

# View communities
for comm in results['communities']:
    print(f"Community: {', '.join(comm['members'][:5])}")

Validation

from case_validator import validate_analysis, generate_validation_report

# Validate analysis results
validation = validate_analysis(analysis_results)
print(f"Detection Accuracy: {validation.detection_accuracy:.1%}")
print(f"False Positive Rate: {validation.false_positive_rate:.1%}")

# Generate validation report
report = generate_validation_report(validation, 'validation_report.txt')

Core Scripts

anti_corruption.py

Unified analysis tool with all features.

Usage:

# Basic corruption analysis
python anti_corruption.py analyze input.jsonl report.json

# Relationship analysis
python anti_corruption.py relationships input.jsonl relationships.json --text-report report.txt

# Social network analysis
python anti_corruption.py social-network input.jsonl social_network.json --text-report social_report.txt

# Full analysis with all features
python anti_corruption.py full input.jsonl output_dir/

Commands:

  • analyze: Basic corruption pattern detection
  • relationships: Relationship network analysis
  • social-network: Social network and person profile analysis
  • full: Run all analyses

multi_lang_patterns.py

Multi-language pattern detection module.

Features:

  • Automatic language detection (Chinese, English, Mixed)
  • Direct pattern matching for corruption indicators
  • Semantic pattern matching for implicit expressions
  • Enterprise fraud specific patterns (SPE, accounting manipulation)
  • Risk scoring based on pattern matches

Usage:

from multi_lang_patterns import MultiLangPatternMatcher, EnterpriseFraudDetector

# Pattern matching
matcher = MultiLangPatternMatcher()
matches = matcher.match_patterns(text)
summary = matcher.get_summary(matches)

# Enterprise fraud detection
detector = EnterpriseFraudDetector()
result = detector.analyze_email(email_data)

case_validator.py

Validation framework for analysis results.

Features:

  • Pattern detection validation
  • False positive rate estimation
  • Report completeness checking
  • Improvement recommendations

Usage:

from case_validator import PatternValidator, ReportValidator

# Validate analysis
validation = PatternValidator.validate_analysis(results)

# Validate report
report_check = ReportValidator.validate_report(report)

Data Format

Input Format (JSONL)

{"timestamp": "2024-01-15T14:30:00", "sender": "张三", "receiver": "李四", "content": "那笔钱准备好了吗?"}
{"timestamp": "2024-01-15T14:31:00", "sender": "李四", "receiver": "张三", "content": "已经准备好了"}

Input Format (TXT)

[2024-01-15 14:30:00] 张三 -> 李四: 那笔钱准备好了吗?
[2024-01-15 14:31:00] 李四 -> 张三: 已经准备好了

Output Format

Social Network Analysis Output

{
  "person_profiles": {
    "张三": {
      "name": "张三",
      "message_count": 150,
      "contact_count": 8,
      "contacts": ["李四", "王五", ...],
      "primary_role": "official",
      "detected_roles": ["official", "business"],
      "suspicious_message_count": 25,
      "corruption_patterns": {
        "financial_corruption": 15,
        "power_abuse": 10
      },
      "risk_score": 7.5,
      "risk_level": "🔴 高风险",
      "activity_anomaly": {
        "anomaly_score": 6.2,
        "late_night_ratio": 0.31,
        "peak_hours": [22, 23, 0]
      }
    }
  },
  "intermediaries": [
    {
      "name": "王五",
      "brokerage_score": 8,
      "contact_count": 15,
      "primary_role": "intermediary",
      "risk_level": "🔴 高风险"
    }
  ],
  "communities": [
    {
      "id": 0,
      "members": ["张三", "李四", "王五"],
      "member_count": 3,
      "average_risk_score": 7.2,
      "risk_level": "🔴 高风险"
    }
  ]
}

Pattern Categories

Financial Corruption

  • Chinese: 转账, 汇款, 回扣, 贿赂, 好处费
  • English: kickback, bribe, hidden payment, secret fee

Evidence Destruction

  • Chinese: 删除, 销毁, 清理, 不留痕迹
  • English: delete, destroy, shred, clean up, off the record

Insider Trading

  • English: stock option, insider information, before announcement

Enterprise Fraud

  • SPE, off-balance sheet, mark-to-market
  • Aggressive accounting, earnings management
  • EBITDA manipulation, pro forma adjustments

Pressure Manipulation

  • English: pressure, hit the target, make it happen
  • Adjust numbers, bridge the gap, find a way

Version History

  • v7.0: Added multi-language pattern detection, enterprise fraud patterns, validation framework
  • v6.0: Added social network analysis, person profiling, intermediary detection
  • v5.0: Refactored for clarity, human-friendly output, improved performance
  • v4.0: Added relationship network analysis
  • v3.0: Large-scale processing support
  • v2.0: Semantic pattern matching
  • v1.0: Initial release with keyword-based detection
Install via CLI
npx skills add https://github.com/s1366560/agi-demos --skill anti-corruption-investigation
Repository Details
star Stars 3
call_split Forks 2
navigation Branch main
article Path SKILL.md
More from Creator