nba-ai-core - SKILL.md Agent Skill

name: nba-ai-core description: Core knowledge for the NBA AI project, including data pipeline, prediction engines, and system architecture.

This skill provides comprehensive context for maintaining and upgrading the NBA AI system.

The data pipeline is orchestrated by src/database_updater/database_update_manager.py and consists of the following sequential stages:

Schedule Update: Fetch game schedule from NBA Stats API.
Players Update: Update player reference data.
Injuries Update: Fetch official NBA injury report PDFs.
Betting Update: Unified betting lines from ESPN API + Covers.com.
PbP Collection: Fetch play-by-play data (CDN primary, Stats API fallback).
GameStates Parsing: Transform raw PBP into structured snapshots.
Boxscores Collection: Fetch traditional PlayerBox and TeamBox stats.
Pre-Game Data: Generate prior states and feature sets (34 rolling average features).
Predictions: Generate pre-game and live predictions.

SQLite: Primary storage.
Three-DB Setup:
- current: (~500MB) Latest season data.
- dev: (~3GB) Last 3 seasons (Default for work).
- full: (~25GB) Historical archive (1999-present).
Timezones: Always store in UTC. Query logic uses Eastern Time (NBA ops). Display uses user local.

All predictors must inherit from or follow the pattern in src/predictions/prediction_manager.py:

Future Goal: Transit to GenAI-based engines using sequential PBP data.

Always use the virtual environment:

# Windows
.\venv\Scripts\activate
# Unix
source venv/bin/activate

Start Web App: python start_app.py --predictor=Tree --log_level=INFO
Batch Update: python -m src.database_updater.database_update_manager --season=2025-2026 --predictor=Tree
Health Check: python -m src.health_check --season=2025-2026
Run Tests: pytest -v

No verbose comments: Use self-documenting code.
No hardcoded paths: Use src/config.py loaded config.
Always ask before committing: Solo dev setup doesn't mean auto-commits.
Transitive dependencies: Never add them to requirements.txt.
CRITICAL: Row Factory: When querying SQLite, always use conn.row_factory = sqlite3.Row. Failure to do so causes TypeError in API layers that expect dictionary-like access.

config.yaml: Central configuration.
DATA_MODEL.md: Schema source of truth.
project-instructions.md: Deep AI context.
TODO.md: Sprint and task tracking.
AGENT_HANDOVER_REPORT.md: Detailed history of fixes and project state (Feb 2026).