name: omnifinan description: Guide AI to use the OmniFinan Python library for financial analysis, including multi-agent hedge fund workflow, data fetching (AkShare/YFinance/FRED/DBnomics/SEC EDGAR), macro indicators, technical/fundamental/sentiment/valuation analysis, bull-bear debate, risk management, portfolio decisions, backtesting, visualization, and report pipelines. Use when working with OmniFinan code, running analyses, modifying agents, or adding features to this financial analysis system.
OmniFinan Library Guide
OmniFinan is an AI-driven multi-agent hedge fund analysis system supporting US, HK, China, and crypto workflows. It uses LangGraph for agent orchestration, LangChain for LLM calls, and multiple data providers (AkShare, YFinance, SEC EDGAR, FRED, World Bank, IMF Datamapper, DBnomics, Tavily, Brave).
Project Structure
src/omnifinan/
├── __init__.py # Exports: MarketType, run_hedge_fund
├── main.py # CLI entrypoint -> presentation.cli.run_cli()
├── unified_api.py # Low-level AkShare/FRED/WorldBank data functions (3800+ lines)
├── data_models.py # Pydantic models: Price, FinancialMetrics, LineItem, etc.
├── visualize.py # Plotly charting: StockFigure, macro dashboards
├── backtester.py # Backtester class for strategy evaluation
├── agents/ # LangGraph agent nodes
│ ├── graphs.py # Graph builders: create_trading_graph()
│ ├── state.py # AgentState TypedDict
│ ├── nodes.py # Node factories
│ ├── edges.py # Conditional routing functions
│ ├── prompts.py # Prompt constants
│ ├── market_data.py # Data collection agent
│ ├── technicals.py # Technical analysis agent
│ ├── fundamentals.py # Fundamental analysis agent
│ ├── macro.py # Macro analyst agent
│ ├── sentiment.py # Sentiment analysis agent (uses LLM)
│ ├── valuation.py # DCF/multiples valuation agent
│ ├── researcher_bull.py # Bullish thesis generator (uses LLM)
│ ├── researcher_bear.py # Bearish thesis generator (uses LLM)
│ ├── debate_room.py # Bull-bear debate judge (uses LLM)
│ ├── risk_manager.py # Position sizing / risk constraints
│ └── portfolio_manager.py # Final buy/sell/hold decisions (uses LLM)
├── core/
│ ├── config.py # RuntimeConfig (env vars / YAML / JSON)
│ ├── workflow.py # run_hedge_fund() orchestration
│ ├── observability.py # RunTrace for metrics/cost tracking
│ └── experiment.py # ExperimentRecorder for run comparison
├── data/
│ ├── cache.py # DataCache: file-based request + dataset cache
│ ├── unified_service.py # UnifiedDataService: cached provider wrapper
│ ├── symbols.py # is_crypto_ticker() helper
│ └── providers/
│ ├── base.py # DataProvider ABC
│ ├── factory.py # create_data_provider(name)
│ ├── akshare_provider.py
│ ├── yfinance_provider.py
│ ├── sec_edgar_provider.py
│ └── moomoo_options_provider.py # stock options provider
├── analysis/
│ ├── indicators.py # XMA, cross_over, cross_under (TA-Lib wrappers)
│ ├── transform.py # Feature engineering: returns, rolling features
│ ├── options.py # BS pricing, IV, Greeks, max pain, OI levels, GEX, term structure, skew
│ ├── factor_mining.py # CustomFactorSpec, add_candidate_factors, IC/RankIC evaluation
│ └── factor_backtest.py # Cross-sectional factor backtesting, perf_stats
├── research/
│ ├── valuation.py # dcf_intrinsic_value(), valuation_signal()
│ ├── factors.py # Qlib-style DSL: ref, mean, std, rank, apply_factor
│ ├── report_pipeline.py # PDF report -> LLM synthesis
│ └── report_parser.py # ParsedReport via pypdf
├── llm/
│ ├── client.py # call_llm(): unified LLM call with cache + retry
│ └── providers.py # PROVIDER_REGISTRY: gpt/claude/gemini/deepseek
├── presentation/
│ ├── cli.py # argparse CLI with questionary analyst selector
│ └── api.py # Flask REST API: POST /analyze, GET /healthz
└── utils/
├── analysts.py # ANALYST_CONFIG registry, get_analyst_nodes()
├── holidays.py # Trading calendar filtering
├── normalization.py # confidence_to_unit()
├── progress.py # Progress tracking
├── display.py # Console output formatting
├── scratchpad.py # Scratchpad for run artifacts
└── llm.py # Convenience LLM wrapper
Core Concepts
AgentState
All agents receive and return AgentState:
class AgentState(TypedDict):
messages: Annotated[Sequence[BaseMessage], operator.add]
data: Annotated[dict[str, Any], merge_dicts]
metadata: Annotated[dict[str, Any], merge_dicts]
dataholds tickers, dates, prices, financial_metrics, macro_indicators, analyst_signalsmetadataholds show_reasoning, model_name, provider_api, language, data_service, trace, scratchpad
Trading Graph Pipeline
start_node -> market_data_agent -> [analyst agents in parallel] -> investment_debate_start
-> researcher_bull_agent <-> researcher_bear_agent -> debate_room_agent
-> risk_start -> risk_management_agent -> execution_start -> portfolio_management_agent -> END
Analyst agents (run in parallel after market_data_agent):
technical_analyst_agent- trend, momentum, mean reversion, volatility, stat arbfundamentals_agent- profitability, growth, financial health, valuation ratiosmacro_analyst_agent- central bank rates, inflation, employmentsentiment_agent- news/inference-driven sentiment analysis node; data discovery is separate fromget_company_newsvaluation_agent- DCF, owner earnings, residual income, comparable multiples
Data Providers
| Provider | Name strings | Markets | Capabilities |
|---|---|---|---|
| AkShare | "akshare" |
CN, US, HK | Prices, financials, news, macro (CN+intl) |
| YFinance | "yfinance", "yf", "yahoo" |
US, global, crypto | Prices, financials, news |
| SEC EDGAR | "sec_edgar", "sec" |
US | Financial metrics, line items from XBRL |
| DBnomics | n/a (API helper in unified_api.py) |
Global macro | Macro time series catalog (providers/datasets/series). Use workflow: /search → /datasets/{provider}/{dataset} → fetch via series_ids or provider+dataset+dimensions. Avoid calling /series without selectors (can be slow/terminated). |
| Moomoo (options-only) | provider="moomoo" |
US stock/index options | Stock option chain fetch only |
Ticker normalization includes volatility indices:
VIX/.VIX->^VIXVVIX/.VVIX->^VVIX
Default stock-options routing:
autousesmoomoo.- Futures option chain is explicit unavailable in the current provider stack.
Create with: create_data_provider("akshare")
DataProvider ABC
All providers implement:
get_prices(ticker, start_date, end_date, interval) -> list[Price]get_financial_metrics(ticker, end_date, period, limit) -> list[FinancialMetrics]search_line_items(ticker, period, limit) -> list[LineItem]get_company_news_raw(ticker, start_date, end_date, limit) -> list[CompanyNews]get_insider_trades(ticker, end_date, start_date, limit) -> list[InsiderTrade]get_market_cap(ticker, end_date) -> float | Noneget_macro_indicators(start_date, end_date) -> dict
UnifiedDataService
Wraps a DataProvider with intelligent caching via DataCache. Key behaviors:
- Price data: Incremental fetch - backfills gaps, appends new data, avoids redundant downloads
- Financial metrics/line items: Refetch when latest report is >30 days stale
- Macro indicators: Datasets-first strategy (stored in
datasets/macro_indicators_master/, immune to TTL-basedcleanup_expired). Uses series-level staleness + subset refresh. Whenforce=False, the service must never auto full-refresh; if subset refresh is unavailable or fails, it logs ERROR and continues with cached payload. Metrics explicitly retired from the active macro interface are removed from outputs rather than emitted as stale placeholders. - Company news:
- A-shares use AkShare raw news
- US/HK use Tavily-first and Brave supplemental search
- results are clustered into integrated events
- cross-verification uses actual publisher/domain weights, not search-engine counts
Public API note:
UnifiedDataService.get_company_news(...)returns integrated news events- provider-layer
get_company_news_raw(...)is internal raw discovery only - Insider trades: Incremental fetch by filing_date
- Crypto: Auto-routes to YFinance for crypto tickers
- Options:
get_stock_option_chain()defaults toprovider="auto"(moomoo)get_futures_option_chain()returns explicit unavailable in the current provider stack- default snapshot mode is previous-business-day close (
snapshot_mode="prev_close")
- default snapshot mode is previous-business-day close (
from omnifinan.data.cache import DataCache
from omnifinan.data.providers.factory import create_data_provider
from omnifinan.data.unified_service import UnifiedDataService
service = UnifiedDataService(
provider=create_data_provider("akshare"),
cache=DataCache(),
ttl_seconds=3600,
)
prices = service.get_prices("600519", "2025-01-01", "2025-12-31")
macro = service.get_macro_indicators("2025-01-01", "2025-12-31")
structured = service.get_macro_indicators_structured("2025-01-01", "2025-12-31")
stock_opts = service.get_stock_option_chain("AAPL", expiration="2026-06-19")
fut_opts = service.get_futures_option_chain("ES", expiration="2026-06-19")
Macro Data Architecture
- Source policy:
fixed_sources_with_dbnomics_proxies - Data-source credentials: read from
OMNIX_PATH/finn_api.json- Recommended node keys:
FRED.api_key,tavily.api_key,brave.api_key
- Recommended node keys:
- Master payload stored as dataset with snapshot history
- Staleness detection: per-series based on
cycle_days * 3threshold - Refetch cooldown:
fetched_at+ frequency-based minimum interval - Subset refresh when provider supports
get_macro_indicators_subset() - Structured output keys:
meta,dimensions,metrics,chart_data
Structured Macro Output
get_macro_indicators_structured() returns:
meta: snapshot_at, source_policy, countsdimensions: growth, inflation, liquidity, credit, market_feedbackmetrics: per-series cards with yoy, mom, qoq, trend_short, trend_medium, volatilitychart_data.long: flattened list for plotting
Factor Mining Framework
Quantitative factor mining pipeline inspired by qlib. Supports the full cycle: factor creation -> IC evaluation -> cross-sectional backtest.
Dependencies
scipyis required for Spearman Rank IC computation (evaluate_factors).
Built-in Factor Generation
from omnifinan.analysis.factor_mining import add_candidate_factors
# Input: panel DataFrame with columns [date, symbol, close, high, low, volume]
factored = add_candidate_factors(panel, forward_horizon=5)
# Adds: ret_1, ret_5, ret_20, mom_ma_5_20, mom_ma_20_60,
# volatility_20, amplitude_1, vol_ratio_20, rev_5, fwd_ret_5
Cross-Sectional Z-Score
from omnifinan.analysis.factor_mining import zscore_by_date
zscored, z_cols = zscore_by_date(factored, ["ret_5", "mom_ma_5_20"])
# z_cols = ["ret_5_z", "mom_ma_5_20_z"]
IC Evaluation
from omnifinan.analysis.factor_mining import evaluate_factors, daily_ic
# Full report: IC mean/std/IR + Rank IC mean/std/IR for each factor
report = evaluate_factors(zscored, z_cols, label_col="fwd_ret_5")
# Single factor IC time series
ic_ts = daily_ic(zscored, "ret_5_z", label_col="fwd_ret_5", method="spearman")
Custom Factors
from omnifinan.analysis.factor_mining import apply_custom_factors, CustomFactorSpec
def rolling_sharpe(g, window=20):
ret = g["close"].pct_change()
return ret.rolling(window).mean() / ret.rolling(window).std()
specs = [
CustomFactorSpec(name="sharpe_20", func=rolling_sharpe, kwargs={"window": 20}),
]
extended = apply_custom_factors(factored, specs)
# Or dict shorthand:
extended = apply_custom_factors(factored, {"ret_3": lambda g: g["close"].pct_change(3)})
Factor Backtest
from omnifinan.analysis.factor_backtest import (
build_cross_sectional_weights, run_daily_backtest, perf_stats,
)
weights = build_cross_sectional_weights(
df, score_col="mom_ma_20_60_z", quantile=0.2, long_short=True,
)
bt = run_daily_backtest(df, weights, cost_rate=0.001)
stats = perf_stats(bt["net_ret"], bt["equity"])
# Returns: total_return, annual_return, annual_vol, sharpe, max_drawdown, win_rate
Qlib-Style Factor DSL
from omnifinan.research.factors import ref, mean, std, rank, apply_factor
# Primitives
ref(series, 1) # shift(1)
mean(series, 5) # rolling(5).mean()
std(series, 20) # rolling(20).std()
rank(series, 20) # rolling(20) rank of latest value
# String expression interface
result = apply_factor("Ref($close,1)", df)
result = apply_factor("Mean($close,5)", df)
LLM Execution Guidance
This section is for LLM runtime orchestration guidance only.
Preferred Data-First API Path
For non-LLM analytics tasks, prefer direct UnifiedDataService APIs instead of full
multi-agent orchestration:
get_pricesget_financial_metricsget_line_itemsget_company_newsget_insider_tradesget_macro_indicatorsget_macro_indicators_structuredget_stock_option_chainget_futures_option_chainget_stock_option_chain_analytics(new; non-LLM)get_stock_option_gex(new; non-LLM; estimated GEX summary)- supports
gex_expiration(defaultNone):None=> full-chain GEXYYYY-MM-DD=> GEX for that expiry only
- supports
exp_datepost-fetch expiry bucket filters:all0dteNdte(nearest available DTE bucket, e.g.7dte)monthlyquarterly- combinable with
+, e.g.7dte+monthly
- supports
get_futures_option_chain_analytics(new; non-LLM)
Option Analytics (Non-LLM) Guidance
When the task asks for IV/skew/term-structure/Greeks, use analytics APIs first:
UnifiedDataService.get_stock_option_chain_analytics(...)risk_free_ratecan be omitted; service resolves from macro yields (us_treasury_2ythenus_treasury_10y) before fallback.contract_multipliersupports explicit override; default is 100.- provider
ivinputs are normalized to decimal volatility at chain level before Greeks/GEX computation, so values such as87.4are interpreted as87.4%, not87.4xvol.
UnifiedDataService.get_futures_option_chain_analytics(...)
Market compatibility rule:
- China A-share / HK equity tickers do not have options support in current provider stack.
- For those tickers, stock-option APIs return
meta.source = "fixed_sources_unavailable"with explicitmeta.error. - LLM should continue the broader analysis flow without treating this as a fatal error.
- Crypto pair symbols are normalized to base asset for options endpoints:
BTC-USDT,BTC-USD,BTCUSDT->BTCETH-USDT,ETH-USD,ETHUSDT->ETH
Return contract from analytics APIs:
meta(inherits chain metadata +analytics_version)data(raw option rows)raw(provider raw payload)analytics.summary(option_count,enriched_count,underlying_price,median_iv)analytics.surface(per-contract normalized metrics with IV/Greeks)analytics.term_structure(ATM IV by expiry)analytics.skew_by_expiry(risk_reversal_25d,butterfly_25d, ATM IV)analytics.smile_by_expiry(IV smile points by strike/moneyness per expiry)analytics.max_pain(overall and per-expiry max pain strike)analytics.levels(primary support/resistance from put/call OI walls)analytics.implied_vs_realized(current_atm_iv,historical_volatility,iv_minus_hv,iv_to_hv_ratio)analytics.summary.iv_historical_percentile(requiresiv_historyinput)analytics.errors(explicit calculation issues)
get_stock_option_gex(...) return notes:
gex_data.metadata.gamma_flip_priceis only populated when net GEX crosses zero inside the internal spot sweep band (0.7xto1.3xcurrent spot).- If no sign change occurs inside that band,
gamma_flip_priceremainsnull; this means “no zero-gamma root found in search band”, not “missing data”.
LLM-as-Glue Fallback Pattern
If a step requires semantic generation but nested runtime model calls are not desired:
- Pause at the step and collect exact upstream state.
- Generate the structured output in the current LLM context.
- Write back to the exact state path expected downstream.
- Resume remaining deterministic steps.
Required compatibility write-back path example:
state["data"]["analyst_signals"]["sentiment_agent"][ticker]- fields:
signal,confidence,reasoning
Keep external contracts stable:
- Top-level result keys (e.g.,
decisions,analyst_signals) - Macro structured keys (
meta,dimensions,metrics,chart_data)
Orchestration Interfaces (Still Available)
These orchestration interfaces remain supported; use them when task intent is end-to-end execution rather than atomic data/analytics calls.
omnifinan.run_hedge_fund(...)(full multi-agent workflow)omnifinan.core.workflow.run_hedge_fund(...)(workflow module entry)omnifinan.backtester.Backtester(backtest runner)omnifinan.presentation.api.create_app()(REST API entry)omnifinan.visualize.StockFigure/omnifinan.visualize.create_macro_figure(charting)
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
OMNIFINAN_DATA_PROVIDER |
akshare |
Data provider name |
OMNIFINAN_MARKET_TYPE |
china |
Market: us, china, hongkong |
OMNIFINAN_DATA_CACHE_TTL |
3600 |
Cache TTL in seconds |
OMNIFINAN_MODEL_NAME |
deepseek-chat |
LLM model |
OMNIFINAN_PROVIDER_API |
deepseek |
LLM provider |
OMNIFINAN_LANGUAGE |
Chinese |
Output language |
OMNIFINAN_MODEL_TEMPERATURE |
0.2 |
LLM temperature |
OMNIFINAN_DEBATE_ROUNDS |
1 |
Bull-bear debate rounds |
OMNIFINAN_DETERMINISTIC_MODE |
1 |
Enable LLM response caching |
OMNIFINAN_LLM_SEED |
7 |
LLM seed for reproducibility |
OMNIFINAN_ENABLED_ANALYSTS |
(all) | Comma-separated analyst keys |
OMNIFINAN_CONFIG_PATH |
(none) | Path to YAML/JSON config file |
Config File (YAML/JSON)
data_provider: akshare
market_type: china
data_cache_ttl_seconds: 3600
debate_rounds: 2
deterministic_mode: true
enabled_analysts:
- technical_analyst
- fundamentals_analyst
- macro_analyst
llm:
model_name: deepseek-chat
provider_api: deepseek
temperature: 0.2
max_retries: 3
language: Chinese
Available Analyst Keys
technical_analyst- Technical indicators and pattern analysisfundamentals_analyst- Financial statement analysismacro_analyst- Macroeconomic indicators analysissentiment_analyst- News and insider trading sentimentvaluation_analyst- Intrinsic value estimation
Data Models
Key Pydantic Models
| Model | Key Fields |
|---|---|
Price |
open, close, high, low, volume, time, market |
FinancialMetrics |
ticker, report_period, period, currency, market_cap, PE/PB/PS ratios, margins, ROE/ROA, growth rates |
LineItem |
ticker, report_period, net_income, operating_revenue, free_cash_flow (extra="allow") |
CompanyNews |
provider-level raw article row: ticker, title, source, date, url |
IntegratedNewsEvent |
event_id, ticker, headline, published_at, primary_source, weighted_source_score, consensus_passed |
InsiderTrade |
ticker, filing_date, transaction_date, transaction_shares, transaction_value |
MarketType |
US, CHINA, CHINA_SZ, CHINA_SH, HK, UNKNOWN |
Runtime Data Paths
All runtime data under OMNIX_PATH/omnifinan/:
request_cache/- API response cache (hashed JSON files)datasets/- Persistent datasets (prices, financials, macro history)reports/- Output reports and experiment recordslogs/- Application logs
LLM Integration
from omnifinan.llm.client import call_llm
# Plain text response
text = call_llm(
prompt="Analyze this data...",
model_name="deepseek-chat",
provider_api="deepseek",
)
# Structured Pydantic response
from pydantic import BaseModel
class Analysis(BaseModel):
signal: str
confidence: float
result = call_llm(
prompt="...",
model_name="deepseek-chat",
provider_api="deepseek",
pydantic_model=Analysis,
deterministic_mode=True, # enables response caching
trace=trace, # optional RunTrace
scratchpad=scratchpad, # optional Scratchpad
)
Supported providers: deepseek, openai (gpt), anthropic (claude), google (gemini)
Testing
Run verification tests after changes:
# Core macro logic
pytest tests/test_macro_source_policy.py
pytest tests/test_macro_structured.py
pytest tests/test_macro_visualize.py
# Factor mining and backtest (requires scipy)
pytest tests/test_factor_mining.py
pytest tests/test_factor_backtest.py
# Other test suites
pytest tests/test_agent_graphs.py
pytest tests/test_agent_edges.py
pytest tests/test_data_cache.py
pytest tests/test_llm_client.py
pytest tests/test_runtime_config.py
pytest tests/test_sec_edgar_provider.py
pytest tests/test_symbols.py
Key Constraints (from AGENTS.md)
- Source policy: Do not change macro source policy unless explicitly requested
- Runtime data: Never write hot data into the repo; use
OMNIX_PATH/omnifinan/ - Output stability: Preserve unified API structures (
meta,dimensions,metrics,chart_data) - Minimal edits: Keep changes deterministic and technically concise
- Anti-loop: Avoid repeated full refresh loops when sources have no delta
- Local-first: Prefer cached data for repeated analysis/report generation
Adding a New Analyst Agent
- Create
src/omnifinan/agents/your_analyst.pywith function signature:def your_analyst_agent(state: AgentState) -> AgentState: # Read from state["data"], state["metadata"]["data_service"] # Write signals to state["data"]["analyst_signals"][ticker]["your_analyst_agent"] return {"messages": state["messages"], "data": {...}, "metadata": state["metadata"]} - Register in
src/omnifinan/utils/analysts.pyANALYST_CONFIG - The graph builder (
graphs.py) auto-wires registered analysts
Adding a New Data Provider
- Create
src/omnifinan/data/providers/your_provider.pyimplementingDataProviderABC - Register in
src/omnifinan/data/providers/factory.py - All 7 abstract methods must be implemented
Ticker Format
- China A-shares: 6-digit code (e.g.,
600519,000001) - Hong Kong: 5-digit zero-padded (e.g.,
00700) - US: Standard symbols (e.g.,
AAPL,MSFT) - Crypto: Pair format (e.g.,
BTC-USD,ETHUSDT) - auto-routed to YFinance
For detailed macro series reference and structured output schema, see macro-reference.md.