adata

Agent-Native financial data infrastructure for A-shares.

adata is designed for AI agents, not humans. Agents say what data they need — adata handles where to get it, how to cache it, and when to refresh it. Zero configuration, zero data-source knowledge required.

from adata import DataClient

client = DataClient()
df = client.query_daily(["sh.600519", "sz.000001"], "2024-01-01")
# Auto: check cache → fetch missing from best available provider → cache → return DataFrame

Architecture

LLM Agent
    ↓ MCP tools (query_daily / query_benchmark / data_status ...)
adata MCP Server
    ↓
DataClient — smart routing layer
    ↓ cache-first → provider fallback chain
┌──────────┬──────────┬──────────┬──────────┬──────────┐
│ Parquet   │ PolarDB  │ rqdatac  │ baostock │ akshare  │
│ (cache)   │          │          │          │          │
└──────────┴──────────┴──────────┴──────────┴──────────┘

DataClient is the single entry point. It transparently:

Checks local Parquet cache first
Falls back through a configurable provider chain
Caches fetched data for future queries
Normalizes codes across formats (sh.600519 ↔ 600519.XSHG ↔ sh600519)

MCP Server exposes DataClient as MCP tools so any LLM agent can discover and query data through the standard MCP protocol.

Installation

pip install -e .

# With specific data providers:
pip install -e ".[rqdatac]"    # 米筐
pip install -e ".[baostock]"   # baostock
pip install -e ".[akshare]"    # 东方财富
pip install -e ".[polardb]"    # Aliyun PolarDB
pip install -e ".[mcp]"        # MCP server

# Everything:
pip install -e ".[all]"

Usage

Python API (for agents and applications)

from adata import DataClient

client = DataClient()

# Daily OHLCV — auto-routed, auto-cached
df = client.query_daily(["sh.600519"], "2024-01-01")

# Benchmark index
bm = client.query_benchmark("hs300", "2024-01-01")

# Index constituents
codes = client.query_universe("hs300")

# Cache status
status = client.data_status()
# → {"data_dir": "~/.adata", "categories": {"stocks": {"files": 300, ...}}, "providers": [...]}

# Data freshness check
freshness = client.data_freshness(["sh.600519", "sz.000001"])
# → [{"code": "sh.600519", "last_date": "2026-05-14", "stale": false}, ...]

MCP Server (for LLM agents)

python -m adata mcp    # Start stdio MCP server

MCP tools available:

Tool	Description
`query_daily`	Query daily OHLCV data
`query_benchmark`	Query benchmark index (hs300, zz500, etc.)
`query_universe`	List index constituents
`data_status`	Cache overview: files, date ranges, providers
`data_freshness`	Check data staleness
`update_data`	Trigger data update by universe or codes

Claude Desktop / Claude Code config:

{
  "mcpServers": {
    "adata": {
      "command": "python",
      "args": ["-m", "adata", "mcp"]
    }
  }
}

CLI (for humans)

# Fetch data from a specific provider
python -m adata fetch --source rqdatac --universe hs300 --mode incremental

# Fetch specific codes
python -m adata fetch --source baostock --codes sh.600519,sz.000001

# Fetch benchmarks
python -m adata fetch --source rqdatac --benchmark hs300,zz500

# Full re-download
python -m adata fetch --source akshare --codes sh.600519 --mode full --start 2015-01-01

# Check cache status
python -m adata status

Data Providers

Provider	Asset Types	Auth Required	Notes
rqdatac	Stock, ETF, Index	Yes (`RQDATAC_USERNAME/PASSWORD`)	Fastest, most complete
tushare	Stock, Index	Yes (`TUSHARE_TOKEN`)	Tushare Pro, batch API
baostock	Stock	No	Free, sequential per-stock
akshare	Stock, ETF	No	Free, East Money data
polardb	Stock, Index	Yes (`POLARDB_PASSWORD`)	Private Aliyun DB

Provider priority is configurable:

client = DataClient(provider_priority=["rqdatac", "tushare", "baostock", "akshare"])

Default: ["polardb", "rqdatac", "tushare", "baostock", "akshare"]. Unavailable providers are automatically skipped.

Configuration

Environment variables (or .env file):

# Data directory (default: ~/.adata)
# ADATA_DIR=/path/to/data

# rqdatac
RQDATAC_USERNAME=
RQDATAC_PASSWORD=

# tushare pro (https://tushare.pro)
TUSHARE_TOKEN=

# PolarDB (private)
POLARDB_HOST=
POLARDB_PASSWORD=

Data Schema

Daily OHLCV (trade_date, stock_code, open, high, low, close, volume, amount, pct_change):

All providers normalize to this schema. Stored as per-instrument Parquet files under ~/.adata/{category}/.

Code format: {exchange}.{number} — e.g., sh.600519, sz.000001. Automatic normalization from rqdatac (600519.XSHG), PolarDB (sh600519), and raw (600519) formats.

Testing

pip install pytest
python -m pytest tests/ -v

94 tests covering all modules. All tests use in-memory fixtures, no network calls.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
adata		adata
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

adata

Architecture

Installation

Usage

Python API (for agents and applications)

MCP Server (for LLM agents)

CLI (for humans)

Data Providers

Configuration

Data Schema

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

adata

Architecture

Installation

Usage

Python API (for agents and applications)

MCP Server (for LLM agents)

CLI (for humans)

Data Providers

Configuration

Data Schema

Testing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages