QuantGPT

Agent-Driven Factor Research Engine — Autonomous Mining, Independent Validation on QuantGPT Cloud

LLM Agent 自治因子挖矿 → 批量回测 → 多维评分 → 反过拟合验证 → QuantGPT Cloud 独立验证 + 样本外跟踪 | 全程零人工干预

Quick Start · Architecture · API Docs · MCP Guide · Factor Mining · Roadmap · Contributing

WeChat / 微信交流群: Add quantgpt_ai on WeChat to join the community

What Is QuantGPT

QuantGPT is an agent-driven factor research engine — not a backtest library, not a chatbot wrapper. It gives an LLM Agent (Claude, via MCP) a complete toolkit to autonomously discover, evaluate, iterate, and validate alpha factors, with zero human intervention per research cycle. Factors that pass local validation are automatically uploaded to QuantGPT Cloud for independent verification and out-of-sample tracking — ensuring research results are reproducible and auditable.

The core architecture:

LLM Agent (Claude Code / Claude Desktop)
    │
    ├── MCP Tools (15 个)         ← Agent 的工具箱
    │   ├── run_backtest           ← 全市场分组回测
    │   ├── score_factor           ← 0-100 综合评分
    │   ├── diagnose_factor        ← 失败模式诊断
    │   ├── run_anti_overfit       ← 4 项反过拟合检验
    │   ├── run_rolling_validation ← Walk-forward 验证
    │   ├── validate_expression    ← 语法校验
    │   ├── list_operators         ← 60+ 算子文档
    │   ├── list_universes         ← 股票池和基准
    │   ├── wq_brain_submit        ← WQ BRAIN 单因子提交
    │   ├── wq_brain_batch_submit  ← 批量参数扫描提交
    │   ├── wq_brain_submit_by_ids ← 按 ID 提交
    │   ├── wq_brain_list_alphas   ← 查询已提交 alpha
    │   ├── wq_brain_check_alphas  ← 检查 alpha 状态
    │   └── wq_brain_finalize_submissions ← 最终提交确认
    │
    ├── QuantGPT Cloud             ← 独立验证平台 (quant-gpt.com)
    │   ├── A 级因子自动上传
    │   ├── 独立 IC/IR/Turnover 验证
    │   ├── 样本外实时跟踪
    │   ├── 因子去重（Self-Correlation 检测）
    │   └── 公开可审计的研究记录
    │
    ├── Evolution Engine           ← 因子进化引擎
    │   ├── MutationEngine (8 方向突变)
    │   ├── CrossoverEngine (高分因子交叉)
    │   ├── MetaEvolutionSelector (自适应策略)
    │   └── TrajectoryAnalyzer (轨迹分析)
    │
    ├── WQ BRAIN Integration       ← WorldQuant 直连（可选）
    │   ├── Dollar-neutral 模拟
    │   └── 一键正式提交
    │
    └── Knowledge Base             ← 跨会话知识积累
        ├── rules/    (已验证规则)
        ├── findings/ (经验发现)
        └── failures/ (已证伪路径)

How It Differs from "AI Backtest Tools"

传统工具（包括 ChatGPT + 回测库）的模式是：人类想因子 → 工具跑回测 → 人类看结果。Agent 是执行者，人类是决策者。

QuantGPT 的模式是：人类定义目标 → Agent 自治研究 → 人类审阅产出。Agent 是研究者，人类是审阅者。

这不是接口层的区别（自然语言 vs. 代码），而是决策权的区别。Agent 自主决定：探索哪个方向、生成什么表达式、评估哪些指标、何时迭代、何时放弃、何时提交。

Production Track Record

Metric	Value
累计回测任务	370+
单轮迭代（8 候选因子）	~15 分钟
表达式算子	60+（含非线性、三元、技术指标）
Cloud 独立验证	A 级因子自动上传 quant-gpt.com 验证 + 样本外跟踪
WQ BRAIN 提交（可选）	3 个因子 IS 全部 PASS，已提交（最佳 Fitness 1.26）

Validated Results

QuantGPT 产出的因子通过两层验证：本地多维评分 + QuantGPT Cloud 独立 IC/IR 验证。高质量因子还可选提交至 WQ BRAIN。以下 3 个因子已通过 WQ BRAIN IS 检测并正式提交：

Factor	Expression	WQ Sharpe	WQ Fitness	WQ Returns	IS Tests	Status
Debt-Momentum Composite	`-1 * rank(ts_av_diff(close, 10)) + rank(debt / enterprise_value)`	1.77	1.26	20.18%	ALL PASS	Submitted
VWAP Decay Reversal	`-1 * rank(ts_decay_linear(close / vwap, 10))`	1.69	1.07	18.63%	ALL PASS	Submitted
Returns-Volume Momentum	`-1 * rank(ts_decay_linear(returns * volume / adv20, 5))`	1.60	1.03	24.15%	ALL PASS	Submitted

3 个因子代表不同的 alpha 来源：Debt-Momentum 结合动量反转与基本面（债务/企业价值），行业中性化；VWAP Decay Reversal 捕捉价格偏离 VWAP 的衰减回归，市场中性化；Returns-Volume Momentum 捕捉收益率与相对成交量的衰减动量，市场中性化。全程 Agent 自治完成。

_{Debt-Momentum Composite — 已正式提交：Sharpe 1.77, Fitness 1.26, Returns 20.18%, IS 全部 PASS}

_{VWAP Decay Reversal — 已正式提交：Sharpe 1.69, Fitness 1.07, Returns 18.63%, IS 全部 PASS}

_{Returns-Volume Momentum — 已正式提交：Sharpe 1.60, Fitness 1.03, Returns 24.15%, IS 全部 PASS}

Autonomous Factor Mining — The Core Loop

This is QuantGPT's defining capability.

Agent 读知识库、设计假设、批量实验、分析结果、积累知识、自我迭代，每个结论经过双模型交叉验证。A 级因子自动上传 QuantGPT Cloud 独立验证，建立可审计的样本外跟踪记录。

Research Cycle

                    ┌─────────────────────────────┐
                    │  Research Notes & Knowledge  │
                    │  (Rules / Findings / Fails)  │
                    └──────────┬──────────────────┘
                               │ read
                               ▼
┌──────────┐    ┌──────────────────────────┐    ┌──────────────────┐
│  Phase 0 │───▶│  Phase 1: Factor Design  │───▶│  Phase 2: Batch  │
│  Context │    │  Hypothesis → Expression │    │  Backtest (10-20 │
│  Loading │    │  1-3 candidates per idea │    │  concurrent)     │
└──────────┘    └──────────────────────────┘    └────────┬─────────┘
                                                         │
                               ┌─────────────────────────┘
                               ▼
                ┌──────────────────────────────────────────┐
                │  Phase 3: Four-Step Analysis             │
                │                                          │
                │  ① Fact Collection (metrics vs baseline) │
                │  ② Independent Judgment (Agent)          │
                │  ③ Cross-Review (DeepSeek Reasoner)      │
                │  ④ Consensus or Divergence Resolution    │
                └──────────────────┬───────────────────────┘
                                   │
                    ┌──────────────┴──────────────┐
                    ▼                             ▼
          ┌─────────────────┐          ┌──────────────────┐
          │  Phase 4: Update │          │  Phase 5: Stop?  │
          │  Notes + Knowledge│         │  Converged /     │
          │  Base             │◀────────│  Time / Rounds   │
          └─────────────────┘          └──────────────────┘
                    │                         │ no
                    │                         └──▶ back to Phase 1
                    ▼
          ┌──────────────────┐
          │  Phase 6: Report │
          │  A/B factors +   │
          │  new knowledge   │
          └──────────────────┘

Key Mechanisms

Dual-LLM Cross-Review

每个结论性判断（采用/不采用/关闭方向）必须经过第二个 LLM 独立评审。把事实数据和第一个模型的推理链一起发给 DeepSeek Reasoner，要求独立评估推理是否合理、是否有遗漏角度。

共识 → 直接输出。分歧 → 呈现双方证据，采用更保守结论。

这解决了单模型因子研究的核心问题：confirmation bias。

Persistent Knowledge Base

research_notes/knowledge/
├── rules/       ← 已验证的稳定规则（必须遵守）
├── findings/    ← 经验发现（参考）
└── failures/    ← 已证伪路径（禁止重复）

知识库跨会话积累。第 10 次研究会话可以直接利用前 9 次的所有发现，避免重复实验，遵守已验证规则，绕开已证伪路径。

这不是 chat history——是结构化的研究资产。

Batch Concurrent Evaluation

单次提交 10-20 个因子表达式，并发回测 + 三波重试。结果按 fitness 降序排列。hs300 fitness < 0.1 时自动跳过 csi500 验证，节省算力。

from scripts.factor_miner import batch_evaluate
results = batch_evaluate(
    server, expressions, params,
    max_concurrent=10
)

Research Discipline (Enforced)

不是建议，是硬性规则：

每次实验只改一个变量
提交前检查是否已做过（笔记 + 知识库）
分析结论标注"仅为假设"
失败实验同样记录原因
表达式 > 4 层嵌套需额外论证
简单清晰 > 复杂精巧

上面 Validated Results 中的因子就是这个流程的产出。 多轮迭代，A 级因子自动上传 QuantGPT Cloud 独立验证。完整方法论见 Factor Mining Guide。

Architecture

┌────────────────────────────────────────────────────────────────────┐
│                     QuantGPT Research Engine                       │
├─────────────┬──────────────────────────────┬───────────────────────┤
│             │         Core Engine          │                       │
│  Agent      │  ┌──────────────────────┐   │   Data Layer          │
│  Interface  │  │  Expression Parser   │   │  ┌─────────────────┐  │
│             │  │  60+ operators       │   │  │ baostock (free)  │  │
│ MCP Tools   │  │  Cloud + WQ compat.  │   │  │ akshare (free)   │  │
│ REST API    │  └──────────┬───────────┘   │  │ PolarDB (opt)    │  │
│ Web UI      │  ┌──────────▼───────────┐   │  │ Parquet cache    │  │
│ (monitor)   │  │  Backtest Engine     │   │  └─────────────────┘  │
│             │  │  Rank-based grouping │   │                       │
│             │  │  Cloud aligned       │   │   AI Layer            │
│             │  └──────────┬───────────┘   │  ┌─────────────────┐  │
│             │  ┌──────────▼───────────┐   │  │ DeepSeek LLM    │  │
│             │  │  Validation Suite    │   │  │ Factor design   │  │
│             │  │  Anti-overfit (4x)   │   │  │ Cross-review    │  │
│             │  │  Walk-forward        │   │  │ Mutation engine │  │
│             │  │  QuantGPT Cloud      │   │  └─────────────────┘  │
│             │  │  WQ BRAIN (optional) │   │                       │
│             │  └──────────────────────┘   │   Storage             │
│             │                             │  ┌─────────────────┐  │
│             │  Evolution Engine           │  │ SQLite (default)│  │
│             │  Trajectory → Meta-Evo →    │  │ PostgreSQL (opt)│  │
│             │  Mutation / Crossover       │  └─────────────────┘  │
├─────────────┴──────────────────────────────┴───────────────────────┤
│  Agent Orchestrator: Claude Code skill loop / Claude Desktop MCP   │
└────────────────────────────────────────────────────────────────────┘

Tech Stack

Layer	Technology
Agent	Claude Code (skill loop) / Claude Desktop (MCP)
Backend	Python 3.10+, FastAPI, uvicorn, SQLAlchemy 2.0 async
Database	SQLite (default, zero-config) / PostgreSQL (optional)
AI/LLM	DeepSeek (factor generation + cross-review)
Market Data	baostock + akshare (free) → Parquet cache
Cloud Validation	QuantGPT Cloud — independent IC/IR verification
Frontend	React 18 + TypeScript + Tailwind CSS 4 (monitoring dashboard)
MCP	FastMCP (stdio / SSE / streamable-http)
Report	QuantStats HTML

Key Engineering Decisions

1. Expression Parser — The Core Differentiator

自研的表达式解析器（expression_parser.py, 1000+ lines）是整个系统的核心：

60+ 算子：截面（rank, zscore）、时序（ts_corr, decay_linear）、非线性（sigmoid, tanh, sign_power）、条件/三元（where, trade_when, ? :）、技术指标（rsi, macd, atr）
双模式：mode="local" 开放全部 60+ 算子，mode="wq" 仅允许 WQ BRAIN 兼容子集（可选提交前校验）
语义正确的截面/时序分离：rank() 按 trade_date 分组（截面），ts_mean() 按 stock_code 分组（时序），自动处理分组逻辑
安全约束：递归深度限制、窗口上限、表达式长度限制，防止恶意输入

2. Three-Layer Anti-Overfit System

Layer	Module	Method
Statistical Tests	`anti_overfit.py`	IC 稳定性 + 子样本压力测试（牛/熊/震荡）+ 安慰剂检验 + 半衰期估计
Walk-Forward	`rolling_validator.py`	滚动 train/valid/test 窗口，评估样本外 IC 衰减
WQ Simulation	`wq_simulate.py`	Dollar-neutral 多空模拟，Sharpe/Turnover/Fitness 计算
QuantGPT Cloud	`cloud_client.py`	独立验证 — A 级因子自动上传 quant-gpt.com + IC/IR 检验 + 样本外跟踪
WQ BRAIN API (optional)	`wq_brain_client.py`	直连 BRAIN 平台 — 真实模拟 + 一键正式提交

3. Evolutionary Factor Iteration

受 QuantaAlpha 启发的三阶段自动搜索：

TrajectoryAnalyzer → MetaEvolutionSelector → Strategy Execution
 (质量指标评估)       (EXPLOIT/EXPLORE/        (MutationEngine ×8 方向
                      RECOMBINE/SIMPLIFY)       / CrossoverEngine)

8 种定向突变：时间窗口变异、算子替换、复杂度调整、截面变换叠加等。5 维评分驱动迭代方向。

4. Agent-First Access Model

Mode	Role	Use Case
MCP (primary)	Agent toolkit	Claude Code / Claude Desktop 通过 MCP 调用所有研究工具，驱动自治研究循环
REST API	Programmatic access	批量回测、外部系统集成、CI/CD 因子验证
Web UI	Monitoring dashboard	任务监控、报告查看、因子库管理

MCP Tools (15 个)

Tool	Description
`list_operators`	全部算子文档
`list_universes`	股票池和基准
`validate_expression`	语法校验
`run_backtest`	完整回测
`score_factor`	评分（0–100, A/B/C/D）
`diagnose_factor`	失败模式诊断 + 改进建议
`run_anti_overfit`	4 项反过拟合检验
`run_rolling_validation`	Walk-forward 验证

Competitive Landscape

Capability	JoinQuant	Backtrader	ChatGPT + Backtest	QuantGPT
Research mode	Human writes code	Human writes code	Human prompts, tool executes	Agent autonomously researches
Factor discovery	Manual	Manual	One-shot LLM	Multi-round evolution + knowledge base
Anti-bias	Researcher judgment	None	None	Dual-LLM mandatory cross-review
Knowledge accumulation	Personal notes	None	Lost between sessions	Structured KB across sessions
Anti-overfit	--	--	--	4 statistical tests + walk-forward
Independent validation	--	--	--	QuantGPT Cloud — 独立 IC/IR 验证 + 样本外跟踪
WQ BRAIN (optional)	--	--	--	Operator-aligned + direct submission
MCP / AI Agent	--	--	--	15 tools, skill-loop orchestration
Live trading	Yes	Limited	--	--
Intraday data	Yes	Yes	--	Daily only

Quick Start

Option 1: Agent Mode (Recommended)

git clone https://github.com/Miasyster/QuantGPT.git && cd QuantGPT
make setup   # creates venv, installs deps, generates .env
make run     # starts server at http://localhost:8003

Add MCP configuration to Claude Code or Claude Desktop:

{
  "mcpServers": {
    "quantgpt": {
      "command": "python",
      "args": ["-m", "quantgpt"]
    }
  }
}

Then let the Agent work: "在沪深300上挖掘高质量因子，自动上传 Cloud 验证"

Option 2: Expression Mode (No LLM Required)

# Direct expression backtest via API
curl -X POST http://localhost:8003/api/v1/auto_backtest \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{"expression": "rank(close / ts_mean(close, 20))", "universe": "hs300"}'

Windows Quick Start

Windows 用户不需要 make 和 restart.sh，手动执行即可：

# 1. 克隆项目
git clone https://github.com/Miasyster/QuantGPT.git
cd QuantGPT

# 2. 创建虚拟环境并安装依赖
python -m venv .venv
.venv\Scripts\activate
pip install -e .
# 可选：装 PostgreSQL 支持
# pip install -e ".[postgresql]"

# 3. 构建前端（需要 Node.js，从 nodejs.org 下载 LTS 版本）
cd frontend && npm install && npm run build && cd ..

# 4. 启动服务
python -m quantgpt --transport http
# 浏览器打开 http://localhost:8003

注意：

推荐 Python 3.11 或 3.12（3.14 太新，部分依赖可能不兼容）

如果端口被占用：netstat -ano | findstr :8003 查进程，taskkill /PID <pid> /F 杀掉

也可以使用 WSL2（wsl --install），体验与 macOS/Linux 完全一致

Zero config by default: SQLite database, baostock + akshare free data. See full Quick Start guide for details.

Optional: DeepSeek API (for factor generation & cross-review)

# Edit .env, add your DeepSeek API key (~$0.001 per query)
DEEPSEEK_API_KEY=sk-your-key-here

Optional: PostgreSQL (for production)

pip install "quantgpt[postgresql]"
# Edit .env:
DATABASE_URL=postgresql+asyncpg://quantgpt:password@localhost:5432/quantgpt
alembic upgrade head

Optional: QuantGPT Cloud (independent validation)

# Edit .env — A-grade factors auto-upload for independent validation
QUANTGPT_CLOUD_EMAIL=your_email@example.com
QUANTGPT_CLOUD_PASSWORD=your_password

Register at quant-gpt.com, then add credentials to .env. A-grade factors will be automatically uploaded and independently validated (IC/IR/turnover/correlation checks).

Expression Examples

Local backtest — works out of the box with baostock/akshare data:

# 20-day momentum
rank(close / ts_mean(close, 20))

# Low volatility
rank(-1 * ts_std(close/ts_shift(close,1)-1, 20))

# Decay-weighted correlation
decay_linear(rank(ts_corr(vwap, volume, 10)), 5)

# Momentum + profitability composite
-1 * rank(ts_av_diff(close, 10)) + rank(roe)

WQ BRAIN remote (optional) — these expressions use fields only available via WQ BRAIN submission:

# Debt-momentum composite — BRAIN submitted, Fitness 1.26, Sharpe 1.77
-1 * rank(ts_av_diff(close, 10)) + rank(debt / enterprise_value)

# VWAP decay reversal — BRAIN submitted, Fitness 1.07, Sharpe 1.69
-1 * rank(ts_decay_linear(close / vwap, 10))

# Returns-volume momentum — BRAIN submitted, Fitness 1.03, Sharpe 1.60
-1 * rank(ts_decay_linear(returns * volume / adv20, 5))

Project Structure

quantgpt/
├── quantgpt/                    # Backend (Python)
│   ├── expression_parser.py     # Factor expression parser (60+ ops, WQ compatible)
│   ├── backtest.py              # Rank-based group backtest engine
│   ├── market_data.py           # baostock/akshare → Parquet cache
│   ├── api_server.py            # FastAPI REST API + SSE
│   ├── mcp_server.py            # FastMCP server (15 tools — Agent's toolkit)
│   ├── iteration.py             # 3-phase evolutionary iteration
│   ├── mutation_engine.py       # 8 directed mutation strategies
│   ├── crossover_engine.py      # High-score factor crossover
│   ├── meta_evolution.py        # Adaptive strategy selector
│   ├── trajectory_analyzer.py   # Trajectory quality metrics
│   ├── anti_overfit.py          # 4 statistical anti-overfit tests
│   ├── rolling_validator.py     # Walk-forward validation
│   ├── wq_simulate.py           # Dollar-neutral simulator
│   ├── wq_brain_client.py       # WQ BRAIN API (optional)
│   ├── cloud_client.py          # QuantGPT Cloud upload + independent validation
│   ├── neutralize.py            # Industry & cap neutralization
│   ├── daily_summary.py         # LLM-powered daily market report
│   └── routes/                  # API route modules
├── frontend/                    # React 18 + TypeScript + Tailwind CSS 4
│   └── src/components/          # Monitoring dashboard
├── scripts/
│   └── factor_miner.py          # Batch factor evaluation toolkit
├── tests/                       # 74 tests (parser + backtest + WQ simulate)
├── example_factor/              # BRAIN validation screenshots
└── docs/                        # Architecture, API, MCP, Mining guides

Limitations

Daily frequency only — no intraday backtesting
A-share market only — China mainland equities
Agent quality depends on LLM — better models produce better factors

License

This repository is the original source of the QuantGPT factor research engine. Derivative works should retain the copyright notice and comply with the MIT License terms. See NOTICE for details.

_{Past factor performance does not guarantee future returns. This project does not constitute investment advice.}

Name		Name	Last commit message	Last commit date
Latest commit History 154 Commits
.github		.github
docs		docs
engine		engine
example_factor		example_factor
frontend		frontend
quantgpt		quantgpt
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
PROMPT.md		PROMPT.md
README.md		README.md
ROADMAP.md		ROADMAP.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
restart.sh		restart.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuantGPT

What Is QuantGPT

How It Differs from "AI Backtest Tools"

Production Track Record

Validated Results

Autonomous Factor Mining — The Core Loop

Research Cycle

Key Mechanisms

Architecture

Tech Stack

Key Engineering Decisions

1. Expression Parser — The Core Differentiator

2. Three-Layer Anti-Overfit System

3. Evolutionary Factor Iteration

4. Agent-First Access Model

Competitive Landscape

Quick Start

Option 1: Agent Mode (Recommended)

Option 2: Expression Mode (No LLM Required)

Windows Quick Start

Project Structure

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

QuantGPT

What Is QuantGPT

How It Differs from "AI Backtest Tools"

Production Track Record

Validated Results

Autonomous Factor Mining — The Core Loop

Research Cycle

Key Mechanisms

Architecture

Tech Stack

Key Engineering Decisions

1. Expression Parser — The Core Differentiator

2. Three-Layer Anti-Overfit System

3. Evolutionary Factor Iteration

4. Agent-First Access Model

Competitive Landscape

Quick Start

Option 1: Agent Mode (Recommended)

Option 2: Expression Mode (No LLM Required)

Windows Quick Start

Project Structure

Limitations

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages