Local CrewAI-based toolkit (v2.3) for building embeddings from WorldQuant Brain materials and running a multi-agent alpha research workflow with retrieval-augmented generation (RAG). The repo includes the v2.3 agent pipeline, embedding notebook, API simulator client, and test utilities.
wqbagent_v2_3.py: v2.3 CrewAI pipeline (retrieval tools, LLM routing, and simulation integration).wqbagent_embedding.ipynb: embedding build notebook for PDF/text sources.wqbagent-v2.3.ipynb: interactive notebook for the full v2.3 agent workflow.wqbagent_output_test.py,wqbagent_output_test.ipynb: output/log formatting and LLM connectivity checks (updateBASE_DIRif needed).wqbquant_searchtool_test.py,wqbquant_searchtool_test.ipynb: health check helper for search/retrieval tools.wqbagentcore/: core modules (LLM setup, embeddings, tools, crews).wqb_api/: WorldQuant Brain API client and simulation helpers.config/: configuration constants (plus gitignored API keys).utils/: logging and data-cleanup helpers.materials/: reference materials and notes.scripts/: Windows batch helpers and launchers.releases/: archived v1/v2 artifacts.requirements.txt: Python dependencies.
- 🚨 Python version must be <= 3.12.
- Windows is recommended for the provided launch scripts (they can be adapted for other OSes).
- Access to an OpenAI-compatible LLM endpoint (Moonshot, DeepSeek, Gemini, or a local proxy).
- WorldQuant Brain credentials if you plan to run the simulator API.
-
Create and activate a virtual environment.
-
Install dependencies:
pip install -r requirements.txt -
Ensure
config/exists, then createconfig/api_key.py(gitignored):API_KEY_MOONSHOT = "YOUR_KEY_HERE" API_KEY_GEMINI_C26 = "YOUR_KEY_HERE" API_KEY_GEMINI_CU = "YOUR_KEY_HERE" API_KEY_DEEPSEEK = "YOUR_KEY_HERE" API_KEY_GOOGLE_CLOUD = "YOUR_KEY_HERE"
Define all variables; for providers you are not using, set empty strings.
-
Add WorldQuant Brain credentials (only required if you use the API simulator). Create
Credentials/brain_credentials_0.txtwith JSON content:["username", "password"]
-
Place your documents and metadata under the expected folders (or update paths in
wqbagent_v2_3.py/wqbagent_embedding.ipynb):Docs/Forums/wqb_china_consultant_pdfDocs/Forums/wqb_global_consultant_pdfDocs/Forums/wqb_research_pdfDocs/Forums/wqb_brain_tips_pdfDocs/OfficialDocsOperators/Operators-Agent.jsonDataFields/Datafield-Dataset-Category-Description.json
Note: If migrating from older versions with PaymentPolicy PDFs in
Docs/PaymentPolicy, move them intoDocs/Forums/wqb_brain_tips_pdf(v2.3 treats PaymentPolicy content as part of the brain tips corpus).
-
Update
BASE_DIRand doc paths inwqbagent_embedding.ipynbif you keep data outside the repo. -
Run the embedding build workflow (recommended:
wqbagent_embedding.ipynb):jupyter lab
-
Execute the ingestion cells once to build the embedding DBs.
-
Embeddings are stored under
embedding_db/with v2.3 subfolders:wqb_forum_china_embedding_dbwqb_forum_global_embedding_dbwqb_forum_research_embedding_dbwqb_forum_tips_embedding_dbwqb_official_docs_embedding_db
Ingest tracking is stored as
ingested_files.jsoninside each docs folder.
python .\wqbagent_v2_3.py-
Output/log formatting test:
python .\wqbagent_output_test.py
-
Search tool health check (import
test_agentsand pass your tool functions plus the LLM instance from your pipeline):python .\wqbquant_searchtool_test.py
scripts/wqbagent.bat, scripts/wqbagent_test.bat, and scripts/wqbtool_test.bat are templates that:
- activate a venv
- force UTF-8 output
- pipe ANSI output to HTML using
ansi2html
Update the venv path and the Python entry point to match an available script like wqbagent_v2_3.py,
wqbagent_output_test.py, or wqbquant_searchtool_test.py.
Additional helpers include scripts/add_user_path.bat, scripts/add_user_path_here.py, and scripts/hf_wqb_sync.bat.
The following are created at runtime and are excluded from git:
logs/(run logs)cache/(HF/transformers cache)embedding_db/(v2.3 vector stores)Docs/,DataFields/,Operators/,Credentials/(local datasets and credentials)wqb_embedding_db/,quant_forum_chroma/,quant_forum_bgem3/(legacy vector stores from earlier versions)
No license file is currently included.