| Student's name | SCIPER |
|---|---|
| Haodong Zheng | 387661 |
| Youliang Zhu | 415773 |
| Yueyang Pan | 350575 |
AI Value Chain Investor Dashboard is the final COM-480 Milestone 3 visualization for comparing companies across the AI stack: physical infrastructure, compute and silicon, AI infrastructure software, foundation models, and AI-enabled applications.
Audience • Usage • Setup • Data pipeline • Repository layout • Milestone 3
The dashboard is designed for people who need to compare AI companies across layers, regions, and business maturity:
- Emerging fund managers deciding where to allocate capital across the AI value chain.
- Angel investors looking for capital-efficient private and public opportunities.
- High-skill job seekers evaluating where valuable technical work is being created.
The dashboard helps users move from a market-level overview to company-level evidence. It starts with the AI value-chain structure, then supports filtering by layer and geography, comparison of valuation and efficiency metrics, and detailed inspection of each company's source-backed trend data.
Typical use cases:
- Fund managers: use the value-chain and geography views to see where companies sit in the AI stack by region. This makes it easier to compare, for example, US compute and model companies against Chinese infrastructure suppliers or European application/software companies, then decide where different asset strategies may fit.
- Angel investors: use the capital-efficiency view to compare how much enterprise or market value each company has created per dollar of disclosed capital raised. They can also inspect GitHub star trends and open-source signals to identify companies gaining developer attention before that attention is fully reflected in financial metrics.
- Job seekers: use the density view to find companies with high value creation per employee. This highlights organizations that appear to produce unusually high market or private value with concentrated teams, which can be a useful proxy for technical leverage and talent density.
Each company also has a detail view with richer trend data, source links, confidence labels, and missing-data indicators. Missing metrics are kept as null rather than guessed.
Install Node dependencies for the Vite, TypeScript, and D3 frontend:
npm installSet up the Python data-pipeline environment:
python3.11 -m venv .agent_venv
.agent_venv/bin/pip install -e srcOptional provider credentials are documented in .env_example. Copy it only if you plan to refresh live data:
cp .env_example .envDo not commit .env or real API keys.
Start the local development server:
npm run devOpen the local URL printed by Vite.
Build the static site:
npm run buildPreview the production build locally:
npm run previewThe static build output is written to dist/.
The frontend reads the frozen Milestone 3 snapshot at data/snapshot-milestone3.json. Seed files are not used directly by the dashboard; they are validated, enriched, stored in SQLite, and exported into generated JSON snapshots.
Validate seed files and write the base generated data:
npm run data:updateInitialize the SQLite metric store if needed:
npm run data:init-dbRefresh one public company by company_id or ticker:
npm run data:enrich-public -- --company nvidiaRefresh one private company by company_id:
npm run data:enrich-private -- --company openaiExtract public-company customer concentration for one company:
npm run data:extract-public-customers -- --company nvidiaRefresh deterministic non-financial signals such as open-source activity:
npm run data:enrich-signals -- --company mistral_aiFreeze the current enriched outputs for Milestone 3:
.agent_venv/bin/python -m pipeline snapshot --name milestone3Run pipeline tests:
npm run data:test- Public companies are enriched from structured sources such as yfinance, AkShare, SEC EDGAR, CNInfo/HKEX filing routes, Financial Datasets, and FMP when credentials are available.
- Private companies are enriched from deterministic public-web and source-backed signals first. Optional LLM extraction is available for founding facts, funding, ARR, and commercial relationships when structured sources are incomplete.
data/metrics.sqliteis the source of truth for acquired observations. JSON files underdata/are generated frontend and review artifacts.- Source fetches and LLM outputs are cached under
data/cache/where applicable. - Low-confidence, missing, or failed observations are written to
data/review/instead of being silently filled.
src/dashboard/- D3 dashboard views, interaction logic, data loading, primitives, and callouts.src/main.ts- frontend entrypoint loaded by Vite.src/data/scripts/- Python data pipeline CLI, source adapters, schema logic, storage, extractors, prompts, and tests.src/data/seeds/- acquisition-class seed CSVs for public, private, and manual-review companies.data/- generated machine-readable outputs, SQLite metrics store, frozen Milestone 3 snapshot, and review files.doc/- milestone requirements, data/schema documentation, process-book notes, and screencast planning.report/- final process book source and PDF.public/maps/- static geographic map assets used by the dashboard..env_example- documented provider keys and feature toggles.
Run the frontend build:
npm run buildRun the data-pipeline unit tests:
npm run data:testMilestone 3 requires a GitHub repository with clean code, data, process book PDF, and setup/usage README; a screencast of at most 2 minutes; and a process book of at most 8 pages.
Repository deliverable links:
- Requirements summary:
doc/milestone3.md - Original milestone PDF:
doc/Milestone3.pdf - Process book PDF:
report/process-book.pdf - Process book source:
report/process-book.tex - Screencast outline:
doc/screencast-outline.md