Project of Data Visualization (COM-480)

Student's name	SCIPER
Haodong Zheng	387661
Youliang Zhu	415773
Yueyang Pan	350575

AI Value Chain Investor Dashboard is the final COM-480 Milestone 3 visualization for comparing companies across the AI stack: physical infrastructure, compute and silicon, AI infrastructure software, foundation models, and AI-enabled applications.

Audience • Usage • Setup • Data pipeline • Repository layout • Milestone 3

Target Audience

The dashboard is designed for people who need to compare AI companies across layers, regions, and business maturity:

Emerging fund managers deciding where to allocate capital across the AI value chain.
Angel investors looking for capital-efficient private and public opportunities.
High-skill job seekers evaluating where valuable technical work is being created.

Final Visualization And Intended Usage

The dashboard helps users move from a market-level overview to company-level evidence. It starts with the AI value-chain structure, then supports filtering by layer and geography, comparison of valuation and efficiency metrics, and detailed inspection of each company's source-backed trend data.

Typical use cases:

Fund managers: use the value-chain and geography views to see where companies sit in the AI stack by region. This makes it easier to compare, for example, US compute and model companies against Chinese infrastructure suppliers or European application/software companies, then decide where different asset strategies may fit.
Angel investors: use the capital-efficiency view to compare how much enterprise or market value each company has created per dollar of disclosed capital raised. They can also inspect GitHub star trends and open-source signals to identify companies gaining developer attention before that attention is fully reflected in financial metrics.
Job seekers: use the density view to find companies with high value creation per employee. This highlights organizations that appear to produce unusually high market or private value with concentrated teams, which can be a useful proxy for technical leverage and talent density.

Each company also has a detail view with richer trend data, source links, confidence labels, and missing-data indicators. Missing metrics are kept as null rather than guessed.

Setup

Install Node dependencies for the Vite, TypeScript, and D3 frontend:

npm install

Set up the Python data-pipeline environment:

python3.11 -m venv .agent_venv
.agent_venv/bin/pip install -e src

Optional provider credentials are documented in .env_example. Copy it only if you plan to refresh live data:

cp .env_example .env

Do not commit .env or real API keys.

Run The Website

Start the local development server:

npm run dev

Open the local URL printed by Vite.

Build the static site:

npm run build

Preview the production build locally:

npm run preview

The static build output is written to dist/.

Data Pipeline

The frontend reads the frozen Milestone 3 snapshot at data/snapshot-milestone3.json. Seed files are not used directly by the dashboard; they are validated, enriched, stored in SQLite, and exported into generated JSON snapshots.

Validate seed files and write the base generated data:

npm run data:update

Initialize the SQLite metric store if needed:

npm run data:init-db

Refresh one public company by company_id or ticker:

npm run data:enrich-public -- --company nvidia

Refresh one private company by company_id:

npm run data:enrich-private -- --company openai

Extract public-company customer concentration for one company:

npm run data:extract-public-customers -- --company nvidia

Refresh deterministic non-financial signals such as open-source activity:

npm run data:enrich-signals -- --company mistral_ai

Freeze the current enriched outputs for Milestone 3:

.agent_venv/bin/python -m pipeline snapshot --name milestone3

Run pipeline tests:

npm run data:test

Pipeline Notes

Public companies are enriched from structured sources such as yfinance, AkShare, SEC EDGAR, CNInfo/HKEX filing routes, Financial Datasets, and FMP when credentials are available.
Private companies are enriched from deterministic public-web and source-backed signals first. Optional LLM extraction is available for founding facts, funding, ARR, and commercial relationships when structured sources are incomplete.
data/metrics.sqlite is the source of truth for acquired observations. JSON files under data/ are generated frontend and review artifacts.
Source fetches and LLM outputs are cached under data/cache/ where applicable.
Low-confidence, missing, or failed observations are written to data/review/ instead of being silently filled.

Repository Layout

src/dashboard/ - D3 dashboard views, interaction logic, data loading, primitives, and callouts.
src/main.ts - frontend entrypoint loaded by Vite.
src/data/scripts/ - Python data pipeline CLI, source adapters, schema logic, storage, extractors, prompts, and tests.
src/data/seeds/ - acquisition-class seed CSVs for public, private, and manual-review companies.
data/ - generated machine-readable outputs, SQLite metrics store, frozen Milestone 3 snapshot, and review files.
doc/ - milestone requirements, data/schema documentation, process-book notes, and screencast planning.
report/ - final process book source and PDF.
public/maps/ - static geographic map assets used by the dashboard.
.env_example - documented provider keys and feature toggles.

Development Checks

Run the frontend build:

npm run build

Run the data-pipeline unit tests:

npm run data:test

Milestone 3 Deliverables

Milestone 3 requires a GitHub repository with clean code, data, process book PDF, and setup/usage README; a screencast of at most 2 minutes; and a process book of at most 8 pages.

Repository deliverable links:

Requirements summary: doc/milestone3.md
Original milestone PDF: doc/Milestone3.pdf
Process book PDF: report/process-book.pdf
Process book source: report/process-book.tex
Screencast outline: doc/screencast-outline.md

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
data		data
doc		doc
public/maps		public/maps
report		report
src		src
.env_example		.env_example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project of Data Visualization (COM-480)

Target Audience

Final Visualization And Intended Usage

Setup

Run The Website

Data Pipeline

Pipeline Notes

Repository Layout

Development Checks

Milestone 3 Deliverables

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project of Data Visualization (COM-480)

Target Audience

Final Visualization And Intended Usage

Setup

Run The Website

Data Pipeline

Pipeline Notes

Repository Layout

Development Checks

Milestone 3 Deliverables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages