Skip to content

MonkeyTime/model-talents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Model Talents

Experimental toolkit for routing prompts to talent-oriented language-model specialists.

Project Value

The value of this project is in the learning approach, not in claiming a new learning algorithm.

Most techniques used here are already known in the AI community: probes, talent vectors, distillation, LoRA/adapters, quantization, pruning, early exit, routing, and guarded generation. The interesting part is putting them together in one transparent local workflow so you can see what each method does, where it helps, and where it fails.

In that sense, this repository is closer to an experimental lab than to an algorithmic breakthrough. It is useful for learning how model specialization behaves in practice: measuring trade-offs, comparing quality/cost, exposing routing decisions, and understanding why a small or pruned model may need guardrails.

The project explores:

  • hidden-state talent probes
  • talent-vector scoring
  • model benchmarking with Pareto reports
  • knowledge distillation
  • LoRA/adapters
  • dynamic quantization
  • layer pruning and early-exit probes
  • automatic routing to specialist models
  • a local web UI for transparent prompt routing

Setup

python -m venv .venv
.\.venv\Scripts\python.exe -m pip install -r requirements.txt

Large downloaded models, caches, virtual environments, and generated specialist checkpoints are intentionally ignored by Git.

Quick Start

Generate a basic analysis report:

.\.venv\Scripts\python.exe .\examples\generate_report.py --model distilgpt2

Run the optimization dashboard builder:

.\.venv\Scripts\python.exe .\examples\build_optimization_dashboard.py

Serve the local routing UI:

.\.venv\Scripts\python.exe .\examples\serve_talent_router.py --local-files-only

Then open:

http://127.0.0.1:8765

Main Scripts

Script Purpose Example command HTML / JSON output
examples/generate_report.py Attention, hidden-state and talent report for one model. .\.venv\Scripts\python.exe .\examples\generate_report.py --model distilgpt2 --local-files-only reports/distilgpt2_report.html, reports/distilgpt2_report.json
examples/talent_vectors_usage.py Console example for talent vectors and supervised probes. .\.venv\Scripts\python.exe .\examples\talent_vectors_usage.py --local-files-only No HTML/JSON report; prints scores and metrics to the console.
examples/benchmark_models.py Model quality/cost Pareto benchmark. .\.venv\Scripts\python.exe .\examples\benchmark_models.py --models distilgpt2 .\models\talent_distilled_distilgpt2 --local-files-only reports/benchmark_pareto.html, reports/benchmark_pareto.json
examples/distill_student.py Talent-oriented knowledge distillation. .\.venv\Scripts\python.exe .\examples\distill_student.py --teacher gpt2 --student distilgpt2 --model-output-dir .\models\talent_distilled_distilgpt2 --local-files-only reports/distillation_report.html, reports/distillation_report.json
examples/train_lora_adapters.py Train one LoRA adapter per talent. .\.venv\Scripts\python.exe .\examples\train_lora_adapters.py --base-model distilgpt2 --local-files-only reports/lora_adapters_report.html, reports/lora_adapters_report.json
examples/use_lora_adapter.py Load a saved LoRA adapter and generate a sample. .\.venv\Scripts\python.exe .\examples\use_lora_adapter.py --base-model distilgpt2 --adapter .\models\lora_adapters\distilgpt2\coding.pt --local-files-only No HTML/JSON report; prints one generation to the console.
examples/quantization_benchmark.py fp32 vs dynamic int8 benchmark. .\.venv\Scripts\python.exe .\examples\quantization_benchmark.py --models distilgpt2 .\models\talent_distilled_distilgpt2 --local-files-only reports/quantization_report.html, reports/quantization_report.json
examples/layer_pruning_early_exit.py Layer pruning and early-exit probe analysis. .\.venv\Scripts\python.exe .\examples\layer_pruning_early_exit.py --model .\models\talent_distilled_distilgpt2 --local-files-only reports/layer_pruning_report.html, reports/layer_pruning_report.json
examples/train_talent_models.py Small pruned specialist checkpoints from a local base model. .\.venv\Scripts\python.exe .\examples\train_talent_models.py --base-model .\models\talent_distilled_distilgpt2 --local-files-only reports/talent_specialists_report.html, reports/talent_specialists_report.json
examples/minify_distilgpt2_specialists.py distilgpt2 minified specialists using pruning plus light alignment. .\.venv\Scripts\python.exe .\examples\minify_distilgpt2_specialists.py --teacher-model distilgpt2 --student-seed-model distilgpt2 --local-files-only reports/distilgpt2_minified_specialists_report.html, reports/distilgpt2_minified_specialists_report.json
examples/specialize_recent_llm.py Recent instruction LLM pruning / specialist export. .\.venv\Scripts\python.exe .\examples\specialize_recent_llm.py --base-model Qwen/Qwen3-1.7B --keep-layers 24 --epochs 0 --local-files-only reports/recent_llm_specialists_report.html, reports/recent_llm_specialists_report.json
examples/build_optimization_dashboard.py Aggregate existing reports into one dashboard. .\.venv\Scripts\python.exe .\examples\build_optimization_dashboard.py reports/optimization_suite.html, reports/optimization_suite.json
examples/serve_talent_router.py Local web UI and routing API. .\.venv\Scripts\python.exe .\examples\serve_talent_router.py --local-files-only No report; serves http://127.0.0.1:8765.

Notes

This is a research/learning project, not a production LLM serving stack. The route is shown transparently in the UI, and guarded responses are used for some code cases where small local models are not reliable enough.

About

Experimental toolkit for routing prompts to talent-oriented language-model specialists.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors