Agentic post-training orchestrator on phantom-mesh — 跨裝置、self-hosted、agent 自動挑 base model + hyperparams + 跑 fine-tune,招聘對齊 NVIDIA / Anthropic / Modal。
docs/demo.cast — asciinema recording of the Tier 1 --dry-run fine-tune plan.
# play in a terminal (requires asciinema)
asciinema play docs/demo.cast
# or view the captured text without any tooling:
cat docs/demo.cast | jq -r '.[] | select(.[1]=="o") | .[2]'Self-hosted on purpose — no upload to asciinema.org, no third-party tracking.
The first headless + agentic + cross-device post-training orchestrator that sits on top of Unsloth/Axolotl and runs over a personal device mesh. Unsloth ships kernels, Axolotl ships YAML, AutoTrain ships SaaS — phantom-training is the missing piece: an LLM agent that pulls dataset from your own phantom-mesh FTS5 memory, picks hyperparams, dispatches to whichever mesh node has the right GPU, and publishes the resulting LoRA back as a phantom skill.
Think terraform apply for fine-tuning, where the planner is an agent and the
state lives in your phantom mesh.
- ✅ Tier 1 shipped:
phantom-trainCLI (--skill / --base / --dry-run / --commit), FTS5 dataset extractor against~/.phantom-mesh/memory.db, Curator judge interface stub, example training recipe (examples/rust-coder.toml), pytest smoke test. Dry-run prints a structured plan — no real training yet. - 🟡 Tier 2 next (M2): Unsloth backend wired, real LoRA fine-tune on Mac M-series, eval pipeline (HumanEval / MBPP).
- 🟡 Tier 3 (M3 W11-12, ~2026-08 target full MVP): agent-driven hyperparam search (LaMDAgent-style loop), cross-device dispatch via phantom-mesh, skill publish loop.
git clone https://github.com/markl-a/phantom-training
cd phantom-training
pip install -e .
python -m phantom_training.cli --skill rust-coder --base qwen2.5-coder-7b --dry-run
pytest -vphantom-training is the P3 進化網 measure-upgrade layer of phantom-mesh:
the Hermes 6-step loop (judge → extract → store → recall → apply → measure)
gains a real measure arm that turns logged sessions into a new fine-tuned
LoRA, not just a metric row.
User: "make phantom's coder agent better at Rust"
|
phantom-training agent (this repo):
1. Pull Rust sessions from phantom-mesh FTS5 memory.db
2. Filter to success cases via Hermes Curator judge
3. Build instruction-tuning dataset
4. Pick base model (Qwen2.5-Coder-7B / CodeLlama-7B / ...)
5. Pick LoRA rank / lr / batch via agent proposal
6. Dispatch to Mac M-series or a mesh GPU node (via phantom-mesh)
7. Eval on holdout + HumanEval / MBPP
8. If pass, publish as phantom skill "rust-coder-v2"
9. If fail, agent re-proposes (LaMDAgent loop)
Pillars served: P3 (進化網 / self-improving), P4 (加密為先 — training data never leaves the mesh, weights encrypted at rest).
- Recruiters: hits JD keywords for NVIDIA training infra, Anthropic post-training research, Modal / Together (serverless GPU spill), 工研院 / 中研院 (academic LaMDAgent-style work). Demonstrates Rust+Python systems, agent loops, distributed GPU dispatch, public benchmark eval.
- Co-builders: anyone who wants
terraform applyergonomics for LoRA fine-tuning over a personal-device mesh; fork-friendly Tier 1 surface (~300 LOC entry points).
Full design + competitor matrix + risk register lives at
docs/02-phantom-training.md (sanitized copy of internal spec). 3-bullet
version:
- M2 — Unsloth backend, real LoRA on M-series, eval pipeline.
- M3 W11-12 — agent hyperparam search, cross-device dispatch, skill
publish loop, 1 end-to-end demo (
phantom train --skill rust-coder). - Post-M3 — DPO / preference learning, model registry, 9-Agent Landscape benchmark auto-runs.
Apache-2.0. © 2026 Mark Lai (markl-a).