Scalable Auditable Intelligence through Neural Grafting
A framework for controlled, modular, and auditable AI growth.
Roadmap | Process Docs | Contributing | Security
SAINT-G is a research framework for growing AI systems through small, validated, recomposable neural grafts instead of opaque monolithic retraining.
The long-term thesis is simple:
The safer path to more capable AI may not be only making models larger.
It may be making growth modular, testable, reversible, and governed.
SAINT-G is designed around a different unit of progress:
base model
+ candidate grafts
+ validation gates
+ retention checks
+ safety checks
+ rollback
+ audit trail
+ periodic consolidation
The first backbone is
drm_transformer, a custom
geometric Transformer based on Directional Relational Manifolds. The first
growth method is DRM grafting, but the broader project is now SAINT-G:
Scalable Auditable Intelligence through Neural Grafting.
- Why This Exists
- Core Idea
- SAINT-G vs Traditional Training
- Architecture
- Current Research Stage
- Quick Start
- DRM Transformer Bridge
- Scalability
- Continual Growth
- What This Does Not Claim
- Roadmap
- License
Today, model improvement is usually treated as a dense training problem: update huge tensors, store huge optimizer state, publish a new monolithic checkpoint, and hope the behavioral changes are acceptable.
That works, but it is hard to audit.
SAINT-G explores another path:
freeze most of the model
find where growth may help
train compact graft candidates
validate them against the composed model
accept only what improves real metrics
keep every change removable and traceable
The goal is not merely parameter efficiency. The goal is controlled growth:
- every graft has metadata, metrics, hashes, and provenance;
- every accepted change can be recomposed and evaluated;
- every risky or regressive graft can be removed;
- every consolidation step can be audited;
- every gain is compared against strong baselines.
The current strongest technical object is a neural graft:
Delta W = A Phi B
Where:
Wis a frozen target matrix or module;Aprojects into the graft space;Phiis the compact trainable operator;Bprojects back to the target space;Delta Wis applied by hook, sparse update, or consolidation.
In the DRM experiments, grafts are trained, validated, accepted/rejected, and stored as recomposable artifacts.
Variants explored so far include:
- dense Phi;
- diagonal Phi;
- upper triangular Phi;
- Hadamard Phi;
- low-rank Phi;
- least-squares initialized Phi;
- Phi with sparse residual;
- trainable
A/Bunder a parameter cap; - staged graft growth;
- validation-routed graft selection;
- fine-grained second-stage growth.
| Component | Traditional full training | LoRA/QLoRA | SAINT-G |
|---|---|---|---|
| Base weights | updated | frozen or quantized | frozen by default |
| Trainable object | full tensors | low-rank adapter | validated graft |
| Delta shape | dense | low-rank | structured A Phi B / graft block |
| Selection | all layers or manual | target modules | routing + validation gates |
| Acceptance | final training objective | adapter validation | composed-model validation |
| Checkpoint | full model or adapter | adapter | graft artifact + registry metadata |
| Growth | fixed retraining run | task adaptation | progressive, reversible growth |
| Auditability | low | medium | design goal |
SAINT-G does not assume it beats LoRA or QLoRA. Those are required baselines. The project advances only where SAINT-G shows an advantage in at least one serious axis: memory, checkpoint size, gain per parameter, reversibility, validation-gated growth, or auditability.
data / evals / safety checks
|
v
+--------------------+
| sensitivity maps |
+--------------------+
|
v
+--------------------+
| candidate router |
+--------------------+
|
v
frozen base ---- target layer/module ---- candidate grafts
|
v
+--------------------+
| train graft |
+--------------------+
|
v
+--------------------+
| composed validation|
+--------------------+
|
accept / reject / defer
|
v
+--------------------+
| graft registry |
+--------------------+
|
v
+--------------------+
| rollback / merge |
+--------------------+
Main modules:
saint/
adapters/ DRM, Hugging Face, graft application
blocks/ block partitioning and reconstruction
checkpoints/ compact/sharded payloads and checksums
codebook/ block dictionaries and reuse
memory/ memory estimation and dtype planning
routing/ budget, sensitivity, validation rerank
sensitivity/ gradient, Fisher, activation and proxy maps
training/ toy tasks, linear tasks, mini-transformer tasks
cli/ runtime commands
SAINT-G has moved through several layers of validation:
- traditional LLM training paradigm documentation;
- block-codebook reconstruction;
- routed sparse delta training;
- linear-layer learning benchmarks;
- mini-transformer experiments;
- sensitivity maps;
- robust and scalable checkpoint formats;
- Hugging Face small-model bridge;
- 3B and 14B partial adaptation probes;
- DRM progressive grafting;
- Phi/graft variants;
- full DRM 125M smoke baseline;
- DRM 5M + grafted-to-125M comparison path.
The current bridge is:
DRM full 125M/350M
vs
DRM 5M + SAINT-G grafted
vs
GPT-2/OPT size-band calibration
Recent Phase 16 results showed that staged grafting can produce small but real validation gains with exact recomposition:
base DRM 5M
-> 4 accepted grafts
-> fine-grained G2 accepted
-> checkpoint recomposes with zero drift
This does not mean a 5M model has reached full 125M quality. It means the growth path is operational and measurable.
Create an environment:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txtLinux equivalent:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRun the CLI:
python -m saint.cli --helpLinux equivalent:
python -m saint.cli --helpRun tests:
python -m pytestLinux equivalent:
python -m pytestInspect a small runtime command:
python -m saint.cli estimate --helpLinux equivalent:
python -m saint.cli estimate --helpThe current full-model comparison uses real drm_transformer scaling configs:
configs/scaling/multilingual/125m.yaml
configs/scaling/multilingual/350m.yaml
Prepare the 350M dataset once:
python scripts/prepare_multilingual_data.py `
--output-dir data/multilingual_350m `
--max-tokens 7000000000 `
--vocab-size 50000 `
--langs en,pt,es,fr,deLinux equivalent:
python scripts/prepare_multilingual_data.py \
--output-dir data/multilingual_350m \
--max-tokens 7000000000 \
--vocab-size 50000 \
--langs en,pt,es,fr,deFinalize and clean raw shards:
python scripts/prepare_multilingual_data.py `
--output-dir data/multilingual_350m `
--vocab-size 50000 `
--finalize --clean-rawLinux equivalent:
python scripts/prepare_multilingual_data.py \
--output-dir data/multilingual_350m \
--vocab-size 50000 \
--finalize --clean-rawDerive the 125M dataset:
python scripts/prepare_multilingual_data.py `
--derive-subset-from data/multilingual_350m `
--output-dir data/multilingual_125m `
--max-tokens 3500000000 `
--subset-copy-mode hardlinkLinux equivalent:
python scripts/prepare_multilingual_data.py \
--derive-subset-from data/multilingual_350m \
--output-dir data/multilingual_125m \
--max-tokens 3500000000 \
--subset-copy-mode hardlinkSmoke test the full 125M DRM:
python scripts/train_distributed.py `
--config configs/scaling/multilingual/125m.yaml `
--device cuda `
--override batch_size=1 gradient_accumulation_steps=8 total_tokens=819200 save_interval=100 eval_interval=100 log_interval=10 save_dir=checkpoints/multilingual_125m/smoke_100Linux equivalent:
python scripts/train_distributed.py \
--config configs/scaling/multilingual/125m.yaml \
--device cuda \
--override batch_size=1 gradient_accumulation_steps=8 total_tokens=819200 save_interval=100 eval_interval=100 log_interval=10 save_dir=checkpoints/multilingual_125m/smoke_100SAINT-G is designed to scale in two ways.
On a consumer GPU, the priority is controlled memory:
- frozen base model;
- micro-batch 1;
- sparse or compact deltas;
- checkpoint payloads that avoid dense materialization;
- routed training instead of full updates;
- cheap validation before expensive consolidation.
On a cluster, the main opportunity is parallel graft search:
- GPU 1 tests graft candidates for layer A;
- GPU 2 tests graft candidates for layer B;
- GPU 3 runs LoRA/dense controls;
- GPU 4 validates old examples for regression;
- a coordinator approves, rejects, defers, or retries grafts.
This is not one huge synchronized dense run. It is distributed search for useful growth modules.
base model frozen
|
v
workers train candidate grafts
|
v
central validator measures composed gain
|
v
accept / reject / defer
|
v
recomposable checkpoint
If validation-gated grafting works at 125M/350M and later at cluster scale, SAINT-G becomes a continual growth system:
base model
+ verified graft registry
+ distributed graft search
+ continual safety gates
+ rollback
+ distillation
+ governance layer
Planned components:
- Graft Registry: versioned metadata, datasets, evals, hashes, compatibility.
- Rollback: remove one graft without discarding the whole model.
- Graft Distillation: consolidate many grafts into a new compact base.
- Safety-Gated Growth: quality, retention, safety, interpretability, conflict, rollback gates.
- Specialized Graft Libraries: code, math, Portuguese, legal, medical, safety, tool use.
- Auditable Composition: identify which graft changed which metric or behavior.
- Governed Self-Improvement: candidates can be proposed automatically, but accepted only through external validation and policy gates.
The larger research question:
Can an AI system improve continuously without losing traceability,
correctability, and control?
SAINT-G does not currently claim:
- full 70B pretraining on a consumer GPU;
- universal superiority over LoRA/QLoRA;
- replacement for dense pretraining;
- proof that grafting beats full training in general;
- autonomous self-modification without governance.
The honest claim is narrower:
SAINT-G is a research system for testing controlled AI growth through
small, validated, auditable, and reversible neural grafts.
Near-term:
- Finish the full DRM 125M/350M vs grafted comparison.
- Replicate with more seeds, splits, and at least one additional config.
- Compare against stronger LoRA/QLoRA/full-module/sparse baselines.
- Add retention, regression, and safety/control evals.
- Formalize the DRM-Growth Protocol.
- Prototype DRM-GOS: distributed validation-gated graft search.
Long-term:
- 1.3B bridge before 70B;
- 70B partial adaptation with quantized/frozen base;
- cluster-scale online graft search;
- graft registry and rollback;
- continual safety-gated growth;
- distillation of accumulated grafts;
- publication-quality reports.
Full roadmap:
docs/roadmap.md
docs/process/
SAINT-G is available under a dual-license model:
- AGPL-3.0 for open-source use compatible with AGPL obligations.
- Commercial license for proprietary, closed-source, SaaS, OEM, or other deployments that need different terms.
For commercial licensing, contact felipe@truthagi.ai.
See:
LICENSELICENSE-COMMERCIAL.mdCOPYRIGHTCLA.mdCONTRIBUTING.mdSECURITY.mdPRIOR_ART.md
