This is the official repository for "Thinking in Flow: A Dissipative Stabilization Operator for Robust Autoregressive Reasoning" (ICML 2026). We propose a Neural ODE-based controller that adaptively regulates token-level reasoning depth, injected into the residual stream of a transformer backbone.
The implementation is built on LLaMA-Factory.
The trained checkpoint is available on HuggingFace:
➡️ IKEJAY/gsm8k-ode-llama32-3b
pip install -e .
pip install ".[deepspeed,vllm]"Train the ODE-augmented LLaMA-3.2-3B on GSM8K:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 llamafactory-cli train configs/train.yamlKey configuration:
- Base model:
LLM-Research/Llama-3.2-3B(downloaded from HuggingFace) - Dataset: GSM8K train set (
gsm8k_train_clean) - ODE: Euler solver, 1 step, α=100, progressive unfreezing at epoch 2
- DeepSpeed: ZeRO Stage 3
Run inference with vLLM:
CUDA_VISIBLE_DEVICES=0 python scripts/vllm_infer.py --config configs/eval.yamlEvaluate on GSM8K test set (gsm8k_test_clean) with greedy decoding.
├── configs/
│ ├── train.yaml # Training configuration
│ ├── eval.yaml # Evaluation configuration
│ └── ds_z3_config.json # DeepSpeed ZeRO-3 config
├── data/
│ ├── gsm8k_train_clean.json # GSM8K training data
│ ├── gsm8k_test_clean.json # GSM8K test data
│ └── dataset_info.json # Dataset registry (sharegpt format)
├── scripts/
│ ├── vllm_infer.py # vLLM inference script
│ └── vllm_ode_patch.py # ODE solvers for vLLM
└── src/ # Core source code (LLaMA-Factory + ODE module)
└── llamafactory/
└── model/ode/ # ODE controller & wrapper
src/llamafactory/model/ode/controller.py— Neural ODE g_theta MLP with Euler/RK4/Heun solverssrc/llamafactory/model/ode/wrapper.py— ODEModelWrapper that injects ODE states into transformer residual streamscripts/vllm_ode_patch.py— Standalone ODE solvers for vLLM inference compatibility
@inproceedings{thinkinginflow2026,
title={Thinking in Flow: A Dissipative Stabilization Operator for Robust Autoregressive Reasoning},
author={},
booktitle={International Conference on Machine Learning},
year={2026}
}This implementation is built on LLaMA-Factory. We thank the LLaMA-Factory team for their excellent framework.
This project is released under the Apache 2.0 License.