Skip to content

MAiTL-Group/TIF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thinking in Flow: A Dissipative Stabilization Operator for Robust Autoregressive Reasoning

ICML 2026

This is the official repository for "Thinking in Flow: A Dissipative Stabilization Operator for Robust Autoregressive Reasoning" (ICML 2026). We propose a Neural ODE-based controller that adaptively regulates token-level reasoning depth, injected into the residual stream of a transformer backbone.

The implementation is built on LLaMA-Factory.

Model

The trained checkpoint is available on HuggingFace:

➡️ IKEJAY/gsm8k-ode-llama32-3b

Installation

pip install -e .
pip install ".[deepspeed,vllm]"

Training

Train the ODE-augmented LLaMA-3.2-3B on GSM8K:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 llamafactory-cli train configs/train.yaml

Key configuration:

  • Base model: LLM-Research/Llama-3.2-3B (downloaded from HuggingFace)
  • Dataset: GSM8K train set (gsm8k_train_clean)
  • ODE: Euler solver, 1 step, α=100, progressive unfreezing at epoch 2
  • DeepSpeed: ZeRO Stage 3

Evaluation

Run inference with vLLM:

CUDA_VISIBLE_DEVICES=0 python scripts/vllm_infer.py --config configs/eval.yaml

Evaluate on GSM8K test set (gsm8k_test_clean) with greedy decoding.

Project Structure

├── configs/
│   ├── train.yaml          # Training configuration
│   ├── eval.yaml           # Evaluation configuration
│   └── ds_z3_config.json   # DeepSpeed ZeRO-3 config
├── data/
│   ├── gsm8k_train_clean.json   # GSM8K training data
│   ├── gsm8k_test_clean.json    # GSM8K test data
│   └── dataset_info.json        # Dataset registry (sharegpt format)
├── scripts/
│   ├── vllm_infer.py            # vLLM inference script
│   └── vllm_ode_patch.py        # ODE solvers for vLLM
└── src/                         # Core source code (LLaMA-Factory + ODE module)
    └── llamafactory/
        └── model/ode/           # ODE controller & wrapper

Key Components

  • src/llamafactory/model/ode/controller.py — Neural ODE g_theta MLP with Euler/RK4/Heun solvers
  • src/llamafactory/model/ode/wrapper.py — ODEModelWrapper that injects ODE states into transformer residual stream
  • scripts/vllm_ode_patch.py — Standalone ODE solvers for vLLM inference compatibility

Citation

@inproceedings{thinkinginflow2026,
  title={Thinking in Flow: A Dissipative Stabilization Operator for Robust Autoregressive Reasoning},
  author={},
  booktitle={International Conference on Machine Learning},
  year={2026}
}

Acknowledgments

This implementation is built on LLaMA-Factory. We thank the LLaMA-Factory team for their excellent framework.

License

This project is released under the Apache 2.0 License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages