A batched implementation of an affine/linear transformation and the minGRU as proposed in "Were RNNs All We Needed?". Useful for neural additive/ensemble modeling of tabular and longitudinal data.
The initialization used in linear.py is adopted from the PyTorch nn.Linear initialization (BSD-3-Clause license). The code in gru.py is adapted from Phil Wang's implementation of the minGRU (MIT License), which itself incorporates Franz Heinsen's log-space implementation of the parallel scan algorithm (MIT License). The minGRU was proposed in "Were RNNs All We Needed?", which simplifies the GRU to a minimal variant that can be trained efficiently in parallel. The implemented cell uses the parallel scan proposed in "Efficient Parallelization of a Ubiquitous Sequential Computation".
License texts are available here.
Last accessed: 2025-12-13
$ pip install git+https://github.com/kachelriess/rnamRNAM is briefly demonstrated in example.ipynb.
@inproceedings{Feng2024WereRA,
title = {Were RNNs All We Needed?},
author = {Leo Feng and Frederick Tung and Mohamed Osama Ahmed and Yoshua Bengio and Hossein Hajimirsadegh},
year = {2024},
url = {https://api.semanticscholar.org/CorpusID:273025630},
}@misc{heinsen2023efficientparallelizationubiquitoussequential,
title = {Efficient Parallelization of a Ubiquitous Sequential Computation},
author = {Franz A. Heinsen},
year = {2023},
url = {https://arxiv.org/abs/2311.06281v4},
}@misc{rnam2025,
title = {RNAM: Recurrent Neural Additive Model},
author = {Kachelrieß, Lucas},
year = {2025},
note = {Software available at \url{https://github.com/kachelriess/rnam}},
}