AXEN-M (Attention eXtended Efficient Network - Model)

Overview

AXEN-M is a transformer-based framework designed for large-scale language modeling, fine-tuning, and efficient attention handling. Built by Vinkura, it integrates Infini-Attention with LoRA-based fine-tuning to provide scalability and optimized performance for long-context tasks.

Features

Advanced Attention Mechanisms
- Infini-Attention implementation for extended sequence handling
- Optimized memory usage with sliding window attention
- Support for sequences up to 32K tokens
Model Architecture
- LoRA (Low-Rank Adaptation) integration for efficient fine-tuning
- Compatible with LLaMA model architectures
- Modular transformer components
Training Capabilities
- 4-bit quantization support
- Gradient checkpointing
- Mixed-precision training
- Distributed training support
Development Tools
- Comprehensive data preprocessing pipeline
- Attention pattern visualization
- Performance metrics and evaluation
- Extensive test coverage

AXEN-M Installation Guide

Prerequisites

System Requirements

Python 3.8 or higher
CUDA 11.7+ (for GPU support)
16GB RAM minimum (32GB recommended)
Linux, macOS, or Windows (via WSL2)
Git

Required Dependencies

PyTorch 2.0+
CUDA toolkit (for GPU support)
cuDNN (for GPU support)

Project Structure

infini-attention/
│
│  
│
├── configs/                     
│   ├── single_node.yaml         # Config for single-node training
│   ├── two_node.yaml            # Config for two-node distributed training
│   └── zero3_offload.json       # DeepSpeed Zero3 offload config
│
├── scripts/
│   └── train.sh                 # Shell script for training
│
├── src/                         # Source code directory
│   ├── __init__.py
│   ├── main.py                  
│   ├── fine_tune.py             
│   ├── llama_model.py           
│   ├── temp_utils.py            
│   └── train.py                 
│
├── tests/                       
│   └── test_llama_model.py
├   └── test_utils.py 
│
├── .gitignore
├── requirements.txt              # List of dependencies
├── README.md                     # Complete project documentation
└── LICENSE

Installation Methods

Clone the repository:

git clone https://github.com/vinkuraai/axen-m.git
cd axen-m

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Linux/macOS
# or
venv\Scripts\activate  # On Windows

Install the package:

# For basic installation
pip install -e .

# For development installation with extra tools
pip install -e ".[dev]"

GPU Support Setup

CUDA Installation

Download CUDA Toolkit 11.7 or higher from NVIDIA website
Install CUDA Toolkit following NVIDIA's instructions
Verify CUDA installation:

nvcc --version

Install GPU Dependencies

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117

Installation Verification

Run the verification script:

python -c "from axen_m import Model; print(Model.version_check())"

Common Installation Issues

CUDA Not Found

# Add CUDA to PATH (Linux/macOS)
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

# Add CUDA to PATH (Windows)
set PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin;%PATH%

Memory Issues During Installation

# Reduce number of build workers
pip install -e . --no-cache-dir --no-build-isolation

Development Setup

Install development dependencies:

pip install -e ".[dev,test,docs]"

Install pre-commit hooks:

pre-commit install

Setup documentation build environment:

cd docs
make html

Configuration

Environment Variables

# Required for distributed training
export MASTER_ADDR="localhost"
export MASTER_PORT="12355"
export WORLD_SIZE="1"
export LOCAL_RANK="0"

# Optional performance tweaks
export CUDA_LAUNCH_BLOCKING="1"
export CUDA_VISIBLE_DEVICES="0,1"

Project Configuration File

Create config.yaml in your project root:

model:
  type: "llama"
  size: "7b"
  quantization: 4

training:
  batch_size: 32
  gradient_accumulation: 4
  mixed_precision: "fp16"

system:
  num_workers: 4
  pin_memory: true

Troubleshooting

Error: CUDA capability sm_XX not supported

Solution: Update your GPU drivers or modify CUDA architecture flags:

export TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6"
pip install -e .

Error: Memory allocation failed

Solution: Enable gradient checkpointing in your configuration:

model_config = {
    "gradient_checkpointing": True,
    "max_memory": {0: "12GB"}
}

Performance Benchmarks

Model Size	Sequence Length	Memory Usage	Training Speed	Inference Speed
7B	2048	14GB	32 samples/s	50 tokens/s
7B	8192	20GB	24 samples/s	45 tokens/s
7B	32768	28GB	16 samples/s	35 tokens/s

Contributing

We welcome contributions!

Code style and standards
Pull request process
Development setup
Testing requirements

Testing

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

License

This project is licensed under the MIT License. See LICENSE for details.

Citation

@software{axen_m2025,
  author = {VinkuraAI},
  title = {AXEN-M: Attention eXtended Efficient Network},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/vinkuraAI/axen-m}
}

Support

Documentation: https://axen-m.readthedocs.io/
Issue Tracker: GitHub Issues
Discussions: GitHub Discussions

Acknowledgments

The LLaMA team for their foundational work
The LoRA authors for their efficient fine-tuning approach
The open-source AI community

_{Built by Vinkura with a focus on efficiency and scalability.}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
scripts		scripts
src		src
tests		tests
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

AXEN-M (Attention eXtended Efficient Network - Model)

Overview

Features

AXEN-M Installation Guide

Prerequisites

System Requirements

Required Dependencies

Project Structure

Installation Methods

GPU Support Setup

CUDA Installation

Install GPU Dependencies

Installation Verification

Common Installation Issues

CUDA Not Found

Memory Issues During Installation

Development Setup

Configuration

Environment Variables

Project Configuration File

Troubleshooting

Error: CUDA capability sm_XX not supported

Error: Memory allocation failed

Performance Benchmarks

Contributing

Testing

License

Citation

Support

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages