QLoRA Legal Support Assistant

This repository contains a learning-focused LoRA fine-tuning setup for an English legal support assistant. The current default config is sized for local experimentation on a Mac with Qwen/Qwen2.5-0.5B-Instruct. Quantization is included in the codepath but disabled by default because bitsandbytes 4-bit QLoRA requires a CUDA environment.

What the project does

loads an instruction-tuned base model from Hugging Face
applies LoRA adapters with PEFT
formats chat-style JSONL data into the model's chat template
fine-tunes with trl.SFTTrainer
saves the LoRA adapter separately from the base model
reloads the base model + adapter for inference

Project structure

qlora-project/
├── configs/
│   └── qlora.yaml
├── data/
│   └── processed/
│       └── train.jsonl
├── models/
│   ├── base/
│   └── lora/
├── outputs/
│   ├── checkpoints/
│   └── logs/
├── src/
│   ├── dataset.py
│   ├── inference.py
│   ├── model.py
│   ├── prompts.py
│   ├── train.py
│   └── utils.py
├── .gitignore
├── README.md
└── requirements.txt

Install

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Training data format

Each line in data/processed/train.jsonl must be JSON with a messages array:

{
  "messages": [
    {"role": "system", "content": "System instruction"},
    {"role": "user", "content": "User question"},
    {"role": "assistant", "content": "Assistant answer"}
  ]
}

The assistant responses should stay within general legal information, mention jurisdiction differences, and recommend consulting a licensed attorney for specific advice.

Train

python -m src.train --config configs/qlora.yaml

Notes:

the default config uses Qwen/Qwen2.5-0.5B-Instruct
save_strategy: "no" avoids the shared-tensor checkpoint error you hit during local training
the LoRA adapter is saved to models/lora/legal-support-qwen-lora

Run inference

python -m src.inference \
  --config configs/qlora.yaml \
  --question "My employer fired me without warning. What general legal issues should I review?"

Config overview

configs/qlora.yaml controls:

model name and dtype
optional quantization settings
LoRA hyperparameters
training arguments
inference generation settings

To move from local LoRA to true QLoRA later:

switch to a CUDA environment
set quantization.enabled: true
update model.torch_dtype and training flags for the target hardware
replace the small base model with your target Qwen checkpoint

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QLoRA Legal Support Assistant

What the project does

Project structure

Install

Training data format

Train

Run inference

Config overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
data/processed		data/processed
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

QLoRA Legal Support Assistant

What the project does

Project structure

Install

Training data format

Train

Run inference

Config overview

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages