Skip to content

human-again/LedgerLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LedgerLens

AI-powered tool to extract financial transactions from PDF bank/credit card statements and populate Excel templates for tax purposes.

Features

  • Intelligent Parsing: Uses local Llama 3.1 8B model via Ollama to handle varied statement formats
  • Auto-categorization: Automatically separates money in vs money out
  • Batch Processing: Processes multiple PDFs in one run
  • Template Preservation: Maintains your Excel template formatting
  • Privacy: All processing happens locally - your financial data never leaves your machine

Prerequisites

  1. Ollama installed (download from ollama.ai)
  2. Python 3.8+
  3. Llama 3.1 8B model (automatically downloaded during setup)
  4. Excel template with columns for date, description, money in, money out

Installation

  1. Clone or download this repository

  2. Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Ensure Ollama is running and Llama 3.1 8B is available:
ollama pull llama3.1:8b

Configuration

Edit config.json to match your Excel template structure:

{
  "pdf_folder": "./statements",
  "output_folder": "./output",
  "excel_template": "./template.xlsx",
  "model_name": "llama3.1:8b",
  "column_mapping": {
    "date": "A",
    "description": "B",
    "money_in": "C",
    "money_out": "D"
  },
  "start_row": 2
}
  • pdf_folder: Folder containing your PDF statements
  • output_folder: Where processed Excel files will be saved
  • excel_template: Path to your Excel template
  • column_mapping: Map transaction fields to Excel columns
  • start_row: Row number where data should start (skip header row)

Usage

Web UI (Recommended)

  1. Start the web server:
./run_web.sh
# Or manually:
source venv/bin/activate
python -m uvicorn app:app --host 0.0.0.0 --port 8000 --reload
  1. Open your browser and navigate to:
http://localhost:8000
  1. Use the web interface to:
    • Drag and drop PDF files or click to select
    • Monitor real-time processing progress
    • Preview extracted transactions
    • Download the Excel file

Command Line

  1. Place your PDF statements in the statements/ folder
  2. Place your Excel template in the project root (or update path in config.json)
  3. Run the script:
python main.py
  1. Review the extraction summary
  2. Confirm to write transactions to Excel

Excel Template Format

Your Excel template should have columns for:

  • Date: Transaction date
  • Description: Transaction description/merchant
  • Money In: Deposits, credits, refunds
  • Money Out: Purchases, withdrawals, fees

The script will preserve all existing formatting, formulas, and styles in your template.

Performance

  • Processing speed: ~2-5 seconds per PDF page
  • Accuracy: Excellent for structured financial documents
  • Cost: $0 - completely free, unlimited usage
  • Privacy: All data stays on your local machine

Troubleshooting

Model not found

If you get an error about the model not being found:

ollama pull llama3.1:8b

Low extraction accuracy

  • Ensure PDFs are text-based (not scanned images)
  • Check that statements have clear transaction tables
  • Review extracted data and manually correct if needed

Excel write errors

  • Ensure template file exists and is not open in Excel
  • Check that column mappings in config.json match your template
  • Verify start_row doesn't overwrite important data

Project Structure

.
├── main.py              # Main orchestration script
├── pdf_extractor.py     # PDF text extraction
├── ai_parser.py         # AI transaction parsing
├── excel_writer.py      # Excel template writer
├── config.json          # Configuration file
├── requirements.txt     # Python dependencies
├── statements/          # Place PDFs here
├── output/             # Processed Excel files appear here
└── template.xlsx        # Your Excel template

License

Free to use for personal tax preparation purposes.

About

Privacy-first PDF bank statement extraction for Excel tax prep using local Ollama models

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors