UniversalDocsGrabber

Desktop tool for automatically downloading, converting, and organizing documents from IMAP mailboxes.

Deutsche Dokumentation: README-DE.md

Features

Multi-account IMAP support
Search profiles with sender, subject, and date filters
Downloads PDF, DOCX, DOC, JPG, PNG, and other document types
Automatic PDF conversion for documents, images, and text bodies
OCR support for scanned PDFs via Tesseract
SHA-256 hash-based duplicate detection
Built-in scheduler for recurring scans from 15 minutes to 24 hours
Rule-based auto-categorization for invoices, shipping, contracts, taxes, insurance, and related mail
Drag-and-drop profile ordering and batch runs for all active profiles
Local-first storage for account settings and indexed document metadata

Privacy Model

UniversalDocsGrabber runs locally on your Windows machine. Mail credentials are stored through the operating system keyring when available, while project metadata is kept in the user profile. The application does not ship with telemetry, cloud sync, or a hosted backend.

Installation

Requirements

Python 3.8+
Windows for Word conversion via win32com
Optional: Tesseract OCR
Optional: Poppler

Setup

pip install -r requirements.txt

Optional: Poppler

Download from https://github.com/oschwartz10612/poppler-windows/releases
Extract to C:\Program Files\poppler\
Adjust POPPLER_PATH in UniversalDocsGrabberV1.py if needed

Optional: Tesseract

Download from https://github.com/UB-Mannheim/tesseract/wiki
Install to C:\Program Files\Tesseract-OCR\
Add to PATH

Run

python UniversalDocsGrabberV1.py

or double-click START.bat.

Typical Workflow

Add an IMAP account in the Accounts tab
Create a search profile with group, filters, and target folder
Set a date range
Start a single profile or scan all active profiles with START
Browse results in the Documents tab

Features in Detail

Search Profiles

Group-based organization for thematic sorting
Drag-and-drop sorting between groups
Profile-specific override settings
Per-run date filters

Conversion

Word to PDF via win32com or docx2pdf
TXT to PDF via reportlab
Images to PDF via Pillow
OCR for PDFs without a text layer

Scheduler & Auto-Categorization

Recurring scans from 15 minutes to 24 hours
Runs skipped if another scan is already active
Batch execution processes all active profiles grouped by account
Rule-based auto-categorization for invoices, shipping, contracts, cancellations, taxes, insurance, applications, and banking

Deduplication

SHA-256 hash check
Configurable per profile

Local Data

%USERPROFILE%\.univ_docs_grabber\config_v1.json
%USERPROFILE%\.univ_docs_grabber\documents.json
%USERPROFILE%\Downloads\UnivDocs\

These files are intentionally ignored by Git because they can contain account names, local paths, document metadata, and downloaded documents.

Known Limitations

OCR requires Tesseract and Poppler
Word conversion requires Windows components
Search is intentionally conservative and limits the mail count per profile

Development

python -m pytest -q
python -m py_compile UniversalDocsGrabberV1.py

Related Tools

Part of the doc-bricks mail suite:

Tool	Description
MailProcessor	System tray launcher for all Universal Mail Tools
UniversalMailCleaner	Rule-based IMAP mailbox cleaner with safe mode
UniversalInvoiceMail	Extract invoices and receipts from IMAP mail

License

MIT - Lukas Geiger

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README		README
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Feature_Analyse_DocsGrabber.md		Feature_Analyse_DocsGrabber.md
LICENSE		LICENSE
README-DE.md		README-DE.md
README.md		README.md
SECURITY.md		SECURITY.md
START.bat		START.bat
UniversalDocsGrabberV1.py		UniversalDocsGrabberV1.py
UniversalDocsGrabber_icon.ico		UniversalDocsGrabber_icon.ico
UniversalDocsGrabber_icon.png		UniversalDocsGrabber_icon.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniversalDocsGrabber

Features

Privacy Model

Installation

Requirements

Setup

Optional: Poppler

Optional: Tesseract

Run

Typical Workflow

Features in Detail

Search Profiles

Conversion

Scheduler & Auto-Categorization

Deduplication

Local Data

Known Limitations

Development

Related Tools

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UniversalDocsGrabber

Features

Privacy Model

Installation

Requirements

Setup

Optional: Poppler

Optional: Tesseract

Run

Typical Workflow

Features in Detail

Search Profiles

Conversion

Scheduler & Auto-Categorization

Deduplication

Local Data

Known Limitations

Development

Related Tools

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages