Desktop tool for automatically downloading, converting, and organizing documents from IMAP mailboxes.
Deutsche Dokumentation: README-DE.md
- Multi-account IMAP support
- Search profiles with sender, subject, and date filters
- Downloads PDF, DOCX, DOC, JPG, PNG, and other document types
- Automatic PDF conversion for documents, images, and text bodies
- OCR support for scanned PDFs via Tesseract
- SHA-256 hash-based duplicate detection
- Built-in scheduler for recurring scans from 15 minutes to 24 hours
- Rule-based auto-categorization for invoices, shipping, contracts, taxes, insurance, and related mail
- Drag-and-drop profile ordering and batch runs for all active profiles
- Local-first storage for account settings and indexed document metadata
UniversalDocsGrabber runs locally on your Windows machine. Mail credentials are stored through the operating system keyring when available, while project metadata is kept in the user profile. The application does not ship with telemetry, cloud sync, or a hosted backend.
- Python 3.8+
- Windows for Word conversion via
win32com - Optional: Tesseract OCR
- Optional: Poppler
pip install -r requirements.txt- Download from https://github.com/oschwartz10612/poppler-windows/releases
- Extract to
C:\Program Files\poppler\ - Adjust
POPPLER_PATHinUniversalDocsGrabberV1.pyif needed
- Download from https://github.com/UB-Mannheim/tesseract/wiki
- Install to
C:\Program Files\Tesseract-OCR\ - Add to
PATH
python UniversalDocsGrabberV1.pyor double-click START.bat.
- Add an IMAP account in the
Accountstab - Create a search profile with group, filters, and target folder
- Set a date range
- Start a single profile or scan all active profiles with
START - Browse results in the
Documentstab
- Group-based organization for thematic sorting
- Drag-and-drop sorting between groups
- Profile-specific override settings
- Per-run date filters
- Word to PDF via
win32comordocx2pdf - TXT to PDF via
reportlab - Images to PDF via Pillow
- OCR for PDFs without a text layer
- Recurring scans from 15 minutes to 24 hours
- Runs skipped if another scan is already active
- Batch execution processes all active profiles grouped by account
- Rule-based auto-categorization for invoices, shipping, contracts, cancellations, taxes, insurance, applications, and banking
- SHA-256 hash check
- Configurable per profile
%USERPROFILE%\.univ_docs_grabber\config_v1.json%USERPROFILE%\.univ_docs_grabber\documents.json%USERPROFILE%\Downloads\UnivDocs\
These files are intentionally ignored by Git because they can contain account names, local paths, document metadata, and downloaded documents.
- OCR requires Tesseract and Poppler
- Word conversion requires Windows components
- Search is intentionally conservative and limits the mail count per profile
python -m pytest -q
python -m py_compile UniversalDocsGrabberV1.pyPart of the doc-bricks mail suite:
| Tool | Description |
|---|---|
| MailProcessor | System tray launcher for all Universal Mail Tools |
| UniversalMailCleaner | Rule-based IMAP mailbox cleaner with safe mode |
| UniversalInvoiceMail | Extract invoices and receipts from IMAP mail |
MIT - Lukas Geiger
