Automatically fetch invoices from online providers and import them into Paperless-NGX.
paperflow runs as a Docker container, periodically logs into your provider accounts, downloads invoices as PDFs, and uploads them to your Paperless-NGX instance β fully automatically. A SQLite database tracks which invoices have already been processed to avoid duplicates.
A built-in web interface (port 8085) lets you configure everything, manage providers, view the invoice history, and watch live logs β no terminal needed.
- Automatic invoice download from Amazon.de / Amazon.com, IKEA, and Klarna
- Paperless-NGX upload via REST API β sets tags, correspondent, date, and title automatically
- Product title extraction β Paperless title shows the actual product name, not just the order number
- Duplicate prevention β SQLite database tracks every processed invoice
- Year-skip optimization β past years that were fully scanned are skipped on subsequent runs
- Incremental scan mode β optionally scan only the last 30 days for fast daily runs
- Parallel uploads β multiple PDFs uploaded simultaneously (configurable workers)
- Correspondent dropdown β select the correct Paperless-NGX correspondent from a live list
- Year tags β each invoice is automatically tagged with its year (e.g.
2024) - Progress bar β real-time upload progress shown in the web UI
- Error categories β history shows whether failure was
no PDF,Download β, orUpload β - Plugin architecture β add new providers by dropping a single
.pyfile - CDP browser mode β connects to a persistent Chrome instance via Remote Debugging (no repeated logins, supports 2FA)
- Cookie import β log in via Cookie Editor extension instead of VNC (useful for IKEA, Amazon)
- Docker-first β two containers:
paperflow(FastAPI + Python) +paperflow-chrome(Chrome + noVNC)
git clone https://github.com/Caps3n/paperflow.git
cd paperflowcp .env.example .envEdit .env:
| Variable | Description | Default |
|---|---|---|
PAPERLESS_URL |
Your Paperless-NGX URL | http://paperless:8000 |
PAPERLESS_TOKEN |
API token from Paperless-NGX admin | β |
AMAZON_EMAIL |
Amazon account email | β |
AMAZON_PASSWORD |
Amazon account password | β |
AMAZON_DOMAIN |
amazon.de or amazon.com |
amazon.de |
AMAZON_MONTHS_BACK |
How many months back to scan | 12 |
IKEA_EMAIL |
IKEA account email | β |
IKEA_PASSWORD |
IKEA account password | β |
UPLOAD_WORKERS |
Parallel upload threads | 3 |
RUN_INTERVAL_HOURS |
How often to run (hours) | 24 |
docker compose up -dOpen the web interface at http://localhost:8085
On first run, open the browser at http://localhost:6080 (noVNC), log into Amazon or IKEA manually once β the session is then reused automatically.
- In Portainer β Stacks β Add Stack β Repository
- Set:
- Repository URL:
https://github.com/Caps3n/paperflow - Compose path:
docker-compose.portainer.yml
- Repository URL:
- Add environment variables in the Environment variables tab (see table above)
- Click Deploy
Portainer builds the paperflow-chrome browser container from source and pulls paperflow from ghcr.io automatically.
| Page | Description |
|---|---|
| Dashboard | Stats, progress bar, last run status, manual trigger |
| Settings | Edit all credentials and intervals in-browser |
| Providers | Enable/disable providers, edit tags & correspondent, upload custom .py scripts |
| History | Invoice history with status, error category, and link to Paperless document |
| Logs | Live log output with auto-refresh |
By default the web UI is accessible without authentication. To enable login protection:
UI_USER=admin
UI_PASSWORD=yourpasswordOr set it in Settings β Security in the web UI.
Note: paperflow runs HTTP only. For external access, place it behind a reverse proxy with TLS (e.g. Caddy for automatic HTTPS).
βββββββββββββββββββββββββββ CDP ββββββββββββββββββββββββ
β paperflow β ββββββββββΊ β paperflow-chrome β
β FastAPI + Python β β Chrome + noVNC β
β port 8085 (Web UI) β β port 6080 (VNC) β
ββββββββββββ¬βββββββββββββββ ββββββββββββββββββββββββ
β REST API
βΌ
βββββββββββββββββββββββββββ
β Paperless-NGX β
βββββββββββββββββββββββββββ
paperflow connects to Chrome over CDP (Chrome DevTools Protocol), uses the live browser session to download invoice PDFs, then uploads them to Paperless-NGX via REST API.
paperflow uses a persistent Chrome browser (paperflow-chrome) so you only log in once:
- Open http://<server>:6080 in your browser (noVNC web UI)
- Log into Amazon, IKEA, or Klarna β including any 2FA prompts
- Start a scan from the web UI β your session is reused automatically
Alternative β Cookie import (no VNC needed):
- Install the Cookie Editor browser extension
- Log into the provider in your regular browser
- Export cookies as JSON via Cookie Editor
- Paste the JSON in Settings β Amazon / IKEA β Import Cookies
paperflow has a plugin system. To add a new provider:
- Create a file
myprovider.pyfollowing this template:
from app.providers import BaseProvider, Invoice
from pathlib import Path
class MyproviderProvider(BaseProvider):
provider_name = "myprovider"
def fetch_invoices(self) -> list[Invoice]:
# Your download logic here
return [
Invoice(
invoice_id="2024-001",
file_path=Path("/app/downloads/myprovider/invoice.pdf"),
title="My Provider Invoice 2024-001",
date="2024-01-15",
extra_tags=["2024"],
)
]- Upload via the Providers page in the web UI, or place the file in
providers_custom/ - Enable the provider in the web UI β done!
Convention: class name must be <Providername>Provider (capitalized), file name must be <providername>.py (lowercase).
paperflow/
βββ app/
β βββ main.py # Entry point β scheduler + parallel uploads
β βββ web.py # FastAPI web interface + API endpoints
β βββ ui.html # Single-page web UI
β βββ database.py # SQLite tracking (invoices + scanned years)
β βββ paperless_client.py # Paperless-NGX API client
β βββ state.py # Shared scan progress state
β βββ providers/
β βββ __init__.py # BaseProvider + Invoice dataclass
β βββ amazon.py # Amazon provider (CDP mode + fallback)
β βββ ikea.py # IKEA provider (CDP mode + cookie import)
β βββ klarna.py # Klarna provider (CDP mode, Kaufbelege)
βββ chrome-desktop/ # Chrome + noVNC Docker image (paperflow-chrome)
β βββ Dockerfile
β βββ start.sh
βββ providers_custom/ # Drop custom provider .py files here
βββ data/ # SQLite DB + logs + settings (persisted volume)
βββ downloads/ # Temporary PDF storage
βββ Dockerfile
βββ docker-compose.yml # Local development
βββ docker-compose.portainer.yml # Portainer / production deployment
βββ .env.example
- eBay provider
- Email/IMAP provider (catch invoices sent by email)
- Notification on completion (Telegram / ntfy)
- Dark/light mode toggle in web UI
Contributions are welcome! Please read CONTRIBUTING.md first.
The easiest way to contribute is to write a provider for a service you use and open a pull request.
MIT β see LICENSE