A web app designed for online document scanning. Featuring built-in ML-powered perspective correction.
Install all dependencies before proceeding.
pip install -r requirements.txt- Navigate to the app directory
cd app- Run app with flask, changing the host and port to your needs.
flask run --host=0.0.0.0 --port=5001- Navigate to the server's location over http. Further instructions are located in the UI.
The general workflow is as follows
The complex components of the workflow are explained below
We utilize deep learning to simplify a crucial step in the perspective correction process: contour detection. on The model is custom UNet for binary segmentation of rectangular documents.
Implementation details on this model can be found in this repo: https://github.com/LukeIngram/DocuSegement-Pytorch
Example inference:
These masks remove all other subjects from the image, which greatly improves the accuracy of the contour detection algorithms.
Four corner points are required for accurate perspective correction, and in the case where the input isn't a quadrilateral, a minimum bounding one is computed around the subject.
Approximation example:
The implementation details can be found at app/backend/utils/boundingQuad.py
The core feature of this app is the perspective transform.
The first step is finding the new corrected corners of the quadrilateral. This is done by computing a four point linear transform from the quadrilateral approximated in the previous step. (More details can be found in app/backend/utils/transforms.py)
Then the perspective is corrected using a process called a homography transformation.
The results are as follows:
Luke Ingram - lukeingram01@gmail.com
- Containerization
- Device Camera Support
- event logging
- file cleanup (both uploads and outbox)




