DocuScanner Flask

A web app designed for online document scanning. Featuring built-in ML-powered perspective correction.

Usage

Install all dependencies before proceeding.

pip install -r requirements.txt

Running Locally

Navigate to the app directory

cd app

Run app with flask, changing the host and port to your needs.

flask run --host=0.0.0.0 --port=5001

Navigate to the server's location over http. Further instructions are located in the UI.

How it All Works

The general workflow is as follows

The complex components of the workflow are explained below

Image Segmentation

We utilize deep learning to simplify a crucial step in the perspective correction process: contour detection. on The model is custom UNet for binary segmentation of rectangular documents.

Implementation details on this model can be found in this repo: https://github.com/LukeIngram/DocuSegement-Pytorch

Example inference:

These masks remove all other subjects from the image, which greatly improves the accuracy of the contour detection algorithms.

Bounding Quadrilateral Approximation

Four corner points are required for accurate perspective correction, and in the case where the input isn't a quadrilateral, a minimum bounding one is computed around the subject.

Approximation example:

The implementation details can be found at app/backend/utils/boundingQuad.py

Perspective Correction

The core feature of this app is the perspective transform.

The first step is finding the new corrected corners of the quadrilateral. This is done by computing a four point linear transform from the quadrilateral approximated in the previous step. (More details can be found in app/backend/utils/transforms.py)

Then the perspective is corrected using a process called a homography transformation.

The results are as follows:

Contact

Luke Ingram - lukeingram01@gmail.com

Future Work

Containerization
Device Camera Support
event logging
file cleanup (both uploads and outbox)

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
app		app
media		media
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocuScanner Flask

Contents

Usage

Running Locally

How it All Works

Image Segmentation

Bounding Quadrilateral Approximation

Perspective Correction

Contact

Future Work

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DocuScanner Flask

Contents

Usage

Running Locally

How it All Works

Image Segmentation

Bounding Quadrilateral Approximation

Perspective Correction

Contact

Future Work

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages