DocuSegment-Pytorch: UNet Based Document Segmentation in PyTorch.

A custom implementation of the UNet architecture, designed and tuned specifically for the task of binary segmentation of documents and backgrounds.

Usage

Before you begin, ensure all dependencies are installed

pip install -r requirements.txt

For detailed on the synthetic dataset mentioned earlier, see the Synthetic Dataset section.

If you choose to use your own dataset, the following structure is recommended:

DocuSegment-Pytorch/
├── data/
│    └── <dataset name>/
│           ├── train/
│           │   ├── images/
│           │   └── masks/
│           └── valid/
│               ├── images/
│               └── masks/
├── runs/
│  
├── samples/ 
│  
├── src/
│
⋮
└── README.md

You may need to adapt the DocumentDataset class in utils/datasets.py to suit your custom dataset.

Training

Note: Installing CUDA is strongly recommended for on-device training.

> python -m train -h 
usage: train.py [-h] [-nb N] [-nc N] [-n N] [-lr eta] [-bs b] [-c c] [-sc s] [-d DEVICE] [-tdp path [path ...]] [-vdp path [path ...]] -sn filename [-vbo] [-udi]

options:
  -h, --help            show this help message and exit
  -nb N, --num_blocks N
                        Number of down sampling & upsampling blocks featured in the UNet.
  -nc N, --num_start_channels N
                        Number of channels after the first convolution block.
  -n N, --num_epochs N  Number of epochs.
  -lr eta, --learning_rate eta
                        Learning Rate.
  -bs b, --batch_size b
                        Batch Size.
  -c c, --num_classes c
                        Number of classes
  -sc s, --scale_fact s
                        Factor to reduce / increase the inputs by.
  -d DEVICE, --device DEVICE
                        Device to run model.
  -tdp path [path ...], --train_data_paths path [path ...]
                        Paths of training image and mask directories.
  -vdp path [path ...], --validation_data_paths path [path ...]
                        Paths of validation image and mask directories.
  -sn filename, --save_name filename
                        Same of save file (.pth).
  -vbo, --verbose       Verbose Output.
  -udi, --use_dice_and_iou
                        Add DICE and IoU score to loss during training.

Saved models are always stored in the models/saves directory, and training results are saved into the runs directory.

Inference

> python -m inference -h
usage: inference.py [-h] -w filename [-inputs path [path ...]] -odir dir [-sc s] [-d DEVICE]

options:
  -h, --help            show this help message and exit
  -w filename, --weight_file filename
                        The saved model's filename.
  -inputs path [path ...], --input_paths path [path ...]
                        Paths to the input(s).
  -odir dir, --output_dir dir
                        Path to Directory where predictions are saved.
  -sc s, --scale_fact s
                        Factor to reduce / increase the inputs by.
  -d DEVICE, --device DEVICE
                        Device to run model.

Weight files must be stored in the models/saves

Synthetic Dataset

tehwq

Results

All provided models were trained with the following specifics:

Batch Size: 8
Learning Rate: 0.0003
Epochs: 50
scale Factor: 1.5394

Model Name	Num Params	Supported Size	Validation DICE	Validation IOU
unet_16	1.9M	480 x 480	0.970725	0.945591
unet_32	7.7M	480 x 480	0.981021	0.960021

The '16' and '32' refer to the number of filters in the first convolutionary block.

These models are all available for download via scripts/downloadWeights.sh

bash scripts/downloadWeights.sh <MODEL NAME>

Future Work

Flexible BasicDocumentDataset class for easy training on custom datasets
Proper Logging instead of printing to stdout
Config file based experiment setup
Checkpointing / Loading and training existing model saves

References

Data

"Preparing Synthetic Dataset For Robust Document Segmentation," LearnOpenCV, Deep Learning-based Document Segmentation Using Semantic Segmentation with DeepLabV3 on a Custom Dataset. Link

Research

Olaf Ronneberger, Philipp Fischer, and Thomas Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," arXiv preprint arXiv:1505.04597 (2015). Link
Ke Ma, Zhixin Shu, Xue Bai, Jue Wang, Dimitris Samaras, "DocUNet: Document Image Unwarping via A Stacked U-Net", Stony Brook University & Megvii Inc. Link

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocuSegment-Pytorch: UNet Based Document Segmentation in PyTorch.

Contents

Usage

Training

Inference

Synthetic Dataset

Results

Future Work

References

Data

Research

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
models		models
runs		runs
samples		samples
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_dataset.py		build_dataset.py
evaluate.py		evaluate.py
inference.py		inference.py
loss.py		loss.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

DocuSegment-Pytorch: UNet Based Document Segmentation in PyTorch.

Contents

Usage

Training

Inference

Synthetic Dataset

Results

Future Work

References

Data

Research

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages