This repository contains the code implementation based on the paper UNet++: A Nested U-Net Architecture for Medical Image Segmentation (doi).
Several experiments were conducted to reach these results. However, there is clearly still much room for improvement, so if you spot any flaws in the experiments, feel free to share your thoughts in the repository's issues section.
This network architecture is based on skip connections with fully connected nodes, as shown below.
An important detail is that a technique known as Deep Supervision is also used, where instead of evaluating only the final network output, the outputs of the last four nodes of the highest layer are also evaluated.
The model incorporates Attention Gates (Oktay et al., 2018) on every skip connection in the dense decoder. Each gate uses the upsampled feature map from the deeper layer as a gating signal to compute a spatial attention map, which suppresses irrelevant regions in the shallower skip connection before concatenation. Four shared gates — one per scale — keep the parameter overhead minimal while covering all skip connections throughout the nested architecture.
To search for the best parameters, the grid search technique was used, which consists of choosing a sequence (or sequences) of hyperparameters and running the same experiment on the same dataset in order to find the combination that delivers the best expected result. In this study, the Loss was monitored.
The Dice coefficient evaluates segmentation based on the overlap between the predicted mask and the ground truth mask, making it ideal for small objects.
The Jaccard index accurately evaluates segmentation, calculated based on the original mask and the mask found by the model.
Here is a step-by-step guide to running the experiments carried out in this study.
# Clone the repository
$ git clone https://github.com/felipersteles/att-nested-unet.git
# Enter the directory
$ cd att-nested-unet
Before performing the full preprocessing, train the ResNet50 classifier so that only slices containing the pancreas are kept.
# Convert 3D volumes to 2D slices
$ python preprocess/transform.py --input path/to/nifti --output path/to/slices
# Train the ResNet50 slice classifier (uses patient/slice/ + patient/pancreas/ pairs)
$ python train_classificator.py \
--data_dir path/to/slices \
--weights_path ./checkpoint/classifier.pth \
--epochs 20 \
--batch_size 16
# Run the trained classifier over the full dataset and produce dataset.json
$ python generate_filter_json.py \
--data_dir path/to/slices \
--weights_path ./checkpoint/classifier.pth \
--output dataset.json
# Filter: copy only slices where class=1 to the destination folder
$ python preprocess/filter.py \
--dataset_path dataset.json \
--origin_folder path/to/slices \
--destination_folder path/to/filtered
# Calculate the probabilistic atlas
$ python preprocess/atlas.py
# Crop images based on the atlas
$ python preprocess/crop.py
# To run the grid search, simply choose the directory where the filtered data is located.
$ python grid_search.py --data_dir path/to/filtered
# Training has several parameters that can be configured before execution
$ python train.py \
--data_dir path/to/data \
--batch_size 6 \
--epochs 300 \
--save_model True \
--model_path path/to/model \
--model_name model_name.pth
The best results obtained after a 100-epoch experiment using Dice as the reference for the Loss were:
| Metric | Value |
|---|---|
| Loss | 0.3600 |
| Dice Coefficient | 0.6450 |
| Jaccard Index | 0.4760 |

