An attention-enhanced deep learning framework for automated microplastic detection
with implications for human health and environmental sustainability
Marine microplastic pollution is one of the most pressing environmental and public health challenges of the 21st century. These microscopic plastic fragments (<5mm) have been detected in marine ecosystems, drinking water, food chains, and even human biological tissues.
Traditional identification methods like FTIR and Raman Spectroscopy are expensive, slow, and require specialized expertise. This project presents an automated microplastic detection framework using deep learning and transfer learning, making large-scale environmental monitoring feasible.
- CBAM-Enhanced ResNet50 โ Integrates Convolutional Block Attention Module for focused feature extraction on microplastic particles
- Multi-Architecture Comparison โ Comprehensive evaluation across 4 model implementations
- Cross-Framework Validation โ PyTorch and TensorFlow implementations on identical data
- Grad-CAM Interpretability โ Visual explanations confirming attention on particle-relevant regions
- Sustainability Aligned โ Contributes to UN SDGs 3, 6, and 14
โโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
โ Input โโโโถโ Conv1 โโโโถโ Layer1 โโโโถโ Layer2 โโโโถโ Layer3 โ
โ 224ร224 โ โ BN+ReLU โ โ(frozen) โ โ(frozen) โ โ 1024ch โ
โโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโฌโโโโโ
โ
โโโโโโผโโโโโ
โ CBAM โ
โ 1024ch โ
โโโโโโฌโโโโโ
โ
โโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโผโโโโโ
โ Output โโโโโ FC Head โโโโโ AvgPool โโโโโ CBAM โโโโโ Layer4 โ
โ 2 class โ โ2048โ512โ2โ โ Global โ โ 2048ch โ โ 2048ch โ
โโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ
CBAM Module: Feature Map F โ Channel Attention โ Spatial Attention โ Refined F''
| Model | Framework | Test Acc. | F1-Score | AUC-ROC | Parameters | Final Loss |
|---|---|---|---|---|---|---|
| ResNet50 | PyTorch | 100.00% | 1.00 | 1.00 | 25.0M | 0.1099 |
| ResNet50 + CBAM | PyTorch | 100.00% | 1.00 | 1.00 | 25.2M | 0.1014 |
| EfficientNet-B0 | PyTorch | 100.00% | 1.00 | 1.00 | 4.7M | 0.1159 |
| ResNet50 (TF) | TensorFlow | 95.25% | 0.95 | 0.97 | 23.6M | 0.2700 |
๐ฏ ResNet50 + CBAM achieves the lowest training loss (0.1014) โ indicating the most confident, best-calibrated predictions
โก EfficientNet-B0 matches performance with 5.4ร fewer parameters โ ideal for edge deployment
๐ Cross-framework validation reveals implementation-level impact on model performance
Microplastics_detection/
โ
โโโ ๐ dataset/
โ โโโ train/
โ โ โโโ microplastic/ # 1,600 images
โ โ โโโ non_microplastic/ # 1,600 images
โ โโโ val/
โ โ โโโ microplastic/ # 200 images
โ โ โโโ non_microplastic/ # 200 images
โ โโโ test/
โ โโโ microplastic/ # 200 images
โ โโโ non_microplastic/ # 200 images
โ
โโโ ๐ results/
โ โโโ resnet50_cbam_best.pth # Best model weights
โ โโโ training_curves.png # Loss & accuracy plots
โ โโโ confusion_matrix.png # Test set confusion matrix
โ โโโ roc_curve.png # ROC curve with AUC
โ โโโ classification_report.txt # Precision, recall, F1
โ โโโ gradcam_*.png # Grad-CAM heatmaps
โ
โโโ ๐ results_vanilla/ # Vanilla ResNet50 results
โโโ ๐ results_efficientnet/ # EfficientNet-B0 results
โ
โโโ ๐ augment_mp.py # Image augmentation (781 โ 2000)
โโโ ๐ split_dataset.py # Train/val/test split
โโโ ๐ train_resnet50_cbam.py # ResNet50 + CBAM training
โโโ ๐ train_resnet50_vanilla.py # Vanilla ResNet50 training
โโโ ๐ train_efficientnet.py # EfficientNet-B0 training
โ
โโโ ๐ microplastic_ieee_paper.tex # IEEE format research paper
โโโ ๐ README.md
pip install torch torchvision matplotlib scikit-learn seaborn tqdm pillow numpyExpand 781 original microplastic images to 2,000 using microscopy-appropriate augmentations:
# Update INPUT_DIR in augment_mp.py to your image folder
python augment_mp.pyAugmentations applied: rotation, flipping, brightness/contrast jitter, Gaussian blur, noise injection, random crop-resize, sharpening (2โ4 per image).
Create balanced train/val/test splits (80/10/10):
# Update paths in split_dataset.py
python split_dataset.py# Train ResNet50 + CBAM (primary model)
python train_resnet50_cbam.py
# Train vanilla ResNet50 (baseline)
python train_resnet50_vanilla.py
# Train EfficientNet-B0 (efficiency comparison)
python train_efficientnet.pyTraining time: ~25 min per model on Apple M-series (MPS) | ~15 min on NVIDIA GPU
All results are automatically saved to their respective results/ directories:
- Training curves (loss & accuracy)
- Confusion matrices
- ROC curves with AUC scores
- Classification reports
- Grad-CAM heatmaps (ResNet50 models)
The Convolutional Block Attention Module applies two sequential attention mechanisms:
Channel Attention โ "What features to focus on"
M_c(F) = ฯ(MLP(AvgPool(F)) + MLP(MaxPool(F)))
Spatial Attention โ "Where to focus"
M_s(F') = ฯ(Conv7ร7([AvgPool(F'); MaxPool(F')]))
CBAM adds only ~213K parameters (+0.9%) to ResNet50 while providing:
- More focused attention on particle regions
- Faster convergence during training
- Better-calibrated prediction confidence
Grad-CAM heatmaps reveal where the model looks when making predictions:
| CBAM Model | Vanilla Model | |
|---|---|---|
| Attention Pattern | Concentrated on particles | Diffuse across image |
| Background Suppression | Strong | Weak |
| Scientific Validity | Focuses on morphology | Relies on context |
| Property | Value |
|---|---|
| Total Images | 4,000 (2,000 per class) |
| Original Microplastic | 781 images (augmented to 2,000) |
| Original Non-Microplastic | 5,000 images (subsampled to 2,000) |
| Image Size | 224 ร 224 pixels |
| Microplastic Source | Laboratory petri dish captures |
| Non-Microplastic Source | IFCB flow cytometry imaging |
| Split Ratio | 80% train / 10% val / 10% test |
| Hyperparameter | PyTorch Models | TensorFlow Model |
|---|---|---|
| Epochs | 30 | 20 |
| Optimizer | Adam | Adam |
| Learning Rate | 1ร10โปโด | 1ร10โปโด |
| Weight Decay | 1ร10โปโด | โ |
| Scheduler | Cosine Annealing | ReduceLROnPlateau |
| Mixup ฮฑ | 0.2 | โ |
| Loss | Weighted CE | Weighted CE |
| Hardware | Apple MPS | NVIDIA T4 (Colab) |
The complete IEEE-format research paper is included as microplastic_ieee_paper.tex. It covers:
- Comprehensive literature review (31 references, Chicago style)
- Full methodology with CBAM mathematical formulation
- Four-model comparative analysis
- Training convergence and parameter efficiency study
- Grad-CAM interpretability analysis
- Cross-framework (PyTorch vs TensorFlow) validation
- Implications for human health and UN SDGs
- Honest limitations and 8 future work directions
- ๐ฌ Same-domain validation with unified imaging protocols
- ๐ท๏ธ Multi-class morphotype classification (fiber, fragment, film, pellet, foam)
- ๐ก Spectral-visual data fusion combining CNN features with FTIR/Raman data
- ๐ฑ Edge deployment via quantization and pruning for portable devices
- ๐ฏ Object detection using YOLOv8 for particle-level localization and counting
- ๐จ Generative augmentation using GANs/diffusion models for synthetic training data
- ๐งช Polymer identification through multi-task learning
- ๐ Longitudinal monitoring integration with automated sampling stations
The microplastic and non-microplastic images originate from distinct imaging modalities (petri dish photography vs. flow cytometry), which may allow models to leverage imaging-domain features rather than particle morphology alone. This is transparently documented in the research paper. The framework is validated as a laboratory pre-screening tool, with same-domain evaluation identified as a priority for future work.
- Woo et al. (2018) โ CBAM: Convolutional Block Attention Module โ ECCV
- He et al. (2016) โ Deep Residual Learning โ CVPR
- Tan & Le (2019) โ EfficientNet โ ICML
- Selvaraju et al. (2017) โ Grad-CAM โ ICCV
- Jambeck et al. (2015) โ Plastic waste inputs into the ocean โ Science
Full bibliography with 31 Chicago-style references available in the research paper.
Built with ๐งช science and ๐ป deep learning for a cleaner planet
If this project helped you, consider giving it a โญ