Developer: Alfarouq Ibrahim | Robotics & Automation Intern
Project: Multi-Modal Edge-AI Inspection System for Industrial Gears
Dataset Link: https://www.kaggle.com/datasets/alfarouqibrahim/decodlabes-project-2-gear-dataset
This project is an advanced Edge-AI research prototype developed during the DecodeLabs Robotics & Automation Internship. The objective is to automate the inspection of mechanical gears on an assembly line, detecting structural defects (like broken teeth) with high industrial reliability.
Instead of relying solely on classical Computer Vision (which is sensitive to lighting and reflections), this project implements a Multi-Modal Explainable AI (XAI) Fusion Architecture.
Built a Convolutional Neural Network entirely from scratch using NumPy. It handles forward propagation, backpropagation, and cross-entropy loss without relying on heavy frameworks like PyTorch or TensorFlow, making it highly optimized for Edge-devices.
Inspired by Kai Zhou, Jiong Tang, the system includes a data-independent physical engine. It extracts the gear's outer contour, unrolls it into a 1D signal, and uses Fast Fourier Transform (FFT) to detect harmonic distortions caused by broken teeth.
The system doesn't act as a black box. It features a custom green-terminal dashboard that displays three independent evaluations:
- Classical PLC Gate: Deterministic OpenCV logic (Gaussian Blur, Thresholding, Convexity Defects) with dynamic bounding boxes targeting only the gear teeth.
- Visual AI: The NumPy CNN prediction.
- Signal AI: The FFT mathematical prediction.
- Final Fused Decision: A confidence-weighted late fusion combining all modalities for maximum fault tolerance.
Due to the lack of high-fidelity industrial datasets, a custom Python script was written for Blender (Cycles Engine). It uses Involute Curve mathematics to procedurally generate over 2,000 photorealistic 1080p images of intact and defective Carbon Steel gears.
main.py: The core fusion engine and dashboard UI.scratch_cnn.py: The custom NumPy-based CNN architecture.geometric_sonification.py: The FFT signal extraction and analysis model.fusion_coordinator.py: The late-fusion logic combining CNN and FFT probabilities.METHODOLOGY_AND_ARCHITECTURE.md: Formal academic documentation of the architecture.gear_data/: The generated dataset (Not fully uploaded due to size limits. Use--prepare-datato initialize).
1. Clone the repository and install requirements:
pip install -r requirements.txt
2. Run the Inference Dashboard (Ensure you have downloaded the weights gear_cnn_hd.npz and some test images in gear_data/unlabeled):
python main.py --load-model gear_cnn_hd.npz --epochs 0 --image-size 96 --infer-dir gear_data/unlabeled
DecodeLabs: For the internship opportunity and the classical CV pipeline baseline.
Kai Zhou, Jiong Tang: For the concept and inspiration behind the FFT/Spectral Signal analysis module. (https://data.mendeley.com/datasets/87y47nvsf4/1)