A desktop application for training and running MobileNetV2 SSD-Lite object detection models. Built with PyQt5 and powered by pytorch-ssd.
| Feature | Description |
|---|---|
| Annotation | Built-in YOLO-format labelling tool (labelimg2) |
| Dataset prep | Automatic train / val / test split via GUI or CLI |
| Training | MobileNetV2 SSD-Lite, YOLO .txt labels, SGD with momentum |
| Photo detection | Load any image, run inference, view annotated result |
| Camera detection | Real-time detection from webcam |
| Package | Version |
|---|---|
| Python | 3.9 – 3.12 |
| PyTorch | ≥ 2.1.0 |
| torchvision | ≥ 0.16.0 |
| OpenCV | ≥ 4.8.0 (opencv-python) |
| PyQt5 | ≥ 5.15.9 |
| NumPy | ≥ 1.24.0 |
Install all dependencies:
pip install -r requirements.txtpython main.pyThe main menu has three options:
- 数据集训练 — Dataset Training
- 摄像头模型验证 — Live Camera Detection
- 照片模型验证 — Single-Image Detection
Raw images
│
▼
[1] Annotate with labelimg2
│ python labelimg2/main.py --img_dir <images/> --save_dir <labels/>
▼
[2] Prepare dataset (train/val/test split)
│ python prepare_dataset.py --src <raw/> --dst <dataset/> --class-names beer,coke
▼
[3] Train
│ python -m mobilenet.train (or GUI)
▼
[4] Weights saved → weights/mb2-ssd-lite-best.pth
│ weights/mb2-ssd-labels.txt
▼
[5] Inference
python main.py → Photo / Camera detection
python labelimg2/main.py \
--img_dir /path/to/your/images \
--save_dir /path/to/save/labels- Draw bounding boxes, assign class names, save in YOLO
.txtformat. - Each saved file is named
<image_stem>.txtand contains one line per object:All coordinates are normalized to [0, 1].<class_id> <cx> <cy> <width> <height>
- Launch
python main.py→ 数据集训练. - Click 选择数据集目录 and pick a folder that already has
images/andlabels/subdirectories. - Click 开始标注 — labelimg2 opens automatically pointing at that folder.
The training pipeline requires images and labels split into train / val / test sub-folders. Use the provided script to create this layout automatically:
python prepare_dataset.py \
--src /path/to/raw_folder \
--dst /path/to/output_dataset \
--class-names beer,coke,spriteArguments:
| Argument | Default | Description |
|---|---|---|
--src |
(required) | Folder with images + matching .txt label files |
--dst |
(required) | Output dataset folder to create |
--class-names |
"" |
Comma-separated class names (no spaces), e.g. beer,coke,sprite |
--train-ratio |
0.7 |
Fraction used for training |
--val-ratio |
0.2 |
Fraction used for validation (remainder → test) |
--seed |
42 |
Random seed for reproducible splits |
Output structure:
output_dataset/
├── images/
│ ├── train/ ← 70 % of images
│ ├── val/ ← 20 %
│ └── test/ ← 10 %
├── labels/
│ ├── train/ ← matching YOLO .txt files
│ ├── val/
│ └── test/
├── annotations/ ← COCO JSON stubs (used by GUI dataset checker)
└── class_names.txt
Tip: If your images and labels are already split into subfolders, you can skip this script and point the training directly at
images/train/andlabels/train/.
python main.py→ 数据集训练- 选择数据集目录 — pick the folder created in Step 2.
- The GUI validates the dataset structure. Fix any errors shown.
- Enter your class names (comma-separated, e.g.
beer,coke,sprite). - Set Epochs and Batch size.
- (Optional) Click 选择权重文件 to continue training from an existing
.pthcheckpoint. - Click 开始训练.
Training logs stream in real time. Weights are saved to weights/ automatically.
python -m mobilenet.train \
--image-dir /path/to/dataset/images/train \
--label-dir /path/to/dataset/labels/train \
--class-names beer,coke,sprite \
--epochs 20 \
--batch-size 8 \
--lr 1e-3 \
--save-dir weightsAll CLI arguments:
| Argument | Default | Description |
|---|---|---|
--image-dir |
(required) | Path to training images directory |
--label-dir |
(required) | Path to training YOLO .txt labels directory |
--class-names |
(required) | Comma-separated class names (no background) |
--epochs |
20 |
Number of training epochs |
--batch-size |
8 |
Batch size |
--lr |
1e-3 |
Initial learning rate (SGD) |
--weights |
"" |
Pre-trained .pth file to continue training from |
--save-dir |
weights |
Directory to save checkpoints |
--device |
auto | cuda or cpu |
After training, the following files are written to --save-dir (default: weights/):
| File | Description |
|---|---|
mb2-ssd-lite-best.pth |
Checkpoint with the lowest training loss |
mb2-ssd-lite-last.pth |
Checkpoint from the last epoch |
mb2-ssd-labels.txt |
Class name list (read automatically at inference time) |
mb2-ssd-labels.txt must stay in the same folder as the .pth file — it is loaded automatically when you open the model for inference.
python main.py→ 照片模型验证- Select a model file (
.pth) from theweights/folder. - Adjust confidence threshold and image size if needed.
- Click 选择图片 — detection results are drawn on the image.
python main.py→ 摄像头模型验证- Select a model file.
- Click 开始检测 — bounding boxes are overlaid on the live webcam feed in real time.
import cv2
from mobilenet.infer import MobileNetV2Detector
# Load model (class names are read from mb2-ssd-labels.txt automatically)
detector = MobileNetV2Detector(
model_path="weights/mb2-ssd-lite-best.pth",
use_cpu=False, # set True to force CPU
)
# Run on a single image
frame = cv2.imread("photo.jpg") # BGR format
result = detector.predict(frame, score_threshold=0.5)
# result.boxes → numpy (N, 4) pixel coords [x1, y1, x2, y2]
# result.scores → numpy (N,)
# result.labels → numpy (N,) 1-based class indices
# Draw and save
annotated = detector.draw(frame, result)
cv2.imwrite("output.jpg", annotated)
print(f"Detected {len(result.boxes)} objects")
for box, score, label in zip(result.boxes, result.scores, result.labels):
name = detector.class_names[int(label) - 1]
print(f" {name} {score:.2f} {box.astype(int).tolist()}")MB V2/
├── main.py # GUI application entry point
├── prepare_dataset.py # Raw data → train/val/test dataset converter
├── requirements.txt # Python dependencies
│
├── mobilenet/ # MobileNetV2 module (wraps pytorch-ssd)
│ ├── model.py # build_model / load_model
│ ├── train.py # Training script (CLI entry point)
│ ├── infer.py # MobileNetV2Detector class
│ ├── yolo_dataset.py # YOLO-format dataset adapter for pytorch-ssd
│ └── config.py # AppConfig defaults
│
├── pytorch-ssd/ # Upstream SSD library (qfgaohao/pytorch-ssd)
│ └── vision/ # Model, loss, transforms, predictor
│
├── labelimg2/ # Annotation tool (YOLO .txt output)
│
└── weights/ # Created by training
├── mb2-ssd-lite-best.pth
├── mb2-ssd-lite-last.pth
└── mb2-ssd-labels.txt
- Image size used by the model is fixed at 300 × 300 (SSD standard). The size combo in the GUI affects how input images are pre-resized before passing to the model, not the internal network input.
- GPU acceleration: if CUDA is available it is used automatically; pass
--device cpuor tick 强制使用CPU to force CPU mode. - Class names must be consistent between training and inference; they are embedded in
mb2-ssd-labels.txtat training time. - The
yolov7/folder is not required and can be deleted without affecting any functionality.