Skip to content

OLIVER-XYP/MB_V2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MobileNetV2 Object Detection Toolkit

A desktop application for training and running MobileNetV2 SSD-Lite object detection models. Built with PyQt5 and powered by pytorch-ssd.


Features

Feature Description
Annotation Built-in YOLO-format labelling tool (labelimg2)
Dataset prep Automatic train / val / test split via GUI or CLI
Training MobileNetV2 SSD-Lite, YOLO .txt labels, SGD with momentum
Photo detection Load any image, run inference, view annotated result
Camera detection Real-time detection from webcam

Requirements

Package Version
Python 3.9 – 3.12
PyTorch ≥ 2.1.0
torchvision ≥ 0.16.0
OpenCV ≥ 4.8.0 (opencv-python)
PyQt5 ≥ 5.15.9
NumPy ≥ 1.24.0

Install all dependencies:

pip install -r requirements.txt

Launch the Application

python main.py

The main menu has three options:

  • 数据集训练 — Dataset Training
  • 摄像头模型验证 — Live Camera Detection
  • 照片模型验证 — Single-Image Detection

Complete Workflow

Raw images
    │
    ▼
[1] Annotate with labelimg2
    │  python labelimg2/main.py --img_dir <images/> --save_dir <labels/>
    ▼
[2] Prepare dataset (train/val/test split)
    │  python prepare_dataset.py --src <raw/> --dst <dataset/> --class-names beer,coke
    ▼
[3] Train
    │  python -m mobilenet.train  (or GUI)
    ▼
[4] Weights saved → weights/mb2-ssd-lite-best.pth
    │              weights/mb2-ssd-labels.txt
    ▼
[5] Inference
       python main.py → Photo / Camera detection

Step 1 — Annotate Raw Images

Option A: Standalone annotation tool

python labelimg2/main.py \
  --img_dir  /path/to/your/images \
  --save_dir /path/to/save/labels
  • Draw bounding boxes, assign class names, save in YOLO .txt format.
  • Each saved file is named <image_stem>.txt and contains one line per object:
    <class_id>  <cx>  <cy>  <width>  <height>
    
    All coordinates are normalized to [0, 1].

Option B: Annotate from the GUI

  1. Launch python main.py数据集训练.
  2. Click 选择数据集目录 and pick a folder that already has images/ and labels/ subdirectories.
  3. Click 开始标注 — labelimg2 opens automatically pointing at that folder.

Step 2 — Prepare Dataset Structure

The training pipeline requires images and labels split into train / val / test sub-folders. Use the provided script to create this layout automatically:

python prepare_dataset.py \
  --src   /path/to/raw_folder \
  --dst   /path/to/output_dataset \
  --class-names beer,coke,sprite

Arguments:

Argument Default Description
--src (required) Folder with images + matching .txt label files
--dst (required) Output dataset folder to create
--class-names "" Comma-separated class names (no spaces), e.g. beer,coke,sprite
--train-ratio 0.7 Fraction used for training
--val-ratio 0.2 Fraction used for validation (remainder → test)
--seed 42 Random seed for reproducible splits

Output structure:

output_dataset/
├── images/
│   ├── train/   ← 70 % of images
│   ├── val/     ← 20 %
│   └── test/    ← 10 %
├── labels/
│   ├── train/   ← matching YOLO .txt files
│   ├── val/
│   └── test/
├── annotations/ ← COCO JSON stubs (used by GUI dataset checker)
└── class_names.txt

Tip: If your images and labels are already split into subfolders, you can skip this script and point the training directly at images/train/ and labels/train/.


Step 3 — Train

Option A: GUI

  1. python main.py数据集训练
  2. 选择数据集目录 — pick the folder created in Step 2.
  3. The GUI validates the dataset structure. Fix any errors shown.
  4. Enter your class names (comma-separated, e.g. beer,coke,sprite).
  5. Set Epochs and Batch size.
  6. (Optional) Click 选择权重文件 to continue training from an existing .pth checkpoint.
  7. Click 开始训练.

Training logs stream in real time. Weights are saved to weights/ automatically.

Option B: Command Line

python -m mobilenet.train \
  --image-dir  /path/to/dataset/images/train \
  --label-dir  /path/to/dataset/labels/train \
  --class-names beer,coke,sprite \
  --epochs 20 \
  --batch-size 8 \
  --lr 1e-3 \
  --save-dir weights

All CLI arguments:

Argument Default Description
--image-dir (required) Path to training images directory
--label-dir (required) Path to training YOLO .txt labels directory
--class-names (required) Comma-separated class names (no background)
--epochs 20 Number of training epochs
--batch-size 8 Batch size
--lr 1e-3 Initial learning rate (SGD)
--weights "" Pre-trained .pth file to continue training from
--save-dir weights Directory to save checkpoints
--device auto cuda or cpu

Step 4 — Saved Weight Files

After training, the following files are written to --save-dir (default: weights/):

File Description
mb2-ssd-lite-best.pth Checkpoint with the lowest training loss
mb2-ssd-lite-last.pth Checkpoint from the last epoch
mb2-ssd-labels.txt Class name list (read automatically at inference time)

mb2-ssd-labels.txt must stay in the same folder as the .pth file — it is loaded automatically when you open the model for inference.


Step 5 — Inference

Option A: Photo detection (GUI)

  1. python main.py照片模型验证
  2. Select a model file (.pth) from the weights/ folder.
  3. Adjust confidence threshold and image size if needed.
  4. Click 选择图片 — detection results are drawn on the image.

Option B: Live camera detection (GUI)

  1. python main.py摄像头模型验证
  2. Select a model file.
  3. Click 开始检测 — bounding boxes are overlaid on the live webcam feed in real time.

Option C: Python API

import cv2
from mobilenet.infer import MobileNetV2Detector

# Load model (class names are read from mb2-ssd-labels.txt automatically)
detector = MobileNetV2Detector(
    model_path="weights/mb2-ssd-lite-best.pth",
    use_cpu=False,   # set True to force CPU
)

# Run on a single image
frame = cv2.imread("photo.jpg")          # BGR format
result = detector.predict(frame, score_threshold=0.5)

# result.boxes  → numpy (N, 4)  pixel coords [x1, y1, x2, y2]
# result.scores → numpy (N,)
# result.labels → numpy (N,)  1-based class indices

# Draw and save
annotated = detector.draw(frame, result)
cv2.imwrite("output.jpg", annotated)

print(f"Detected {len(result.boxes)} objects")
for box, score, label in zip(result.boxes, result.scores, result.labels):
    name = detector.class_names[int(label) - 1]
    print(f"  {name}  {score:.2f}  {box.astype(int).tolist()}")

Project Structure

MB V2/
├── main.py                    # GUI application entry point
├── prepare_dataset.py         # Raw data → train/val/test dataset converter
├── requirements.txt           # Python dependencies
│
├── mobilenet/                 # MobileNetV2 module (wraps pytorch-ssd)
│   ├── model.py               # build_model / load_model
│   ├── train.py               # Training script (CLI entry point)
│   ├── infer.py               # MobileNetV2Detector class
│   ├── yolo_dataset.py        # YOLO-format dataset adapter for pytorch-ssd
│   └── config.py              # AppConfig defaults
│
├── pytorch-ssd/               # Upstream SSD library (qfgaohao/pytorch-ssd)
│   └── vision/                # Model, loss, transforms, predictor
│
├── labelimg2/                 # Annotation tool (YOLO .txt output)
│
└── weights/                   # Created by training
    ├── mb2-ssd-lite-best.pth
    ├── mb2-ssd-lite-last.pth
    └── mb2-ssd-labels.txt

Notes

  • Image size used by the model is fixed at 300 × 300 (SSD standard). The size combo in the GUI affects how input images are pre-resized before passing to the model, not the internal network input.
  • GPU acceleration: if CUDA is available it is used automatically; pass --device cpu or tick 强制使用CPU to force CPU mode.
  • Class names must be consistent between training and inference; they are embedded in mb2-ssd-labels.txt at training time.
  • The yolov7/ folder is not required and can be deleted without affecting any functionality.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors