WideHRNet: an Efficient Model for Human Pose Estimation Using Wide Channels in Lightweight High-Resolution Network.

Abstract

In building robust AI models, most researchers tend to add more layers, focusing on increasing depth. This led us to ask: what would happen if we did the opposite? Specifically, what if a model for human pose estimation emphasized increasing the number of neurons (i.e., a wider network) rather than adding more layers (i.e., a deeper network)? This question became the starting point of our paper.

You can read the full paper here

Building block, where (a) is the proposed block that is inspired by various blocks, including (b) the conditional channel weighting (CCW) and (c) the inverted residual block. The stride value of all these blocks is 1. Conv: convolution, BN: batch normalization, SE: squeeze-excitation block, CRW: cross-resolution weights block, and SW: spatial weights block.

Results and models

Results on MPII val set

Using groundtruth bounding boxes. The metric is PCKh. The value of the channel expansion and SE reduction ratio is 4 and 4, respectively.

Model	Input Size	#Params	FLOPs	PCKh	config	log	weight
Wide-HRNet-18	256x256	2.7M	0.96G	87.99	config	log	weight
Wide-HRNet-18 + SE	256x256	4.4M	0.97G	88.47	config	log	weight

Results on COCO val2017

Using detection results from a detector that obtains 56 mAP on the person. The value of the channel expansion and SE reduction ratio is 4 and 4, respectively.

Model	Input Size	#Params	FLOPs	AP	AR	config	log	weight
Wide-HRNet-18 + SE	256x192	4.4M	0.9G	70.0	75.8	config	log	weight
Wide-HRNet-18 + SE	384x288	4.4M	1.6G	70.1	76.6	config	log	weight
Wide-HRNet-30 + SE	256x192	7.19M	1.21G	71.47	77.14	config	log	weight

Usage

The code was developed and tested on Ubuntu 22.04. We used 1 RTX 3060 GPU card to train and test the model. We also trained the WideHRNet model using 8 NVIDIA v100 GPU cards. Other platforms or GPU cards (except NVIDIA 1080ti) are not fully tested.

Requirements

Linux
Python 3.8
mmcv 1.4.8
PyTorch 1.9.0
cuda >= 11.1

pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.4.8 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
pip install mmdet==2.28.2
pip install mmpose==0.29.0
# then clone this repository
pip install -r requirements.txt
pip install -v -e .

After installing these libraries, install timm and einops, i.e.,

pip install timm==0.4.9 einops

Training

We have trained our model on the MPII dataset using 1 RTX 3060 GPU card. After a while, we had a higher graphics card (8 NVIDIA V100 GPU) available, which allowed us to train our model on the COCO dataset.

Use the following command to train the model

bash ./tools/dist_train.sh <Config PATH> <NUM GPUs> --seed 0

Examples: Training on the MPII dataset using single machine

bash ./tools/dist_train.sh ./configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/mpii/widehrnet_18_mpii_256x256.py 1 --seed 0

Training on the COCO dataset using multiple machines

bash ./tools/dist_train.sh ./configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/widehrnet_18_se_coco_256x192.py 8 --seed 0

Testing

Use the following command to test the model

bash ./tools/dist_test.sh <Config PATH> <Checkpoint PATH> <NUM GPUs>

Examples:

# Testing on the MPII dataset
bash ./tools/dist_test.sh ./configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/mpii/widehrnet_18_mpii_256x256.py  ../work_dirs/epoch_210.pth 1

#  Testing on the COCO dataset
bash ./tools/dist_test.sh ./configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/widehrnet_18_se_coco_256x192.py  ../work_dirs/epoch_210.pth 1

Get the computational complexity

python3 ./tools/summary_network.py ./configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/mpii/widehrnet_18_mpii_256x256.py

TODO List

Upload WideHRNet code
Add abstract and link of WideHRNet paper
Add results table (MPII and COCO datasets)
Upload checkpoints
More experiments with attention modules
Add environment setup
Add instructions on how to train and test the model
Add acknowledgement
Add citation

Acknowledgement

Thanks to:

Citation

If you use our code or models in your research, please cite with:

@article{samkari2024widehrnet,
  title={WideHRNet: an Efficient Model for Human Pose Estimation Using Wide Channels in Lightweight High-Resolution Network},
  author={Samkari, Esraa and Arif, Muhammad and Alghamdi, Manal and Al Ghamdi, Mohammed A},
  journal={IEEE Access},
  year={2024},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
WideHRNet		WideHRNet
resources		resources
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WideHRNet: an Efficient Model for Human Pose Estimation Using Wide Channels in Lightweight High-Resolution Network.

Abstract

Results and models

Results on MPII val set

Results on COCO val2017

Usage

Requirements

Training

Testing

Get the computational complexity

TODO List

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WideHRNet: an Efficient Model for Human Pose Estimation Using Wide Channels in Lightweight High-Resolution Network.

Abstract

Results and models

Results on MPII val set

Results on COCO val2017

Usage

Requirements

Training

Testing

Get the computational complexity

TODO List

Acknowledgement

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages