AI_power

This repo is no longer maintained.

Face, audio, and LLM related APIs have been moved to dedicated repos:

face_power - Face detection, parsing, restore, swap, etc.

audio_power - ASR, TTS, voice conversion, audio translation

llm_power - LLM fine-tuning, RAG, LangChain, ChatGLM

Convenience API collection for computer vision, motion capture, segmentation, AIGC, and Stable Diffusion inference.

Pretrained models: BaiDuPan pwd: ibgg

Related Projects

Project	Description
cv2box	CV utility functions used across all projects
apstone	Base inference engine for all model wrappers

Module Overview

body_lib - Body Keypoint Detection

Model	Source	Function
YOLOX-tiny/s	MMDetection	Body bbox
HRNetV2-w32	ModelScope	Body keypoints
BlazePose	PINTO_model_zoo	Body keypoints
Lightweight-Pose	lightweight-human-pose-estimation	Body keypoints
MoveNet	TFHub	Body keypoints
KAPAO	kapao	Body keypoints

hand_lib - Hand Detection & Mesh

Model	Source	Function
YOLOX-tiny	MMDetection	Hand 21 keypoints
hand_detector.d2	hand_detector.d2	Hand bbox
MediaPipe Hands	MediaPipe	Hand landmarks
YOLOX*	MMDetection	Hand detection
Minimal-Hand	minimal-hand	Hand mesh
FrankMoCap	frankmocap	Hand pose regressor

mocap_lib - Motion Capture Pipeline

Module	Description
SPIN	Body shape regress
MMPose	Whole body keypoints (r50, hrnet_w48_384_dark, etc.)
MediaPipe Holistic	Whole body holistic
Calibration	Camera calibration
Smooth Filter	Temporal smoothing
Triangulate	Multi-view triangulation

seg_lib - Segmentation & Matting

Module	Description
CarveKit	Cloth segmentation
CIHP-PGN	Human parsing
U2Net	Object segmentation / saliency
PPMattingV2	Portrait matting
Green Screen Matting	Chroma-key video matting (BackgroundMattingV2)
SegFormer B2	Cloth segmentation
MODNet	Portrait matting
RAFT	Optical flow

art_lib - AIGC & Video Effects

Model	Function
DCTNet	Style transfer
LaMa	Image inpainting
TPSMM	Talking head synthesis
SadTalker	Talking head synthesis
Wav2Lip	Lip sync
DINet	Talking head synthesis

sd_lib - Stable Diffusion Tools

Module	Description
ControlNet	ControlNet inference
IP-Adapter	IP-Adapter for image-prompted generation
Prompt2Prompt	Prompt-based image editing
CLIP Encoder	Text encoding
DDIM Inversion	Image inversion
Tagger	Image tagging/WD14

ocr_lib - OCR

Model	Description
PaddleOCR	General-purpose OCR

data_lib - Dataset Tools

COCO format conversion, dataset preprocessing, visualization utilities.

math_lib / utils - Utilities

Affine transforms, Gaussian filters, K-means, timing tools, path helpers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI_power

Related Projects

Module Overview

body_lib - Body Keypoint Detection

hand_lib - Hand Detection & Mesh

mocap_lib - Motion Capture Pipeline

seg_lib - Segmentation & Matting

art_lib - AIGC & Video Effects

sd_lib - Stable Diffusion Tools

ocr_lib - OCR

data_lib - Dataset Tools

math_lib / utils - Utilities

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
art_lib		art_lib
body_lib		body_lib
data_lib		data_lib
hand_lib		hand_lib
math_lib		math_lib
mocap_lib		mocap_lib
ocr_lib		ocr_lib
sd_lib		sd_lib
seg_lib		seg_lib
sr_lab/realesrgan		sr_lab/realesrgan
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SPEEDTABLE.md		SPEEDTABLE.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI_power

Related Projects

Module Overview

body_lib - Body Keypoint Detection

hand_lib - Hand Detection & Mesh

mocap_lib - Motion Capture Pipeline

seg_lib - Segmentation & Matting

art_lib - AIGC & Video Effects

sd_lib - Stable Diffusion Tools

ocr_lib - OCR

data_lib - Dataset Tools

math_lib / utils - Utilities

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages