Skip to content

ykk648/AI_power

Repository files navigation

AI_power

This repo is no longer maintained.

Face, audio, and LLM related APIs have been moved to dedicated repos:

  • face_power - Face detection, parsing, restore, swap, etc.
  • audio_power - ASR, TTS, voice conversion, audio translation
  • llm_power - LLM fine-tuning, RAG, LangChain, ChatGLM

Convenience API collection for computer vision, motion capture, segmentation, AIGC, and Stable Diffusion inference.

Pretrained models: BaiDuPan pwd: ibgg

Related Projects

Project Description
cv2box CV utility functions used across all projects
apstone Base inference engine for all model wrappers

Module Overview

body_lib - Body Keypoint Detection

Model Source Function
YOLOX-tiny/s MMDetection Body bbox
HRNetV2-w32 ModelScope Body keypoints
BlazePose PINTO_model_zoo Body keypoints
Lightweight-Pose lightweight-human-pose-estimation Body keypoints
MoveNet TFHub Body keypoints
KAPAO kapao Body keypoints

hand_lib - Hand Detection & Mesh

Model Source Function
YOLOX-tiny MMDetection Hand 21 keypoints
hand_detector.d2 hand_detector.d2 Hand bbox
MediaPipe Hands MediaPipe Hand landmarks
YOLOX* MMDetection Hand detection
Minimal-Hand minimal-hand Hand mesh
FrankMoCap frankmocap Hand pose regressor

mocap_lib - Motion Capture Pipeline

Module Description
SPIN Body shape regress
MMPose Whole body keypoints (r50, hrnet_w48_384_dark, etc.)
MediaPipe Holistic Whole body holistic
Calibration Camera calibration
Smooth Filter Temporal smoothing
Triangulate Multi-view triangulation

seg_lib - Segmentation & Matting

Module Description
CarveKit Cloth segmentation
CIHP-PGN Human parsing
U2Net Object segmentation / saliency
PPMattingV2 Portrait matting
Green Screen Matting Chroma-key video matting (BackgroundMattingV2)
SegFormer B2 Cloth segmentation
MODNet Portrait matting
RAFT Optical flow

art_lib - AIGC & Video Effects

Model Function
DCTNet Style transfer
LaMa Image inpainting
TPSMM Talking head synthesis
SadTalker Talking head synthesis
Wav2Lip Lip sync
DINet Talking head synthesis

sd_lib - Stable Diffusion Tools

Module Description
ControlNet ControlNet inference
IP-Adapter IP-Adapter for image-prompted generation
Prompt2Prompt Prompt-based image editing
CLIP Encoder Text encoding
DDIM Inversion Image inversion
Tagger Image tagging/WD14

ocr_lib - OCR

Model Description
PaddleOCR General-purpose OCR

data_lib - Dataset Tools

COCO format conversion, dataset preprocessing, visualization utilities.

math_lib / utils - Utilities

Affine transforms, Gaussian filters, K-means, timing tools, path helpers.

About

AI toolbox and pretrain models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors