Skip to content
View Chaobai-Jiang's full-sized avatar
  • ShangHai

Block or report Chaobai-Jiang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Chaobai-Jiang/README.md

Chaobai | AI Infra & ML System Engineer

字节跳动推理加速工程师 | 专注大规模视觉生成模型推理与部署 | 万卡集群 Serving 工程实践


核心方向

视觉生成推理工程化

  • 十万卡规模异构推理集群(GPU/NPU/MLU)的 7×24 部署
  • 分离式部署、跨模型角色复用、潮汐调度、QoS 流量控制

推理性能优化

  • Linear 量化、SageAttention、Ulysses+Ring 混合序列并行
  • 通算融合、TeaCache、Diffusion模型步数蒸馏 等优化手段

代表性成果

工程化落地

项目 成果
Seed 视觉生成系列 ToB 工程化 万卡异构集群,分离式部署 + 潮汐调度,7*24小时平均利用率 80%+
Wan/flux 等模型开源版本推理优化 量化 + 跨机高副本数量并行 + 通信 overlap,提升至基础性能基线 400%+

论文

  • iccv-2025:《Robustifying Zero-Shot Vision Language Models by Subspaces Alignment》
  • eurosys26-spring:《Handling Network Faults in Distributed AI Training: Failover is Now an Option》

技术栈

推理框架:PyTorch | vLLM | Megatron | 自研推理框架

高性能系统:C++ 无锁编程 | 共享内存 IPC | 微秒级性能优化 | RDTSC 高精度测时

工程能力:万卡集群调度 | 异构硬件适配 | 7×24 高可用部署


联系方式

Popular repositories Loading

  1. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  2. BEVFormer BEVFormer Public

    Forked from fundamentalvision/BEVFormer

    [ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

    Python

  3. Step-Video-T2V Step-Video-T2V Public

    Forked from stepfun-ai/Step-Video-T2V

    Python

  4. xDiT xDiT Public

    Forked from xdit-project/xDiT

    xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

    Python

  5. Wan2.1 Wan2.1 Public

    Forked from Wan-Video/Wan2.1

    Wan: Open and Advanced Large-Scale Video Generative Models

    Python

  6. Claude-yaml Claude-yaml Public

    Shell