TransDex: Pre-training Visuo-Tactile Policy with Point Cloud Reconstruction for Dexterous Manipulation of Transparent Objects
🌐Project Page | 📄Arxiv | 🎬Video
Fengguan Li, Yifan Ma, Chen Qian, Wentao Rao, Weiwei Shang
University of Science and Technology of China
We propose TransDex, a 3D visuo-tactile fusion motor policy based on point cloud reconstruction pre-training.
✅ This project is recommended to run in the following environment:
- Linux: Ubuntu 20.04.
- CUDA version: 12.9.1.
- Python version: 3.9.
- PyTorch version: The official build version adapted to CUDA 12.9.
⚡ Important: Please ensure CUDA 12.9.1 and the corresponding PyTorch version are already installed. This guide does not cover CUDA/PyTorch installation.
git clone https://github.com/LFGfg/TransDex.git
cd TransDexconda create -n VTFusion python=3.9 -y && conda activate VTFusionsudo apt install libgl1 -y && sudo apt-get install g++ -y
conda install pinocchio xorg-libx11 -c conda-forge -y
pip install -r requirements.txtAdjust the path according to your project layout:
cd ~/policy_ws/src/VTFusion/src/extensions/pointops
python setup.py installFirst enter the pre-training document directory:
cd ~/policy_ws/src/VTFusion/src/PretrainPointThe code for the dataset processing and model in the pre-training stage can be found in ./models/Dataset_process_nor.py and ./models/PretrainPoint.py. Pre-trained data used in this project are generated in Pybullet simulator, and the dexterous hand used can be found in this paper. The example dataset will be released at Google Drive later.
😸 We strongly suggest that users generate corresponding point cloud datasets according to your own dexterous hand systems and process them in the format provided by the data processing codes.
👉🏻 Before training, please place the hand-object dataset in ~/policy_ws/src/VTFusion/hand_object_data/hand_object_dataset/, or make sure that dataset.data_dir in ./cfgs/pretrain_hand_object.yaml should be changed to the storage location of your own dataset.
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/pretrain_hand_object.yamlThe trained weight files can be found in ./experiments/pretrain_hand_object/.
To perform a simple evaluation across the entire dataset, you can run:
export PYTHONPATH="~/policy_ws/src/VTFusion/src:$PYTHONPATH"
python ./models/Evaluation.py --mask_ratio 0.70 --ckpt_path ./experiments/pretrain_hand_object/ckpt-last.pth --data_dir ~/policy_ws/src/VTFusion/hand_object_data/hand_object_datasetThe real robotic system consists of a 16-DOF dexterous hand and a 7-DOF humanoid arm. The robot used in this project can also be found in this paper. The dexterous hand is equipped with Paxini array tactile sensors. Additionally, the system requires two Intel RealSense D435i depth cameras positioned at the wrists of the robotic arms and around the workbench respectively.
Enter the document directory:
cd ~/policy_ws/src/VTFusion/src/The code for the dataset processing and model of the policy can be found in VTFusion_dataset.py and FusionNetwork.py. Users can collect manipulation dataset through your own robotic system.
👉🏻 Before training the policy, please ensure:
- The pre-trained encoder weight file is copied to
./pretrain_pointencoder/ckpt.pth. - Manipulation dataset is placed under
../data_record/task_name/, and edit thetask_namein the config file./config/config.yaml. - Put the URDF file of the robot in
~/policy_ws/src/URDF/. - Adjust parameters such as
pos_mins/maxs,rpy_mins/maxs,joint_mins/maxsin the config file according to robotic system and task. Relevant instructions are already commented in the sample config fileconfig/config.yaml.
Use the following script for training:
CUDA_VISIBLE_DEVICES=0,1,2,3 python ./training.py --config ./config/config.yamlThe trained weight files can be found in ./ckpts/.
⚡ Note: Certain code files, such as ./pin_forward.py, ./VTFusion_dataset.py, contain functions specifically designed for the robotic system used in the project. When using these modules, you need to modify and adapt the specific functions within them.
This project utilizes ROS and TwinCat communication for underlying motor control. Users can evaluate through your own robotic systems and corresponding trained networks.
- Ensure all hardware devices is supplied with stable power and properly connected.
- Dual RealSense cameras require hand-eye calibration and time synchronization.
- Point cloud fusion necessitates coordinate transformation and ICP registration.
- Calibrate robotic arm/dexterous hand joint zero positions and set limits.
- Visualize the point cloud to ensure that the 3D position of tactile points is calculated correctly.
😄 If you find our work useful, please consider citing:
@article{li2026transdex,
title={TransDex: Pre-training Visuo-Tactile Policy with Point Cloud Reconstruction for Dexterous Manipulation of Transparent Objects},
author={Li, Fengguan and Ma, Yifan and Qian, Chen and Rao, Wentao and Shang, Weiwei},
journal={arXiv preprint arXiv:2603.13869},
year={2026}
}❓ If you have any questions, please contact lfguan@mail.ustc.edu.cn.
