LTX-2 Motion Transfer

Transfer motion from a reference video onto a still image, producing a new short video where the image moves like the reference. Built on top of Lightricks LTX-2.3 using the ICLoraPipeline with IC-LoRA conditioning.

   subject image                  reference video                  output video
   ┌──────────┐                   ┌──────────────┐               ┌─────────────┐
   │   🧑      │   +               │  ↻ motion ↺  │   ───────▶    │  🧑 + motion │
   └──────────┘                   └──────────────┘               └─────────────┘
                                                                   (silent .mp4)

⚖️ License & restrictions — read before using

This project is a derivative of Lightricks LTX-2 and is governed by the LTX-2 Community License Agreement. By cloning or using this repo you agree to those terms.

Free for personal and small-business use. Entities with annual revenue ≥ USD 10,000,000 must obtain a paid commercial license from Lightricks at https://ltx.io/model/licensing before any use. Unauthorized commercial use triggers liquidated damages equal to 2× the commercial fee (LICENSE §2).

Use restrictions (LICENSE Attachment A) — prohibited: deepfakes / impersonation without consent, content harming minors, disinformation, automated decision-making affecting legal rights, discrimination, harassment, and the full list in LICENSE.

Derivative obligations (LICENSE §3) — anyone who forks or modifies this repo must redistribute under the same LTX-2 Community License, ship the full LICENSE + NOTICE, mark modified files prominently, and retain attribution.

Outputs — you own the generated videos but must still comply with Attachment A.

The Gemma 3 text encoder is © Google, used under the Gemma Terms of Use.

See NOTICE for the full third-party attribution and modified-file list.

How it works (layman's version)

Subject image — locks in who/what appears in the output (the person, clothes, background style).
Reference video — supplies how things move (motion, pose, camera path).
Text prompt — guides the overall scene description.
The model — combines all three into a new video, in two stages: a fast low-res pass to lock in motion + composition, then a refinement pass that upsamples to the target resolution.

The output is a silent MP4 (audio is stripped automatically).

Quick start

# 1. Clone
git clone git@github.com:PrachiUkey/motion_transfer.git
cd motion_transfer

# 2. Install Python deps (pick ONE of the two options below — see "Install" section)
uv sync --frozen && source .venv/bin/activate    # option A — uv (recommended)
# -- or --
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu129 \
  && pip install -e packages/ltx-core packages/ltx-pipelines   # option B — pip

# 3. Download model weights (~67 GB). Needs a HuggingFace token; see "Models" section.
python download_models.py

# 4. Generate
python main.py path/to/your_image.png
# → outputs/motion_transfer_<image_stem>.mp4

The default reference video is assets/idle_avatar_15_reverse.mp4 (a 5-second idle-avatar clip). Override it with --video your_motion.mp4.

Requirements


Python	≥ 3.10
GPU	NVIDIA with ≥ 24 GB VRAM (e.g. RTX 4090, A100, H100). FP8 quantization is used to fit the 22B model in 24 GB.
CUDA	12.9 (PyTorch wheels are pinned to this)
RAM	≥ 32 GB (Gemma 3 text encoder runs on CPU to keep VRAM free)
Disk	~70 GB for model weights + ~1 MB per output video

Install

You can use either uv (faster, single command, recommended) or plain pip.

Option A — uv

# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh

uv sync --frozen
source .venv/bin/activate

uv sync --frozen reads uv.lock and installs the exact resolved versions (including the right PyTorch CUDA 12.9 wheel and the local ltx-core + ltx-pipelines workspace packages).

Option B — pip

# Create and activate a venv (recommended)
python3 -m venv .venv && source .venv/bin/activate

# Install all third-party deps, using PyTorch's CUDA 12.9 index for torch wheels
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu129

# Install the two local packages in editable mode
pip install -e packages/ltx-core packages/ltx-pipelines

requirements.txt is auto-generated from uv.lock (uv export --no-dev --no-emit-workspace) so the pinned versions stay in sync with the uv flow. The --extra-index-url is required so pip picks the CUDA 12.9 PyTorch wheel rather than the CPU-only default.

Model weights

download_models.py fetches four sets of weights into ./models/:

Subdir	File	Source	Size
`distilled/`	`ltx-2.3-22b-distilled.safetensors`	Lightricks/LTX-2.3	~43 GB
`upscaler/`	`ltx-2.3-spatial-upscaler-x2-1.0.safetensors`	Lightricks/LTX-2.3	~1 GB
`ic-lora/`	`ltx-2.3-22b-ic-lora-motion-track-control-ref0.5.safetensors`	Lightricks/LTX-2.3-22b-IC-LoRA-Motion-Track-Control	~0.3 GB
`gemma/`	Gemma 3 12B (5 shards + tokenizer)	google/gemma-3-12b-it-qat-q4_0-unquantized — gated	~23 GB

HuggingFace token (one-time setup)

Gemma 3 is a gated Google model. Before running download_models.py:

Sign in at https://huggingface.co.
Visit https://huggingface.co/google/gemma-3-12b-it-qat-q4_0-unquantized, click "Acknowledge license", and submit the form.
Create a Read token at https://huggingface.co/settings/tokens.

Provide the token any of these ways (script checks in order):

export HF_TOKEN=hf_...
export HUGGINGFACE_HUB_TOKEN=hf_...
huggingface-cli login (caches token in ~/.cache/huggingface/token)
Interactive prompt (the script asks if none of the above are set)

The token stays on your machine. It is not committed to the repo.

Using `main.py`

python main.py IMAGE [--video VIDEO] [--prompt PROMPT] [--output PATH] [options]

Flag	Default	Purpose
`IMAGE`	(required)	Subject image (PNG / JPG). Convert paletted/grayscale PNGs to RGB first.
`--video`	`assets/idle_avatar_15_reverse.mp4`	Reference motion video.
`--prompt`	"A person facing the camera with subtle head and shoulder movement…"	Text prompt for the output.
`--output`	`outputs/motion_transfer_<image_stem>.mp4`	Where to write the silent MP4.
`--height` / `--width`	`768` / `768`	Output resolution (must be multiples of 64).
`--num-frames`	`121`	Frame count (≈ 5 s at 25 fps).
`--frame-rate`	`25.0`	Output fps.
`--seed`	`42`	Reproducibility.
`--lora-strength`	`0.8`	How strongly the IC-LoRA influences output. Try 0.6–1.0.
`--video-strength`	`1.0`	How strongly the reference motion drives output.
`--image-strength`	`1.0`	How strongly the subject image anchors appearance.

Internally, main.py:

Sets LTX_TEXT_ENCODER_CPU=1 so Gemma runs on CPU (needed on 24 GB GPUs).
Invokes python -m ltx_pipelines.ic_lora with FP8 quantization.
Strips the audio stream losslessly via PyAV (no ffmpeg binary required).

Tuning tips

Motion too weak → raise --video-strength (try 1.0–1.5) or --lora-strength.
Motion too rigid / face distorts → lower --lora-strength to 0.5–0.7.
Subject drifts from the image → raise --image-strength, keep the image anchored at frame 0.
Out of VRAM → lower --height / --width to 512, or --num-frames to 81. FP8 quantization is already on.
Faster preview → temporarily set --num-frames 49 --height 512 --width 512 for a 2-second 512² draft.

Repo layout

.
├── main.py                            # generate a single silent video
├── download_models.py                 # fetch all weights into models/
├── assets/
│   └── idle_avatar_15_reverse.mp4     # default reference motion
├── models/                            # downloaded weights (gitignored)
├── outputs/                           # generated MP4s (gitignored)
└── packages/
    ├── ltx-core/                      # LTX-2 model implementation
    └── ltx-pipelines/                 # ICLoraPipeline + shared utils

Troubleshooting

GatedRepoError: 403 during Gemma download → your account hasn't accepted the Gemma 3 license yet. See the HF token section.
ValueError: Expected numpy array with ndim 3 during run → your input image is grayscale or palette-indexed. Convert to RGB first:
```
from PIL import Image; Image.open("img.png").convert("RGB").save("img.png")
```
CUBLAS_STATUS_ALLOC_FAILED → out of GPU memory. Confirm no other processes are holding the card with nvidia-smi. Try lower resolution.

License & attribution

See the License & restrictions callout at the top of this README, the full LICENSE, and NOTICE for the complete attribution and modified-file list.

Pipeline source: packages/ltx-pipelines/src/ltx_pipelines/ic_lora.py.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
assets		assets
packages		packages
static		static
.gitattributes		.gitattributes
.gitignore		.gitignore
FLOW.md		FLOW.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
download_models.py		download_models.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements-api.txt		requirements-api.txt
requirements.txt		requirements.txt
server.py		server.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LTX-2 Motion Transfer

⚖️ License & restrictions — read before using

How it works (layman's version)

Quick start

Requirements

Install

Option A — uv

Option B — pip

Model weights

HuggingFace token (one-time setup)

Using `main.py`

Tuning tips

Repo layout

Troubleshooting

License & attribution

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LTX-2 Motion Transfer

⚖️ License & restrictions — read before using

How it works (layman's version)

Quick start

Requirements

Install

Option A — uv

Option B — pip

Model weights

HuggingFace token (one-time setup)

Using main.py

Tuning tips

Repo layout

Troubleshooting

License & attribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Using `main.py`

Packages