Bug Reports & Feature Requests (v1.0.5) -- Image upload, model download, retake, multi-reference

# Bug Reports & Feature Requests

## Environment
- **App Version:** v1.0.5
- **OS:** Windows 11
- **GPU:** NVIDIA RTX (CUDA available)

---

## Part 1: Bug Reports

### Bug 1: Image Upload Fails Silently in Video Editor Assets Panel

#### Description
In **Video Editor** mode, clicking the upload button in the **Assets** panel opens the file picker dialog, but after selecting an image (JPG/PNG) and confirming, nothing happens. The Assets panel shows no new files, and no error popup appears. Drag-and-drop also does not work.

Uploading **video files** works correctly 鈥?only images are affected.

#### Steps to Reproduce
1. Open LTX Desktop 鈫?switch to **Video Editor** mode
2. In the left **Assets** panel, click the upload button
3. Select a JPG or PNG image (e.g., 800脳500 pixels)
4. Click "Open" 鈥?the dialog closes but the Assets panel remains empty

#### Expected Behavior
The selected image should appear in the Assets panel with a thumbnail.

#### Actual Behavior
Nothing happens. No error message, no file added.

#### Root Cause Analysis (from local debugging)
In `electron/ipc/file-handlers.ts`, the `addVisualAssetToProject` handler calls `createVisualThumbnails()` (which uses Python Pillow). When this call throws an exception (e.g., due to image format edge cases or Pillow issues), the exception propagates and causes the entire function to return `{ success: false }`.

On the frontend (`useEditorMediaImport.ts`), this result is handled with:
```typescript
if (!copied) continue;
```
This silently skips the failed file without any user-facing error.

#### Suggested Fix
Wrap `createVisualThumbnails()` and `getVisualAssetDimensions()` in try-catch blocks, log warnings, and fall back to using the original image path and default dimensions instead of failing the entire import:

```typescript
let bigThumbnailPath: string = destPath;
let smallThumbnailPath: string | undefined;
let width = 1920;
let height = 1080;

try {
  const thumbs = createVisualThumbnails(destPath, type);
  bigThumbnailPath = thumbs.bigThumbnailPath;
  smallThumbnailPath = thumbs.smallThumbnailPath;
} catch (thumbError) {
  logger.warn(`Thumbnail creation failed for ${destPath}, using original: ${thumbError}`);
}

try {
  const dims = getVisualAssetDimensions(destPath, type);
  width = dims.width;
  height = dims.height;
} catch (dimError) {
  logger.warn(`Dimension probe failed for ${destPath}, using defaults: ${dimError}`);
}
```

---

### Bug 2: "Generation Failed" in Gen Space 鈫?Generate Image

#### Description
In **Gen Space** mode, clicking **Generate** after entering a prompt always shows "Generation Failed" after a brief loading period.

#### Error Log
```
FileNotFoundError: D:\LTX Desktop\models\Z-Image-Turbo\transformer\
does not appear to have a file named
diffusion_pytorch_model-00001-of-00003.safetensors
```

#### Root Cause Analysis
The Z-Image-Turbo model download was interrupted (e.g., by app crash, network issue, or manual shutdown). The `.downloading/Z-Image-Turbo/` directory contains numerous `.incomplete` shard files, and the final `transformer/` directory is missing all `diffusion_pytorch_model-0000X-of-00003.safetensors` weight files.

However, the app's `is_cp_downloaded()` check in `model_download_specs.py` only verifies that the model directory **exists**, not that all required shard files are present. This causes the app to skip re-downloading, leaving the model in a broken state.

#### Suggested Fix
Improve `is_cp_downloaded()` to validate that all expected model files (including sharded checkpoints) are present and non-empty before marking a model as downloaded. Alternatively, add a "Verify / Repair Models" button in settings.

**Workaround:** Manually delete `models/Z-Image-Turbo/` and `models/.downloading/Z-Image-Turbo/`, then restart the app to trigger a fresh download.

---

### Bug 3: "Generation Failed" in Video Editor 鈫?Retake

#### Description
Using the **Retake** feature on a video clip in Video Editor mode results in "Generation Failed".

#### Error Log
```
Video width and height must be multiples of 32. Got 720x1280.
```

#### Root Cause Analysis
The LTX model requires video dimensions to be multiples of 32. A 720脳1280 video fails because **720 is not divisible by 32** (32脳22=704, 32脳23=736). The `_validate_video_metadata()` method in `retake_handler.py` throws an HTTP 400 error without any attempt to auto-correct the dimensions.

#### Suggested Fix
Before validation, automatically resize non-compliant videos to the nearest valid dimensions using ffmpeg:

```python
@staticmethod
def _ensure_video_dimensions(video_path: str) -> str:
    meta = get_videostream_metadata(video_path)
    width, height = meta.width, meta.height
    if width % 32 == 0 and height % 32 == 0:
        return video_path
    new_width = (width // 32) * 32
    new_height = (height // 32) * 32
    # ffmpeg resize to snapped dimensions
    ...
```

Then call `_ensure_video_dimensions()` before `_validate_video_metadata()` in `_run_local_retake()`.

**Workaround:** Pre-process videos with ffmpeg to ensure width/height are multiples of 32 before importing.

---

### Bug 4: Model Download Does Not Support Resume 鈥?Interrupted Downloads Lose All Progress

#### Description
When downloading large models (e.g., Z-Image-Turbo, ~20GB), if the download is interrupted at 90% due to a network hiccup, app crash, or system sleep, restarting the app causes the download to **start from 0% instead of resuming**.

#### Root Cause Analysis
`huggingface_hub.snapshot_download()` **natively supports resumable downloads** via HTTP Range requests and `.incomplete` shard files. However, LTX Desktop's error-handling logic in `download_handler.py` destroys this resume capability:

```python
# backend/handlers/download_handler.py
except Exception:
    self.cleanup_downloading_dir()  # 鈫?deletes ALL temporary files
    raise
```

`cleanup_downloading_dir()` recursively deletes the entire `.downloading/` directory, which contains every model's in-progress shard files, not just the one that failed. This forces a full re-download.

Additionally, `is_cp_downloaded()` does not verify that sharded checkpoint files are actually complete, so partially downloaded directories can be mistaken as fully downloaded.

#### Suggested Fix
1. **Scope cleanup to the failed model only:**
   ```python
   except Exception:
       self._cleanup_failed_checkpoint(cp_id)  # only remove this model's temp files
       raise
   ```

2. **Trust `snapshot_download`'s built-in resume:** Do not manually wipe `.downloading/` on transient errors (network timeout, connection reset). Let `huggingface_hub` handle its own `.incomplete` files.

3. **Validate checkpoint integrity before marking as downloaded:**
   ```python
   def is_cp_downloaded(models_dir: Path, cp_id: ModelCheckpointID) -> bool:
       path = resolve_model_path(models_dir, cp_id)
       spec = get_model_cp_spec(cp_id)
       if not path.exists():
           return False
       if spec.is_folder:
           return _validate_folder_checkpoint(path, spec)  # verify shard files exist
       return path.stat().st_size == spec.expected_size_bytes
   ```

**Workaround:** Keep LTX Desktop open and prevent the system from sleeping until the download completes. If interrupted, the only recovery is to wait for a full re-download.

---

## Part 2: Feature Requests

### Feature 1: Multi-Image Reference for Image Generation

#### Description
Currently, **Gen Space 鈫?Generate Image** only allows uploading **a single reference image**. Users often need to combine multiple visual references (e.g., character pose + background style + lighting reference) to guide generation.

#### Proposed Behavior
- Allow uploading **2鈥? reference images**
- Each image can have an optional text label (e.g., "character", "background", "style")
- The model uses all references as conditioning inputs for the generation pipeline

#### Use Cases
- Character design: combine a face reference + pose reference + costume reference
- Scene composition: combine a background reference + lighting reference + subject reference
- Style transfer: combine a content image + multiple style images

---

### Feature 2: Multi-Reference Video Generation with Character Consistency

#### Description
Currently, **Gen Space 鈫?Generate Video** (and related video workflows) supports very limited reference input. For narrative content 鈥?especially with **multiple characters and dialogue scenes** 鈥?it is critical to maintain visual consistency across frames and shots.

#### Proposed Behavior
1. **Multi-Reference Image Upload**
   - Allow uploading **multiple character reference images** (front/side/profile views)
   - Allow uploading **scene/mood board references** (lighting, color palette, environment)
   - Allow uploading **previous video clips** as motion/style references

2. **Character Consistency Lock**
   - Extract a visual embedding/identity vector from character reference images
   - Lock this embedding across all generated frames to prevent face/body drift
   - Support **multiple named characters** each with their own locked identity

3. **Multi-Person Dialogue Scene Support**
   - In the prompt/script input, allow tagging characters by name (e.g., "[Alice] says 'Hello' while [Bob] nods")
   - The generator ensures the correct character model appears in the correct position with consistent appearance
   - Auto-detect speaker turns and generate appropriate lip-sync / gesture variations

#### Use Cases
- AI short film production with recurring characters
- Animated dialogue scenes with 2+ characters
- Brand video production with consistent mascot/avatar appearance
- Storyboarding with visual continuity across shots

#### Technical Considerations
- May require IP-Adapter or similar identity-preserving conditioning mechanisms
- Could leverage existing face-embedding models (e.g., ArcFace, InsightFace) for consistency
- May need a "character library" UI where users pre-register characters with multi-angle photos

---

## Summary

| # | Type | Title | Priority |
|---|------|-------|----------|
| 1 | Bug | Image upload silent failure in Video Editor Assets | High |
| 2 | Bug | Gen Space Generate Image fails due to incomplete model download | High |
| 3 | Bug | Retake fails on non-32-multiple video dimensions | Medium |
| 4 | Bug | Model download does not support resume; interrupted downloads restart from 0% | High |
| 5 | Feature | Multi-image reference for image generation | Medium |
| 6 | Feature | Multi-reference video generation with character consistency & dialogue | High |

Thank you for the great work on LTX Desktop! 

#	Type	Title	Priority
1	Bug	Image upload silent failure in Video Editor Assets	High
2	Bug	Gen Space Generate Image fails due to incomplete model download	High
3	Bug	Retake fails on non-32-multiple video dimensions	Medium
4	Bug	Model download does not support resume; interrupted downloads restart from 0%	High
5	Feature	Multi-image reference for image generation	Medium
6	Feature	Multi-reference video generation with character consistency & dialogue	High

Uh oh!

Bug Reports & Feature Requests (v1.0.5) -- Image upload, model download, retake, multi-reference #118

Description

Bug Reports & Feature Requests

Environment

Part 1: Bug Reports

Bug 1: Image Upload Fails Silently in Video Editor Assets Panel

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis (from local debugging)

Suggested Fix

Bug 2: "Generation Failed" in Gen Space 鈫?Generate Image

Description

Error Log

Root Cause Analysis

Suggested Fix

Bug 3: "Generation Failed" in Video Editor 鈫?Retake

Description

Error Log

Root Cause Analysis

Suggested Fix

Bug 4: Model Download Does Not Support Resume 鈥?Interrupted Downloads Lose All Progress

Description

Root Cause Analysis

Suggested Fix

Part 2: Feature Requests

Feature 1: Multi-Image Reference for Image Generation

Description

Proposed Behavior

Use Cases

Feature 2: Multi-Reference Video Generation with Character Consistency

Description

Proposed Behavior

Use Cases

Technical Considerations

Summary

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions