Skip to content

Bug Reports & Feature Requests (v1.0.5) -- Image upload, model download, retake, multi-reference #118

Description

@zk3699

Bug Reports & Feature Requests

Environment

  • App Version: v1.0.5
  • OS: Windows 11
  • GPU: NVIDIA RTX (CUDA available)

Part 1: Bug Reports

Bug 1: Image Upload Fails Silently in Video Editor Assets Panel

Description

In Video Editor mode, clicking the upload button in the Assets panel opens the file picker dialog, but after selecting an image (JPG/PNG) and confirming, nothing happens. The Assets panel shows no new files, and no error popup appears. Drag-and-drop also does not work.

Uploading video files works correctly 鈥?only images are affected.

Steps to Reproduce

  1. Open LTX Desktop 鈫?switch to Video Editor mode
  2. In the left Assets panel, click the upload button
  3. Select a JPG or PNG image (e.g., 800脳500 pixels)
  4. Click "Open" 鈥?the dialog closes but the Assets panel remains empty

Expected Behavior

The selected image should appear in the Assets panel with a thumbnail.

Actual Behavior

Nothing happens. No error message, no file added.

Root Cause Analysis (from local debugging)

In electron/ipc/file-handlers.ts, the addVisualAssetToProject handler calls createVisualThumbnails() (which uses Python Pillow). When this call throws an exception (e.g., due to image format edge cases or Pillow issues), the exception propagates and causes the entire function to return { success: false }.

On the frontend (useEditorMediaImport.ts), this result is handled with:

if (!copied) continue;

This silently skips the failed file without any user-facing error.

Suggested Fix

Wrap createVisualThumbnails() and getVisualAssetDimensions() in try-catch blocks, log warnings, and fall back to using the original image path and default dimensions instead of failing the entire import:

let bigThumbnailPath: string = destPath;
let smallThumbnailPath: string | undefined;
let width = 1920;
let height = 1080;

try {
  const thumbs = createVisualThumbnails(destPath, type);
  bigThumbnailPath = thumbs.bigThumbnailPath;
  smallThumbnailPath = thumbs.smallThumbnailPath;
} catch (thumbError) {
  logger.warn(`Thumbnail creation failed for ${destPath}, using original: ${thumbError}`);
}

try {
  const dims = getVisualAssetDimensions(destPath, type);
  width = dims.width;
  height = dims.height;
} catch (dimError) {
  logger.warn(`Dimension probe failed for ${destPath}, using defaults: ${dimError}`);
}

Bug 2: "Generation Failed" in Gen Space 鈫?Generate Image

Description

In Gen Space mode, clicking Generate after entering a prompt always shows "Generation Failed" after a brief loading period.

Error Log

FileNotFoundError: D:\LTX Desktop\models\Z-Image-Turbo\transformer\
does not appear to have a file named
diffusion_pytorch_model-00001-of-00003.safetensors

Root Cause Analysis

The Z-Image-Turbo model download was interrupted (e.g., by app crash, network issue, or manual shutdown). The .downloading/Z-Image-Turbo/ directory contains numerous .incomplete shard files, and the final transformer/ directory is missing all diffusion_pytorch_model-0000X-of-00003.safetensors weight files.

However, the app's is_cp_downloaded() check in model_download_specs.py only verifies that the model directory exists, not that all required shard files are present. This causes the app to skip re-downloading, leaving the model in a broken state.

Suggested Fix

Improve is_cp_downloaded() to validate that all expected model files (including sharded checkpoints) are present and non-empty before marking a model as downloaded. Alternatively, add a "Verify / Repair Models" button in settings.

Workaround: Manually delete models/Z-Image-Turbo/ and models/.downloading/Z-Image-Turbo/, then restart the app to trigger a fresh download.


Bug 3: "Generation Failed" in Video Editor 鈫?Retake

Description

Using the Retake feature on a video clip in Video Editor mode results in "Generation Failed".

Error Log

Video width and height must be multiples of 32. Got 720x1280.

Root Cause Analysis

The LTX model requires video dimensions to be multiples of 32. A 720脳1280 video fails because 720 is not divisible by 32 (32脳22=704, 32脳23=736). The _validate_video_metadata() method in retake_handler.py throws an HTTP 400 error without any attempt to auto-correct the dimensions.

Suggested Fix

Before validation, automatically resize non-compliant videos to the nearest valid dimensions using ffmpeg:

@staticmethod
def _ensure_video_dimensions(video_path: str) -> str:
    meta = get_videostream_metadata(video_path)
    width, height = meta.width, meta.height
    if width % 32 == 0 and height % 32 == 0:
        return video_path
    new_width = (width // 32) * 32
    new_height = (height // 32) * 32
    # ffmpeg resize to snapped dimensions
    ...

Then call _ensure_video_dimensions() before _validate_video_metadata() in _run_local_retake().

Workaround: Pre-process videos with ffmpeg to ensure width/height are multiples of 32 before importing.


Bug 4: Model Download Does Not Support Resume 鈥?Interrupted Downloads Lose All Progress

Description

When downloading large models (e.g., Z-Image-Turbo, ~20GB), if the download is interrupted at 90% due to a network hiccup, app crash, or system sleep, restarting the app causes the download to start from 0% instead of resuming.

Root Cause Analysis

huggingface_hub.snapshot_download() natively supports resumable downloads via HTTP Range requests and .incomplete shard files. However, LTX Desktop's error-handling logic in download_handler.py destroys this resume capability:

# backend/handlers/download_handler.py
except Exception:
    self.cleanup_downloading_dir()  # 鈫?deletes ALL temporary files
    raise

cleanup_downloading_dir() recursively deletes the entire .downloading/ directory, which contains every model's in-progress shard files, not just the one that failed. This forces a full re-download.

Additionally, is_cp_downloaded() does not verify that sharded checkpoint files are actually complete, so partially downloaded directories can be mistaken as fully downloaded.

Suggested Fix

  1. Scope cleanup to the failed model only:

    except Exception:
        self._cleanup_failed_checkpoint(cp_id)  # only remove this model's temp files
        raise
  2. Trust snapshot_download's built-in resume: Do not manually wipe .downloading/ on transient errors (network timeout, connection reset). Let huggingface_hub handle its own .incomplete files.

  3. Validate checkpoint integrity before marking as downloaded:

    def is_cp_downloaded(models_dir: Path, cp_id: ModelCheckpointID) -> bool:
        path = resolve_model_path(models_dir, cp_id)
        spec = get_model_cp_spec(cp_id)
        if not path.exists():
            return False
        if spec.is_folder:
            return _validate_folder_checkpoint(path, spec)  # verify shard files exist
        return path.stat().st_size == spec.expected_size_bytes

Workaround: Keep LTX Desktop open and prevent the system from sleeping until the download completes. If interrupted, the only recovery is to wait for a full re-download.


Part 2: Feature Requests

Feature 1: Multi-Image Reference for Image Generation

Description

Currently, Gen Space 鈫?Generate Image only allows uploading a single reference image. Users often need to combine multiple visual references (e.g., character pose + background style + lighting reference) to guide generation.

Proposed Behavior

  • Allow uploading 2鈥? reference images
  • Each image can have an optional text label (e.g., "character", "background", "style")
  • The model uses all references as conditioning inputs for the generation pipeline

Use Cases

  • Character design: combine a face reference + pose reference + costume reference
  • Scene composition: combine a background reference + lighting reference + subject reference
  • Style transfer: combine a content image + multiple style images

Feature 2: Multi-Reference Video Generation with Character Consistency

Description

Currently, Gen Space 鈫?Generate Video (and related video workflows) supports very limited reference input. For narrative content 鈥?especially with multiple characters and dialogue scenes 鈥?it is critical to maintain visual consistency across frames and shots.

Proposed Behavior

  1. Multi-Reference Image Upload

    • Allow uploading multiple character reference images (front/side/profile views)
    • Allow uploading scene/mood board references (lighting, color palette, environment)
    • Allow uploading previous video clips as motion/style references
  2. Character Consistency Lock

    • Extract a visual embedding/identity vector from character reference images
    • Lock this embedding across all generated frames to prevent face/body drift
    • Support multiple named characters each with their own locked identity
  3. Multi-Person Dialogue Scene Support

    • In the prompt/script input, allow tagging characters by name (e.g., "[Alice] says 'Hello' while [Bob] nods")
    • The generator ensures the correct character model appears in the correct position with consistent appearance
    • Auto-detect speaker turns and generate appropriate lip-sync / gesture variations

Use Cases

  • AI short film production with recurring characters
  • Animated dialogue scenes with 2+ characters
  • Brand video production with consistent mascot/avatar appearance
  • Storyboarding with visual continuity across shots

Technical Considerations

  • May require IP-Adapter or similar identity-preserving conditioning mechanisms
  • Could leverage existing face-embedding models (e.g., ArcFace, InsightFace) for consistency
  • May need a "character library" UI where users pre-register characters with multi-angle photos

Summary

# Type Title Priority
1 Bug Image upload silent failure in Video Editor Assets High
2 Bug Gen Space Generate Image fails due to incomplete model download High
3 Bug Retake fails on non-32-multiple video dimensions Medium
4 Bug Model download does not support resume; interrupted downloads restart from 0% High
5 Feature Multi-image reference for image generation Medium
6 Feature Multi-reference video generation with character consistency & dialogue High

Thank you for the great work on LTX Desktop!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions