Add preset switch failed callback for shader error logging by djj0s3 · Pull Request #23 · projectM-visualizer/gst-projectm

djj0s3 · 2026-04-02T22:44:47Z

Summary

Registers projectm_set_preset_switch_failed_event_callback after projectm_create()
Logs preset filename and exact compilation error message via g_printerr

Why

ProjectM silently swallows shader compilation errors by default. When generated presets fail to compile (producing black frames), there's no way to know WHY. This callback surfaces the exact HLSL/GLSL error message so we can fix the preset generation pipeline.

Test plan

Build plugin and load a preset with intentional shader errors
Verify error message appears in stderr output
Verify valid presets still render normally (no regression)

Generated with Claude Code

control (pass=cbr) was ineffective for ProjectM's highly complex visual content 2. Switching to quality-based encoding: Using quantizer=35 with CRF instead of fixed bitrate 3. Adding quality constraints: qp-max=50 to prevent quality from degrading too much 4. Optimizing for speed: speed-preset=ultrafast for faster encoding

- Log stdout/stderr from convert.sh on both success and failure - Add environment diagnostics to convert.sh (GPU detection, GStreamer plugin check) - Add pre-flight checks for file permissions and accessibility - Improve error visibility in Runpod logs This should help identify why jobs are failing with exit code 1. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Update OpenGL version to 4.5 for better compatibility - Add explicit GStreamer plugin paths and scanner location - Respect Runpod's NVIDIA_VISIBLE_DEVICES setting (don't override) - Add LD_LIBRARY_PATH to ensure libraries are found - Improve NVIDIA driver capabilities configuration These changes should resolve GPU access and library loading issues. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Added gstreamer1.0-gl package to Dockerfile dependencies - Provides glcolorconvert and gldownload elements needed for OpenGL texture conversion - Resolves "no element glcolorconvert/gldownload" pipeline errors - Built and pushed as v3 and latest tags 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Clean up stale X lock files before starting Xvfb - Kill existing Xvfb processes on display 99 - Enable GLX extension in Xvfb for better GL compatibility - Use GLX platform instead of X11 for software rendering - Improve gpu_accessible() to test nvidia-smi functionality - Add sleep to ensure Xvfb is ready before use These changes resolve the "Server is already active for display 99" error and improve GPU detection when nvidia-smi works but devices aren't exposed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Add explicit video/x-raw(memory:GLMemory),format=RGBA caps - Ensures proper capability negotiation in headless EGL mode - Resolves "could not link projectm0 to glcolorconvertelement0" error The pipeline now explicitly specifies RGBA format at each GL stage: projectm -> RGBA(GLMemory) -> glcolorconvert -> RGBA(GLMemory) -> gldownload -> RGBA 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- ProjectM plugin only supports ABGR format output - Removed explicit format=RGBA caps that were causing negotiation failure - Let glcolorconvert and videoconvert handle format conversion automatically - Resolves "projectm0 can't handle caps format=(string)RGBA" error 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- ProjectM GL context fails in headless EGL mode on Runpod - Always use Xvfb for ProjectM rendering (works reliably with X11 GL) - Detect GPU separately for hardware encoding (nvh264enc) - Maintains best of both: stable rendering + GPU-accelerated encoding This resolves the persistent "could not link projectm0 to glcolorconvertelement0" errors caused by GL context initialization failures in headless EGL mode. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Calculate display number based on PID: DISPLAY_NUM = 99 + (PID % 100) - Prevents conflicts when multiple jobs run simultaneously - Each job gets its own X display (range :99 to :198) - Removes only the specific lock file for this display Resolves issues with concurrent jobs interfering with each other's Xvfb instances. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Change GST_GL_API from opengl to opengl3 - Resolves GL context creation error with Xvfb/Mesa - Mesa provides opengl3 API, not legacy opengl - Fixes: "Cannot create context with user requested api (opengl)" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Start HTTP server by default when not in serverless environment - Only use serverless handler when RUNPOD_ENDPOINT_ID or RUNPOD_JOB_ID present Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Install openssh-server in container - Generate SSH host keys during build - Configure SSH for root login - Start SSH daemon in start.sh before main process - Expose ports 22 and 8000 Container works locally but RunPod pod readiness still being debugged. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Build for amd64 architecture (was incorrectly building arm64) - Add DRI device accessibility check before using EGL-GBM - Disable EGL surfaceless mode due to framebuffer incompatibility with ProjectM - Fall back to Mesa software rendering when DRI is not accessible - Improve start.sh to handle shell commands passed via dockerArgs The RunPod container now works reliably with Mesa software rendering when GPU DRI devices are not accessible due to permission issues. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The EGL-device mode (EGL_PLATFORM_DEVICE_EXT) successfully creates an OpenGL context on RunPod, but ProjectM has framebuffer issues because it renders to the default framebuffer (0) which doesn't exist in headless EGL modes. This is a fundamental limitation of ProjectM's rendering approach - it expects a display surface with a real framebuffer. Fixing this would require modifying the gst-projectm plugin to use FBOs. For now, fall back to Mesa software rendering on RunPod which is reliable, though slower than GPU rendering. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

This commit adds support for rendering to an FBO (Framebuffer Object) in headless EGL environments where the default framebuffer (0) doesn't exist. Plugin changes (src/plugin.c): - Add GST_PROJECTM_FORCE_FBO environment variable to force FBO mode - Detect headless mode by checking if framebuffer 0 is complete - Create and bind FBO before ProjectM initialization in headless mode - Never unbind to framebuffer 0 in headless mode - Properly manage FBO lifecycle to avoid binding framebuffer 0 during resize Convert script changes (convert.sh): - Add EGL-device surfaceless mode for NVIDIA GPUs without DRI access - Set GST_PROJECTM_FORCE_FBO=1 in all headless EGL modes - Add diagnostic output for GST_PROJECTM_FORCE_FBO and GST_GL settings Known limitation: ProjectM-4 internally uses framebuffer 0 during its initialization phase. This causes GL_INVALID_FRAMEBUFFER_OPERATION errors in headless EGL modes even with our FBO workaround. Full headless GPU rendering requires either: - Proper DRI device access (currently blocked on RunPod: permission 660) - Or patches to ProjectM-4 to not use framebuffer 0 during init The Mesa software rendering fallback continues to work reliably. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Reorder GPU rendering methods to prioritize Xvfb + NVIDIA GLX which works reliably on Vast.ai and similar cloud GPU platforms: 1. Xvfb + NVIDIA GLX (BEST) - virtual X server with HW-accelerated GL 2. Xorg dummy + NVIDIA GLX - fallback if Xvfb fails 3. EGL-GBM (experimental) - may not work with all NVIDIA drivers 4. EGL-device surfaceless - when DRI isn't accessible Key improvements: - Xvfb starts first without NVIDIA vendor set - NVIDIA GLX vendor is set for client apps only (GStreamer, glxinfo) - Verifies NVIDIA GLX actually works before proceeding - Falls back to Mesa if driver doesn't support this mode Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When NVIDIA GLX fails the glxinfo test, now properly falls through to EGL-GBM and EGL-device methods before falling back to software rendering. Previously, GLX failure immediately triggered Mesa fallback, causing slow software rendering on Vast.ai instances. Changes: - Use GPU_METHOD_FOUND flag to track successful GPU initialization - Remove broken "Xorg dummy + NVIDIA GLX" method (requires nvidia_drv.so) - Prioritize EGL-GBM as reliable GPU method after GLX fails - EGL-device surfaceless as last GPU resort before Mesa Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

EGL-GBM and EGL-device surfaceless often fail with NVIDIA drivers on Vast.ai. Now these methods are opt-in (FORCE_EGL=1) and include validation tests before use. When NVIDIA GLX fails, go directly to Mesa software rendering which is more reliable. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Method 1: Use Xorg with modesetting driver + glamor acceleration - Works through DRM/KMS with nvidia-container-runtime - Uses xorg-nvidia.conf which enables GPU acceleration - More reliable than Xvfb + NVIDIA GLX which requires server-side support Method 2: Xvfb + NVIDIA GLX (kept as fallback) - Only works when NVIDIA GLX server modules are available Both methods test with glxinfo before proceeding. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

… frames Root cause: projectm_opengl_render_frame() renders to ProjectM's internal buffer, not our external FBO. This caused all frames to be black. Fix: Use projectm_opengl_render_frame_fbo(handle, fbo_id) when an FBO is available. This renders directly to our framebuffer object. Also improved convert.sh GPU initialization: - Add GPU environment diagnostics for debugging - Reject llvmpipe/software rendering (causes black frames with gst-projectm) - Make Xvfb + NVIDIA GLX the preferred method for Vast.ai - Remove DRI requirement for Method 2 (GLX works without DRI access) - Add detailed EGL device enumeration for container environments Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Changes: - Switch base image from ubuntu:24.04 to nvidia/cuda:12.2.0-devel-ubuntu22.04 - Install nv-codec-headers for NVENC/NVDEC support - Build gst-plugins-bad from source with nvcodec=enabled - Add libnvidia-encode/decode libraries - Include 'video' capability in NVIDIA_DRIVER_CAPABILITIES - Update GST_PLUGIN_PATH to include nvcodec plugin location This enables hardware H.264 encoding via nvh264enc, which is ~2x faster than software x264 encoding and offloads work from the CPU to the GPU's dedicated video encoding hardware (NVENC). Combined with the mesh optimization (640x480 → 220x140), this should enable faster-than-realtime rendering for long audio files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Use nvidia/cuda:12.2.0-devel-ubuntu22.04 base image - Install nv-codec-headers for NVENC/NVDEC - Build nvcodec GStreamer plugin from gstreamer 1.20.7 monorepo - Add libnvidia-encode/decode libraries - Include 'video' capability for NVENC access The nvh264enc plugin enables hardware H.264 encoding, offloading encoding from CPU to GPU's dedicated NVENC hardware for ~2x faster video encoding. Image size: 8.79GB (larger due to CUDA devel libraries) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When running in Docker with -e DISPLAY=:0 -v /tmp/.X11-unix:/tmp/.X11-unix, the container should use the host's X server instead of starting its own. This enables: - NVIDIA GPU rendering via host Xorg with NVIDIA driver - NVENC hardware encoding (host GPU access) - Proper FBO rendering (no black frames) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Lambda Labs and other compute-focused cloud instances have CUDA but not OpenGL by default. This change: - Attempts to install libnvidia-gl for EGL/GLX support - Creates /usr/share/glvnd/egl_vendor.d/10_nvidia.json so libglvnd can find NVIDIA's EGL implementation With this, the container can use GPU-accelerated OpenGL rendering when nvidia-container-toolkit injects the host's NVIDIA libraries. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Removed libnvidia-encode-525 and libnvidia-decode-525 packages. These caused NVENC to fail with "unsupported device" when the host runs a different driver version (e.g., 570 vs 525). Kept libnvidia-gl for ProjectM OpenGL rendering (EGL/GLX). nvidia-container-toolkit will inject the correct encode/decode libraries at runtime when NVIDIA_DRIVER_CAPABILITIES=video is set. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The easter-egg property controls a startup logo/feature that shows the ProjectM W logo. Setting it to 0 disables this. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replaced projectM's default logo textures with user's custom VJ logo. Added multiple filename variations to cover all possible projectM texture references: - M.tga, m_logo.tga, mlogo.tga - projectm.tga, project.tga - headphones.tga - spiral.tga - logo.tga - pM.tga These will be included in the Docker image and override any default projectM logos that appear during idle/startup.

- Add vj_studio_logo.png for "Made With VJ Studio" overlay - Enable faststart=true on mp4mux for better YouTube streaming Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Load first preset immediately on init to avoid showing idle screen - Add gst_projectm_load_first_timeline_preset() for timeline mode - Prevent timeline_activate from resetting to index -1 if first preset already loaded - Add COPY for vj_studio_logo.png in Dockerfile This fixes the issue where the ProjectM "M" logo would briefly appear at the start of videos before transitioning to the first real preset. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Cropped top portion of logo to remove "Made With" text, leaving just the VJ character and "STUDIO" for a cleaner bottom-right watermark appearance. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add elapsed_seconds to timeline switch log message so we can see the actual PTS value when each switch occurs - Add periodic PTS diagnostic (every 600 frames / ~10s) logging both audio and video buffer PTS to detect drift between them - Add render_frame_count to GstProjectMPrivate for frame tracking This helps diagnose an issue where timeline entries get skipped, possibly due to video PTS drifting ahead of audio PTS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When CPU encoding is used (x264enc fallback), video PTS runs at 0.5-0.7x of audio time, causing the timeline engine to skip entries. This resulted in only 90/190 timeline entries being visited for a 53-min DJ set. Audio PTS advances at the true playback rate regardless of video encoding speed, ensuring all timeline entries are visited correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tr_array_sort g_ptr_array_sort() passes each comparison argument as a pointer to the array slot (GstProjectMTimelineEntry**), not a direct pointer to the entry. Without the extra dereference, the comparator was interpreting raw memory addresses as gdouble start_time values, resulting in a semi-random sort order. This caused large sections of the timeline to be unreachable — the fast-path optimization in timeline_find_target_index() would stay stuck on an early index because the "next" entry in the corrupted sort order had a much later start_time, making the before_next check always true. Symptoms: only ~89 of 190 timeline entries visited during a 53-min DJ set render, with 9-17 minute gaps where the same preset played. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Helps verify the sort comparator fix is working by logging start_time/duration/end_time of the first 20 entries after g_ptr_array_sort in gst_projectm_load_timeline(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

G_DEFINE_TYPE_WITH_CODE was initializing the debug category as "gstprojectm" while plugin_init used "projectm". Since the type init runs AFTER plugin_init (via gst_element_register), it overwrote the category variable with "gstprojectm" which didn't match the GST_DEBUG=projectm:4 setting, causing INFO-level diagnostic messages (PTS tracking, sort order verification) to be suppressed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use GST_WARNING_OBJECT instead of GST_INFO_OBJECT for timeline diagnostics so they appear regardless of debug category threshold. Includes "build v62" marker to verify correct binary is running. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The build v62 WARNING-level markers were temporary debugging aids to verify the timeline sort fix on RunPod. Now confirmed working (190/190 entries visited), downgrade back to INFO level for production. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Register projectm_set_preset_switch_failed_event_callback to log the exact error message when a preset fails to compile. ProjectM silently swallows these errors by default, making it impossible to debug why generated presets produce black frames. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

djj0s3 · 2026-04-02T22:45:03Z

Opened against upstream by mistake. This is for our fork only.

djj0s3 and others added 30 commits September 22, 2025 20:30

using gstreamer and projectM Docker

faca332

updated convert script

4a59080

conversion settings

aa6c2f6

fixes

91178ae

codex fixes

bd382d1

latest changes

40ebf2b

update docker

43ee792

Merge remote changes with local updates

a3dd765

Keep timeline-driven presets honest

56d84e1

let gst-convert respect timelines and use bitrate mode

20e221e

fixing a ton of conflicts. whoops

f3f51f6

drop unsupported vbv settings (again)

80ee497

updating logic to not rebuild the container on the fly

f6c2db6

making things work on runpod

75ace44

optimizations

dd577ef

runpod

6210cba

fixes

07c13b6

getting runpod working

b2be214

runpod

a713e93

runpod

33ee818

djj0s3 and others added 28 commits January 22, 2026 07:44

Fix start.sh to auto-detect pod vs serverless mode

ecdf277

- Start HTTP server by default when not in serverless environment - Only use serverless handler when RUNPOD_ENDPOINT_ID or RUNPOD_JOB_ID present Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Disable ProjectM easter-egg (W logo) at startup

f786229

The easter-egg property controls a startup logo/feature that shows the ProjectM W logo. Setting it to 0 disables this. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add VJ Studio logo and enable MP4 faststart for YouTube

f7d01a3

- Add vj_studio_logo.png for "Made With VJ Studio" overlay - Enable faststart=true on mp4mux for better YouTube streaming Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Remove "Made With" text from logo for cleaner watermark

32acca9

Cropped top portion of logo to remove "Made With" text, leaving just the VJ character and "STUDIO" for a cleaner bottom-right watermark appearance. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

djj0s3 closed this Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add preset switch failed callback for shader error logging#23

Add preset switch failed callback for shader error logging#23
djj0s3 wants to merge 84 commits into
projectM-visualizer:masterfrom
djj0s3:feat/shader-error-callback

djj0s3 commented Apr 2, 2026

Uh oh!

djj0s3 commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

djj0s3 commented Apr 2, 2026

Summary

Why

Test plan

Uh oh!

djj0s3 commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant