System Environment:
- LM Studio Version: 0.4.17
- OS: Windows 10
- Hardware: AMD Ryzen 7 5700X3D | NVIDIA RTX 5060 Ti (16GB VRAM) | 32GB RAM
- Model Tested: Qwen 3.6 35B A3B (GGUF) with large context 64K
Description of the Bug & Visual Evidence:
There is an inverse boolean logic bug with the "Limit Model Offloading to GPU Dedicated Memory" toggle when running Qwen 3.6 35B A3B. The behavior of the switch is completely inverted compared to its visual label:
- When ON (Toggle Enabled): VRAM caps at ~13.5 GB, offloading processing to the CPU (causing CPU usage to spike). Performance drops to ~23 tok/s.
- When OFF (Toggle Disabled): The backend correctly allocates maximum VRAM (~15.2 GB), keeping shared memory at 0 GB, and performance flies at +72 tok/s with low CPU usage.

System Environment:
Description of the Bug & Visual Evidence:
There is an inverse boolean logic bug with the "Limit Model Offloading to GPU Dedicated Memory" toggle when running
Qwen 3.6 35B A3B. The behavior of the switch is completely inverted compared to its visual label: