This repo controls an external fan for a GPU by writing PWM values to an NZXT Smart Device v2 (nzxtsmart2). The improved version in this repo avoids polling nvidia-smi in a shell loop.
nvidia-smi is a userspace CLI layered on top of NVML. It works, but it is the wrong level for a long-running control loop:
- every sample spawns a process
- parsing CLI output is less robust than calling the driver API directly
- it is slower and adds another failure surface
The better order of preference is:
- NVIDIA temperature from
sysfs hwmonif the driver exposes it - direct
NVMLcalls vialibnvidia-ml.so - no
nvidia-smifallback in the controller loop
gpu-fan-control.py: main controllergpu-fan-control.service: systemd unitgpu-fan-control.env: sample install-time configgpu-fan-control.env.example: optional config file templateinstall-live.sh: helper script to install the service on a host
The controller is built around fail-safe defaults:
- writes
SAFE_PWMon startup before entering the loop - writes
SAFE_PWMon shutdown and on any control/read failure - exits after repeated failures so systemd can restart it cleanly
- writes
EMERGENCY_PWMimmediately if temperature reachesCRITICAL_TEMP - uses systemd watchdog heartbeats so a stuck controller is restarted
- treats unexpectedly low fan RPM at high PWM as a fault and pushes emergency PWM
- uses an exclusive lock in
/run/gpu-fan-control/lock - prefers kernel
sysfstemperature when available, otherwise uses directNVML
- Review and tune the config:
cp gpu-fan-control.env.example gpu-fan-control.envEdit gpu-fan-control.env for your hardware before installing.
- Install the script:
sudo install -m 0755 gpu-fan-control.py /usr/local/bin/gpu-fan-control.py- Install the unit:
sudo install -m 0644 gpu-fan-control.service /etc/systemd/system/gpu-fan-control.service- Install the environment file:
sudo install -m 0644 gpu-fan-control.env /etc/default/gpu-fan-control- Reload and restart:
sudo systemctl daemon-reload
sudo systemctl enable --now gpu-fan-control.service- Verify:
sudo /usr/local/bin/gpu-fan-control.py --print-temp
sudo systemctl status gpu-fan-control.service
journalctl -u gpu-fan-control.service -fNZXT_HWMON_NAME,PWM_CHANNEL, andFAN_CHANNELmust match your controller's hwmon layout under/sys/class/hwmon.- The controller prefers
sysfstemperature fromNVIDIA_HWMON_NAME=nvidiawhen available, otherwise it falls back toNVML. - If your system has multiple GPUs, set
GPU_INDEXor uncomment and setGPU_PCI_BUS_IDfor a stable target. SAFE_PWMis applied on startup, shutdown, and repeated read/control failures.EMERGENCY_PWMis applied when the GPU reachesCRITICAL_TEMPor fan feedback indicates a likely fault.
If you want a single command install flow on a host, tune gpu-fan-control.env first and then run:
sudo ./install-live.sh