[Deepin-Kernel-SIG] [linux 6.18.y] [FROMLIST] [Intel] thermal: intel: Add support for Directed Package Thermal Interrupt#1797
Conversation
…ailure The function thermal_throttle_add_dev() may fail and abort a CPU hotplug online operation. Since the failure occurs within the online callback, thermal_throttle_online(), the CPU hotplug framework does not invoke the corresponding offline callback. As a result, the hardware and software resources set up during the failed operation are not torn down. Since only thermal_throttle_add_dev() can fail, call it before setting up the rest of the resources. Fixes: f665620 ("x86/mce/therm_throt: Optimize notifications of thermal throttle") Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://lore.kernel.org/linux-pm/20260528-rneri-directed-therm-intr-v2-1-8e2f9e0c1a36@linux.intel.com/ Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>
…nterrupt Add CPUID and MSR bit definitions required to support Intel Directed Package Thermal Interrupt. A CPU requests directed package-level thermal interrupts by setting bit 25 in IA32_THERM_INTERRUPT. Hardware acknowledges by setting bit 25 in IA32_PACKAGE_THERM_STATUS, indicating that only CPUs that opted in will receive the interrupt. If no CPU in the package requests it, delivery falls back to broadcast. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> [WangYuli: Fix conflicts] Link: https://lore.kernel.org/linux-pm/20260528-rneri-directed-therm-intr-v2-2-8e2f9e0c1a36@linux.intel.com/ Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>
Package-level thermal interrupts are broadcast to all online CPUs within a package, even though only one CPU needs to service them. This results in unnecessary wakeups, lock contention, and corresponding performance and power-efficiency penalties. When supported by hardware, a CPU requests to receive directed package- level thermal interrupts by setting a designated bit in IA32_THERM_INTERRUPT. The operating system must then verify that hardware has acknowledged this request by checking a designated bit in IA32_PACKAGE_THERM_STATUS. Enable directed package-level thermal interrupts on one CPU per package using the CPU hotplug infrastructure. Keep track of the CPUs handling package-level interrupts with an array. If the handling CPU goes offline, select a new CPU. Temporarily enable directed interrupts on both the current and new CPU until hardware acknowledges the new selection, then disable them on the outgoing CPU. Systems without directed-interrupt support continue to broadcast the package-level interrupt to all CPUs. Also, add a rollback mechanism in the CPU hotplug online callback to fall back to broadcast mode if the directed-interrupt acknowledgment fails in any package. This is most important during boot, when all CPUs in a package come online and would otherwise keep retrying on faulty hardware. A complete rollback is not needed in the CPU hotplug offline callback since at that point the hardware is known to work. While here, update an inline comment to point to the correct volume of the Intel Software Developer's Manual. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://lore.kernel.org/linux-pm/20260528-rneri-directed-therm-intr-v2-3-8e2f9e0c1a36@linux.intel.com/ Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>
Directed package-level thermal interrupts are serviced by a single CPU per package. These handler CPUs are selected at boot through the CPU hotplug infrastructure. This mechanism is sufficient to restore the directed interrupt configuration when resuming from suspend for non-boot packages. It also keeps the handler-tracking array updated. For the boot package, CPU0 is chosen during boot because its CPU hotplug online callback runs first. However, this callback is not invoked on resume. The directed package-level interrupt configuration for the boot package is not restored. Add a syscore resume callback to re-enable directed package-level interrupts for this package. Disabling directed interrupts during suspend is required to keep the handler-tracking array in a consistent state for the boot package, allowing the correct configuration to be restored on resume. The resume callback must busy-wait for hardware acknowledgment of the directed interrupt setup. Otherwise, the handler-tracking array could be left in an inconsistent state. This implies running with interrupts disabled for up to 15ms, though in practice it takes less than 1ms. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://lore.kernel.org/linux-pm/20260528-rneri-directed-therm-intr-v2-4-8e2f9e0c1a36@linux.intel.com/ Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>
A kexec reboot may load a kernel that does not support directed package- level thermal interrupts. Without a shutdown callback, the directed interrupt configuration remains enabled across kexec but will not be handled correctly. In particular, if the CPU designated to receive the directed interrupt goes offline, no other CPU in the package will receive it. Add a syscore shutdown callback to disable directed package-level thermal interrupts on all packages before a kexec reboot. If the post-kexec kernel does not enable directed interrupts, it falls back to broadcasting the interrupt to all CPUs. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://lore.kernel.org/linux-pm/20260528-rneri-directed-therm-intr-v2-5-8e2f9e0c1a36@linux.intel.com/ Signed-off-by: WangYuli <wangyl5933@chinaunicom.cn>
Reviewer's GuideAdds support for Intel Directed Package Thermal Interrupts to the x86 thermal throttling path, including per‑package handler CPU selection, hotplug- and syscore-aware enable/disable/rollback logic, and the necessary CPUID/MSR feature bits, while also fixing an existing hotplug error-handling bug. Sequence diagram for enabling directed package thermal interrupt on CPU onlinesequenceDiagram
participant CPU_hotplug
participant thermal_throttle_online
participant enable_directed_thermal_pkg_intr
participant config_directed_thermal_pkg_intr
participant check_directed_thermal_pkg_intr_ack
participant disable_all_directed_thermal_pkg_intr
participant Hardware
CPU_hotplug->>thermal_throttle_online: thermal_throttle_online(cpu)
thermal_throttle_online->>thermal_throttle_online: thermal_throttle_add_dev(dev, cpu)
thermal_throttle_online->>enable_directed_thermal_pkg_intr: enable_directed_thermal_pkg_intr(cpu)
enable_directed_thermal_pkg_intr->>enable_directed_thermal_pkg_intr: topology_logical_package_id(cpu)
enable_directed_thermal_pkg_intr->>enable_directed_thermal_pkg_intr: directed_intr_handler_cpus[pkg_id]
alt first_handler_in_package
enable_directed_thermal_pkg_intr->>Hardware: thermal_clear_package_intr_status(PACKAGE_LEVEL, PACKAGE_THERM_STATUS_DPTI_ACK)
enable_directed_thermal_pkg_intr->>config_directed_thermal_pkg_intr: config_directed_thermal_pkg_intr(&enable=true)
config_directed_thermal_pkg_intr->>Hardware: wrmsrl(MSR_IA32_THERM_INTERRUPT, THERM_INT_DPTI_ENABLE)
enable_directed_thermal_pkg_intr->>check_directed_thermal_pkg_intr_ack: check_directed_thermal_pkg_intr_ack()
check_directed_thermal_pkg_intr_ack->>Hardware: rdmsrl(MSR_IA32_PACKAGE_THERM_STATUS)
alt ack_received
check_directed_thermal_pkg_intr_ack-->>enable_directed_thermal_pkg_intr: return 0
enable_directed_thermal_pkg_intr->>enable_directed_thermal_pkg_intr: directed_intr_handler_cpus[pkg_id] = cpu
else ack_timeout
check_directed_thermal_pkg_intr_ack-->>enable_directed_thermal_pkg_intr: return -ETIMEDOUT
enable_directed_thermal_pkg_intr->>config_directed_thermal_pkg_intr: config_directed_thermal_pkg_intr(&enable=false)
config_directed_thermal_pkg_intr->>Hardware: wrmsrl(MSR_IA32_THERM_INTERRUPT, ~THERM_INT_DPTI_ENABLE)
enable_directed_thermal_pkg_intr->>disable_all_directed_thermal_pkg_intr: disable_all_directed_thermal_pkg_intr()
disable_all_directed_thermal_pkg_intr->>config_directed_thermal_pkg_intr: smp_call_function_single(handler_cpu, config_directed_thermal_pkg_intr, &enable=false, wait=true)
disable_all_directed_thermal_pkg_intr->>disable_all_directed_thermal_pkg_intr: kfree(directed_intr_handler_cpus)
end
else handler_already_set
enable_directed_thermal_pkg_intr-->>thermal_throttle_online: return
end
thermal_throttle_online-->>CPU_hotplug: return
Sequence diagram for syscore suspend/resume/shutdown handling of directed package thermal interruptssequenceDiagram
participant Syscore
participant directed_pkg_intr_syscore_suspend
participant directed_pkg_intr_syscore_resume
participant directed_pkg_intr_syscore_shutdown
participant enable_directed_thermal_pkg_intr
participant disable_directed_thermal_pkg_intr
participant disable_all_directed_thermal_pkg_intr
Syscore->>directed_pkg_intr_syscore_suspend: directed_pkg_intr_syscore_suspend(data)
directed_pkg_intr_syscore_suspend->>disable_directed_thermal_pkg_intr: disable_directed_thermal_pkg_intr(0)
disable_directed_thermal_pkg_intr-->>directed_pkg_intr_syscore_suspend: return
directed_pkg_intr_syscore_suspend-->>Syscore: return 0
Syscore->>directed_pkg_intr_syscore_resume: directed_pkg_intr_syscore_resume(data)
directed_pkg_intr_syscore_resume->>enable_directed_thermal_pkg_intr: enable_directed_thermal_pkg_intr(0)
enable_directed_thermal_pkg_intr-->>directed_pkg_intr_syscore_resume: return
directed_pkg_intr_syscore_resume-->>Syscore: return
Syscore->>directed_pkg_intr_syscore_shutdown: directed_pkg_intr_syscore_shutdown(data)
directed_pkg_intr_syscore_shutdown->>disable_all_directed_thermal_pkg_intr: disable_all_directed_thermal_pkg_intr()
disable_all_directed_thermal_pkg_intr-->>directed_pkg_intr_syscore_shutdown: return
directed_pkg_intr_syscore_shutdown-->>Syscore: return
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- disable_all_directed_thermal_pkg_intr() unconditionally uses smp_call_function_single() but is also invoked from the syscore shutdown callback where interrupts are disabled and CPU hotplug is frozen, which contradicts the function’s own calling requirements and risks deadlock; consider either avoiding SMP calls in syscore context or splitting out a variant that is safe for shutdown.
- The comment above disable_all_directed_thermal_pkg_intr() mentions syscore resume and asserts no SMP calls will be issued in that context, but the current users are the syscore shutdown callback and cpuhp teardown path, so it would be good to update the comment to accurately describe the actual callers and constraints.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- disable_all_directed_thermal_pkg_intr() unconditionally uses smp_call_function_single() but is also invoked from the syscore shutdown callback where interrupts are disabled and CPU hotplug is frozen, which contradicts the function’s own calling requirements and risks deadlock; consider either avoiding SMP calls in syscore context or splitting out a variant that is safe for shutdown.
- The comment above disable_all_directed_thermal_pkg_intr() mentions syscore resume and asserts no SMP calls will be issued in that context, but the current users are the syscore shutdown callback and cpuhp teardown path, so it would be good to update the comment to accurately describe the actual callers and constraints.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
This PR adds support for Intel Directed Package Thermal Interrupts (DPTI) in the x86 thermal throttling path to avoid broadcasting package thermal interrupts to all CPUs, reducing contention and unnecessary wakeups. It also integrates CPU hotplug and syscore (suspend/resume/shutdown) handling to keep the directed-interrupt “handler CPU” per package consistent across lifecycle events.
Changes:
- Add a new x86 CPU feature bit and MSR bit definitions for DPTI enable/acknowledge.
- Extend
drivers/thermal/intel/therm_throt.cto opt CPUs into directed package thermal interrupts, select a per-package handler CPU via CPU hotplug, and restore/teardown state via syscore callbacks. - Adjust thermal hotplug online error handling to avoid leaving partially initialized sysfs resources behind.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| drivers/thermal/intel/therm_throt.c | Implements directed package interrupt enable/disable, per-package handler selection, and syscore suspend/resume/shutdown integration. |
| arch/x86/include/asm/msr-index.h | Adds MSR bit definitions for enabling DPTI and checking its hardware acknowledgment. |
| arch/x86/include/asm/cpufeatures.h | Introduces X86_FEATURE_DPTI to gate the feature at runtime. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /* | ||
| * The package-level interrupt must remain directed after this CPU goes | ||
| * offline. | ||
| */ | ||
| new_cpu = cpumask_any_but(topology_core_cpumask(cpu), cpu); |
| #define X86_FEATURE_HWP_HIGHEST_PERF_CHANGE (14*32+15) /* HWP Highest perf change */ | ||
| #define X86_FEATURE_HFI (14*32+19) /* "hfi" Hardware Feedback Interface */ | ||
|
|
||
| #define X86_FEATURE_DPTI (14*32+24) /* Intel Directed Package Thermal Interrupt */ |
Hi,
This is v2 of this series. The main changes are a new patch fixing a pre-
existing bug and a redesigned error rollback strategy that disables
directed thermal interrupts across all packages when enabling fails in any
one of them. Please see the changelog for details.
Package-level thermal interrupts are currently broadcast to all CPUs in a
package. Only one CPU is needed to service package-wide events.
Broadcasting creates unnecessary resource contention. Thermal interrupts
generated for Hardware Feedback Interface[1] updates are an example: all
CPUs in the package receive the interrupt and race for a lock to update a
shared data structure. Idle CPUs are needlessly woken up.
Newer Intel processors allow directing package-level thermal interrupts
only to CPUs that explicitly request them. A CPU opts in by setting a
designated bit in IA32_THERM_INTERRUPT. Hardware acknowledges the request
by setting a designated bit in IA32_PACKAGE_THERM_STATUS.
This series enables directed package-level thermal interrupts and
designates one handler CPU per package using the CPU hotplug
infrastructure. A new CPU is selected if the handler CPU goes offline.
Because CPU0's hotplug callbacks are not invoked during suspend and resume,
syscore callbacks are added to restore the handler for the boot package.
The series also disables directed delivery during kexec reboot, avoiding
stale interrupt routing when rebooting into a kernel that does not support
the feature.
This patchset introduces a change in behavior in the /sys/devices/system/
cpu/cpuN/thermal_throttle/package* sysfs files. These files reflect per-CPU
variables updated when a CPU handles a package-level thermal interrupt. In
broadcast mode, all CPUs update their variables. When directed package-
level thermal interrupts are enabled, only the handler CPU's variables are
updated.
Lastly, nothing changes for processors that do not support this feature:
they fall back to broadcast delivery.
[1] Intel Software Developer's Manual Vol. 3, Section 17.6, March 2026 https://www.intel.com/SDM
Signed-off-by: Ricardo Neri ricardo.neri-calderon@linux.intel.com
Changes in v2:
CPU hotplug callback.
the boot package.
failure returns small_cpumask_bits, not nr_cpu_ids.
consistency and brevity. (Boris)
Ricardo Neri (5):
thermal: intel: Fix dangling resources on thermal_throttle_online() failure
x86/thermal: Add bit definitions for Intel Directed Package Thermal Interrupt
thermal: intel: Enable the Directed Package-level Thermal Interrupt
thermal: intel: Add syscore callbacks for suspend and resume
thermal: intel: Add a syscore shutdown callback for kexec reboot
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/include/asm/msr-index.h | 2 +
drivers/thermal/intel/therm_throt.c | 272 +++++++++++++++++++++++++++++++++++-
3 files changed, 272 insertions(+), 4 deletions(-)
base-commit: e7ae89a0c97ce2b68b0983cd01eda67cf373517d
change-id: 20260306-rneri-directed-therm-intr-9f3f8888bb3f
Best regards,
Ricardo Neri ricardo.neri-calderon@linux.intel.com
Link: https://lore.kernel.org/linux-pm/20260528-rneri-directed-therm-intr-v2-0-8e2f9e0c1a36@linux.intel.com/
Summary by Sourcery
Add support for Intel Directed Package Thermal Interrupts and fix thermal throttle CPU hotplug error handling.
New Features:
Bug Fixes:
Enhancements: