A newer version of this document is available. Customers should click here to go to the newest version.
Thermal Management Features
Occasionally the processor may operate in conditions that are near to its maximum operating temperature. This can be due to internal overheating or overheating within the platform. In order to protect the processor and the platform from thermal failure, several thermal management features exist to reduce package power consumption and thereby temperature in order to remain within normal operating limits.
Adaptive Thermal Monitor
The purpose of the Adaptive Thermal Monitor is to reduce processor IA core power consumption and temperature until it operates below its maximum operating temperature. Processor IA core power reduction is achieved by:
- Adjusting the operating frequency (using the processor IA core ratio multiplier) and voltage.
- Modulating (starting and stopping) the internal processor IA core clocks (duty cycle).
The Adaptive Thermal Monitor can be activated when the package temperature, monitored by any Digital Thermal Sensor (DTS), meets its maximum operating temperature. The maximum operating temperature implies maximum junction temperature TjMAX.
Reaching the maximum operating temperature activates the Thermal Control Circuit (TCC). When activated the TCC causes both the processor IA core and graphics core to reduce frequency and voltage adaptively. The Adaptive Thermal Monitor will remain active as long as the package temperature remains at its specified limit. Therefore, the Adaptive Thermal Monitor will continue to reduce the package frequency and voltage until the TCC is de-activated.
Clock modulation (refer to Clock Modulation) is another means to reduce the processor core clock. The duty cycle of the clock modulation can be programmed through MSR (refer to MSR Based On-Demand Mode).
TjMAX is factory calibrated and is not user configurable. The default value is software visible in the TEMPERATURE_TARGET (0x1A2) MSR, bits [23:16].
The Adaptive Thermal Monitor does not require any additional hardware, software drivers, or interrupt handling routines. It is not intended as a mechanism to maintain processor thermal control to PL1 = TDP. The system design should provide a thermal solution that can maintain normal operation when PL1 = TDP within the intended usage range.
Frequency / Voltage Control
Upon Adaptive Thermal Monitor activation, the processor attempts to dynamically reduce processor temperature by lowering the frequency and voltage operating point. The operating points are automatically calculated by the processor IA core itself and do not require the BIOS to program them. The processor IA core will scale the operating points such that:
- The voltage will be optimized according to the temperature, the processor IA core bus ratio and number of processor IA cores in deep C-states.
- The processor IA core power and temperature are reduced while minimizing performance degradation.
Once the temperature has dropped below the trigger temperature, the operating frequency and voltage will transition back to the normal system operating point.
Once a target frequency/bus ratio is resolved, the processor IA core will transition to the new target automatically.
- On an upward operating point transition, the voltage transition precedes the frequency transition.
- On a downward transition, the frequency transition precedes the voltage transition.
- The processor continues to execute instructions. However, the processor will halt instruction execution for frequency transitions.
If a processor load-based Enhanced Intel SpeedStep® Technology/P-state transition (through MSR write) is initiated while the Adaptive Thermal Monitor is active, there are two possible outcomes:
- If the P-state target frequency is higher than the processor IA core optimized target frequency, the P-state transition will be deferred until the thermal event has been completed.
- If the P-state target frequency is lower than the processor IA core optimized target frequency, the processor will transition to the P-state operating point.
If the frequency/voltage changes are unable to end an Adaptive Thermal Monitor event, the Adaptive Thermal Monitor will utilize clock modulation. Clock modulation is done by alternately turning the clocks off and on at a duty cycle (ratio between clock “on” time and total time) specific to the processor. The duty cycle is factory configured to 25 % on and 75 % off and cannot be modified. The period of the duty cycle is configured to 32 microseconds when the Adaptive Thermal Monitor is active. Cycle times are independent of processor frequency. A small amount of hysteresis has been included to prevent excessive clock modulation when the processor temperature is near its maximum operating temperature. Once the temperature has dropped below the maximum operating temperature, and the hysteresis timer has expired, the Adaptive Thermal Monitor goes inactive and clock modulation ceases. Clock modulation is automatically engaged as part of the Adaptive Thermal Monitor activation when the frequency/voltage targets are at their minimum settings. Processor performance will be decreased when clock modulation is active. Snooping and interrupt processing are performed in the normal manner while the Adaptive Thermal Monitor is active.
Clock modulation is not activated by the Package average temperature control mechanism.
Digital Thermal Sensor
Each processor has up to 12 on-die Digital Thermal Sensors (DTS) that detect the processor IA (with one sensor in the core), GT (nine sensors), CCU (one sensor), and display (one sensor).
Temperature values from the DTS can be retrieved through:
- A software interface using processor Model Specific Register (MSR).
When temperature is retrieved by the processor MSR, it is the instantaneous temperature of the given DTS. The average DTS temperature may not be a good indicator of package Adaptive Thermal Monitor activation or rapid increases in temperature that triggers the Out of Specification status bit within the PACKAGE_THERM_STATUS (0x1B1) MSR and IA32_THERM_STATUS (0x19C) MSR.
Code execution is halted in C1 or deeper C-states.
Unlike traditional thermal devices, the DTS outputs a temperature relative to the maximum supported operating temperature of the processor (TjMAX), regardless of TCC activation offset. It is the responsibility of software to convert the relative temperature to an absolute temperature. The absolute reference temperature is readable in the TEMPERATURE_TARGET (0x1A2) MSR . The temperature returned by the DTS is an implied negative integer indicating the relative offset from TjMAX. The DTS does not report temperatures greater than TjMAX. The DTS-relative temperature readout directly impacts the Adaptive Thermal Monitor trigger point. When a package DTS indicates that it has reached the TCC activation (a reading of 0x0, except when the TCC activation offset is changed), the TCC will activate and indicate an Adaptive Thermal Monitor event. A TCC activation will lower both processor IA core and graphics core frequency, voltage, or both. Changes to the temperature can be detected using two programmable thresholds, one set above and another below the current temperature, located in the processor thermal MSRs. These thresholds have the capability of generating interrupts using the processor IA core's local APIC.
The thermal thresholds defined for Processor are:
- Core Threshold #1 Temperature in IA32_THERM_INTERRUPT (MSR 0x19B) Bits 14:8. This value indicates the offset in degrees below TjMAX Temperature that will trigger a Thermal Threshold 1 trip.
- Package Threshold #1 Temperature in IA32_THERM_INTERRUPT (MSR 0x1B2) Bits 14:8. This value indicates the offset in degrees below TjMAX Temperature that will trigger a Package Thermal Threshold 1 trip.
- Core Threshold #2 Temperature in IA32_THERM_INTERRUPT (MSR 0x19B) Bits 22:16. This value indicates the offset in degrees below TjMAX Temperature that will trigger a Thermal Threshold 2 trip. Similar to Threshold Value 1.
- Package Threshold #2 Temperature in IA32_THERM_INTERRUPT (MSR 0x1B2) Bits 22:16. This value indicates the offset in degrees below TjMAX Temperature that will trigger a Thermal Threshold 2 trip to all cores in the package. Similar to Core Threshold Value 2.
Digital Thermal Sensor Accuracy (Taccuracy)
The error associated with DTS measurements does not exceed ±5 °C within the entire operating range.
Fan Speed Control with Digital Thermal Sensor
Digital Thermal Sensor based fan speed control (TFAN) is a recommended feature to achieve optimal thermal performance. TFAN temperature (sometimes called TCONTROL) indicates the relative offset from the Thermal Monitor Trip Temperature at which fans should be engaged. For current temperature reporting, it is recommended that the value MSR PACKAGE_THERM_MARGIN (1A1h) [15:0] be used for fan control software. Intel recommends full cooling capability before the DTS reading reaches TjMAX.
PROCHOT_N (processor hot) is asserted by the processor when the TCC is active. Only a single PROCHOT_N pin exists at a package level. When any DTS temperature reaches the TCC activation temperature, the PROCHOT_N signal is asserted. PROCHOT_N assertion policies are independent of Adaptive Thermal Monitor enabling.
By default, the PROCHOT_N is configured as bi-directional pin. When configured as an input or bi-directional signal, PROCHOT_N is used for thermally protecting other platform components should they overheat as well. When PROCHOT_N is driven by an external device:
- The package will immediately transition to the lowest P-State (Pn) supported by the processor IA cores and graphics cores. This is contrary to the internally-generated Adaptive Thermal Monitor response.
- Clock modulation is not activated.
The processor package remains at the lowest supported P-state until the system de-asserts PROCHOT_N. The processor is configured to generate an interrupt upon assertion and de-assertion of the PROCHOT_N signal.
When PROCHOT_N is configured as a bi-directional signal and PROCHOT_N is asserted by the processor, it is impossible for the processor to detect a system assertion of PROCHOT_N. The system assertion will have to wait until the processor de-asserts PROCHOT_N before PROCHOT_N action can occur due to the system assertion. While the processor is hot and asserting PROCHOT_N, the power is reduced but the reduction rate is slower than the system PROCHOT_N response of < 100 us. The processor thermal control is staged in smaller increments over many milliseconds. This may cause several milliseconds of delay to a system assertion of PROCHOT_N while the output function is asserted.
Voltage Regulator Protection using PROCHOT_N
PROCHOT_N may be used for thermal protection of voltage regulators (VR). System designers can create a circuit to monitor the VR temperature and assert PROCHOT_N and, if enabled, activate the TCC when the temperature limit of the VR is reached. When PROCHOT_N is configured as a bi-directional or input only signal, if the system assertion of PROCHOT_N is recognized by the processor, it will result in an immediate transition to the lowest P-State (Pn) supported by the processor IA cores and graphics cores. Systems should still provide proper cooling for the VR and rely on bi-directional PROCHOT_N only as a backup in case of system cooling failure. Overall, the system thermal design should allow the power delivery circuitry to operate within its temperature specification even while the processor is operating at its TDP.
Thermal Solution Design and PROCHOT_N Behavior
With a properly designed and characterized thermal solution, it is anticipated that PROCHOT_N will only be asserted for very short periods of time when running the most power intensive applications. The processor performance impact due to these brief periods of TCC activation is expected to be so minor that it would be immeasurable. However, an under-designed thermal solution that is not able to prevent excessive assertion of PROCHOT_N in the anticipated ambient environment may:
- Cause a noticeable performance loss.
- Result in prolonged operation at the specified maximum junction temperature and affect the long-term reliability of the processor.
- May be incapable of cooling the processor even when the TCC is active continuously (in extreme situations).
Low-Power States and PROCHOT_N Behavior
Depending on package power levels during package C-states, outbound PROCHOT_N may de-assert while the processor is idle as power is removed from the signal. Upon wake up, if the processor is still hot, the PROCHOT_N will re-assert, although typically package idle state residency should resolve any thermal issues.
Regardless of enabling the automatic or on-demand modes, in the event of a catastrophic cooling failure, the package will automatically shut down when the silicon has reached an elevated temperature that risks physical damage to the product. At this point the THRMTRIP_N signal will go active.
Critical Temperature Detection
Critical Temperature detection is performed by monitoring the package temperature. This feature is intended for graceful shutdown before the THRMTRIP_N is activated. However, the processor execution is not guaranteed between critical temperature and THRMTRIP_N. If the Adaptive Thermal Monitor is triggered and the temperature remains high, a critical temperature status and sticky bit are latched in the PACKAGE_THERM_STATUS (0x1B1) MSR and the condition also generates a thermal interrupt, if enabled.
The processor provides an auxiliary mechanism that allows system software to force the processor to reduce its power consumption using clock modulation. This mechanism is referred to as “On-Demand” mode and is distinct from Adaptive Thermal Monitor and bi-directional PROCHOT_N. The processor platforms should not rely on software usage of this mechanism to limit the processor temperature. On-Demand Mode can be accomplished using processor MSR or chipset I/O emulation. On-Demand Mode may be used in conjunction with the Adaptive Thermal Monitor. However, if the system software tries to enable On-Demand mode at the same time the TCC is engaged, the factory configured duty cycle of the TCC will override the duty cycle selected by the On-Demand mode. If the I/O based and MSR-based On-Demand modes are in conflict, the duty cycle selected by the I/O emulation-based On-Demand mode will take precedence over the MSR-based On-Demand Mode.
MSR Based On-Demand Mode
If Bit 4 of the IA32_CLOCK_MODULATION MSR is set to 1, the processor will immediately reduce its power consumption using modulation of the internal processor IA core clock, independent of the processor temperature. The duty cycle of the clock modulation is programmable using bits [3:0] of the same IA32_CLOCK_MODULATION MSR. In this mode, the duty cycle can be programmed in 6.25% increments. Thermal throttling using this method will modulate each processor IA core's clock independently.
I/O Emulation-Based On-Demand Mode
I/O emulation-based clock modulation provides legacy support for operating system software that initiates clock modulation through I/O writes to ACPI defined processor clock control registers on the chipset (PROC_CNT). Thermal throttling using this method will modulate all processor IA cores simultaneously.