Intel® Xeon® 6700-Series Processor with E-Cores
Specification Update
Errata Details
SRF1. Intel® VT-d Remapping Hardware Does Not Perform Reserved(0) Check on PGSNP Field of Scalable-Mode PASID Table Entry
Problem: Intel® Virtualization Technology (Intel® VT) for Directed I/O (Intel® VT-d) remapping hardware does perform Reserved(0) check on Page Snoop (PGSNP) field in scalable-mode Process Address ID (PASID) table entry when Snoop Control capability is defined as not available in the Extended Capability Register Offset 10h bit 7 (ECAP.SC=0).
Implication: There are no known functional implications due to this erratum. Intel has not observed this erratum with any commercially available software.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF2. Remapping Hardware May Set Access/Dirty Bits in a First-Stage Page-Table Entry
Problem: When remapping hardware is configured by system software in scalable mode as Nested (PGTT=011b) and with PWSNP field Set in the PASID-table-entry, it may Set Accessed bit and Dirty bit (and Extended Access bit if enabled) in first-stage page-table entries even when second-stage mappings indicate that corresponding first-stage page-table is Read-Only.
Implication: Due to this erratum, pages mapped as Read-only in second-stage page-tables may be modified by remapping hardware Access/Dirty bit updates.
Workaround: None identified. System software enabling nested translations for a VM should ensure that there are no read-only pages in the corresponding second-stage mappings.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF3. Machine Check Bank 4 UCNA Errors May Not Be Signaled
Problem: When any UC error is not enabled in machine check bank 4 due its associated bit being 0 in IA32_MC4_CTL (MSR 410h), and the disabled UC error and a UCNA error happen simultaneously, the UC error is logged with overflow set, but the UCNA error may not be signaled.
Implication: Due to this erratum, when UC errors are disabled in bank 4, UCNA errors may not be signaled.
Workaround: None identified. Software should keep MCAs enabled in IA32_MC4_CTL.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF4. Platform May Hang if System Software Sends a Page Group Response or DevTLB Invalidation to Non-existent Requester ID
Problem: When system software submits a Page Group Response or DevTLB Invalidation command to remapping hardware, the remapping hardware forwards commands to Root-Complex so that the Root-Complex may route the command to Requester ID specified by system software. If system software specifies a Requester ID in the command that does not exist on the platform, the command is not correctly aborted and may cause the system to hang.
Implication: If system software issues a Page Group Response or DevTLB Invalidations towards Requestor ID that does not exist on the platform, the system may hang. Intel has only observed this behavior in a synthetic test environment.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF5. Remapping Hardware Implements Bits [31:16] of The Three Event Data Registers (VTDBAR offsets 0x3C, 0xA4, and 0xE4) as Read-Writable
Problem: Bits [31:16] of the three Event Data registers (VTDBAR offsets 0x3C, 0xA4, and 0xE4) are defined to be “Reserved and Zero” (RsvdZ) but are implemented as Read-Writable (RW).
Implication: Due to this erratum, system software may write these bit[31:16] to non-zero values. Intel has not observed this erratum to impact the operation of any commercially available software.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF6. Performance Monitoring Event Branch Instruction Retired Does Not Count CALLs to Next Sequential Instruction
Problem: A CALL instruction whose target is the next sequential instruction (the same address pushed onto the stack) does not increment the performance monitoring event BR_INST_RETIRED (Event: C4H, UMask: 00H, F9H).
Implication: Due to this erratum, software monitoring Branch Instruction Retired events may undercount. Since the CALL is to the next instruction, control flow tracing with the Last Branch Retired (LBR) records is not affected.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF7. Performance Monitoring Event Branch Instruction Retired Overcounts on Certain Types of Branch and Complex Instructions
Problem: On certain types of branch and complex instructions, the performance monitoring event BR_INST_RETIRED (Event: C4H, UMask: 00H / 7EH / BFH / C0H / DFH / EBH / FBH / F9H) overcounts by 1. Affected instructions include FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD and complex SGX/SMX/CSTATE instructions/flows.
Implication: Due to this erratum, software monitoring Branch Instruction Retired events may overcount.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF8. Remapping Hardware May Encounter Incorrect Error Code in Invalidation Queue Error Record Register
Problem: When fetching a new descriptor from the Invalidation Queue, if DMA remapping hardware observes an unsupported value in the Translation Table Mode (TTM) field, it may report an invalid descriptor width programmed in the InvalidationQueue (code 5) instead of invalid value in the TTM field of the Root Table Address (code 7) in the Invalidation Queue Error Info (IQEI) register of the IQERCD_REG (VTDBAR offset 0xB0).
Implication: Due to this erratum, Software that distinguishes between error code 5 and error code 7 may not function as expected.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF9. Processor Trace May Generate PSB Packets Too Infrequently
Problem: A Packet Stream Boundary (PSB) packet should be generated for every PSBFreq number of trace output bytes. Due to this erratum, PSB packets may be generated only after as many as four times that number of output bytes have been generated.
Implication: Due to this erratum, trace decoder software may see fewer PSB packets than expected. This may lead to the trace decoder software needing to search further to find a starting point to decode or, when used in circular mode, being unable to decode the trace due to lacking any PSB packets.
Workaround: None identified. Software can request more frequent PSB packets by programming PSBFreq (bits[27:24]) of IA32_RTIT_CTL MSR (570H) to a value 1/4 of the desired value.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF10. Unsynchronized Cross-Modifying Code Operations Can Cause Unexpected Instruction Execution Results
Problem: The act of one processor or system bus master writing data into a currently executing code segment of a second processor with the intent of having the second processor execute that data as code is called cross-modifying code (XMC). XMC that does not force the second processor to execute a synchronizing instruction prior to execution of the new code is called unsynchronized XMC. Software using unsynchronized XMC to modify the instruction byte stream of a processor may see unexpected or unpredictable execution behavior from the processor that is executing the modified code.
Implication: In this case the phrase "unexpected or unpredictable execution behavior" encompasses the generation of most of the exceptions listed in the Intel Architecture Software Developer's Manual Volume 3: System Programming Guide including a General Protection Fault (GPF) or other unexpected behaviors. In the event that unpredictable execution causes a GPF the application executing the unsynchronized XMC operation would be terminated by the operating system.
Workaround: In order to avoid this erratum programmers should use the XMC synchronization algorithm as detailed in the Intel Architecture Software Developer's Manual Volume 3: System Programming Guide Section: Handling Self- and Cross-Modifying Code.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF11. Address of Poisoned Data Line May Be Incorrectly Reported
Problem: Under complex microarchitectural conditions, an incorrect address may be logged in IA32_MC3_ADDR (MSR 40Eh) when an instruction consumes Poisoned data.
Implication: Due to this erratum, IA32_MC3_ADDR (MSR 40Eh) may contain an incorrect address when a Poison fault is reported in IA32_MC3_STATUS (MSR 40dH, MCACOD of 0135H). Poison containment is not lost. Intel has only observed this behavior under synthetic testing conditions.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF12. Locked Page Split Access May Not Be Detected by UC-lock Disable if Split-lock Disable is Not Used
Problem: The UC-lock disable feature (MSR_MEMORY_CTRL bit [28] (MSR 33h)) may not cause a fault (#AC(4)) for a page split lock that accesses a page with non-WB memory type if the split lock disable (MSR_MEMORY_CTRL bit [29]) is not set.
Implication: Due to this erratum, system software may not be able to fully prevent bus locks due to locks to non-WB memory unless they use the split-lock disable feature to prevent bus locks due to splits. Intel has not observed this erratum with any commercially available software.
Workaround: None identified. Software using the UC-lock disable feature should also enable the split lock disable feature.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF13. Intel® QAT Accelerator May Violate ATS Invalidation Completion Ordering
Problem: Address Translation Service (ATS) invalidations may complete before all in-flight writes are drained from Intel® QuickAssist Technology (Intel® QAT) accelerator.
Implication: Due to this erratum, Intel QAT accelerator operation with ATS capability enabled may lead to unexpected system behavior.
Workaround: None identified. System software (OS/VMM) performing ATS invalidation on Intel QAT accelerator needs to serially execute a second (duplicate) ATS invalidation request after the first invalidation completes to drain in-flight writes.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF14. Intel QAT Accelerator Device May Not Invalidate PASID Supervisor-Privilege Translations
Problem: ATS invalidations for Process Address Space ID (PASID) with Supervisor-privilege translations may not correctly invalidate the device TLB on Intel QAT.
Implication: Due to this erratum, Intel QAT accelerator operation with ATS capability enabled and Supervisor-privilege PASID may lead to unexpected system behavior.
Workaround: None identified. System software (OS/VMM) performing ATS invalidation on Intel QAT accelerator on behalf of any supervisor-privilege PASID must set the Global Invalidate (G) bit in the ATS invalidation to avoid the erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF15. RTTO May Occur at Lower Link Speed And Reduce Link Width
Problem: When a 32 GT/s, x16 PCI Express* (PCIe*) port is configured to operate at a lower link speed and reduced link width (such as 2.5 GT/s, x1 mode), Data Link Layer Packets (DLLPs), including transaction ACK packets, may incur large latencies.
Implication: Due to this erratum, large latencies at lower link speeds and reduced link widths may lead to Replay Timer Timeout (RTTO) failures from the link partner.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF16. PCIe* Root Port May Fail to Set The RICFM Bit
Problem: The PCIe Root Port may not set the Received Integrity Check Fail Message (RICFM) bit in the Selective IDE Stream Status Register (Bit [31] in SIDESSTS_1, Bus: 29, 26, 4-1; Device: 9-2; Function: 0; Offset 320h) when the Root Port receives an Integrity Data Encryption (IDE) Fail message from a PCIe device.
Implication: Due to this erratum, when the Root Port receives an IDE Fail message from a PCIe device, software that relies upon the RICFM bit may function as expected.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF17. Intel® IAA Drop Initial Bits Field in AECS May Not Clear When Drop Initial Bits field is 8 Times Source1 Size
Problem: On systems with Intel® In-Memory Analytics Accelerator (Intel® IAA) enabled, if the Drop Initial Bits field (Offset 1DCh) in the Analytics Engine Configuration and State (AECS) structure is equal to 8 times the Source 1 Transfer Size (Offset 20h) in the descriptor, then if the AECS is written out at the end of a decompression job, the updated value of this Drop Initial Bits field is incorrect.
Implication: Due to this erratum, if AECS is used to pass information between related jobs and this condition occurs, the following job may generate incorrect output, which may lead to unpredictable system behavior.
Workaround: None identified. If this condition occurs, then when the job completes, software must manually zero the Drop Initial Bits field.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF18. Intel IAA/Intel® DSA May Not Report Interrupt PASID Check Failure Error
Problem: On systems with Intel IAA or Intel® Data Streaming Accelerator (Intel® DSA), if Drain Descriptor encounters both an Interrupt PASID check failure and a Page Fault error on an explicit readback address, the interrupt PASID check failure may not be reported.
Implication: Due to this erratum, software that relies on the interrupt PASID check failure error being reported may not function as expected.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF19. The Processor May Drop Noncompliant Posted Peer-to-Peer Transactions
Problem: If the processor receives a noncompliant posted PCIe peer-to-peer transaction with non-zero upper tag bits [9:8], it may drop the transaction instead of forwarding it to the intended destination.
Implication: Due to this erratum, PCIe devices that perform peer-to-peer posted transactions may not operate as expected. Intel has not observed this erratum with any commercially available devices.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF20. Unexpected Rollover in MBM Counters
Problem: When using Intel® Resource Director Technology (Intel® RDT), unexpected rollover can occur when Memory Bandwidth Monitoring (MBM) counter values are close to the the maximum allowed counter value. A rollover is when a MBM counter value read in the n+1th iteration is lower than nth iteration.
Implication: Due to this erratum, bandwidth computed from successive MBM readings representing a rollover may not be accurate.
Workaround: None identified. Software should discard the memory bandwidth computed over a rollover interval.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF21. Remapping Hardware May Not Generate a Page Request Group Response Message While Operating in Legacy Mode or Abort DMA Mode
Problem: Remapping hardware may not generate a Page Request Group Response Message while operating in Legacy mode or Abort DMA mode if a PCIe device generates a Page Request Message.
Implication: Due to this erratum, when the remapping hardware fails to generate a Page Request Group Response Message may lead to unpredictable device behavior, including a device hang. The remapping hardware continues to report RTA.3 or RTA.4 faults if it receives these Page Request Group Response Message. Intel has only observed this behavior in a synthetic test environment.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF22. Remapping Hardware May Abort ZLR to Second-Stage Write Only Pages
Problem: Remapping hardware reports non-recoverable Intel VT-d fault and causes the Zero-Length-Read (ZLR) to be aborted, If a ZLR encounters read-only page in first-stage tables and write-only page in second-stage tables.
Implication: Due to this erratum, device may observe an unexpected abort on a ZLR and an Intel VT-d fault may be indicated. Intel has not observed this erratum with any commercially available software.
Workaround: None identified. System software should not create write only pages in second-stage page tables.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF23. Remapping Hardware Does Not Perform Reserved (0) Check in Page Response Descriptor
Problem: Remapping hardware does not set Invalidation Queue Error field in the Fault Status Register (VTDBAR offset 0x34) when software writes non-zero value in bits [255:128] and bit[5] of the Page Response descriptor.
Implication: System software violating Intel VT-d architecture requirement by programming non-zero values in bits [255:128] and bit[5] of Page Response descriptor may not fault on current processors but may fault on future processors. Intel has not observed this sighting/erratum with any commercially available system.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF24. I/O And Complex Operations May Hang in the Presence of Lines Marked Poisoned
Problem: Under certain microarchitectural conditions, IN, OUT, REP IN, REP OUT, and complex operations that utilize these primitives may hang if the processor contains a poisoned line in the first level data cache (DL1).
Implication: When this erratum occurs, the system may become unresponsive and hang with an Internal Timer Error with Machine Check Exception (MCACOD=0400h) logged into IA32_MC0_STATUS (MSR 401h). Intel has only observed this behavior in a synthetic test environment.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF25. CHA UCNA Errors May Be Incorrectly Controlled by MC7_CTL Enable Bits
Problem: Uncorrectable Error No Action required (UCNA) errors reported in Cache Home Agent (CHA) Machine Check Banks (Bank 7) IA32_MC7_STATUS (MSR 41Dh) may be incorrectly controlled by the associated IA32_MC7_CTL (MSR 41Ch).
Implication: Due to this erratum, when IA32_MC7_CTL = 0h, the UCNA error may be logged but not signaled. When IA32_MC7_CTL =FFFFFFFFFh, the UCNA error may be logged and signaled, but may incorrectly set IA32_MC7_STATUS.EN (bit 60). Intel has not observed this erratum to affect any commercially available software.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF26. Errors May Occur During TOR CrashDump
Problem: When reading the TOR_AUX array prior to a TOR CrashDump operation, a segment of the TOR_AUX memory array may not be fully initialized, and the processor may return invalid data. As a result, a TOR_AUX_DATA_PARITY_ERR error may be logged (IA32_MCi_STATUS.MSCOD = 40h).
Implication: Due to this erratum, software analyzing a CrashDump record may not behave as expected.
Workaround: None identified. Software should ignore TOR_AUX_DATA_PARITY_ERR errors after a TOR CrashDump.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF27. TOR_TIMEOUT May Occur Due to RID Values Outside Range Limit
Problem: When Integrity and Data Encryption (IDE) is enabled, Requestor ID (RID) values that falls outside the expected RID value limits defined by the Selective IDE RID Association 1 Register (Bits [23:8] in SIDERIDA1_1, Bus: 29, 26, 4-1; Device: 9-2; Function: 0; Offset 324h) and Selective IDE RID Association 2 Register (Bits [23:8] in SIDERIDA2_1, Bus: 29, 26, 4-1; Device: 9-2; Function: 0; Offset 328h) for the PCIe root port may lead to a TOR_TIMEOUT Machine Check Exception reported in MC7_STATUS (MSR 41Dh, MSCOD=0Ch), rather than a Completion Timeout.
Implication: Due to this erratum, a TOR_TIMEOUT may occur, which may lead to a system hang.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF28. DDR5 9x4 DIMMs ECS Data May Be Reported Incorrectly
Problem: For DDR5 9x4 DIMMs, after the memory controller issues a Movable Read Reference (MRR) to device 8, the Error Correctable String (ECS) data will be reported incorrectly in mr_read_result (MEM_BAR [0-3], Offsets 22C80h-22C90h or 2AC80h-2AC90h).
Implication: Due to this erratum, the software cannot rely on ECS data for Device 8 with DDR5 9x4 DIMMs. DDR5 10x4 or 5x8 DIMMs are not affected by this erratum.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF29. RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device Bit May Be Incorrect
Problem: On systems using 9x4 DDR5 DIMMs, when Permanent Fault Detection (PFD) is disabled, the RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device bit (138_MEM_RRD + Offsets 22C54h, 22D80h, 2AC54h, 2AD80h, 22E60h, Bit 7) will always report 0 when an error is detected in device 8.
Implication: Due to this erratum, when an error is detected on device 8, the system software is not able to rely on the value of the RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device bit.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF30. Intel DSA Memory Write with Incorrect Parity May Result in a System Crash
Problem: When executing a copy operation, if an Intel DSA device receives a poisoned data response to a memory read request (Bus 8:11; Device: 1; Function: 0; Offset 104h, ERRUNCSTS.PTLP, bit[12]), an associated destination memory write with incorrect parity may be generated.
Implication: Due to this erratum, the memory write with incorrect parity may result in a machine check error leading to a system crash.
Workaround: None identified. It may be possible for the BIOS to contain a mitigation for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF31. Intel IAA Expand Operation With PRLE Format Input May Return an Error
Problem: Intel IAA Expand operation may parse beyond the required elements of Source 1 and return a Parquet Run Length Encoding (PRLE) Format Error (14h) unexpectedly.
Implication: Due to this erratum, software may receive a spurious error.
Workaround: None identified. Software should not send non-PRLE encoded stream data in Source 1.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF32. Processor Trace May Not Generate a CYC Packet Before MODE.EXEC Packets
Problem: When a Processor Trace MODE.EXEC packet is generated due to a change in RFLAGS.IF (interrupt flag) or the CS.L or CS.D bits, the processor may not generate a CYC packet before generating the MODE.EXEC packet.
Implication: Due to this erratum, trace decoder software may not be able to precisely determine when mode changes that involve changing the interrupt flag or the application’s default operand size happened.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF33. APPP Error May Be Logged Incorrectly in Machine Check Status Register ADDRV
Problem: When the processor encounters an address parity error, it will signal an Address Parity Error (APPP) in [bit:58] of IA32_MCi_STATUS.ADDRV, the failing address in IA32_MCi_STATUS.ADDRV register may be incorrectly logged.
Implication: Due to this erratum, software that relies upon the IA32_MCi_ADDR.ADDR bits may not function as expected.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF34. Intel DSA Does Not Report IDPT Entry in SWERROR or Event Log
Problem: The Intel DSA device may not log the Inter-Doman Permissions Table (IDPT) entry number into the error information field of the SWERROR register or Event Log as defined by Intel® Data Streaming Accelerator Architecture Specification, document number 671116, when it attempts to update an IDPT entry that is inaccessible.
Implication: Due to this erratum, software that utilizes the Event Log or SWERROR register for debugging purposes may not be aware of the entry number that triggered this error. Intel has not observed any functional issues due to this erratum.
Workaround: None identified. Software can avoid this erratum by always providing a completion record for the Update Window descriptor.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF35. UBOXERRMISC_CFG Registers Do Not Log Errors
Problem: The UBOXERR[MISC,MISC2, & MISC3]_CFG registers (Bus: 30; Device: 0; Function: 0; Offset: [ECh,E8h, & F4h]) within the Ubox Event Control (EVNTS) PCI configuration space will not log errors and incorrectly read 0h unless written by software.
Implication: Due to this erratum, software that relies upon the UBOXERR[MISC,MISC2, & MISC3]_CFG registers may function incorrectly.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF36. Intel DSA and Intel IAA Devices May Cause a Machine Check
Problem: Reads returning greater than 8 from some registers of Intel DSA and Intel IAA devices will result in a fatal Machine Check Exception (MCE) logged in MC4_STATUS (MSR=419h, MSCOD=0000h, MCACOD=0x0E0Bh).
Implication: Due to this erratum, reads greater than 8 bytes will trigger a fatal MCE.
Workaround: None identified. Only software at the highest privilege level should access Intel DSA and Intel IAA devices.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF37. Intel DSA and Intel IAA Devices May Cause Invalid Translation Caching
Problem: Submitting a Disable Work Queue command to Intel DSA and Intel IAA devices does not drain in-flight address translations resulting in possible invalid translation caching, which may lead to an unpredictable system behavior.
Implication: Due to this erratum, Intel DSA and Intel IAA devices may not behave as expected.
Workaround: None identified. As a mitigation, software should replace the Disable Work Queue command with a Drain Work Queue command followed by a Reset Work Queue command.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF38. S and AR Bits of MCi_STATUS Registers Unexpectedly Cleared by UCNA or CE
Problem: An Uncorrected No Action (UCNA) or Correctable Error (CE) may incorrectly clear the Signaling (S; bit 56) or Action Required (AR; bit 55) flags of IA32.MCi_STATUS registers.
Implication: Due to this erratum, software that relies on S and AR bits of IA32_MCi_STATUS registers may not function as expected.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF39. Completion Timeout When Using Link IDE And Selective IDE
Problem: When using both Link Integrity and Data Encryption (Link IDE) and Selective Integrity and Data Encryption (Selective IDE) capabilities for a PCIe port, the processor may respond with an incorrect StreamID field value in a completion packet.
Implication: Due to this erratum, an endpoint that relies upon coherent secure link StreamID field values when the Link IDE Stream State register (Bit [3:0]; Bus: 29, 26, 4-1; Device: 9-2; Function: 0; offset 314h ) and the Selective IDE Stream State register (Bit [3:0], Bus: 29, 26, 4-1; Device: 9-2; Function: 0; offset 320h) are enabled may not function as expected and may lead to a device Completion Timeout (CTO) error.
Workaround: None identified. Software should not enable both Link IDE and Selective IDE for a PCIe port. Additional details may be found in the Root Complex IDE Key Configuration Unit Software Programming Guide, document number 732838.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF40. Error Overflow Indication is Not Getting Set Properly When Back to Back Errors Occur
Problem: Under complex transaction flow condition, if multiple (back to back) errors occur, the OVERFLOW (bit 62) in IA32_MCi_STATUS registers may not be set.
Implication: For some errors, the system will log MCE, but may not set the OVERFLOW in IA32_MCi_STATUS register of CHA, which may result in OS performing incorrect page map out action.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF41. Non Canonical Fault May Be Signaled on Access That Wraps Address Space When LAM is Enabled
Problem: When Linear Address Masking (LAM) is enabled, a non-canonical fault may be signaled if there is an access which splits the 64-bit linear address space (and thus touches both linear address FFFF_FFFF_FFFF_FFFFh and 0h).
Implication: Due to this erratum, software may receive an unexpected exception on such accesses. Intel has not observed this erratum with any commercially available software.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF42. Intel RDT Memory Bandwidth Monitor (MBM) May Overcount Memory Bandwidth Measurements
Problem: The Intel RDT MBM feature may overcount memory bandwidth measurements when both the dead block predictor and the last level cache stream prefetcher are enabled.
Implication: Due to this erratum, software that relies upon MBM counters may function incorrectly.
Workaround: None identified. Software may use alternatively use uncore performance monitoring of UNC_M_CAS_COUNT_SCH0.RD, UNC_M_CAS_COUNT_SCH0.WR, UNC_M_CAS_COUNT_SCH1.RD, and UNC_M_CAS_COUNT_SCH1.WR to measure aggregate memory bandwidth. See https://perfmon-events.intel.com for more information.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF43. Intel RDT Memory Bandwidth Allocation (MBA) Cannot Throttle to Minimum Bandwidth
Problem: The system may not be limited to the minimum memory bandwidth possible when the maximum Intel RDT MBA delay value MSR throttle settings IA32_L2_QOS_EXT_BW_THRTL_[0..14] (at MSR address D50h...D5Eh) are configured by setting these registers to the maximum value of 90.
Implication: Due to this erratum, maximum bandwidth throttling cannot be guaranteed.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF44. Mirrored 128b ECC Mode May Not Log DRAM Address With Poison
Problem: When a poison pattern error is detected with 128 bit Error Correcting Code (ECC) mode and Memory Mirroring mode enabled, the DRAM address may not be logged in RETRY_RD_ERR_LOG (Bus:30; Device: 6-5; Function 6-1; Offset 2F10h + [0...3 × 4h]) register.
Implication: Due to this erratum, software that relies upon the RETRY_RD_ERR_LOG register may not function as expected.
Workaround: None identified. A BIOS code change has been identified and may be implemented as a mitigation for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF45. Cache Level Wrongly Reported in Machine Check Banks
Problem: When reporting a Machine check in the module level caches (IA32_MC1_STATUS, MSR 405H), a Compound Error Code of type Cache Hierarchy Error will be reported with a Level (LL) Sub-field of 0b10[L2] instead of 0b01[L1].Implication: Due to this erratum, system software relying on this data may wrongly categorize the cache level in which the error was reported. The severity of the error will be reported accurately.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF46. System Hang When Page Request Message Issued From Discrete Device
Problem: A Page Request Message with a Last Page In Group (LPIG) field value of 0 issued from a discrete PCIe device to the processor may behave unexpectedly.
Implication: Due to this erratum, the system may hang. Root complex integrated devices are not affected by this erratum.
Workaround: None identified. It may be possible for the BIOS to contain a mitigation for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF47. PLR and PEM PCS_PSTATE Not Asserted or Incremented
Problem: The PCS_PSTATE status (bit 29) in the Performance Limit Reasons (PLR) (Bus: 8; Device: 3; Function: 7; Offset: 12K+10×4×1-4d) and the Power and Performance Excursion Monitor (PEM) (Bus: 8; Device: 3; Function: 7; Offset: 44K+10×4×1-4d) will not be asserted when the Baseboard Management Controller (BMC) overrides the P-state and limits core frequency, and the PCS_PSTATE counter (ID 29) in the PEM will not be incremented and will read 0.
Implication: Due to this erratum, software that relies upon the PCS_PSTATE bits may behave unexpectedly.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF48. Some PECI Wire Commands May Not be Serviced
Problem: Every 65536th transaction over PECI wire that is not a Ping(), GetTemp(), or GetDIB() Service Command, may not be serviced.
Implication: Due to this erratum, PECI wire may incorrectly continue to return an unexpected Completion Code (CC) of 83h after retrying for 850ms and fail to return the read data or write the value.
Workaround: A fix for this erratum is available in PECI driver patch peci-wa_v2 or later.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF49. Unable to Access 32-bit Address MMIO Registers Out of Band
Problem: The processor logic incorrectly handles access requests from an out-of-band agent (such as BMC) to any MMIO registers with a 32-bit address.
Implication: Due to this erratum, out-of-band access to 32-bit address MMIO registers may result in a completion code of 90h (illegal command) with values of 0.
Workaround: None identified. It may be possible for the BIOS to contain a mitigation for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF50. PCIe TLP May be Lost After Link Down Event
Problem: If the first PCIe TLP received from the endpoint after a link down or hot-add event is a TLP with an LCRC error, one TLP may be lost on the link due to an incorrect Sequence Number in the NAK DLLP.
Implication: Due to this erratum, the system may log a Completion Timeout Status of 1 (ERRUNCSTS.CTE; 29, 26, 4-1; Device: 9-2; Function: 0; Function: 0; Offset 104h; Bit 14) or Configuration Request Timeout of 1 (RPPIOSTS.CFGCTO; 29, 26, 4-1; Device: 9-2; Function: 0; Offset 1ACh; Bit 2), leading to a device reset, a device (surprise) warm reset, or a failure to add the device.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF51. Invalid BMC Frame Data During Reset Cycling
Problem: The I3C_MNG interface within the processor mistakenly uses a device's static address during AC reset cycling.
Implication: Due to this erratum, invalid BMC frame data or a GETPID command may cause the BMC to panic or crash, leading to a system hang.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF52. Unpredictable System Behavior May Occur When C6 or Deeper Sleep States Are Used
Problem: Under complex microarchitectural conditions, a core may encounter incorrect data when other cores in the system are entering Core C6 or deeper sleep states.
Implication: When this erratum occurs, unpredictable system behavior may be observed. Intel has only observed this behavior in a synthetic test environment.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF53. A Core May Hang When Entering or Exiting C6 or Deeper Sleep States
Problem: Under complex microarchitectural conditions involving two or more cores within a module simultaneously entering or exiting Core C6 or deeper sleep states, one or more of those cores may hang without a MCE being logged.
Implication: Due to this erratum, the system may hang. Intel has only observed this behavior in a synthetic test environment.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF54. Reserved(0) Check For a PASID Table Entry May Not Happen For a DMA Request
Problem: When a DMA operation encounters any Reserved(0) bits b[95:91] of a PASID table entry as incorrectly set, the processor may fail to generate Intel VT-d fault SPT.3, may incorrectly generate Intel VT-d fault SPT.4, or fail to block the DMA request.
Implication: Due to this erratum, DMA request may not behave as expected when encounter Reserved(0) of a PASID table. Intel has not observed this erratum with any commercially available software.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF55. Remapping Hardware With Major Version Number 6 Incorrectly Advertises The ESRTPS Support
Problem: Remapping hardware Major Version Number 6 (VER_REG.MAJOR_VERSION_NUMBER= 6, VTDBAR offset 0x0, bits 7:4) enables Enhanced Set Root Table Pointer Support (ESRTPS), but CAP_REG.ESRTPS (VER_REG.ESRTPS, VTDBAR offset 0x8, bit 63) is incorrectly reported as 0.
Implication: Due to this erratum software may incorrectly determine the ESTRPS feature is not supported.
Workaround: None Identified. System software may implement ESRTPS feature if VER_REG.MAJOR_VERSION_NUMBER = 6.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF56. Remapping Hardware Will Not Report The PASID Value For RTA.2 Faults in Modes Other Than Scalable Mode
Problem: When Remapping Hardware encounters RTA.2 fault condition in modes other than Scalable Mode (RTADDR.TTM==01), the Fault Recording Register (FRCDH_REG_0_0_0_VTDBAR, offset 408h) will incorrectly report a value of 0 in the PASID Present (PP) field (bit 31) and in the PASID Value (PV) field (bits [59:40]).
Implication: Due to this erratum, software can not rely on PASID value for RTA.2 faults in modes other than Scalable Mode. Intel has not observed this sighting/erratum to impact the operation of any commercially available software.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF57. Remapping Hardware Does Not Perform a Reserved(0) Check in Interrupt Remap Table Entry
Problem: Remapping hardware does not perform Reserved(0) check on b[127:HAW+64] of the Interrupt Remap Table Entry for a Posted Interrupt.
Implication: Due to this erratum, system software violating Intel VT-d architecture requirement by programming non-zero reserved values in b[127:HAW+64] of Interrupt Remap Table entry for Posted Interrupt may not fault on current processors but may fault on future processors. Intel has not observed this sighting/erratum with any commercially available system.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF58. Intel IAA Decompression Logic May Return Incorrect Values in Completion Record When All Source 1 Data is Dropped
Problem: When an Intel IAA device decompress operation has a non-zero Source 1 Size, and if the sum of Drop Initial Bits of Analytics Engine Configuration and State (AECS; offset 1DCh) and Ignore End Bits (Decompression Flags, bits [8:6]) is equal to Source 1 Size times 8, the operation correctly drops all the Source 1 data. However, if the operation also results in a recoverable output buffer overflow (Completion Record Status Code 0Bh), then the value of Bytes Completed in the Completion Record (bytes 4 to 7) or the value of Drop Initial Bits field in the AECS may be incorrect.
Implication: Due to this erratum, the incorrect values may lead to the subsequent decompress operation, that is, the continuation of the same job producing an incorrect result.
Workaround: None identified. Before submitting the descriptor, software can avoid this erratum by checking if the sum of Drop Initial Bits and Ignore End Bits equals Source 1 Size times 8. In this case, setting Source 1 Size, Drop Initial Bits, and Ignore End Bits all to 0 will produce the correct result.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF59. CPUID Returns Invalid Level Type
Problem: When a Trust Domain (TD) executes the CPUID instruction (leaf Bh, ECX[7:0]), a value of 0 is incorrectly returned instead of the sub-leaf index value, resulting in an invalid level type.
Implication: Due to this erratum, software that relies upon the value of bits [7:0] in the CPUID instruction leaf Bh may function incorrectly.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF60. Unexpected Recoveries With ASPM L1 in CXL Endpoint
Problem: When CXL link ASPM (L1) entry flow is initiated, the CXL endpoint requests L1 for CXL.IO but not for CXL.CM, which may cause an extra Power Management (PM) REQ ACK.
Implication: Due to this erratum, the CXL endpoint may observe unexpected link recoveries.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF61. PCIe Infinite Recovery Loop During Link Equalization
Problem: When link equalization fails at Gen5 or Gen4, and the endpoint disables that rate, the processor may unexpectedly attempt to recover at the link partner's advertised speed resulting in an infinite recovery loop.
Implication: Due to this erratum, the PCIe link may fail to train, preventing DL-Init which causes the Data Link Layer Link Active (DLLLA, bit 13) in Link Status (LINKSTS, offset 52h) to remain unset.
Workaround: None identified. To mitigate this erratum, disable Gen5 or Gen4 per affected port to train at Gen4 or Gen3 within BIOS by using the Requested Link Speed setting.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF62. Incorrect B2CMI MCi_STATUS_SHADOW.CORRCOUNT Value
Problem: During memory mirror failover, the value in Bridge to Converged Memory Interface (B2CMI) MCi_STATUS_SHADOW.CORRCOUNT Offset 1C0h (bits 52:38) may not match the CORRECTED_ERROR_COUNT register values in IA32_MCi_STATUS (bits 52:38).
Implication: When this erratum occurs, software that relies upon the B2CMI MCi_STATUS_SHADOW.CORRCOUNT register may not function correctly.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF63. Incorrect RAPL PPL1 Limit
Problem: Platform Running Average Power Limit (RAPL) Platform Power Limit1 (PPL1) is clipped, resulting in the value of MAX_PPL2 (bits 48:32; Offset 665h) being used instead of MAX_PPL1 (bits 16:0; Offset 665h).
Implication: Due to this erratum, the maximum value of PPL2 may restrict PPL1, leading to an incorrect RAPL.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF64. Incorrect Values For CHA ALL0 And CHA ALL1
Problem: For Out-of-Band access, the BDFs for (CHA ALL0 Bus 31; Device 14-0; Function 7-0) and (CHA ALL1 Bus 30; Device 28-14; Function 7-0) PCI Configuration Space Registers are incorrectly mapped.
Implication: Due to this erratum, BMC that relies upon the values in CHA ALL0 or CHA ALL1 may function incorrectly.
Workaround: None identified. To access the CHA ALL0 PCI Configuration Space Register, the BMC needs to use (Bus 30; Device 28-14; Function 7-0). Additionally, to access the CHA ALL1 PCI Configuration Space Register, the BMC needs to use (Bus 31; Device 14-0; Function 7-0).
Status: For the steppings affected, see the Summary Tables of Changes.
SRF65. Boot Failure During BIOS Update With Missing CHA RTID Table
Problem: A Global Reset following an early warm reset during a BIOS update may fail to clear CPLD SRAM on the non-legacy socket prior to ISCLK provisioning and may result in the CPLD becoming unsynchronized.
Implication: Due to this erratum, a boot failure may occur, resulting in a missing CHA Request Transaction ID (RTID) table in the crashdump.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF66. CXL Mode Incorrectly Identifies ASPM L1 Aborts
Problem: In Compute Express Link (CXL) mode, the processor may incorrectly identify Active State Power Management (ASPM) L1 aborts as errors when the ARBMUX requests L1 and transitions to an Active State before the LTSSM reaches L1, resulting in IBSTERRRCRVSTS.RECOVCNT (bits 30:16; Offset 4E4h) overcounting.
Implication: Due to this erratum, software that relies upon the IBSTERRRCRVSTS.RECOVCNT bits may not function as expected.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF67. System Hang During Sideband And Traffic To B2CXL
Problem: During the transmission of sideband traffic (both posted and non-posted) to the Bridge to CXL (B2CXL) sideband endpoint, the Intel On-Chip System Fabric - Side Band (IOSF-SB) endpoint may experience stalling due to the IP agent logic lacking an additional check for processing new back-to-back messages.
Implication: Due to this erratum, the system may hang.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF68. SYS_RESET_N Trigger May Lead to MCA_GPSB_TIMEOUT
Problem: When the SYS_RESET_N pin is triggered before the reset flow phase 5 completes, an unexpected microcode timeout may occur.
Implication: Due to this erratum, the system may hang with MCA_GPSB_TIMEOUT (IA32_MCi_STATUS.MCACOD=402h and IA32_MCi_STATUS.MSCOD=0B00hh).
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF69. Intel® OOBM Services Module RDIAMSREX Command Unable to Access MSRs Under IERR Conditions
Problem: Under an IERR condition, the Intel® Out-of-Band Management Services Module (Intel® OOBM Services Module) RDIAMSREX command is unable to access Model Specific Registers (MSRs) and will return a completion code of 93h.
Implication: Due to this erratum, software that relies on the RDIAMSREX command under IERR conditions may not behave as expected. The RDIAMSR command is not impacted by this erratum.
Workaround: It will be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF70. Performance Monitoring Event For Memory Bound Stalls May Undercount
Problem: The Performance Monitoring events, MEM_BOUND_STALLS_LOAD (EventID: 34h) and MEM_BOUND_STALLS_IFETCH (EventID: 35h), and their subevents, will undercount the number of cycles of core initiated requests with latencies that exceed 256 cycles. A CMASK value of 255 may be used to count instances of this erratum.
Implication: Due to this erratum, software monitoring the events for Memory Bound Stalls may undercount.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF71. PCIe Root Port May Not Reduce Link Width
Problem: During a PCIe Link Width Degrade event on PCIe devices, the root port may not successfully transition to reduced link width due to a timeout in the PCIe Configuration.Linkwidth.Accept to Configuration.Lanenum.Wait state.
Implication: Due to this erratum, the PCIe root port may encounter unexpected recoveries, unexpected link width degrade attempts, uncorrectable errors, Surprise Link Down, and LTSSM errors.
Workaround: None identified.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF72. PMON Unit Control Unfreeze May Cause COUNTERVALUE Registers to Overcount
Problem: When the PMON Unit Control (PMONUNITCTRL) FREEZECOUNTERS (Offset: 2800h; Bit: 0) is set to 0 while the PMON GLOBALPMONFREEZE register is set to 1, the unit control may unexpectedly override the global control and unfreeze the PMONCNTR_0,1,2,3,4 COUNTERVALUE (Offsets 2808h, 2810h, 2818h, 2820h, 2828h; Bits [47:0]).
Implication: Due to this erratum, software that relies on the value of COUNTERVALUE may observe higher than expected values.
Workaround: None identified. The software should use only the unit control unfreeze when the global control freeze is set to 0.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF73. HPR_CAUSE0 is Not Cleared With a Global Reset or Wake Event
Problem: When a global reset or wake from an Sx state occurs, the HPR_CAUSE0 register (PWRMBASE Offset 192Ch) is not cleared.
Implication: Due to this erratum, software that relies on the HPR_CAUSE0 register may function incorrectly.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.
SRF74. System Reset May Incorrectly Overwrite Previous Crashlog
Problem: When the ENABLETRIGGERONCE bit is set to '1' (CRASHLOG_CTRL Bus 8; Device 3; Function 0; Offset 1B8h; bit [6] or MSM_BIOS_CRASHCONTROL Bus 8; Device 3; Function 0; Offset 158h; bit [6]) and a crashlog is triggered on a system reset, an additional crashlog may be incorrectly triggered on a subsequent system reset prior to the CRASHLOG_CTRL.REARMTRIGGER or MSM_BIOS_CRASHCONTROL.REARMTRIGGER bits being set to '1'.
Implication: As a result of this erratum, data collected during a previous crashlog reporting may be overwritten.
Workaround: It may be possible for the BIOS to contain a workaround for this erratum.
Status: For the steppings affected, see the Summary Tables of Changes.