Errata Details

4th Gen Intel Xeon Scalable Processors Codename Sapphire Rapids

NDA Specification Update

Download as PDF

ID 772415

Date 04/19/2023

Version

A newer version of this document is available. Customers should click here to go to the newest version.

SPR1. IPSR May Not Function Correctly

Problem: Poison created within the Internal memory controller may not log the correct system address when operating in the following modes of operation, 1-clock gating enabled, 2-channel XOR is enabled, and 3 Intel® Optane™ two-level memory.

Implication: Due to this erratum, software that relies upon the Internal Poison Source Register (IPSR) bit may function incorrectly.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR2. Poison Data Reported Instead of a CS Limit Violation

Problem: Under complex microarchitectural conditions, in case of poisoned data on an address that violates the CS (code segment) limit, a poison MCE may be signaled and logged in IA32_MC0_STATUS MSR (MSR 401H, MCACOD 150h ) instead of CS limit violation.

Implication: Due to the erratum, the processor may signal an MCE, rather than a higher-priority CS limit violation.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR3. Monitor Instructions to Legacy VGA Region May Fail

Problem: Monitor instructions that target the legacy VGA region (A0000h - BFFFFh) may generate a Machine Check Exception (MCE) in Cache Home Agent (CHA) Machine Check Banks (Banks 9, 10, and 11) MCi_STATUS MSRs (425h, 429h, or 42Dh) with a System Address Decode error (MSCOD = 05h).

Implication: Due to this erratum, a user application or Virtual Machine (VM) guest that is allowed to use MONITOR or UMONITOR instructions to the legacy VGA region may generate a fatal machine check exception.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR4. TILEDATA State May Be Saved Incorrectly

Problem: If execution of XRSTOR or XRSTORS causes a fault or a VM exit, a subsequent execution of XSAVE, XSAVEC, XSAVEOPT, or XSAVES instructions may incorrectly save the TILEDATA state component as all zeroes. This will occur only if the execution of XRSTOR or XRSTORS is attempting to set the TILECFG state component to its initial configuration and to restore the TILEDATA state component from the XSAVE area in memory.

Implication: Due to this erratum, the data saved in the XSAVE area for the TILEDATA state may be incorrect.

Workaround: None identified. Following an execution of XRSTOR or XRSTORS that causes a fault or a VM exit, software should not use the TILEDATA state component saved by a subsequent execution of XSAVE, XSAVEC, XSAVEOPT, or XSAVES that occurs before re-executing the original instruction (after addressing the cause of the fault or VM exit).

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR5. A Poison Data Event May Not be Serviced if a Data Breakpoint Occurs on an Intel AMX Tile-Load or Intel AVX Gather or REP MOVS Instruction

Problem: Under complex microarchitectural conditions, when both data poison and data breakpoint events happen on an Intel AMX Tile-Load or Intel AVX Gather or REP MOVS instruction, one of the events may not be signaled.

Implication: Due to this erratum, either a data breakpoint or a poison data event may not be signaled.

Workaround: It may be possible for BIOS to contain a mitigation for this erratum. When applying the mitigation a collision between a poison and a data breakpoint will result in skipping the data breakpoint.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR6. IFS MSRs Will Ignore a Non-Zero EDX Value And Not Signal a #GP

Problem: A WRMSR instruction to one of the IFS (In-field scan) MSRs not in 64-bit mode with a non-Zero value in EDX will not trigger a global protection (#GP) fault. This affects COPY_SCAN_HASHES (MSR 2C2h) and AUTHENTICATE_AND_COPY_CHUNK (MSR 2C4h).

Implication: Due to this erratum, IFS software running in a non 64-bit mode and attempting the above WRMSR with non-zero value in EDX will not #GP and instead use a linear address which ignores EDX value.

Workaround: None identified. Software should make sure EDX is clear for the above instructions when not running in 64-bit mode.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR7. Processor May Signal Spurious #GP Fault

Problem: A processor that supports greater than 48-bit physical addressing (CPUID.80000008:EAX[7:0]) operating in Long Mode with 48-bit addressing and maps the PRMRR region above 128TB may generate a spurious #GP fault.

Implication: Due to this erratum, a #GP fault may be signaled when software accesses physical addresses greater than 128TB. Intel has not observed this erratum in any commercially available software.

Workaround: None identified. BIOS should configure PRMRR region to be below 128TB.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR8. A Break Point May be Hit Twice When a VM Exit Without Commit Occurs

Problem: When a VM Exit happens without a commit in the middle of a guest exception handling, the RF flag (bit 17 of the EFLAGS) will be cleared, if a code breakpoint was configured on the VM entry instruction, the clearing of the RF flag will cause that code breakpoint to be served again when we perform a VM Entry back to the guest.

Implication: Due to this erratum, software may observe a code breakpoint twice on an instruction if a VM Exit without commit occurs.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR9. Faulted XRSTORS Instruction May Result in Unexpected X87 FTW Value

Problem: Under complex microarchitectural conditions, when a #GP fault (General Protection) happens on an XRSTORS instruction that attempts to INIT both x87 and UINTR states, the x87 FPU Tag Word (FTW) may result in an unexpected value.

Implication: Due to this erratum, the value of the FTW state may be incorrect.

Workaround: None identified. Software should rerun the XRSTORS instruction after handling the #GP.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR10. Error Conditions Detected During Cold Reset May Not be Cleared by Subsequent Warm Reset

Problem: Under certain microarchitectural conditions, if an IP_READY_TIMEOUT error (MCACOD = 0402h, MSCOD=0100h) occurs following a cold reset, a subsequent warm reset may not clear out the underlying error conditions, and an another error may not be detected.

Implication: Due to this erratum, unpredictable system behavior may occur. Intel has only observed this erratum in a synthetic test environment.

Workaround: None identified. BIOS can detect the IERR and force a cold reset to bring the processor back to a known good state.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR11. DSA/IAX Does Not Log The E2E Prefix Bit And The Prefix-Type Bits in AERTLPPLOG1

Problem: For internal transactions that have a Process Address Space Identifier (PASID) TLP Prefix and have a TLP error, Data Streaming Accelerator (DSA) or In-Memory Analytics Accelerator (IAX) units do not correctly log the E2E prefix (Bus: 8; Device: 1; Function: 0; Offset: 138h; AERTLPPLOG1, bit[28]) and the prefix-type (AERTLPPLOG1 bits[27:24]).

Implication: Due to this erratum, an incorrect TLP prefix type may be logged in AERTLPPLOG1.

Workaround: None identified. As DSA and IAX only support TLPs with a PASID prefix, software should treat E2E prefix as 1 (AERTLPPLOG1[28]=1) and prefix-type as PASID (AERTLPPLOG1[27:24]=0001b).

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR12. The Processor May Drop Noncompliant Posted Peer-to-peer Transactions

Problem: If the processor receives a noncompliant posted PCIe* peer-to-peer transaction with non-zero upper tag bits [9:8], it may drop the transaction instead of forwarding it to the intended destination.

Implication: Due to this erratum, PCIe devices that perform peer-to-peer posted transactions may not operate as expected. Intel has not observed this erratum with any commercially available devices.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR13. Certain Bits in IA32_MC5_STATUS Register Will Always Return 0

Problem: IA32_MC5_STATUS register (MSR 415h, bits [36:34]) always return 0 on a read.

Implication: Due to this erratum, software that attempts to write a non-zero value to bits [36:34] in IA32_MC5_STATUS will always read a 0 on subsequent reads.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR14. Occupancy Interrupt Handle is Not Checked Against Interrupt Table Size

Problem: In the In-Memory Analytics Accelerator (IAX) the value of the Occupancy Interrupt Handle (OIH) is not checked against the interrupt table size.

Implication: Due to this erratum, if the OIH is programmed incorrectly by the host driver, an incorrect entry will be used, and an incorrect interrupt will be generated if the entry has the Mask bit cleared.

Workaround: None identified. Software must ensure OIH is programmed correctly.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR15. Processor May Incorrectly Set PFD Assisted in Correction Bit in Memory Controller

Problem: When RETRY_RD_ERR_LOG_ADDRESS1.FAILED_DEV = 9 ( Register MEM_BAR[0-3], Offset: 22C58h; bits 5:0) and RETRY_RD_ERR_LOG_PARITY.PAR_SYN[63:48] = FFFFh (Register MEM_BAR[0-3], Offset: 22F08h; bits 63:48), the PFD assisted in correction bit ( Register MEM_BAR[0-3], Offset: 22C58h; bit 30) may be incorrectly set.

Implication: Due to this erratum, software may believe that the corruption cannot be corrected by ECC algorithm alone, which may not be true. RETRY_RD_ERR_LOG_PARITY.PAR_SYN[63:48] is sufficient to identify this condition.

Workaround: None identified. If RETRY_RD_ERR_LOG_ADDRESS1.FAILED_DEV = 9 ( Register MEM_BAR[0-3], Offset: 22C58h; bits 5:0) and RETRY_RD_ERR_LOG_PARITY.PAR_SYN[63:48] = FFFFh (Register MEM_BAR[0-3], Offset: 22F08h; bits 63:48) , then ignore the value of 1 in the PFD assisted in correction bit ( Register MEM_BAR[0-3], Offset: 22C58h; bit 30).

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR16. DSA CMDSTATUS Register May Not Reflect Correct Hardware Status

Problem: After submitting a command to the Data Streaming Accelerator (DSA) CMD register (BAR0 offset 0xA0), a subsequent read to CMDSTATUS register (BAR0 offset 0xA8) may incorrectly see a CMDSTATUS.ACTIVE (bit 31) value of 0 before it has had a chance to change to 1.

Implication: Due to this erratum, software that relies upon the CMDSTATUS.ACTIVE bit may function incorrectly.

Workaround: None identified. Software that relies on the CMDSTATUS.ACTIVE bit should read this bit twice, discarding the first read result.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR17. Remapping Hardware May Set Access/Dirty Bits in a First-stage Page-table Entry

Problem: When remapping hardware is configured by system software in scalable mode as Nested (PGTT=011b) and with PWSNP field Set in the PASID-table-entry, it may Set Accessed bit and Dirty bit (and Extended Access bit if enabled) in first-stage page-table entries even when second-stage mappings indicate that corresponding first-stage page-table is Read-Only.

Implication: Due to this erratum, pages mapped as Read-only in second-stage page-tables may be modified by remapping hardware Access/Dirty bit updates.

Workaround: None identified. System software enabling nested translations for a VM should ensure that there are no read-only pages in the corresponding second-stage mappings.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR18. System Software May Not Receive Intel® Virtualization Technology (Intel® VT) for Directed I/O (Intel® VT-d) Fault SPT.3 For Non-Zero Writes to b[191:HAW+128]

Problem: When the PASID Granular Translation Type (PGTT) field (bits 8:6) in the Scalable-Mode PASID Table has a value of 010b (Second-level) or 100b (Pass-through), and software writes a non-zero value to b[191:HAW+128], no Intel VT-d fault is generated.

Implication: Due to this erratum, system software may not receive Intel VT-d fault SPT.3 (fault reason 5ah) when software writes a non-zero value to b[191:HAW+128]. Intel has not observed any functional implications due to this erratum.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR19. APCTL.APNGE Should be RW Instead of RWS

Problem: Alternate Protocol Negotiation Global Enable (APNGE) field (bit 8) in Alternate Protocol Control register (APCTL) (Bus: 5-0; Device: 1,3,5,7; Function: 0; Offset: B28h) has been implemented as Sticky-Read-Write (RWS) but it should be Read-Write (RW).

Implication: Due to this erratum, the APCTL register is not cleared on warm reset, which violates the PCIe Base Specification version 5.0. Intel has not observed any functional implications from this erratum.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR20. CXL Device May Not Receive Viral

Problem: When the processor detects a Data Parity Error in a downstream packet, it may fail to transmit a Viral indication to a Compute Express Link (CXL) partner. However, the processor will put the CXL Link into LinkError state.

Implication: Due to this erratum, the CXL Link will go into the LinkError state instead of going into Viral.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR21. OOBMSM TSC Will be 320ns Behind The Globally Aligned Counter

Problem: The value in the time stamp counter (TSC) in OOBMSM (Bus: , Device: , Function: , Offset h) will be 320ns behind the globally aligned TSC (BDF).

Implication: Due to this erratum, the TSC value recorded in the OOB Crashlog will differ by 320ns from the globally aligned counter.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR22. Performance Monitoring Event Coherent_ops May Undercount

Problem: The performance monitoring events Coherent_Ops.RFO (Event: 10h, Umask: 08h) or Coherent_Ops.SPECITOM (Event: 10h, Umask:10h) or Coherent_Ops.WBMTOI (Event: 10h, Umask:40h) or Coherent_Ops.CLFLUSH (Event: 10h, Umask: 80h) may incorrectly undercount when multiple coherent requests occur simultaneously.

Implication: Due to this erratum, certain Coherent_ops events may undercount.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR23. PCIe* Link Re-Equalization May Not Occur if Link is in L1 State

Problem: Link re-equalization may not occur if the PCIe link is in L1 state and software attempts to initiate a link re-equalization (Bus:5:0; Device: 8:1; Function:0) LINKCTL3.PE(bit 0, offset: A34h) set, and LINKCTL2.TLS(bit 3:0, offset:70h) set to data rate and LINKCTL.RL(bit 5, offset:50h) set.

Implication: Due to this erratum link re-equalization may not occur and the link will retain the previous equalization coefficients.

Workaround: None Identified. Software may disable ASPM L1 prior to initiating re-equalization (see LINKCTL.ASPMCTL), then re-enable ASPM L1 once done performing re-equalization.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR24. Machine Check Bank 4 UCNA Errors May Not be Signaled

Problem: When any UC error is not enabled in machine check bank 4 due its associated bit being 0 in IA32_MC4_CTL (MSR 410h), and the disabled UC error and a UCNA error happen simultaneously, the UC error will be logged with overflow set, but the UCNA error may not be signaled.

Implication: Due to this erratum, when UC errors are disabled in bank 4, UCNA errors may not be signaled.

Workaround: None identified. Software should keep MCAs enabled in IA32_MC4_CTL.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR25. DSA/IAA Use of Priv and PASID

Problem: Data Streaming Accelerator (DSA) and Intel Analytics Accelerator (IAA) do not support concurrent use of user (Priv=0) and supervisor (Priv=1) privileged operations using the same PASID.

Implication: Due to this erratum, if both user-privileged and supervisor-privileged operations are used with the same PASID, the DSA/IAX behavior is undefined. Intel has not observed this erratum to affect any commercially available software.

Workaround: None identified. Use distinct PASIDs for user-privileged operations and supervisor-privileged operations.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR26. Reserved(0) Check For a PASID Table Entry May Not Happen For a DMA Request

Problem: When a DMA Operation encounters any Reserved(0) bits b[95:91] of a PASID table entry as incorrectly Set, the processor may fail to generate Intel VT-d fault SPT.3, may incorrectly generate Intel VT-d fault SPT.4, or fail to block the DMA request.

Implication: Due to this erratum, DMA Request may not behave as expected when encounter Reserved(0) of a PASID table. Intel has not observed this erratum with any commercially available software.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR27. Remapping Hardware May Not Generate a Page Request Group Response Message While Operating in Legacy Mode or Abort DMA Mode

Problem: Remapping hardware may not generate a Page Request Group Response Message while operating in Legacy mode or Abort DMA mode if a PCIe device generates a Page Request Message.

Implication: Due to this erratum, when the remapping hardware fails to generate a Page Request Group Response Message may lead to unpredictable device behavior, including a device hang. The remapping hardware will continue to report RTA.3 or RTA.4 faults if it receives these Page Request Group Response Message. Intel has only observed this behavior in a synthetic test environment.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR28. Remapping Hardware May Abort ZLR to Second-Stage Write Only Pages

Problem: Remapping hardware will report non-recoverable Intel VT-d fault and cause the Zero-Length-Read (ZLR) to be aborted, If a ZLR encounters read-only page in first-stage tables and write-only page in second-stage tables.

Implication: Due to this erratum, device may observe an unexpected abort on a ZLR and an Intel VT-d fault may be indicated. Intel has not observed this erratum with any commercially available software.

Workaround: None identified. System software should not create write only pages in second-stage page tables.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR29. Remapping Hardware with Major Version Number 6 Incorrectly Advertises the ESRTPS Support

Problem: Remapping hardware Major Version Number 6 (VER_REG.MAJOR_VERSION_NUMBER= 6, VTDBAR offset 0x0, bits 7:4) enables Enhanced Set Root Table Pointer Support (ESRTPS), but CAP_REG.ESRTPS (VER_REG.ESRTPS, VTDBAR offset 0x8, bit 63) is incorrectly reported as 0.

Implication: Due to this erratum, software may incorrectly determine the ESTRPS feature is not supported.

Workaround: None identified. System software can implement ESRTPS feature if VER_REG.MAJOR_VERSION_NUMBER = 6.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR30. Platform May Hang if System Software Sends a Page Group Response or DevTLB Invalidation to Non-existent Requester ID

Problem: When system software submits a Page Group Response or DevTLB Invalidation command to remapping hardware, the remapping hardware forwards commands to Root-Complex so that the Root-Complex may route the command to Requester ID specified by system software. If system software specifies a Requester ID in the command that does not exist on the platform, the command is not correctly aborted and may cause the system to hang.

Implication: Due to this erratum, if system software issues a Page Group Response or DevTLB Invalidations towards Requestor ID that does not exist on the platform, the system may hang. Intel has only observed this behavior in a synthetic test environment.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR31. Remapping Hardware Does Not Perform Reserved (0) Check in Page Response Descriptor

Problem: Remapping hardware will not set Invalidation Queue Error field in the Fault Status Register (VTDBAR offset 0x34) when software writes non-zero value in bits[255:128] and bit[5] of the Page Response descriptor.

Implication: Due to this erratum, system software violating Intel VT-d architecture requirement by programming non-zero values in bits[255:128] and bit[5] of Page Response descriptor may not fault on current processors but may fault on future processors. Intel has not observed this sighting/erratum with any commercially available system.

Workaround: None identified. .

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR32. Remapping Hardware Implements b[31:16] of the three Event Data Registers (VTDBAR offsets 0x3C, 0xA4, and 0xE4) as Read-Writable

Problem: b[31:16] of the three Event Data registers (VTDBAR offsets 0x3C, 0xA4, and 0xE4) are “Reserved and Zero” (RsvdZ) but are implemented as Read-Writable (RW).

Implication: Due to this erratum, system software may write these bit[31:16] to non-zero values. Intel has not observed this erratum to impact the operation of any commercially available software.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR33. IAA Do Not Report Overlap Errors For AECS Size of 2GB or Greater

Problem: Intel In-Memory Analytics Accelerator (IAA) does not report overlap errors for the AECS ( Analytics Engine Configuration and State) within the source or destination data, when a descriptor is submitted with an AECS Size of 2GB or greater.

Implication: Due to this erratum, software may not be able to rely on the accuracy of the output data if the output buffer overlaps the AECS buffer. Intel has only observed this behavior in a synthetic test environment.

Workaround: None identified. Software should not use an AECS size greater than or equal to 2 GB.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR34. DSA/IAA Invalid TC Not Reported in The SWERROR Register

Problem: A Intel® Data Streaming Accelerator (DSA)/Intel® In-Memory Analytics Accelerator (IAA) completion record written using an incorrectly configured Traffic Class (TC) will be written using TC0.

Implication: Due to this erratum, an incorrectly configured TC for the completion record is not reported in the SWERROR register.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR35. IAA Unaligned Completion Record Address Error is Not Reported in SWERROR Register

Problem: When an Intel® In-Memory Analytics Accelerator descriptor is submitted with an unaligned Completion Record Address, the completion record is written to the aligned address (ignoring the lower address bits). The Status byte at the Completion Record Address specified in the descriptor will be written as 0, making it appear to software that the descriptor never completed.

Implication: Due to this erratum, the unaligned Completion Record Address error is not reported in the SWERROR register and unpredictable system behavior may occur.

Workaround: None identified. Software should use properly aligned Completion Record Addresses.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR36. Intel® UPI Link Not Resetting When L1 Mismatch Occurs Between Local and Remote Sockets

Problem: When Intel® UPI L1 low power link state is disabled on the local socket and a remote socket requests L1 entry, a link reset should be initiated. However, the local socket does not initiate a link reset and replies with negative acknowledgement while remaining in current link state.

Implication: Due to this erratum, Intel UPI® link reset does not occur. There are no known functional implications due to this erratum. Intel has only observed this behavior in a synthetic test environment.

Workaround: None identified. System Software should configure all links to the same power state.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR37. DSA/IAA May Fail to Log an MDPE Error For Back-to-Back Parity Errors

Problem: For a read completion with Error Poisoned set that is preceded back-to-back by a write with an IOSF data parity error, Intel® Data Streaming Accelerator (DSA) and Intel® In-Memory Analytics Accelerator (IAA) may fail to set the Master Data Parity Error (MDPE, bit 8) in PCI Status registers (IAA Bus: system design dependent, Device: 2, Function: 0; Offset: 6h, DSA Bus: system design dependent, Device: 1, Function: 0; Offset: 6h).

Implication:Due to this erratum, the PCI Status MDPE bit may not be set. Software that uses this bit may not function as expected.

Workaround: None Identified. Software should use PCIe Advanced Error Reporting rather than PCI legacy error logging.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR38. Relaxed Ordering Not Disabled by DEVCTL.ERO bit for DSA/IAA Upstream Transactions

Problem: The PCIe configuration register, Device Control Enable Relaxed Ordering (DEVCTL.ERO, In-Memory Analytics Accelerator (IAA)Bus: system design dependent, Device: 2, Function: 0, Offset: 48h, Bit 4; Intel® Data Streaming Accelerator (DSA) Bus: system design dependent, Device: 1, Function: 0, Offset: 48h, Bit 4) bit does not disable relaxed ordering in DSA /IAA for upstream writes.

Implication: Due to this erratum, for peer-to-peer traffic, writes from DSA/IAA can show up on a PCIe link with RO=1 even though DEVCTL.ERO is set to '0'. Intel has not observed any functional issues as a result of this erratum.

Workaround: None identified. Software requiring Strick Ordering can set the Strict Ordering (SO) flag in the descriptors to '1' in order to enforce Strict Ordering for upstream writes.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR39. System Address Logged For WDB Parity Errors May be Incorrect

Problem: When target XOR enable bit [21] MCMTR.CLUSTER_XOR_ENABLE (MEM0_BAR; Offset: 20EF8h) or channel XOR enable bit [20] MCMTR.CHANNEL_XOR_ENABLE (MEM0_BAR; Offset: 20EF8h) is set or clock gating disable bit [28] [DDRT_CLK_GATING.DIS_REVADDR_LOG_CLKGATING (MEM0_BAR; Offset 21514h)] is not set, the IMC0_POISON_SOURCE (MEM0_BAR; Offset 20E80h) register may log Write Data Buffer/Byte Enable (WDB/BE) Register File parity errors with an incorrect system address.

Implication: Due to this erratum, the IMC0_POISON_SOURCE register may log the incorrect system address when WDB_PARITY_ERR = 1 in IMC0_POISON_SOURCE.

Workaround: None identified. Software may avoid this erratum by disabling clock gating (DDRT_CLK_GATING.DIS_REVADDR_LOG_CLKGATING = 1) and disabling target XOR (MCMTR. CLUSTER_XOR_ENABLE = 0) and disabling channel XOR (MCMTR. CHANNEL_XOR_ENABLE = 0).

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR40. Incorrect MCACOD For L2 MCE

Problem: Under complex microarchitectural conditions, an L2 poison MCE that should be reported with MCACOD 189h in IA32_MC3_STATUS MSR (MSR 40dh, bits [15:0]) may be reported with an MCACOD of 101h.

Implication: Due to this erratum, the reported MCACOD for this MCE may be incorrect.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR41. System May Hang Due to Full LLRB

Problem: The processor may incorrectly fill the Link-Level Retry Buffer (LLRB) with Non-Ack Bearing Flits.

Implication: Due to this erratum, if both the processor and the link partner fill their respective LLRBs with Non-Ack Bearing Flits, the system may hang. Intel has only observed this erratum in a synthetic test environment.

Workaround: None identified. This erratum can be mitigated by configuring the LLRB to its maximum size and minimizing link latency.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR42. IAA May Fail to Properly Decode Data With a Large Header

Problem: If In-memory Analytics Accelerator (IAA) receives a header that is greater than 256B in size, it may flag a decompression error in the completion record or may incorrectly decompress the data, which will cause a mismatch between the original data CRC and the CRC in the completion record.

Implication: Due to this erratum, software may receive an unexpected data decompression failure. Intel has only observed this erratum in a synthetic test environment.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR43. Memory Controller Violates JEDEC RCD tCSALT Timing

Problem: The processor may violate tCSALT timing (as specified in JEDEC DDR5RCD01 Specification Rev 1.1, Section 8.6.2) by issuing either a Power Down Entry (PDE) or Power Down Exit (PDX) command during the tCSALT window.

Implication: Due to this erratum, Register Clock Drivers that receive PDE/PDX commands during the tCSALT window may not operate as expected.

Workaround: It may be possible to mitigate this issue with a BIOS code change.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR44. Wrong CKE Signal Used on 1 DPC 3DS 4H Configs

Problem: For the specific memory configuration using 1 DIMM per channel (DPC) and 4H 3DS DIMMs, the processor will not correctly assert CKE (Clock Enable) during PPD (Precharge Power Down) mode violating section 4.10.1 of the JEDEC specification revision 1.85.

Implication: Due to this erratum, when the memory subsystem enters PPD mode, the processor may experience uncorrectable memory errors.

Workaround: It may be possible for a BIOS code change to contain a workaround for this erratum.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR45. Address May Not be Logged For a UCR Error Detected in The MLC

Problem: An Uncorrected No Action Required (UCNA) error logged in machine check bank 3 with MC3_STATUS.MCACOD=0179h (MSR 40Dh, bits 15:0) may not include a valid address in MC3_ADDR (MSR 40Eh) when an ECC Uncorrected Recoverable (UCR) error is detected on an MLC (Mid-level cache) eviction.

Implication: Due to this erratum, the address of the poisoned data produced by the ECC UCR error in the MLC may not be available to software.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR46. Intel VT-d DMA Remapping Hardware May Hang if it Encounters Page Request Queue Overflow Condition

Problem: Intel VT-d DMA remapping hardware may stop processing new descriptors from the Invalidation Queue and/or stop responding to register reads when it encounters a Page Request Queue overflow condition as indicated by PRS_REG.PRO=1

Implication: Due to this erratum, the system may hang and may not signal a Page Request Queue overflow fault.

Workaround: None identified. Software should size Page Request Queue to avoid overflow condition (PRS_REG.PRO=1). To determine the size of suitable Page Request Queue, software should sum up the value of the Outstanding Page Request Capacity register (Bus: 8-11; Device: 1; Function: 0; Offset: 248h) across all devices where Page Requests are enabled and increment by one. If the result is not a power of 2, then software should round to the nearest higher power of 2. System software that sends Page Response to the device before updating the Page Request Queue Head Register (PQH_REG) will require another doubling of the Page Request Queue Size to avoid overflow condition.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR47. Receiver Common Mode Input Impedance May be Below Specification When Interface is Powered Down

Problem: The processor may fail to meet receiver Common Mode Input Impedance as per PCIe Specification chapter 8.4.3 when the PCIe interface is powered down.

Implication: Due to this erratum, link partners may incorrectly detect the processor and initiate link training during platform reset. Intel has not observed any functional issues from this erratum when used with PCIe compliant link partners.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR48. Remapping Hardware Will Not Report The PASID Value For RTA.2 Faults in Modes Other Than Scalable Mode

Problem: When Remapping Hardware encounters RTA.2 fault condition in modes other than Scalable Mode (RTADDR.TTM==01), the Fault Recording Register (FRCDH_REG_0_0_0_VTDBAR, offset 408h) will incorrectly report a value of 0 in the PASID Present (PP) field (bit 31) and in the PASID Value (PV) field (bits 59:40).

Implication: Due to this erratum, software can not rely on PASID value for RTA.2 faults in modes other than Scalable Mode. Intel has not observed this sighting/erratum to impact the operation of any commercially available software.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR49. Remapping Hardware Does Not Perform a Reserved(0) Check in Interrupt Remap Table Entry

Problem: Remapping hardware does not perform Reserved(0) check on b[127:HAW+64] of the Interrupt Remap Table Entry for a Posted Interrupt.

Implication: Due to this erratum, system software violating Intel VT-d architecture requirement by programming non-zero reserved values in b[127:HAW+64] of Interrupt Remap Table entry for Posted Interrupt may not fault on current processors but may fault on future processors. Intel has not observed this sighting/erratum with any commercially available system.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR50. Processor PCIe Root Port Link Spurious Data Parity Error May be Reported

Problem: When the processor's PCIe root port's link width is down-configured and then subsequently up-configured, the root port may log and report a spurious Local Data Parity Error on the lanes that were disabled and then re-enabled.

Implication: Due to this erratum, the Local data parity error may be observed on PCIe root port down-configured links in the 16.0 GT/s data parity status registers (Bus 5-0; Device 2; Function 0; Offset 10h/14h/18h). Intel has not observed any functional implications due to this erratum.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR51. Mismatch Between UboxErrMisc and MCI_STATUS Registers Error Logs

Problem: The logging in UboxErr Misc Registers (UboxErrMisc_CFG(Bus: 30; Device: 0; Function: 0; Offset ECh), UboxErrMisc2_CFG(Bus: 30; Device: 0; Function: 0; Offset E8h) and UboxErrMisc3_CFG (Bus: 30; Device: 0; Function: 0; Offset F4h)) and IA32_MC6_STATUS (Offset 419h) may be related to different events when a poisoned MMIO transaction and a poisoned Interrupt transaction occur concurrently due to differences in priority logic for logging into the MCI_STATUS register and logging into the UboxErrMisc registers.

Implication: Due to this erratum, the UboxErrMisc registers may show information for a different transaction than the one logged in MCI_STATUS.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR52. CHA UCNA Errors May be Incorrectly Controlled by MCi_CTL Enable Bits

Problem: UCNA (Uncorrectable No Action) errors reported in Cache Home Agent (CHA) Machine Check Banks (Banks 9, 10, and 11) MCi_STATUS MSR’s (425h, 429h, or 42Dh) may be incorrectly controlled by the associated MCi_CTL MSR's (424h,428h, or 42Ch).

Implication: Due to this erratum, when MCi_CTL = 0, the UCNA error will be logged but not signaled. When MCi_CTL =FFFFFFFFFh, the UCNA error will be logged and signaled, but will incorrectly set MCi_STATUS.EN. (bit 60). Intel has not observed this erratum to affect any commercially available software.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR53. Reading The PPERF MSR May Not Return Correct Values

Problem: Under complex microarchitectural conditions, a RDMSR instruction to Productive Performance (MSR_PPERF) MSR (Offset 64eh) may not return correct values in the upper 32 bits (EDX register) if Core C6 is enabled.

Implication: Due to this erratum, Software may experience a non-monotonic value when reading the MSR_PPERF multiple times.

Workaround: None identified. Software should not rely on the upper bits of the MSR_PPERF when core C6 is enabled.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR54. No #GP Will be Signaled When Setting MSR_MISC_PWR_MGMT.ENABLE_SDC if MSR_MISC_PWR_MGMT.LOCK is Set

Problem: If the MSR_MISC_PWR_MGMT.LOCK (MSR 1AAh, bit 13) is set, a General Protection Exception (#GP) will not be signaled when MSR_MISC_PWR_MGMT.ENABLE_SDC (MSR 1AAh, bit 10) is cleared while IA32_XSS.HDC (MSR DA0h, bit 13) is set and if IA32_PKG_HDC_CTL.HDC_PKG_Enable (MSR DB0h, bit 0) was set at least once before.

Implication: Due to this erratum, a #GP will not be signaled even though MSR_MISC_PWR_MGMT.ENABLE_SDC is cleared while the associated LOCK bit is set.

Workaround: None identified. Software should not attempt to clear MSR_MISC_PWR_MGMT.ENABLE_SDC if the above #GP conditions are met.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR55. System May Experience an Internal Timeout Error When an Internal Parity Error Occurs While Working With Intel® AMX

Problem: Under complex microarchitectural conditions, while running Intel® Advanced Matrix Extensions (Intel® AMX), an Internal Parity Error (IA32_MC0_Status (MSR 401n, bits [15:0]) set to 5h) may cause an Internal Timeout Error (IA32_MCi_Status [15:0] set to 400h) in parallel to the reporting of the parity error reporting.

Implication: Due to this erratum, an unexpected Internal Timeout Error may occur.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR56. Last Branch Records May Not Survive Warm Reset

Problem:Last Branch Records (LBRs) are expected to survive warm reset according to Intel® architectures (SDM Vol3 Table 9-2). LBRs may be incorrectly cleared following warm reset if a valid machine check error was logged in one of the IA32_MCi_STATUS MSRs (401h, 405h, 409h, 40Dh).

Implication: Due to this erratum, reading LBRs following warm reset may show zero value even though LBRs were enabled (IA32_LBR_CTL.LBREn[0]=1) before the warm reset.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR57. Single Step on Branches Might be Missed When VMM Enables Notification On VM Exit

Problem: Under complex microarchitectural conditions, "single step on branches" (configured when IA32_DEBUGCTLMSR (Offset 1D9h, bit [1]) and TF flag in EFLAGS register are set) while in guest might be missed when VMM enables "notification on VM Exit" (IA32_VMX_PROCBASED_CTLS2 MSR, Offset 48Bh, bit [31]) while the dirty access bit is not set for the code page (bit [6] in paging-structure entry).

Implication: Due to this erratum, when "single step on branches" is enabled under the above condition, some single step branches will be missed. Intel has only observed this erratum in a synthetic test environment.

Workaround: None identified. When enabling single step on branches for debugging, software should first set the dirty bit of the code page.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR58. Incorrect #CP Error Code on UIRET

Problem: If a #CP exception is triggered during a UIRET instruction execution, the error code on the stack will report NEAR-RET instruction (code 1) instead of FAR-RET instruction (code 2).

Implication: Due to this erratum, an incorrect #CP error code is logged when #CP is triggered during UIRET instruction.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR59. #GP May be Serviced Before an Instruction Breakpoint

Problem: An instruction breakpoint should have the highest priority and needs to be serviced before any other exception. In case an instruction breakpoint is marked on an illegal instruction longer than 15 bytes that starts in bytes 0-16 of a 32B-aligned chunk, and that instruction does not complete within the same 32B-aligned chunk, a General Protection Exception (#GP) on the same instruction will be serviced before the breakpoint exception.

Implication: Due to this erratum, an illegal instruction #GP exception may be serviced before an instruction breakpoint.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR60. Unexpected #PF Exception Might Be Serviced Before a #GP Exception

Problem: Instructions longer than 15 bytes should assert a General Protection Exception (#GP). For instructions longer than 15 bytes, a Page Fault Exception (#PF) from the subsequent page might be issued before the #GP exception in the following cases:

1. The GP instruction starts at byte 1 – 16 of the last 32B-aligned chunk of a page (starting the count at byte 0), and it is not a target of taken jump, and it does not complete within the same 32B-aligned chunk it started in.

2. The GP instruction starts at byte 17 of the last 32B-aligned chunk of a page.

Implication: Due to this erratum, an unexpected #PF exception might be serviced before a #GP exception.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR61. VMX-Preemption Timer May Not Work if Configured With a Value of 1

Problem: Under complex microarchitectural conditions, the VMX-preemption timer may not generate a VM Exit if the VMX-preemption timer value is set to 1.

Implication: Due to this erratum, if the value of the VMX-preemption timer is set to 1, a VM exit may not occur.

Workaround: None identified. Software should avoid programming the VMX-preemption timer with a value of 1.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR62. User Interrupt Might be Delayed

Problem: Under complex microarchitectural conditions, if MOV SS blocking (bit 1 in the guest Interruptibility state) is enabled, when a guest resumes into CPL3 with a user interrupt pending, the awaiting interrupt might be served after the second instruction and not after the first one as expected.

Implication: Due to this erratum, an user interrupt might be delayed. Intel has not observed this erratum with any commercially available software.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR63. VM Exit Qualification May Not be Correctly Set on APIC Access While Serving a User Interrupt

Problem: A VM Exit that occurs while the processor is serving a user interrupt in non-root mode should set the “asynchronous to instruction execution” bit in the Exit Qualification field in the Virtual Machine Control Structure (bit 16). However, if a VM Exit occurs during processing a user interrupt due to an APIC access, the bit will not be set.

Implication: Due to this erratum, the “asynchronous to instruction execution” bit will not be set if an APIC Access VM Exit occurs while the processor is serving a user interrupt. Intel has not observed this erratum with any commercially available software.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR64. Software Tuning That Relies on PCLS Values May Experience Inaccurate Event Counts

Problem: While monitoring system performance, the processor will incorrectly translate a Prior Cache Line State field value of 0 (no performance detail) in an Intel® UPI data response packet to an one-hop near miss performance event when utilizing an External Node Controller (XNC).

Implication: Due to this erratum, software tuning that relies on correct PCLS values may experience inaccurate Intel® UPI one-hop near miss event counts.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR65. Multiple SGX_Doorbell_Errors on Ubox Response Mismatch

Problem: In the event of a mismatch between UPI LT_Doorbell Response completion and SGX_Secure_En configuration bit in Ubox, SGX_Doorbell_Errors may overflow the NCEVENTS_CR_UBOX_MCI_STATUS (MCA Bank 6, MSR 419h) register and signal a redundant Machine Check Exception (MCE) with MSCOD 801Ch and MCACOD of 0407h to the cores and PUNIT.

Implication: Due to this erratum, a redundant MCE may be signaled.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR66. ECS Readout Fails on Mixed Mode Systems

Problem: In platform configurations utilizing both DDR5 and DDRT2 memory technologies on the same channel, accesses to Error Check and Scrub (ECS) data may cause the system to hang.

Implication: Due to this erratum, the system may hang

Workaround: It may be possible for BIOS to workaround this erratum.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR67. Intel DSA/IAA Completion Record is Not Written For Non-Completion Record Invalid Traffic Classes

Problem: For Intel® Data Streaming Accelerator (DSA)/Intel® In-Memory Analytics Accelerator (IAA), when any Traffic Class (TC) selected by a descriptor is invalid, the completion record is not written and the error is reported in SWERROR.

Implication: Due to this erratum, software that expects a completion record may not function as expected.

Workaround: None Identified. Software should check the status of the SWERROR register if it does not receive a completion record as expected.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR68. Intel IAA Expand Operation With PRLE Format Input May Return an Error

Problem: Intel® In-Memory Analytics Accelerator (IAA) Expand operation may parse beyond the required elements of Source 1 and return a Parquet Run Length Encoding (PRLE) Format Error (14h) unexpectedly.

Implication: Due to this erratum, software may receive a spurious error.

Workaround: None Identified. Software should not send non-PRLE encoded stream data in Source 1.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR69. Intel IAA Compression with Compress Bit Order Set May Produce an Odd Number of Bytes

Problem: When a compress job specifies “Compress Bit Order” flag and not “Stats Mode”, the Intel In-Memory Analytics Accelerator (IAA) may incorrectly produce an odd number of bytes.

Implication: Due to this erratum, when this occurs, the end of the generated bit-stream is lost.

Workaround: None Identified. Use the Intel Query Processing Library (QPL) to handle this case.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR70. Intel IAA Source 2 Not Written Properly When Source 2 Size is 32 Bytes

Problem: For Intel® In-Memory Analytics Accelerator (IAA) operations, If the Source 2 size is specified as 32 bytes and if Source 2 is being written, then no Source 2 data will be written.

Implication: Due to this erratum, software that writes 32 bytes of Source 2 data may not function as expected.

Workaround: None Identified. When Source 2 is being written, software should specify Source 2 size to be at least 64 bytes.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR71. Intel IAA May Not Report Invalid Filter Flags Status Code When Source 2 Bit Order Field is Set

Problem: Intel® In-Memory Analytics Accelerator (IAA) may not report an error when Source 2 Bit Order field of Filter Flag is erroneously set.

Implication: Due to this erratum, software may not receive an error when expected. Intel has not observed any functional implication as a result of this erratum.

Workaround: None Identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR72. Intel IAA Does Not Allow Source 1 Size to be 0 For Expand Operation

Problem: Intel® In-Memory Analytics Accelerator (IAA) Expand operation will incorrectly return an error code when Source 1 size is 0.

Implication: Due to this erratum, software that uses the Expand operation with Source 1 size of 0 may not behave as expected.

Workaround: None Identified. For the Expand operation, software can workaround this issue by supplying IAA with a dummy Source 1 input that contains at least one byte.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR73. Intel® DSA/IAA And WQ Configuration Registers May be Incorrectly Updated

Problem: Software may incorrectly update Intel® Data Streaming Accelerator (DSA)/Intel® In-Memory Analytics Accelerator (IAA) and Work Queue (WQ) configuration registers when the device state has changed from "Enabled" to "Disable-in-Progress."

Implication: Due to this erratum, unpredictable DSA/IAA device behavior may occur.

Workaround: None Identified. Software should not change DSA/IAA device WQ configuration registers until CMDSTATUS Register (offset A8h) bit 31=0.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR74. Invalid Flags Field of The Completion Record May Not be Set Correctly For Intel IAA Compression Operation

Problem: In the Intel® In-Memory Analytics Accelerator (IAA) Compress descriptor, if the compression flag Stats Mode = 0 and Read Source 2 flag = 0, then the IAA Compress operation will return an Invalid Operation Status Code 11h but not set the Invalid Flags field.

Implication: Due to this erratum, software can not tell which flags are invalid based on Invalid Flags field.

Workaround: None Identified. User may examine the descriptor and the documentation to determine which flags are invalid.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR75. With Intel® SGX Disabled, Software That Relies on ENCLVexiting May Not Function as Expected

Problem:On processors with Intel SGX disabled, the enable ENCLVexiting bit 60 of IA32_VMX_PROCBASED_CTLS2 MSR (index 48BH) is incorrectly set as being enabled.

Implication: Due to this erratum, software that relies on the ENCLVexiting bit may not function as expected.

Workaround: None identified. Software should not rely on the ENCLVexiting bit when Intel SGX is disabled.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR76. Headers Logged in AERHDRLOG for an AER Error for Intel DSA/IAA may be Incorrect

Problem: The header log (AERHDRLOG(1-4), [Bus: system design dependent, Device: 1, Function: 0, Offset: 11Ch, 120h, 124h, 128h respectively]) for Intel® Data Streaming Accelerator (DSA)/Intel® In-Memory Analytics Accelerator (IAA), will be overwritten as all 1's when a simultaneous and independent Correctable Error occurs.

Implication: Due to this erratum, software relying upon AERHDRLOG (1-4) for an Advanced Error Reporting error may not function as expected.

Workaround: None Identified. Software should ignore AERHDRLOG (1-4) when its value is all 1's.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR77. Intel DSA/IAA May Fail to Send an ERR_FATAL Message if a Non-Fatal Error Occurs in The Same Cycle

Problem: When fatal and non-fatal uncorrectable errors occur in the same cycle, an ERR_FATAL message is not sent and only an ERR_NONFATAL message is sent from the Intel® Data Streaming Accelerator (DSA)/Intel® In-Memory Analytics Accelerator (IAA).

Implication: Due to this erratum, software may not function as expected due to a fatal error not being reported.

Workaround: None Identified. Software should check for fatal errors during the handling of non-fatal errors.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR78. Intel DSA/IAA May Fail to Log an Unexpected Completion Error For an Invalid ATS Response

Problem: Under complex microarchitecture conditions when there are multiple simultaneous errors, Intel® Data Streaming Accelerator (DSA)/Intel® In-Memory Analytics Accelerator (IAA) may fail to log an Unexpected Completion error for an Address Translation Services (ATS) response with an incorrect PASID Privilege Mode Requested value.

Implication: Due to this erratum, when there are multiple simultaneous errors, software may be unaware of an ATS Unexpected Completion error.

Workaround: None Identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR79. Intel IAA Compression Output Buffer Overflow Error May be Incorrectly Reported

Problem: For Intel® In-Memory Analytics Accelerator (IAA), when checking for compression output overflow, the upper bits (31: 29) of the Maximum Destination Size descriptor field (Offset 48 - 51) will be discarded.

Implication: Due to this erratum, an output buffer overflow error may be incorrectly reported for buffer sizes greater than or equal to 2000_0000h.

Workaround: None Identified. Software should avoid Maximum Destination Size greater than or equal to 2000_0000h.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR80. Intel® QuickAssist Technology Accelerator May Violate ATS Invalidation Completion Ordering

Problem: Address Translation Service (ATS) invalidations may complete before all in-flight writes are drained from Intel® QuickAssist Technology (Intel® QAT) accelerator.

Implication: Due to this erratum, Intel® QAT accelerator operation with ATS capability enabled may lead to unexpected system behavior.

Workaround: System software (OS/VMM) performing ATS invalidation on Intel® QAT accelerator needs to serially execute a second (duplicate) ATS invalidation request after the first invalidation completes to drain in-flight writes.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR81. Intel® QuickAssist Technology Accelerator Device May Not Invalidate PASID Supervisor-Privilege Translations

Problem: Address Translation Service (ATS) invalidations for Process Address Space ID (PASID) with Supervisor-privilege translations may not correctly invalidate the device TLB on Intel® QuickAssist accelerator (Intel® QAT).

Implication: Due to this erratum, Intel® QAT accelerator operation with ATS capability enabled and Supervisor-privilege PASID may lead to unexpected system behavior.

Workaround: System software (OS/VMM) performing ATS invalidation on Intel® QAT accelerator on behalf of any supervisor-privilege PASID must set the Global Invalidate (G) bit in the ATS invalidation to avoid the erratum.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR82. The Time-Stamp Counter May Report an Incorrect Value

Problem: Under complex microarchitectural conditions, the Time-Stamp Counter (TSC) may incorrectly report the time stamp to be less than the expected time stamp after exiting C6 power saving state.

Implication: Due to this erratum, systems that rely upon a monotonically increasing value reported by the TSC may exhibit unpredictable system behavior.

Workaround: It may be possible for BIOS to contain a workaround for this erratum.

Status: For the steppings affected, refer to the Errata Summary Table.

SPR83. UPI Machine Check Bank May Not Report The Most Recently Logged Error

Problem: If multiple UPI ports log corrected errors and KTI_MCA_CFG LATCH_FIRST_CE is not set for one or more UPI links (Bus: 30; Device: 1-4; Function: 1; Offset 498h; bit: 0=0), the UPI Machine Check Bank (Bank 5; MSRs: 415h - 417h) may not report the most recently logged error.

Implication: Due to this erratum, software can not rely on the most recent error being logged in the UPI Machine Check Bank.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR84. PECI Wire Host may Continuously Receive a Completion Code of 0x80

Problem: Regardless of targeted PECI endpoint, if a PECI wire host issues a PECI transaction within one second of a previous PECI transaction that received a completion code of 0x80 (command response timeout), it may continuously receive a command response timeout.

Implication: Due to this erratum, a PECI endpoint may be perceived as unresponsive.

Workaround: None identified. A PECI wire host must wait for at least 1 second if a previous request had timed out before sending command to a different endpoint.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR85. DDR5 9x4 DIMMs ECS Data May be Reported Incorrectly

Problem: For DDR5 9x4 DIMMs, after the memory controller issues a Movable Read Reference (MRR) to device 8, the Error Correctable String (ECS) data will be reported incorrectly in mr_read_result (MEM_BAR [0-3], Offsets 22C80h-22C90h or 2AC80h-2AC90h).

Implication: Due to this erratum, the software cannot rely on ECS data for Device 8 with DDR5 9x4 DIMMs. DDR5 10x4 or 5x8 DIMMs are not affected by this erratum.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR86. RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device Bit Maybe Incorrect

Problem: On systems using 9x4 DDR5 DIMMs, when Permanent Fault Detection (PFD) is disabled, the RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device bit (138_MEM_RRD + Offsets 22C54h, 22D80h, 2AC54h, 2AD80h, 22E60h, Bit 7) will always report 0 when an error is detected in device 8.

Implication: Due to this erratum, when an error is detected on device 8, the system software is not able to rely on the value of the RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device bit.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR87. PIROM Reports The Wrong 2 DPC Speed For Processors With Less Than 4800 MT/s 1 DPC Speed

Problem: PIROM incorrectly reports the 2 DIMM Per Channel (DPC) speed as 400 MT/s less than the top speed on processors with a 1DPC speed of less than 4800 MT/s.

Implication: Due to this erratum, when an error is detected on device 8, the system software is not able to rely on the value of the RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device bit.

Workaround: None identified. Software should assume that the 2 DPC speed for any processors with 1 DPC speed less than 4800 MT/s is equal to the 1 DPC speed.

Status: For the steppings affected, refer to the Summary Tables of Changes.

SPR88. An MDF Parity Error May Incorrect Set The Overflow Bit

Problem: The Overflow bit (bit 62) of IA32_MC[7-8]_STATUS MSRs (41Dh, 421h) may be incorrectly set when a Modular Die Fabric (MDF) parity error occurs (MCACOD = 0405h).

Implication: Due to this erratum, software that relies upon the Machine Check Overflow bit may not operate as expected.

Workaround: None identified.

Status: For the steppings affected, refer to the Summary Tables of Changes.

4th Gen Intel Xeon Scalable Processors Codename Sapphire Rapids

NDA Specification Update

Errata Details

SPR1. IPSR May Not Function Correctly

SPR2. Poison Data Reported Instead of a CS Limit Violation

SPR3. Monitor Instructions to Legacy VGA Region May Fail

SPR4. TILEDATA State May Be Saved Incorrectly

SPR5. A Poison Data Event May Not be Serviced if a Data Breakpoint Occurs on an Intel AMX Tile-Load or Intel AVX Gather or REP MOVS Instruction

SPR6. IFS MSRs Will Ignore a Non-Zero EDX Value And Not Signal a #GP

SPR7. Processor May Signal Spurious #GP Fault

SPR8. A Break Point May be Hit Twice When a VM Exit Without Commit Occurs

SPR9. Faulted XRSTORS Instruction May Result in Unexpected X87 FTW Value

SPR10. Error Conditions Detected During Cold Reset May Not be Cleared by Subsequent Warm Reset

SPR11. DSA/IAX Does Not Log The E2E Prefix Bit And The Prefix-Type Bits in AERTLPPLOG1

SPR12. The Processor May Drop Noncompliant Posted Peer-to-peer Transactions

SPR13. Certain Bits in IA32_​MC5_​STATUS Register Will Always Return 0

SPR14. Occupancy Interrupt Handle is Not Checked Against Interrupt Table Size

SPR15. Processor May Incorrectly Set PFD Assisted in Correction Bit in Memory Controller

SPR16. DSA CMDSTATUS Register May Not Reflect Correct Hardware Status

SPR17. Remapping Hardware May Set Access/Dirty Bits in a First-stage Page-table Entry

SPR18. System Software May Not Receive Intel® Virtualization Technology (Intel® VT) for Directed I/O (Intel® VT-d) Fault SPT.3 For Non-Zero Writes to b[191:HAW+128]

SPR19. APCTL.APNGE Should be RW Instead of RWS

SPR20. CXL Device May Not Receive Viral

SPR21. OOBMSM TSC Will be 320ns Behind The Globally Aligned Counter

SPR22. Performance Monitoring Event Coherent_​ops May Undercount

SPR23. PCIe* Link Re-Equalization May Not Occur if Link is in L1 State

SPR24. Machine Check Bank 4 UCNA Errors May Not be Signaled

SPR25. DSA/IAA Use of Priv and PASID

SPR26. Reserved(0) Check For a PASID Table Entry May Not Happen For a DMA Request

SPR27. Remapping Hardware May Not Generate a Page Request Group Response Message While Operating in Legacy Mode or Abort DMA Mode

SPR28. Remapping Hardware May Abort ZLR to Second-Stage Write Only Pages

SPR29. Remapping Hardware with Major Version Number 6 Incorrectly Advertises the ESRTPS Support

SPR30. Platform May Hang if System Software Sends a Page Group Response or DevTLB Invalidation to Non-existent Requester ID

SPR31. Remapping Hardware Does Not Perform Reserved (0) Check in Page Response Descriptor

SPR32. Remapping Hardware Implements b[31:16] of the three Event Data Registers (VTDBAR offsets 0x3C, 0xA4, and 0xE4) as Read-Writable

SPR33. IAA Do Not Report Overlap Errors For AECS Size of 2GB or Greater

SPR34. DSA/IAA Invalid TC Not Reported in The SWERROR Register

SPR35. IAA Unaligned Completion Record Address Error is Not Reported in SWERROR Register

SPR36. Intel® UPI Link Not Resetting When L1 Mismatch Occurs Between Local and Remote Sockets

SPR37. DSA/IAA May Fail to Log an MDPE Error For Back-to-Back Parity Errors

SPR38. Relaxed Ordering Not Disabled by DEVCTL.ERO bit for DSA/IAA Upstream Transactions

SPR39. System Address Logged For WDB Parity Errors May be Incorrect

SPR40. Incorrect MCACOD For L2 MCE

SPR41. System May Hang Due to Full LLRB

SPR42. IAA May Fail to Properly Decode Data With a Large Header

SPR43. Memory Controller Violates JEDEC RCD tCSALT Timing

SPR44. Wrong CKE Signal Used on 1 DPC 3DS 4H Configs

SPR45. Address May Not be Logged For a UCR Error Detected in The MLC

SPR46. Intel VT-d DMA Remapping Hardware May Hang if it Encounters Page Request Queue Overflow Condition

SPR47. Receiver Common Mode Input Impedance May be Below Specification When Interface is Powered Down

SPR48. Remapping Hardware Will Not Report The PASID Value For RTA.2 Faults in Modes Other Than Scalable Mode

SPR49. Remapping Hardware Does Not Perform a Reserved(0) Check in Interrupt Remap Table Entry

SPR50. Processor PCIe Root Port Link Spurious Data Parity Error May be Reported

SPR51. Mismatch Between UboxErrMisc and MCI_​STATUS Registers Error Logs

SPR52. CHA UCNA Errors May be Incorrectly Controlled by MCi_​CTL Enable Bits

SPR53. Reading The PPERF MSR May Not Return Correct Values

SPR54. No #GP Will be Signaled When Setting MSR_​MISC_​PWR_​MGMT.ENABLE_​SDC if MSR_​MISC_​PWR_​MGMT.LOCK is Set

SPR55. System May Experience an Internal Timeout Error When an Internal Parity Error Occurs While Working With Intel® AMX

SPR56. Last Branch Records May Not Survive Warm Reset

SPR57. Single Step on Branches Might be Missed When VMM Enables Notification On VM Exit

SPR58. Incorrect #CP Error Code on UIRET

SPR59. #GP May be Serviced Before an Instruction Breakpoint

SPR60. Unexpected #PF Exception Might Be Serviced Before a #GP Exception

SPR61. VMX-Preemption Timer May Not Work if Configured With a Value of 1

SPR62. User Interrupt Might be Delayed

SPR63. VM Exit Qualification May Not be Correctly Set on APIC Access While Serving a User Interrupt

SPR64. Software Tuning That Relies on PCLS Values May Experience Inaccurate Event Counts

SPR65. Multiple SGX_​Doorbell_​Errors on Ubox Response Mismatch

SPR66. ECS Readout Fails on Mixed Mode Systems

SPR67. Intel DSA/IAA Completion Record is Not Written For Non-Completion Record Invalid Traffic Classes

SPR68. Intel IAA Expand Operation With PRLE Format Input May Return an Error

SPR69. Intel IAA Compression with Compress Bit Order Set May Produce an Odd Number of Bytes

SPR70. Intel IAA Source 2 Not Written Properly When Source 2 Size is 32 Bytes

SPR71. Intel IAA May Not Report Invalid Filter Flags Status Code When Source 2 Bit Order Field is Set

SPR72. Intel IAA Does Not Allow Source 1 Size to be 0 For Expand Operation

SPR73. Intel® DSA/IAA And WQ Configuration Registers May be Incorrectly Updated

SPR74. Invalid Flags Field of The Completion Record May Not be Set Correctly For Intel IAA Compression Operation

SPR75. With Intel® SGX Disabled, Software That Relies on ENCLVexiting May Not Function as Expected

SPR76. Headers Logged in AERHDRLOG for an AER Error for Intel DSA/IAA may be Incorrect

SPR77. Intel DSA/IAA May Fail to Send an ERR_​FATAL Message if a Non-Fatal Error Occurs in The Same Cycle

SPR13. Certain Bits in IA32_MC5_STATUS Register Will Always Return 0

SPR22. Performance Monitoring Event Coherent_ops May Undercount

SPR51. Mismatch Between UboxErrMisc and MCI_STATUS Registers Error Logs

SPR52. CHA UCNA Errors May be Incorrectly Controlled by MCi_CTL Enable Bits

SPR54. No #GP Will be Signaled When Setting MSR_MISC_PWR_MGMT.ENABLE_SDC if MSR_MISC_PWR_MGMT.LOCK is Set

SPR65. Multiple SGX_Doorbell_Errors on Ubox Response Mismatch

SPR77. Intel DSA/IAA May Fail to Send an ERR_FATAL Message if a Non-Fatal Error Occurs in The Same Cycle

SPR86. RETRY_RD_ERR_LOG_MISC.DDR5_9x4_half_device Bit Maybe Incorrect