• 615781
  • 1
  • Public Content

1st Generation Intel® Xeon® Scalable Processors

Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See configuration disclosure for details. No product or component can be absolutely secure.

Refer to https://software.intel.com/articles/optimization-notice for more information regarding performance and optimization choices in Intel software products.

Estimates of SPECrate®2017_​int_​base and SPECrate®2017_​fp_​base based on Intel internal measurements. SPEC® , SPECrate® and SPEC CPU® are registered trademarks of the Standard Performance Evaluation Corporation. See www.spec.org for more information.

Claim Processor Family System Configuration Measurement Measurement Period
41X since 2006 1st Generation Intel® Xeon® Platinum processor Results estimated or published at www.spec.org using historic SPECint*_​rate_​base2006 results as of Jun 11 2017. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Neon City with 384 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using CPU2006_​FOR-OEMs-cpu2006-1.2-ic17.0-lin-binaries-20160922. Data Source: Request Number: 2498, Benchmark: SPECint*_​rate_​base2006, Score: 2550 Higher is better. Results estimated or published at www.spec.org using historic SPECint*_​rate_​base2006 results

Baseline: 2006

New configuration: June 11, 2017

1.65X Average Performance 1st Generation Intel® Xeon® Platinum processor

a. Up to 1.36x claim based on Brokerage Firm OLTP: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 512 GB Total Memory on Windows Server* 2012 R2 Standard using SQL Server* 2014. Data Source: Request Number: 1640, Benchmark: Brokerage Firm OLTP, Score: 4373 transactions per second (tps) for OLTP vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on on Purley-EP (Lewisburg) with 764 GB Total Memory on. Windows Server* 2016 RTM Standard using SQL Server 2016 Data, Score: 5979 tps for OLTP. Higher is better.

b. Up to 1.40x claim based on 2-Tier SAP* SD: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 512 GB Total Memory on SUSE Linux Enterprise Server* 10 SP4 using SAP EHP* 5.0 for ERP* 6.0 and Sybase ASE* 16.0. Data Source: Request Number: 2473, Benchmark: SAP SD 2-Tier enhancement package 5 for SAP ERP 6.0, Score: 19721 vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Purley-EP (Lewisburg) with 768 GB Total Memory on SUSE Linux Enterprise Server* 12 using SAP ERP 6.0/EHP 5. Data Source: Request Number: 2558, Benchmark: SAP SD 2-Tier enhancement package 5 for SAP ERP 6.0, Score: 27678 Higher is better.

c. Up to 1.49x claim based on Server-side Java: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Wildcat Pass with 128 GB Total Memory on Red Hat Enterprise Linux* 6.5 kernel 2.6.32-431 using Java 8 SE, JDK8U60, Java Hotspot V1.8.0_​60 (if appropriate). Data Source: Request Number: 1633, Benchmark: Server-side Java workload - MultiJVM, Score: 112054 Higher is better, vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Purley-EP (Lewisburg) with 384 GB Total Memory on Red Hat Enterprise Linux* 7.3 using jdk1.8u121. Data Source: Request Number: 2513, Benchmark: Server-side Java workload - MultiJVM, Score: 167696 Higher is better.

d. Up to 1.52x claim based on SPECint*_​rate_​base2006: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 256 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using Compiler: C/C++: Version 16.0.0.101 of Intel® C++ Studio XE for Linux; - Fortran: Version 16.0.0.101 of Intel® Fortran Studio XE for Linux. Data Source: Request Number: 2342, Benchmark: SPECint*_​rate_​base2006, Score: 1670 vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Neon City with 384 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using CPU2006_​FOR-OEMs-cpu2006-1.2-ic17.0-lin-binaries-20160922. Data Source: Request Number: 2498, Benchmark: SPECint*_​rate_​base2006, Score: 2550 Higher is better.

e. Up to 1.55x on server virtualization workload: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 512 GB Total Memory on VMware ESXi* 6.0 Update 1 using Guest VM's utilize RHEL* 6 64 bit OS. Data Source: Request Number: 1637, Benchmark: server virtualization workload, Score: 1034 @ 58 vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Wolf Pass SKX with 768 GB Total Memory on VMware ESXi 6.0 U3 GA using Guest VM's utilize RHEL 6 64 bit OS. Data Source: Request Number: 2563, Benchmark: server virtualization workload, Score: 1580 @ 90 VMs Higher is better.

f. Up to 1.63x claim based on SPECfp*_​rate_​base2006: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 256 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using Compiler: C/C++: Version 16.0.0.101 of Intel® C++ Studio XE for Linux; - Fortran: Version 16.0.0.101 of Intel® Fortran Studio XE for Linux. Data Source: Request Number: 2340, Benchmark: SPECfp*_​rate_​base2006, Score: 1050 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Neon City with 384 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using CPU2006_​FOR-OEMs-cpu2006-1.2-ic17.0-lin-binaries-20160922. Data Source: Request Number: 2503, Benchmark: SPECfp*_​rate_​base2006, Score: 1720 Higher is better.

g. Up to 1.65x claim based on STREAM - triad: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 256 GB Total Memory on Red Hat Enterprise Linux* 6.5 kernel 2.6.32-431 using Stream NTW avx2 measurements. Data Source: Request Number: 1709, Benchmark: STREAM - Triad, Score: 127.7 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Neon City with 384 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using STREAM AVX 512 Binaries. Data Source: Request Number: 2500, Benchmark: STREAM - Triad, Score: 199 Higher is better.

h. Up to 1.73x claim based on HammerDB: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 384 GB Total Memory on Red Hat Enterprise Linux* 7.1 kernel 3.10.0-229 using Oracle* 12.1.0.2.0 (including database and grid) with 800 warehouses, HammerDB 2.18. Data Source: Request Number: 1645, Benchmark: HammerDB, Score: 4.13568e+006 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Purley-EP (Lewisburg) with 768 GB Total Memory on Oracle Linux* 7.2 using Oracle 12.1.0.2.0, HammerDB 2.18. Data Source: Request Number: 2510, Benchmark: HammerDB, Score: 7.18049e+006 Higher is better.

i. Up to 1.73x claim based on LAMMPS: LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. It is used to simulate the movement of atoms to develop better therapeutics, improve alternative energy devices, develop new materials, and more. E5-2697 v4: 2S Intel® Xeon® processor E5-2697 v4, 2.3GHz, 36 cores, Intel® Turbo Boost Technology and Intel® Hyper-Threading Technology (Intel® HT Technology) on, BIOS 86B0271.R00, 8x16GB 2400MHz DDR4, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: 2S Intel® Xeon® Gold 6148 processor, 2.4GHz, 40 cores, Intel® Turbo Boost Technology and Intel® Hyperthreading Technology on, BIOS 86B.01.00.0412.R00, 12x16GB 2666MHz DDR4, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

j. Up to 1.77x claim based on DPDK L3 Packet Forwarding: E5-2658 v4: 5 x Intel® XL710-QDA2, DPDK 16.04. Benchmark: DPDK l3fwd sample application Score: 158 Gbits/s packet forwarding at 256B packet using cores. Gold 6152: Estimates based on Intel internal testing on Intel® Xeon® 6152 2.1 GHz, 2x Intel® FM10420(RRC) Gen Dual Port 100GbE Ethernet controller (100Gbit/card) 2x Intel® XXV710 PCI Express* Gen Dual Port 25GbE Ethernet controller (2x25G/card), DPDK 17.02. Score: 281 Gbits/s packet forwarding at 256B packet using cores, IO and memory on a single socket.

k. Up to 1.87x claim based on Black-Scholes: which is a popular mathematical model used in finance for European option valuation. This is a double precision version. E5-2697 v4: 2S Intel® Xeon® processor CPU E5-2697 v4, 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 128GB total memory, 8 x16GB 2400 MHz DDR4 RDIMM, 1 x 1TB SATA, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: Intel® Xeon® Gold processor 6148@ 2.4GHz, H0QS, 40 cores 150W. QMS1, turbo and HT on, BIOS SE5C620.86B.01.00.0412.020920172159, 192GB total memory, 12 x 16 GB 2666 MHz DDR4 RDIMM, 1 x 800GB INTEL® SSD SC2BA80, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

l. Up to 2.27x claim based on LINPACK*: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 64 GB Total Memory on Red Hat Enterprise Linux* 7.0 kernel 3.10.0-123 using MP_​LINPACK 11.3.1 (Composer XE 2016 U1). Data Source: Request Number: 1636, Benchmark: Intel® Distribution of LINPACK*, Score: 1446.4 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Wolf Pass SKX with 384 GB Total Memory on Red Hat Enterprise Linux* 7.3 using mp_​linpack_​2017.1.013. Data Source: Request Number: 3753, Benchmark: Intel® Distribution of LINPACK*, Score: 3295.57 Higher is better.

Geomean based on Normalized Generational Performance (estimated based on Intel internal testing of OLTP Brokerage, SAP SD* 2-Tier, HammerDB*, Server-side Java*, SPEC*int_​rate_​base2006, SPEC*fp_​rate_​base2006, Server Virtualization, STREAM* triad, LAMMPS, DPDK L3 Packet Forwarding, Black-Scholes, Intel® Distribution for LINPACK*. Test configuration: May 2017
2.2X deep learning training and inference performance than the prior generation 1st Generation Intel® Xeon® Platinum processor Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact‘, OMP_​NUM_​THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Compared with Platform: 2S Intel® Xeon® CPU E5-2699 v4 @ 2.20GHz (22 cores), HT enabled, turbo disabled, scaling governor set to “performance” via acpi-cpufreq driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3500 Series (480GB, 2.5in SATA 6Gb/s, 20nm, MLC). Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact,1,0‘, OMP_​NUM_​THREADS=44, CPU Freq set with cpupower frequency-set -d 2.2G -u 2.2G -g performance. Neon: ZP/MKL_​CHWN branch commit id:52bd02acb947a2adabb8a227166a7da5d9123b6d. Dummy data was used. The main.py script was used for benchmarking , in mkl mode. ICC version used : 17.0.3 20170404, Intel® Math Kernel Library (Intel® MKL) small libraries version 2018.0.20170425; Inference and training throughput uses FP32 instructions. Intel® Math Kernel Library (Intel® MKL) Test configuration: May 2017
2.2X deep learning training and inference performance than the prior generation 1st Generation Intel® Xeon® Platinum processor Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact‘, OMP_​NUM_​THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Compared with Platform: 2S Intel® Xeon® CPU E5-2699 v4 @ 2.20GHz (22 cores), HT enabled, turbo disabled, scaling governor set to “performance” via acpi-cpufreq driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3500 Series (480GB, 2.5in SATA 6Gb/s, 20nm, MLC). Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact,1,0‘, OMP_​NUM_​THREADS=44, CPU Freq set with cpupower frequency-set -d 2.2G -u 2.2G -g performance. Neon: ZP/MKL_​CHWN branch commit id:52bd02acb947a2adabb8a227166a7da5d9123b6d. Dummy data was used. The main.py script was used for benchmarking , in mkl mode. ICC version used : 17.0.3 20170404, Intel® Math Kernel Library (Intel® MKL) small libraries version 2018.0.20170425; Inference and training throughput uses FP32 instructions. Intel® Math Kernel Library (Intel® MKL) Test configuration: May 2017
18X deep learning training and inference performance than 4-yr old system 1st Generation Intel® Xeon® Platinum processor Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux* release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact‘, OMP_​NUM_​THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Compared with Platform: 2S Intel® Xeon® CPU E5-2697 v2 @ 2.70GHz (12 cores), HT enabled, turbo enabled, scaling governor set to “performance” via intel_​pstate driver, 256GB DDR3-1600 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.21.1.el7.x86_​64. SSD: Intel® SSD 520 Series 240GB, 2.5in SATA 6Gb/s, 25nm, MLC. Intel Caffe: (https://github.com/intel/caffe/ ), revision b0ef3236528a2c7d2988f249d347d5fdae831236. Inference measured with “caffe time --forward_​only” command, training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (GoogLeNet, AlexNet, and ResNet-50), GCC 4.8.5, MKLML version 2017.0.2.20170110. Intel Caffe: (http://github.com/intel/caffe/ ), revision b0ef3236528a2c7d2988f249d347d5fdae831236. Inference measured with “caffe time --forward_​only” command, training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (GoogLeNet, AlexNet, and ResNet-50), GCC 4.8.5, MKLML version 2017.0.2.20170110. Test configuration: May 2017
Over 100X deep learning training and inference performance than 3-yr old system using unoptimized software 1st Generation Intel® Xeon® Platinum processor Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC).Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact‘, OMP_​NUM_​THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Compared with Platform: 2S Intel® Xeon® CPU E5-2699 v3 @ 2.30GHz (18 cores), HT enabled, turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.el7.x86_​64. OS drive: Seagate* Enterprise ST2000NX0253 2 TB 2.5" Internal Hard Drive.Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact,1,0‘, OMP_​NUM_​THREADS=36, CPU Freq set with cpupower frequency-set -d 2.3G -u 2.3G -g performance. Intel Caffe: (https://github.com/intel/caffe/ ), revision b0ef3236528a2c7d2988f249d347d5fdae831236. Inference measured with “caffe time --forward_​only” command, training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (GoogleNet, AlexNet, and ResNet-50), GCC 4.8.5, MKLML version 2017.0.2.20170110. BVLC-Caffe: https://github.com/BVLC/caffe, Inference & Training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was st ored on local storage and cached in memory before training BVLC Caffe (http://github.com/BVLC/caffe ), revision 91b09280f5233cafc62954c98ce8bc4c204e7475 (commit date 5/14/2017). BLAS: atlas ver. 3.10.1. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (GoogleNet, AlexNet, and ResNet-50), GCC 4.8.5, MKLML version 2017.0.2.20170110. BVLC-Caffe: https://github.com/BVLC/caffe, Inference & Training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was st ored on local storage and cached in memory before training BVLC Caffe (https://github.com/BVLC/caffe ), revision 91b09280f5233cafc62954c98ce8bc4c204e7475 (commit date 5/14/2017). BLAS: atlas ver. 3.10.1. Test configuration: May 2017
Neusoft SaCa Aclome* 1st Generation Intel® Xeon® Platinum processor OS: CentOS 7.3.1611. Testing by Intel and Neusoft May 2017. BASELINE: 2S Intel® Xeon® processor E5-2699 v4, 2.2GHz, 22 cores, turbo and HT on, 128GB total memory, 8 slots / 16GB / 2400 MT/s / DDR4, SATA SSD. NEW: 2S Intel® Xeon® Platinum 8180 processor, 2.5GHz, 28 cores, turbo and HT on, Intel® C627 chipset , 128GB total memory, 8 slots / 16GB / 2666 MT/s / DDR4, SATA SSD. Notes: Data compression/depression using Zlib 1.2.8. QAT Driver version: S4PR1-Linux-QAT1.7.Upstream.L.0.9.0-36 SaCa Aclome* workload (for general performance) and compressing/decompressing workload (for QAT). Test configuration: May 2017
AsiaInfo Telco BSS* 1st Generation Intel® Xeon® Platinum processor OS: RHEL* 7.3. Testing by Intel & AsiaInfo May 2017. BASELINE: 4S Intel® Xeon® processor E7-8890 v4, 2.2GHz, 24 cores, turbo and HT on, 256GB total memory, 16 slots / 16GB / 1600 MT/s / DDR4, P3700 2T SSD x 2. NEW 1 (for general workload benchmarking and Intel® QAT benchmarking): 4S Intel® Xeon® Platinum 8180 processor, 2.5GHz, 28 cores, turbo and HT on, Intel® C627 chipset, 384GB total memory, 24 slots / 16GB / 2666 MT/s / DDR4, Intel® SSD DC P3700 2TB x 2. NEW 2 (for Intel® Optane™ SSD benchmarking): 4S Intel® Xeon® Platinum 8180 processor, 2.5GHz, 28 cores, turbo and HT on, Intel® C627 chipset , 384GB total memory, 24 slots / 16GB / 2666 MT/s / DDR4, Intel® SSD DC P3700 2TB x 2, Intel® Optane™ SSD DC P4800X 375G x 2. AsiaInfo Telco BSS workload Test configuration: May 2017
IBM DB2* 1st Generation Intel® Xeon® Platinum processor Testing by Intel and IBM. April/May 2017. BASELINE: 4S Intel® Xeon® processor E7-8890 v4, 2.2GHz, 24 cores, turbo on, HT on, BIOS 335.R00, 1.5TB total memory, 96 slots / 16GB / 1600 MT/s / DDR4 LRDIMM, 1 x 800GB, Intel® SSD DC S3700, Red Hat Enterprise Linux* 7.3 kernel 3.10.0-514.16.1.el7.x86_​64. NEW: 4S Intel® Xeon® Platinum 8180 processor, 2.5GHz, 28 cores, turbo on, HT on, BIOS 119.R05, 1.5TB total memory, 48 slots / 32GB / 2677 MT/s / DDR4 LRDIMM, 1 x 800GB, Intel® SSD DC S3700, Red Hat Enterprise Linux* X.X kernel 3.10.0-514.16.1.el7.x86_​64. IBM Big Data Insights Internal Heavy Multiuser Workload (BDInsights) is a multi-user data warehousing workload based on a retail environment. The workload is comprised with a mix of complex and intermediate queries. The scale factor for the workload is 3TB with 12 users. Test configuration: April/May 2017
Aerospike database* 1st Generation Intel® Xeon® Platinum processor BASELINE: Aerospike Server Enterprise* 3.6.4 , CentOS* 6.7, kernel version 2.6.32-573.3.1.el6.x86_​64, 2 Intel® Xeon® processor E5-2697 v3, 2.6GHz, 28 cores, 128GB DDR4/1866, regular DIMM, 2x 10Gb network Intel® X540-AT2 not bonded, no disk used – in memory workload. NEXT GEN (old software): Aerospike Server Enterprise* 3.6.4, CentOS 6.7, kernel version 2.6.32-573.3.1.el6.x86_​64, 2 Intel® Xeon® processor E5-2699 v4, 2.2GHz, 44cores, 128GB DDR4/2134, regular DIMM, 2x 10Gb network Intel X540-AT2 not bonded, no disk used – in memory workload. Clients: 8 client systems were used to concurrently submit queries to the servers and drive the workload. The same clients were used in both “baseline” and “new”. The clients were configured as follows: E5-2697 v3 128GB of memory and 10GB Intel X540-AT2 network. The database was populated with 400 M records of 100 bytes each and benchmarked with the Aerospike Java Benchmark tool (https://github.com/aerospike/aerospike-client-java ). The workload simulated 95%/5% read/update ratio. Two Aerospike instances were launched on a single server forming a cluster. NEXT GEN (new software): Aerospike Server Enterprise 3.12.1, OS: CentOS 7.2 with kernel updated to 4.4.59, Intel® Xeon® processor E5-2699 v4, 2.2GHz, 22 cores, turbo and HT on, BIOS SE5C610.86B.01.01.0016.033120161139, 128GB total memory, 16 DIMMs / 8GB / Configured Clock Speed: 1866 MHz / DDR4 DIMM, 2 x Intel® 82599ES 10 Gigabit Ethernet Controllers – all 4 ports on the 2 network controllers were bonded for an aggregate 40000Mb/s bond. No storage – in-memory workload. NEW: Aerospike Server Enterprise 3.12.1, OS: CentOS 7.2 with kernel updated to 4.4.59, Intel® Xeon® Platinum processor 8180, 2.5GHz, 28 cores, turbo and HT on, BIOS SE5C620.86B.01.00.0412.020920172159 , 384GB total memory, 12 DIMMs / 32GB / Configured Clock Speed: 2666 MHz / DDR4 DIMM, 2 x Intel® 82599ES 10 Gigabit Ethernet Controllers – all 4 ports on the 2 network controllers were bonded for an aggregate 40000Mb/s bond. No storage – in-memory workload. Clients: 8 client systems were used to concurrently submit queries to the servers and drive the workload. The same clients were used in both “baseline” and “new”. The clients were configured as follows: CentOS 7.2 with kernel 3.10.0-327. Intel® Xeon® processor E5-2697 v4, 2.3GHz, 18 cores, turbo and HT on, BIOS SE5C610.86B.01.01.0016.033120161139, 128GB total memory, 8 DIMMs / 16GB / Configured Clock Speed: 2400 MHz, 1 x Intel® 82599ES 10 Gigabit Ethernet Controllers. The database was populated with 200 M records of 100 bytes each and benchmarked with the Aerospike Java Benchmark tool (https://github.com/aerospike/aerospike-client-java ) . The workload simulated 95%/5% read/update ratio. Two Aerospike instances were launched on a single server forming a cluster. Each Aerospike instance was affinitized to a CPU socket and configured to use one of the 10GB NICs. Each 10GB NIC had its interrupt IRQs affinitized to a CPU socket. Test configuration: April 2017
Up to 1.63x for Technical Computing workloads 1st Generation Intel® Xeon® Gold processor

a. PERMAS by INTES is an advanced Finite Element software system that offers a complete range of physical models at high performance, quality, and reliability. It plays a mission-critical role in the design process at customers from automotive, ship design, aerospace, and more. E5-2697 v4: 2S Intel® Xeon® processor E5-2697v4, 2.3GHz, 18 cores, turbo on, HT off, NUMA on, BIOS 338.R00, 256 GB total memory (8x 32GB w/ 2400 MT/s, DDR4 LRDIMM), 4x Intel® SSD DC P3600 2 TB in RAID 0 (stripe size 64k). CentOS Linux* release 7.2, kernel 3.10.0-327.13.1.el7.x86_​64. Intel® Composer 2015.5.223. INTES PERMAS V16.00. Gold 6148: Intel® Xeon® Gold 6148 processor, 2.4 GHz, 20 cores, turbo on, HT off, NUMA on, BIOS SE5C620.86B.01.00.0412.020920172159, 384 GB total memory (12x 32GB w/ 2400 MT/s, DDR4 LRDIMM), 3x Intel® SSD DC P3600 2 TB in RAID 0 (stripe size 64k), CentOS* Linux* release 7.3, kernel 3.10.0-514.10.2.el7.x86_​64. Intel® Composer 2015.7.235. INTES PERMAS V16.00.

b. LS-DYNA is the leading product in the crash simulation market. It is used by the automobile, aerospace, construction, military, manufacturing, and bioengineering industries in worldwide. Workload: 2M elements Car2car model with 120ms simulation time. LS-DYNA explicit standard benchmarks tested by Intel, March 2017. E5-2697 V4: 2S Intel® Xeon® processor E5-2697 v4, 2.3GHz, 18 cores, turbo and HT on, BIOS SE5C610.86B.01.01.0016.033120161139, 128GB total memory, 8 memory channels / 8x16GB / 2400 MT/s / DDR4, Red Hat Enterprise Linux* 7.3 kernel 3.10.0-229.20.1.el6.x86_​64.knl2. GOLD 6148: 2S Intel® Xeon® Gold 6148 processor, 2.4GHz, 20 cores, turbo and HT on, BIOS version 412, 192GB total memory, 12 memory channels / 12x16GB / 2400 MT/s / DDR4, Red Hat Enterprise Linux* 7.3 kernel 3.10.0-514.el7.x86_​64.

c. Binomial option pricing is a lattice-based approach that uses a discrete-time model of the varying price over time of the underlying financial instrument. This is compute bound, double precision workload. FSI Binomial workload. OS: Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Testing by Intel March 2017. E5-2697 v4: 2S Intel® Xeon® processor CPU E5-2697 v4, 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 128GB total memory, 8 slots / 16GB / 2400 MT/s / DDR4 RDIMM, 1 x 1TB SATA, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: Intel® Xeon® Gold 6148 processor, 2.4GHz, 40 cores, turbo and HT on, BIOS 86B.01.00.0412, 192GB total memory, 12 slots / 16 GB / 2666 MT/s / DDR4 RDIMM, 1 x 800GB INTEL® SSD SC2BA80, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

d. Monte Carlo is a numerical method that uses statistical sampling techniques to approximate solutions to quantitative problems. In finance, Monte Carlo algorithms are used to evaluate complex instruments, portfolios, and investments. This is compute bound, double precision workload. FSI Monte Carlo workload. OS: Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Testing by Intel March 2017. E5-2697 v4: 2S Intel® Xeon® processor CPU E5-2697 v4, 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 128GB total memory, 8 x16GB 2400 MHz DDR4 RDIMM, 1 x 1TB SATA, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: Intel® Xeon® Gold 6148 processor@ 2.4GHz, H0QS, 40 cores 150W. QMS1, turbo and HT on, BIOS SE5C620.86B.01.00.0412.020920172159, 192GB total memory, 12 x 16 GB 2666 MHz DDR4 RDIMM, 1 x 800GB INTEL SSD SC2BA80, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

e. Black-Scholes is a popular mathematical model used in finance for European option valuation. This is a double precision version. E5-2697 v4: 2S Intel® Xeon® processor CPU E5-2697 v4, 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 128GB total memory, 8 x16GB 2400 MHz DDR4 RDIMM, 1 x 1TB SATA, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: Intel® Xeon® Gold 6148 processor@ 2.4GHz, H0QS, 40 cores 150W. QMS1, turbo and HT on, BIOS SE5C620.86B.01.00.0412.020920172159, 192GB total memory, 12 x 16 GB 2666 MHz DDR4 RDIMM, 1 x 800GB INTEL SSD SC2BA80, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

f. Amber* is a suite of programs for classical molecular dynamics and statistical analysis. The main MD program is PMEMD (Particle Mesh Ewald Molecular Dynamics) employs two separate algorithms for implicit- and explicit-solvent dynamics. Here performance for explicit solvent (PME) is presented. Amber: Version 16 with all patches applied at December, 2016. Workloads: PME Cellulose NVE(408K atoms), PME stmv(1M atoms), GB Nucleosome (25K), GB Rubisco (75K). No cut-off was used for GB workloads. Compiled with -mic2_​spdp –intelmpi - openmp, –DMIC2 * defined. Tests performed on March 2017. E5-2697 v4: Executed with 36 MPI, 2 OpenMP. 2S Intel® Xeon® processor E5-2697 v4, 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 8x16GB 2400MHz DDR4, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: Executed with 40 MPI and 2 OpenMP. 2S Intel® Xeon® Gold 6148 processor, 2.4GHz, 40 cores, turbo on, HT on, BIOS 86B.01.00.0412.R00, 12x16GB 2666MHz DDR, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

Based on Geomean of Weather Research Forecasting - Conus 12Km, HOMME, LSTCLS-DYNA Explicit, INTES PERMAS V16, MILC, GROMACS water 1.5M_​pme, VASPSi256, NAMDstmv, LAMMPS, Amber GB Nucleosome, Binomial option pricing, Black-Scholes, Monte Carlo European options. Test configuration: March/April 2017
RADIOSS* 1st Generation Intel® Xeon® Gold processor BASELINE: Altair RADIOSS 14 on Red Hat Enterprise Linux* 6.5, 2 Intel® Xeon® processor E5-2697 v3, 2.6GHz, 28 cores, Hyperthreading with 28 MPI x 2 OpenMP, 64GB DDR3/1833, regular DIMM, Intel® SSD DC S3700 800GB, 1Gb network, Source is Altair internal as of April 1, 2016. Next GEN: Altair RADIOSS 2017 on Red Hat Enterprise Linux 6.5, 2 Intel® Xeon® processor E5-2699 v4, 2.2GHz, 44 cores, Hyperthreading with 44 MPI x 2 OpenMP, 64GB DDR3/1833, regular DIMM, Intel® SSD DC S3700 800GB, 1Gb network, Bios SE5C620.86B.01.00.0412, Source is Altair internal as of April 1, 2017. NEW: Altair RADIOSS 2017 on CentOS Linux 7.2, 2 Intel® Xeon® Gold 6148 processor, 2.4GHz, 40 cores, Hyperthreading with 40 MPI x 2 OpenMP, 192GB DDR4/2666, regular DIMM, Intel® SSD DC S3700 800GB, 1Gb network, Bios 0271.R00 RADIOSS 2017, Neon 1M 8ms benchmark workload. NEON front car crash refined model with 1 million of elements, first 8ms run. Source is Altair internal as of April 11, 2017
Tencent InGame Purchase Machine Learning Platform* 1st Generation Intel® Xeon® Platinum processor OS: CentOS 7.3.1611. Testing by Intel May 2017. BASELINE: 2S Intel® Xeon® processor E5-2699 v4, 2.2GHz, 22 cores, turbo and HT on, 128GB total memory, 8 slots / 16GB / 2400 MT/s / DDR4, Intel® SSD DC S3700 800GB. NEW: 2S Intel® Xeon® Platinum 8180 pro Tencent InGame Purchase Machine Learning Platform* Test configuration: May 2017
Storage: Up to 5X IOPS while reducing latency by up to 70% 1st Generation Intel® Xeon® Platinum processor

BASELINE: Skylake w/ 4x P4600 no SPDK – CPU: 2S Intel® Xeon® Platinum 8170 CPU @ 2.10GHz; Memory: 196GB, 6x Memory Channels per socket, 1 16GB 2666 DDR4 DIMM per channel; Board: Intel Wolf Pass, BIOS: SE5C620.86B.01.00.0511.051220170820; Storage: 4x Intel P4600 1.6TB, 2 on socket 0 + 2 on socket 1; OS: Ubuntu 16.04.1; Linux Kernel: 4.11.0_​x86_​64; Turbo: On; HT: Disabled; C-States: Disabled; Power & Performance: Performance; Speed Stepping: Enabled; BenchMark: SPDK Perf; IODepth: 128; Block Size: 4096; RunTime: 300 sec: No. of Runs: 3 Times; Num of Cores: Core 0 (Single Core); IOPS – 4k random read: 614531; IOPS – 4k random writes: 588277; Lat – 4K random read: 833; Lat – 4K random writes: 870.

NEW: Skylake with 6x Intel® Optane + SPDK – CPU: 2S Intel® Xeon® Platinum 8168 CPU @ 2.70GHz; Memory: 196GB, 6x Memory Channels per socket, 1 16GB 2666 DDR4 DIMM per channel; Board: Intel Wolf Pass, BIOS: SE5C620.86B.01.00.0511.051220170820; Storage: 6x Intel P4800X 375GB, 1 on socket 0 + 5 on socket 1; OS: Ubuntu 16.04.1; Linux Kernel: 4.11.0_​x86_​64; SPDK Commit: 730a63d02b6; DPDK: 17.02; Turbo: On; HT: Disabled; C-States: Disabled; Power & Performance: Performance; Speed Stepping: Enabled; BenchMark: SPDK Perf; IODepth: 32; Block Size: 4096; RunTime: 300 sec: No. of Runs: 3 Times; Num of Cores: Core 0 (Single Core); IOPS – 4k random read: 3207706; IOPS – 4k random writes: 3005696; Lat – 4K random read: 239; Lat – 4K random writes: 255.

FIO with SPDK Test configuration: April 2017
Keepixo workload 1st Generation Intel® Xeon® Platinum processor OS: CentOS Linux* 7.3 kernel 3.10.0. Testing by Keepixo May 2017. BASELINE: 2S Intel® Xeon® processor E5-2699 v4, 2.2GHz, 22 cores, turbo and HT on, BIOS 251.R01, 64GB total memory, 8 slots / 8GB / 2133 MT/s / DDR4 LRDIMM, CentOS Linux* 7.1 kernel 3.10.0. NEW: 2S Intel® Xeon® Platinum 8168 processor, 2.7GHz, 24 cores, turbo and HT on, BIOS 412, 192GB total memory, 12 slots / 16GB / 2600 MT/s / DDR4 LRDIMM, CentOS Linux* 7.3 kernel 3.10.0. Keepixo workload Tested by Keepixo May 2017
Up to 4.2x VMs 1st Generation Intel® Xeon® Platinum processor Based on Intel® internal estimates 1-Node, 2 x Intel® Xeon® processor E5-2690 on Romley-EP with 256 GB Total Memory on VMware ESXi* 6.0 GA using Guest OS RHEL6.4, glassfish3.1.2.2, postgresql* 9.2. Data Source: Request Number: 1718, Benchmark: server virtualization workload, Score: 377.6 @ 21 VMs Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Wolf Pass SKX with 768 GB Total Memory on VMware ESXi 6.0 U3 GA using Guest VM's utilize RHEL 6 64 bit OS. Data Source: Request Number: 2563, Benchmark: server virtualization workload, Score: 1580 @ 90 VMs Higher is better. Server Virtualization workloads (Vmware ESXi 6.0 GA) Test configuration: May 2017
Up to 65% lower 4-year TCO estimate 1st Generation Intel® Xeon® Platinum processor

TCO estimate example based on equivalent rack performance using VMware ESXi* virtualized consolidation workload comparing 20 installed 2-socket servers with Intel® Xeon® processor E5-2690 (formerly “Sandy Bridge-EP”) running VMware ESXi* 6.0 GA using Guest OS RHEL 6.4 compared at a total cost of $919,362 to 5 new Intel® Xeon® Platinum 8180 (Skylake) running VMware ESXi 6.0 U3 GA using Guest OS RHEL 6 64 bit at a total cost of $320,879 including basic acquisition. Server pricing assumptions based on current OEM retail published pricing for 2-socket server with Broadwell based Intel® Xeon® processor systems– subject to change based on actual pricing of systems offered.

Figure: Cost Table

TCO estimates based on system running Vmware ESXI 6.0 Test configuration: May 2017
Up to 1.59x claim based on SAP testing of SAP HANA* workload 1st Generation Intel® Xeon® Platinum processor 1-Node, 4S Intel® Xeon® processor E7-8890 v4 on Grantley-EX-based platform with 1024 GB Total Memory on SLES12SP1 vs. estimates based on SAP internal testing on 1-Node, 4S Intel® Xeon® processor Scalable family (codename Skylake-SP) system. SAP HANA Tested by SAP, 2017
Security resulting in near zero encryption overhead for stored data 1st Generation Intel® Xeon® Platinum processor Near Zero encryption overhead: BigBench query Runtime/second. Testing done by Intel. BASELINE: Platform 8168, NODES 1 Mgmt + 6 Workers, Make Intel Corporation, Model S2600WFD, Form Factor 2U, Processor Intel® Xeon® Platinum 8168 processor, Base Clock 2.70 GHz, Cores per socket 24, Hyper-Threading Enabled, NUMA mode Enabled, RAM 384GB DDR4, RAM Type 12x 32GB DDR4, OS Drive Intel® SSD DC S3710 Series (800GB, 2.5in SATA 6Gb/s, 20nm, MLC), Data Drives 8x - Seagate Enterprise 2.5 HDD ST2000NX0403 2TB, Intel® SSD DC P3520 Series (2.0TB), Temp Drive DC 3520 2TB, NIC Intel® X722 10GbE - Dual Port, Hadoop Cloudera* 5.11, Benchmark BigBench*, Operating System CentOS Linux release 7.3.1611 (Core); HDFS encryption turned OFF. vs. NEW: Platform 8168, NODES 1 Mgmt + 6 Workers, Make Intel Corporation, Model S2600WFD, Form Factor 2U, Processor Intel® Xeon® Platinum 8168 processor, Base Clock 2.70 GHz, Cores per socket 24, Hyper-Threading Enabled, NUMA mode Enabled, RAM 384GB DDR4, RAM Type 12x 32GB DDR4, OS Drive Intel® SSD DC S3710 Series (800GB, 2.5in SATA 6Gb/s, 20nm, MLC), Data Drives 8x - Seagate Enterprise 2.5 HDD ST2000NX0403 2TB, Intel® SSD DC P3520 Series (2.0TB), Temp Drive DC 3520 2TB, NIC Intel X722 10GbE - Dual Port, Hadoop Cloudera 5.11, Benchmark BigBench, Operating System CentOS Linux release 7.3.1611 (Core); HDFS encryption turned ON. BigBench query Runtime/second Test configuration: April 2017
Technicolor proof of concept accelerated 3D rendering workload times by nearly 3X 1st Generation Intel® Xeon® Platinum processor Approximately 3x performance claim based on internal Technicolor, Inc. rendering workload: one-node, 2x Intel® Xeon® processor E5-2699 v4 with 32 GB total memory, 400 GB Intel® SSD DC S3510 Series, running on Windows® 10 Standard. Scores were normalized based on system 1 configuration as 1.0 baseline performance. Compared against one-node, 2x Intel® Xeon® Platinum 8180 processor with 32 GB total memory, Intel® Optane™ SSD DC P4800X, running on Windows® 10 Standard. Score: 2.95, as normalized against system 1. Technicolor, Inc. rendering workload Test configuration: April 2017
3.1X in SHA Algorithms for cryptographic hashing 1st Generation Intel® Xeon® Platinum processor Performance on single core with frequency obfuscation comparing Intel® Xeon® Platinum 8180 processor vs Intel® Xeon® E5-2650v4 processor. ISA-L Configuration: Intel® Xeon® processor Scalable family: Platinum 8180 processor, 28C, 2.5 GHz, H0, Neon City CRB, 12x16 GB DDR4 2666 MT/s ECC RDIMM, BIOS PLYCRB1.86B.0128.R08.1703242666. Intel® Xeon® E5-2650v4 processor, 12C, 2.2 GHz, Aztec City CRB, 4x8 GB DDR4 2400 MT/s ECC RDIMM, BIOS GRRFCRB1.86B.0276.R02.1606020546. Operating System: Redhat Enterprise Linux 7.3, Kernel 4.2.3, ISA-L 2.18, BIOS Configuration, P-States: Disabled, Turbo: Disabled, Speed Step: Disabled, C-States: Disabled, ENERGY_​PERF_​BIAS_​CFG: PERF Intelligent Storage Acceleration Library (ISA-L) Test configuration: April 2017
Up to 5x claim based on OLTP Warehouse 4S 1st Generation Intel® Xeon® Platinum processor 1-Node, 4 x Intel® Xeon® processor E7-4870 on Emerald Ridge with 512 GB Total Memory on Oracle Linux* 6.4 using Oracle* 12c running 800 warehouses. Data Source: Request Number: 56, Benchmark: HammerDB, Score: 2.46322e+006 Higher is better vs. 1-Node, 4 x Intel® Xeon® Platinum 8180 processor on Lightning Ridge SKX with 768 GB Total Memory on Red Hat Enterprise Linux* 7.3 using Oracle 12.2.0.1 (including database and grid) with 800 warehouses. Data Source: Request Number: 2542, Benchmark: HammerDB, Score: 1.2423e+007 Higher is better. Oracle* 12c running 800 warehouses Test configuration: May 2017
Up to 4.89x claim based on OLTP Warehouse 2S 1st Generation Intel® Xeon® Platinum processor 1-Node, 2 x Intel® Xeon® processor E5-2690 on Intel® Server Board S2600CP2 with 128 GB Total Memory on Oracle Linux* 6.4 using Oracle 11.2.0.3 with 5000 warehouses. Data Source: Request Number: 408, Benchmark: HammerDB, Score: 1.46826e+006 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Purley-EP (Lewisburg) with 768 GB Total Memory on Oracle Linux* 7.2 using Oracle 12.1.0.2.0, HammerDB 2.18. Data Source: Request Number: 2510, Benchmark: HammerDB, Score: 7.18049e+006 Higher is better. Oracle Linux* 6.4 using Oracle 11.2.0.3 with 5000 warehouses Test configuration: February 2017
Performance claims of HPC workloads comparing c4.8xlarge instances to c5.18xlarge instances 1st Generation Intel® Xeon® Platinum processor

AWS c4.8xlarge instance details: 1-Node, 2x Intel® Xeon® Processor E5-2666 v3 (2.9GHz, 9C, 36T, 24MB L3 cache) with 60GB Total Memory on RHEL 7.4 3.10.0-693.el7.x86_​64, Xen Hypervisor, Turbo ON, Intel® Compiler ICC18.0. AWS c5.18xlarge instance: 1-Node, 2x Intel® Xeon® Platinum processor (3.0GHz, 18C, 72T, 24MB L3 cache) with 144GB Total Memory on RHEL 7.4 3.10.0-693.el7.x86_​64, KVM Hypervisor, Turbo ON, Intel® Compiler ICC18.0.

Figure: HPC Results

Testing conducted on HPC applications and workloads comparing AWS c4.8xlarge vs. c5.18xlarge instances Test configuration: March/April 2017
Cost savings for model training by comparing c4.8xlarge instances to c5.18xlarge instances 1st Generation Intel® Xeon® Platinum processor TCO model: Given Cost: C5=$3.06/Hr, C4=$1.59/Hr. Cost Ratio C5/C4=1.92.TCO Ratio between C5/C4= Cost Ratio(C5/C4) / Perf Ratio (C5/C4). C5/C4 perf ratio needs to be > 1.92 to be cost efficient. Cost model based on current EC2 pricing: (https://aws.amazon.com/ec2/pricing/on-demand ). c5 cost=$3.06/hr, c4 cost=$1.59/hr. Workload cost based on pro-rated hourly rental costs for all instances used. Cost model based on current EC2 pricing: (https://aws.amazon.com/ec2/pricing/on-demand ). c5 cost=$3.06/hr, c4 cost=$1.59/hr. Workload cost based on pro-rated hourly rental costs for all instances used. Test configuration: March/April 2017
Throughput performance claims of images/sec comparing c4.8xlarge instances to c5.18xlarge instances 1st Generation Intel® Xeon® Platinum processor Throughput performance claims of images/sec comparing c4.8xlarge instances to c5.18xlarge instances. Testing by Intel on AWS EC2 c4.8xlarge instance (1+16 nodes, 2x Intel® Xeon® processor E5 v4 (18C), 60GB total memory) and c5.18xlarge instance (1+16 nodes, 2x Intel® Xeon® processor Scalable family (36C), 144GB total memory), using BigDL framework (version https://github.com/intel-analytics/BigDL, dataset version Cifar-10), VGG cifar-10, ResNet-50, ResNet-152 topology, EBS-optimized GP2 storage, running RHEL 7.4 3.10.0-693.el7.x86_​64, HT on, Turbo on. Variables: performance command - Training throughput measured as an average of per-iteration throughput in images/sec; data setup - data was stored on local storage and cached in memory before training; Oracle Java 1.8.0_​152, Apache Hadoop 2.7.4, Apache Spark 2.2.0, Apache maven 3.5.2, Protobuf 2.5. Tests conducted comparing c5.18xlarge with c4.8xlarge Test configuration: April 2017
1.73x Average Performance 1st Generation Intel® Xeon® Platinum processor

a. Up to 1.33x on TPC*-E: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Lenovo* Group Limited with 512 GB Total Memory on Windows Server* 2012 Standard using SQL Server 2016 Enterprise Edition. Data Source: http://www.tpc.org/4076, Benchmark: TPC Benchmark* E (TPC-E), Score: 4938.14 vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Lenovo Group Limited with 1536 GB Total Memory on Windows Server* 2016 Standard using SQL Server 2017 Enterprise Edition. Data Source: http://www.tpc.org/4080, Benchmark: TPC Benchmark* E (TPC-E), Score: 6598.36. Higher is better.

b. Up to 1.40x on SPECvirt_​sc* 2013: Claim based on best-published 2-soclet SPECvirt_​sc* 2013 result submitted to/published at http://www.spec.org/virt_sc2013/results/res2016q3/virt_sc2013-20160823-00060-perf.html as of 11 July 2017, Score: 2359 @ 137 VMs vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor with 768 GB (24 x 32 GB, 2R x4 PC4-2666 DDR4 2666MHz RDIMM) Total Memory on SUSE Linux Enterprise Server 12 SP2. Data Source: https://www.spec.org/virt_sc2013/results/res2016q3/virt_sc2013-20160823-00060-perf.html, Benchmark: SPECvirt_​sc* 2013, Score: 3323 @ 189 VMs. Higher is better.

c. Up to 1.44x on 2-Tier SAP* SD: Claim based on best-published two-socket SAP SD 2-Tier on Linux* results published at https://www.sap.com/dmc/exp/2018-benchmark-directory/#/sd as of 11 July 2017. New configuration: 2-tier, 2 x Intel® Xeon® Platinum 8180 processor (56 cores/112 threads) on DellEMC PowerEdge* R740xd with 768 GB total memory on Red Hat Enterprise Linux* 7.3 using SAP Enhancement Package 5 for SAP ERP 6.0, SAP NetWeaver 7.22 pl221, and Sybase ASE 16.0. Source: Certification #: 2017017: https://www.sap.com/dmc/exp/2018-benchmark-directory/#/sd, SAP* SD 2-Tier enhancement package 5 for SAP ERP 6.0 score: 32,085 benchmark users. Higher is better.

d. Up to 1.53x on SPECint*_​rate_​base2006: Claim based on best-published two-socket SPECint*_​rate_​base2006 result submitted to/published at http://www.spec.org/cpu2006/results/ as of 11 July 2017. New configuration: 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Huawei 2288H V5 with 384 GB total memory on SUSE Linux Enterprise Server 12 SP2 (x86_​64) Kernel 4.4.21-69-default, using C/C++: Version 17.0.1.132 of Intel® C/C++ Compiler for Linux. Source: https://www.spec.org/cpu2006/results/res2017q3/cpu2006-20170627-47389.pdf, SPECint*_​rate_​base2006 Score: 2800. Higher is better.

e. Up to 1.58x on SPECjbb*2015 MultiJVM critical-jOPS: Claim based on best-published two-socket SPECjbb*2015 MultiJVM critical-jOPS results published at http://www.spec.org/jbb2015/results/jbb2015multijvm.html as of 11 July 2017. New configuration: 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Cisco* Systems UCS C240 M5 with 1536 GB total memory on Red Hat Enterprise Linux* 7.3 (Maipo) using Java* HotSpot 64-bit Server VM, version 1.8.0_​131. Source: https://www.spec.org/jbb2015/results/res2017q3/jbb2015-20170622-00197.html, SPECjbb2015* - MultiJVM scores: 141,360 max-jOPS and 118,551 critical-jOPS. Higher is better.

f. Up to 1.65x on SPECfp*_​rate_​base2006: Claim based on best-published two-socket SPECfp*_​rate_​base2006 result submitted to/published at http://www.spec.org/cpu2006/results/ as of 11 July 2017. New configuration: 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Huawei 2288H V5 with 384 GB total memory on SUSE Linux Enterprise Server 12 SP2 (x86_​64) Kernel 4.4.21-69-default, using C/C++ and Fortran: Version 17.0.0.098 of Intel® C/C++ and Intel® Fortran Compiler for Linux. Source: http://www.spec.org/cpu2006/results/res2017q3/cpu2006-20170627-47387.pdf, SPECfp*_​rate_​base2006 Score: 1850. Higher is better.

g. Up to 1.69x on STREAM - triad: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 256 GB Total Memory on Red Hat Enterprise Linux* 6.5 kernel 2.6.32-431 using Stream NTW avx2 measurements. Data Source: Request Number: 1709, Benchmark: STREAM - Triad, Score: 127.7 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Neon City with 384 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using STREAM AVX 512 Binaries. Data Source: Request Number: 2500, Benchmark: STREAM - Triad, Score: 216. Higher is better. h. Up to 1.73x on HammerDB: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 384 GB Total Memory on Red Hat Enterprise Linux* 7.1 kernel 3.10.0-229 using Oracle 12.1.0.2.0 (including database and grid) with 800 warehouses, HammerDB 2.18. Data Source: Request Number: 1645, Benchmark: HammerDB, Score: 4.13568e+006 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Purley-EP (Lewisburg) with 768 GB Total Memory on Oracle Linux* 7.2 using Oracle 12.1.0.2.0, HammerDB 2.18. Data Source: Request Number: 2510, Benchmark: HammerDB, Score: 7.18049e+006 Higher is better.

i. Up to 1.73x on LAMMPS: LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. It is used to simulate the movement of atoms to develop better therapeutics, improve alternative energy devices, develop new materials, and more. E5-2697 v4: 2S Intel® Xeon® processor E5-2697 v4, 2.3GHz, 36 cores, Intel® Turbo Boost Technology and Intel® Hyperthreading Technology on, BIOS 86B0271.R00, 8x16GB 2400MHz DDR4, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: 2S Intel® Xeon® Gold 6148 processor, 2.4GHz, 40 cores, Intel® Turbo Boost Technology and Intel® Hyperthreading Technology on, BIOS 86B.01.00.0412.R00, 12x16GB 2666MHz DDR4, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

j. Up to 1.77x on DPDK L3 Packet Forwarding: E5-2658 v4: 5 x Intel® XL710-QDA2, DPDK 16.04. Benchmark: DPDK l3fwd sample application Score: 158 Gbits/s packet forwarding at 256B packet using cores. Gold 6152: Estimates based on Intel internal testing on Intel® Xeon® 6152 2.1 GHz, 2x Intel®, FM10420(RRC) Gen Dual Port 100GbE Ethernet controller (100Gbit/card) 2x Intel® XXV710 PCI Express Gen Dual Port 25GbE Ethernet controller (2x25G/card), DPDK 17.02. Score: 281 Gbits/s packet forwarding at 256B packet using cores, IO and memory on a single socket.

k. Up to 1.87x on Black-Scholes: which is a popular mathematical model used in finance for European option valuation. This is a double precision version. E5-2697 v4: 2S Intel® Xeon® processor CPU E5-2697 v4 , 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 128GB total memory, 8 x16GB 2400 MHz DDR4 RDIMM, 1 x 1TB SATA, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: Intel® Xeon® Gold processor 6148@ 2.4GHz, H0QS, 40 cores 150W. QMS1, turbo and HT on, BIOS SE5C620.86B.01.00.0412.020920172159, 192GB total memory, 12 x 16 GB 2666 MHz DDR4 RDIMM, 1 x 800GB INTEL® SSD SC2BA80, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

l. Up to 2.27x on LINPACK*: 1-Node, 2 x Intel® Xeon® processor E5-2699 v4 on Grantley-EP (Wellsburg) with 64 GB Total Memory on Red Hat Enterprise Linux* 7.0 kernel 3.10.0-123 using MP_​LINPACK 11.3.1 (Composer XE 2016 U1). Data Source: Request Number: 1636, Benchmark: Intel® Distribution of LINPACK, Score: 1446.4 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Wolf Pass SKX with 384 GB Total Memory on Red Hat Enterprise Linux* 7.3 using mp_​linpack_​2017.1.013. Data Source: Request Number: 3753, Benchmark: Intel® Distribution of LINPACK, Score: 3295.57. Higher is better.

m. 2.2X and 2.4X deep learning training and inference ResNet-18 performance: Inference throughput batch size 1, Training throughput batch size 256. Source: Intel measured as of June 2017. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel® microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel® microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel® microprocessors. Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC). Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact‘, OMP_​NUM_​THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Deep Learning Frameworks: Neon: ZP/MKL_​CHWN branch commit id: 52bd02acb947a2adabb8a227166a7da5d9123b6d. Dummy data was used. The main.py script was used for benchmarking, in mkl mode. ICC version used : 17.0.3 20170404, Intel® Math Kernel Library (Intel® MKL) small libraries version 2018.0.20170425. Platform: Platform: 2S Intel® Xeon® CPU E5-2699 v4 @ 2.20GHz (22 cores), HT enabled, turbo disabled, scaling governor set to “performance” via acpi-cpufreq driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_​64. SSD: Intel® SSD DC S3500 Series (480GB, 2.5in SATA 6Gb/s, 20nm, MLC). Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact,1,0‘, OMP_​NUM_​THREADS=44, CPU Freq set with cpupower frequency-set -d 2.2G -u 2.2G -g performance. Deep Learning Frameworks: Neon: ZP/MKL_​CHWN branch commit id: 52bd02acb947a2adabb8a227166a7da5d9123b6d. Dummy data was used. The main.py script was used for benchmarking, in mkl mode. ICC version used: 17.0.3 20170404, Intel® Math Kernel Library (Intel® MKL) small libraries version 2018.0.20170425.

Geomean based on Normalized Generational Performance (estimated based on Intel internal testing of OLTP Brokerage, SAP SD 2-Tier, HammerDB, Server-side Java, SPEC*int_​rate_​base2006, SPEC*fp_​rate_​base2006, Server Virtualization, STREAM* triad, LAMMPS, DPDK L3 Packet Forwarding, Black-Scholes, Intel® Distribution for LINPACK, AI deep learning training on Neon ResNet18, AI deep learning inference on Neon ResNet18. Test configuration: June 2017
Deep Learning: Up to 198x Inference Throughput Performance claim 1st Generation Intel® Xeon® Platinum processor Based on Intel® Optimized Caffe* GoogleNetV1 compared to 3 year old system using un-optimized software. INFERENCE using FP32 Batch Size Caffe GoogleNet v1 256. Processor :2 socket Intel® Xeon® Platinum 8180 CPU @ 2.50GHz / 28 cores HT ON , Turbo ON Total Memory 376.46GB (12slots / 32 GB / 2666 MHz).CentOS Linux-7.3.1611-Core , SSD sda RS3WC080 HDD 744.1GB,sdb RS3WC080 HDD 1.5TB,sdc RS3WC080 HDD 5.5TB , Deep Learning Framework caffe version: f6d01efbe93f70726ea3796a4b89c612365a6341 Topology :googlenet_​v1 BIOS:SE5C620.86B.00.01.0004.071220170215 MKLDNN: version: ae00102be506ed0fe2099c6557df2aa88ad57ec1 NoDataLayer. Measured: 1190 imgs/sec vs Platform: 2S Intel® Xeon® CPU E5-2699 v3 @ 2.30GHz (18 cores), HT enabled, turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.el7.x86_​64. OS drive: Seagate* Enterprise ST2000NX0253 2 TB 2.5" Internal Hard Drive. Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact,1,0‘, OMP_​NUM_​THREADS=36, CPU Freq set with cpupower frequency-set -d 2.3G -u 2.3G -g performance. Deep Learning Frameworks: Intel® Caffe*: (http://github.com/intel/caffe/ ), revision b0ef3236528a2c7d2988f249d347d5fdae831236. Inference measured with “caffe time --forward_​only” command, training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (GoogLeNet, AlexNet, and ResNet-50), https://github.com/intel/caffe/tree/master/models/default_vgg_19 (VGG-19), and https://github.com/soumith/convnet-benchmarks/tree/master/caffe/imagenet_winners (ConvNet benchmarks; files were updated to use newer Caffe prototxt format but are functionally equivalent). GCC 4.8.5, MKLML version 2017.0.2.20170110. BVLC-Caffe: https://github.com/BVLC/caffe, Inference & Training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training BVLC Caffe (http://github.com/BVLC/caffe ), revision 91b09280f5233cafc62954c98ce8bc4c204e7475 (commit date 5/14/2017). BLAS: atlas ver. 3.10.1. Intel® Optimized Caffe* GoogleNetV1 Test configuration: May 2017
Deep Learning: Up to 127x Training Throughput Performance claim 1st Generation Intel® Xeon® Platinum processor Based on Intel® Optimized Caffe* AlexNet compared to 3 year old system using un-optimized software. Batch Size AlexNet 256. Processor: 2x Intel® Xeon® Platinum 8180 CPU @ 2.50GHz / 28 cores HT ON , Turbo ON Total Memory 376.28GB (12slots / 32 GB / 2666 MHz).CentOS Linux-7.3.1611-Core , SSD sda RS3WC080 HDD 744.1GB,sdb RS3WC080 HDD 1.5TB,sdc RS3WC080 HDD 5.5TB , Deep Learning Framework caffe version: f6d01efbe93f70726ea3796a4b89c612365a6341 Topology :alexnet BIOS:SE5C620.86B.00.01.0009.101920170742 MKLDNN: version: ae00102be506ed0fe2099c6557df2aa88ad57ec1 NoDataLayer. Measured: 1023 imgs/sec vs Platform: 2S Intel® Xeon® CPU E5-2699 v3 @ 2.30GHz (18 cores), HT enabled, Turbo disabled, scaling governor set to “performance” via intel_​pstate driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.el7.x86_​64. OS drive: Seagate* Enterprise ST2000NX0253 2 TB 2.5" Internal Hard Drive. Performance measured with: Environment variables: KMP_​AFFINITY='granularity=fine, compact,1,0‘, OMP_​NUM_​THREADS=36, CPU Freq set with cpupower frequency-set -d 2.3G -u 2.3G -g performance. Deep Learning Frameworks: Intel® Caffe*: (http://github.com/intel/caffe/ ), revision b0ef3236528a2c7d2988f249d347d5fdae831236. Inference measured with “caffe time --forward_​only” command, training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training. Topology specs from https://github.com/intel/caffe/tree/master/models/intel_optimized_models (GoogLeNet, AlexNet, and ResNet-50), https://github.com/intel/caffe/tree/master/models/default_vgg_19 (VGG-19), and https://github.com/soumith/convnet-benchmarks/tree/master/caffe/imagenet_winners (ConvNet benchmarks; files were updated to use newer Caffe prototxt format but are functionally equivalent). GCC 4.8.5, MKLML version 2017.0.2.20170110. BVLC-Caffe: https://github.com/BVLC/caffe, Inference & Training measured with “caffe time” command. For “ConvNet” topologies, dummy dataset was used. For other topologies, data was stored on local storage and cached in memory before training BVLC Caffe (http://github.com/BVLC/caffe ), revision 91b09280f5233cafc62954c98ce8bc4c204e7475 (commit date 5/14/2017). BLAS: atlas ver. 3.10.1. Intel® Optimized Caffe* AlexNe Test configuration: May 2017
Up to 1.48x performance per core vs. competing x86 processors claim 1st Generation Intel® Xeon® Platinum processor Based on 2S Intel® Xeon® Platinum 8180 Scalable processor vs. 2S AMD EPYC* 7601 using SPECrate*2017_​int_​base benchmark by running 1 copy of the test on 1 core with hyper-threading on. Configuration 1: Intel® Xeon® Platinum 8180: Intel® Xeon®-based Reference Platform with 2x Intel® Xeon® 8180 (2.5GHz, 28 core) processors, BIOS version SE5C620.86B.00.01.0014.070920180847, 07/09/2018, microcode: 0x200004d, HT ON, Turbo ON, 12x 32GB DDR4-2666, 1 SSD, Ubuntu 18.04.1 LTS (4.17.0-041700-generic Retpoline), 1-copy SPEC* CPU2017 integer rate base benchmark compiled with Intel® Compiler 18.0.2 -O3, executed on 1 core using taskset and numactl on core 0. Estimated score = 6.59, as of 8/2/2018 tested by Intel. Configuration 2: AMD EPYC 7601: Supermicro AS-2023US-TR4 with 2S AMD EPYC 7601 with 2 AMD EPYC 7601 (2.2GHz, 32 core) processors, BIOS ver 1.1a, 4/26/2018, microcode: 0x8001227, SMT ON, Turbo ON, 16x32GB DDR4-2666, 1 SSD, Ubuntu 18.04.1 LTS (4.17.0-041700-generic Retpoline), 1-copy SPEC CPU2017 integer rate base benchmark compiled with AOCC ver 1.0 -Ofast, -march=znver1, executed on 1 core using taskset and numactl on core 0. Estimated score = 4.45. Testing by Intel. Performance results are based on testing as of 8/2/2018 and may not reflect all publicly available security updates. No product can be absolutely secure. SPECrate*2017_​int_​base Test configuration: August 2018

Cost Table

HPC Results