Intel® Data Center GPU Max 1550 has up to 1.83x higher performance that NVIDIA A100 PCIe 80G and up to 1.48x higher performance than NVIDIA H100 PCIe on a cross-platform astrophysics code | DPEcho: Intel® Data Center GPU Max 1550: Testing by Intel as of 1/31/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, BIOS Version WLYDCRB1.SYS.0021.P25.2107280557, Ubuntu 20.04, Kernel 5.15, oneAPI icpx Nightly 20230109 Intel® Xeon® 8480+: Testing by Intel as of 1/24/2023, 1-node 2x Intel® Xeon® 8480+, HT On, Turbo Enabled, 512GB DDR5-4800, Rocky Linux 8.7, Kernel 4.18, oneAPI icpx Nightly 2023.0.0 NVIDIA A100: Testing by Intel as of 1/18/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 128GB DDR4-3200, 1x PCIe NVIDIA A100 80G, BIOS Version SE5C6200.86B.0022.D08.2103221623, Ubuntu 20.04, Kernel 5.15, GPU Driver 510.73.05, Intel LLVM 20230109, CUDA 11.6 NVIDIA H100: Testing as of 1/18/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 128GB DDR4-3200, 1x PCIe NVIDIA H100, BIOS Version SE5C6200.86B.0022.D08.2103221623, Ubuntu 20.04, Kernel 5.15, GPU Driver 525.60.13, Intel LLVM 20230109, CUDA 12.0 Workload settings: Alfvén wave for Grid sizes: 36³, 48³, 72³, 96³, 132³, 192³, 264³, 390³, 516³ cells. DPEcho GitHub: https://github.com/LRZ-BADW/DPEcho |
Intel® Xeon® 8480+ has 1.5x higher geomean HPC performance across 27 benchmarks and applications than AMD EPYC 7763 Intel® Xeon® Max 9480 has 2x higher geoman HPC performance across 29 benchmarks and applications than AMD EPYC 7773X Intel® Xeon® Max 9480 has 1.4x higher geoman HPC performance across 29 benchmarks and applications than Intel® Xeon® 8480+ | Stream Triad: AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.6, Linux version 4.18.0-372.26.1.el8_6.crt1.x86_64, Stream v5.10 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Linux version 4.18.0-372.26.1.el8_6.crt1.x86_64, Stream v5.10 Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Linux version 4.18.0-372.26.1.el8_6.crt1.x86_64, Stream v5.10 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Linux version 5.19.0-rc6.0712.intel_next.1.x86_64+server, Stream v5.10 HPL: AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.6, Linux version 4.18.0-372.26.1.el8_6.crt1.x86_64, HPL v2.3_BLIS-3.0_AMD_OFFICIAL AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Linux version 4.18.0-372.26.1.el8_6.crt1.x86_64, HPL v2.3_BLIS-3.0_AMD_OFFICIAL Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Ubuntu 22.04.1 LTS, Linux version 5.15.0-50-generic, HPL from MKL_v2022.1.0 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Linux version 5.19.0-rc6.0712.intel_next.1.x86_64+server, HPL from MKL_v2022.1.0 HPCG AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, HPCG from MKL_v2022.1.0 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, HPCG from MKL_v2022.1.0 Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Ubuntu 22.04.1 LTS, Kernel 5.15, HPCG from MKL_v2022.1.0 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, HPCG from MKL_v2022.1.0 MPAS-A (MPAS-A V7.3 60-km dynamical core) AMD EPYC 7763: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version V2.4, ucode revision=0xa001173, Rocky Linux 8.6 , Kernel 4.18, MPAS-A V7.3 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit AMD EPYC 7773X: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, MPAS-A V7.3 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® 8480+: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, NUMA configuration SNC4, 512 GB DDR4-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, MPAS-A V7.3 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® Max 9480: Test by Intel as of 10/12/22. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, MPAS-A V7.3 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit NEMO (GYRE_PISCES_25, BENCH ORCA-1) AMD EPYC 7763: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version V2.4, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, NEMO v4.2 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit AMD EPYC 7773X: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, NEMO v4.2 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® 8480+: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, NUMA configuration SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, NEMO v4.2 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® Max 9480: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, NEMO v4.2 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit ROMS (benchmark3 (2048x256x30), benchmark3 (8192x256x30)) AMD EPYC 7763: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version V2.4, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, ROMS V4 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit AMD EPYC 7773X: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, ROMS V4 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® 8480+: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, NUMA configuration SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, ROMS V4 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® Max 9480: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Linux version 5.19, ROMS V4 build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit WRF (CONUS 2.5KM) AMD EPYC 7763: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version V2.4, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, WRF v4.4 built with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit AMD EPYC 7773X: Test by Intel as of 10/12/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, WRF v4.4 built with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® 8480+: Test by Intel as of 10/12/2022. 1-node, Intel® Xeon® 8480+, HT On, Turbo On, NUMA configuration SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, WRF v4.4 built with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel® Xeon® Max 9480: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, Total Memory 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, WRF v4.4 built with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit ENERGY Workloads (AWP, ISO3DFD, SSG) AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, YASK v3.05.07 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, YASK v3.05.07 Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, Intel® Xeon® 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version EGSDCRB1.86B.0083.D22.2206290535, ucode revision=0xaa0000a0, CentOS Stream 8, Linux version 4.18, YASK v3.05.07 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Linux version 5.19, YASK v3.05.07 FSI Kernels (Binomial Options v1.1, Black Scholes v1.4, Monte Carlo v1.2) AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, Binomial Options v1.1, Black Scholes v1.4, Monte Carlo v1.2 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, kernel 4.18, Binomial Options v1.1, Black Scholes v1.4, Monte Carlo v1.2 Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, Binomial Options v1.1, Black Scholes v1.4, Monte Carlo v1.2 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, Binomial Options v1.1, Black Scholes v1.4, Monte Carlo v1.2 AlphaFold 2 (Inference on Eight Streams) AMD EPYC 7773X: Test by Intel as of 10/18/2022. 1-node, 2x AMD EPYC 7773X, HT On, AMD Turbo Core On, 512 GB DDR4-3200, BIOS M10, ucode 0xa001229, OS CentOS Stream 8, Kernel 4.18, AVX-512, FP32, PyTorch 1.11.0, Intel Extension for PyTorch 1.11.200, jax 0.3.14 Intel® Xeon® Max 9480: Test by Intel as of 9/16/2022, 2x Intel® Xeon® Max 9480, HT On, Turbo On, 128GB HBM2e, BIOS SE5C7411.86B.8424.D03.2208100444, ucode 0x2c000020, CentOS Stream 8, Kernel 5.19, AVX-512, FP32, PyTorch 1.11.0, Intel Extension for PyTorch 1.11.200, jax 0.3.14 DeePMD (Multi-Instance Training) AMD EPYC 7773X: Test by Intel as of 10/25/2022. 1-node, 2x AMD EPYC 7773X, 256 GB DDR4-3200, Rocky Linux 8.6, Kernel 4.18, compiler gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), Tensorflow 2.9, Horovod 0.24.0, oneCCL-2021.5.2, Python 3.9 Intel® Xeon® 8480+: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® 8480+, 512 GB DDR5-48000, Rocky Linux 8.6, Kernel 4.18, compiler gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10), Tensorflow 2.9, Horovod 0.24.0, oneCCL-2021.5.2, Python 3.9 Intel® Xeon® Max 9480: Test by Intel as of 10/12/2022. 1-node, 2x Intel® Xeon® Max 9480, 128 GB HBM2e, CentOS Stream 8, Kernel 5.19, compiler gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-13), Tensorflow 2.9, Horovod 0.24.0, oneCCL-2021.5.2, Python 3.9 GROMACS (benchMEM, benchPEP, benchPEP-h, benchRIB, hecbiosim-3m, hecbiosim-465k, hecbiosim-61k, ion_channel_pme_large, lignocellulose_rf_large, rnase_cubic, stmv, water1.5M_pme_large, water1.5M_rf_large) AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version Ver 2.4 Rev 5.22, ucode revision= 0xa001173, Rocky Linux 8.6, Kernel 4.18, GROMACS v2021.4_SP AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6 , kernel 4.18, GROMACS v2021.4_SP Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, GROMACS v2021.4_SP Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, GROMACS v2021.4_SP LAMMPS (Atomic Fluid, Copper, DPD, Liquid_crystal, Polyethylene, Protein, Stillinger-Weber, Tersoff, Water) AMD EPYC 7763: Test by Intel as of 10/6/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version Ver 2.4 Rev 5.22, ucode revision= 0xa001173, Rocky Linux 8.6, Kernel 4.18, LAMMPS v2021-09-29 cmkl:2022.1.0, icc:2021.6.0, impi:2021.6.0, tbb:2021.6.0 AMD EPYC 7773X: Test by Intel as of 10/6/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, LAMMPS v2021-09-29 cmkl:2022.1.0, icc:2021.6.0, impi:2021.6.0, tbb:2021.6.0 Intel® Xeon® 8480+: Test by Intel as of 9/29/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, LAMMPS v2021-09-29 cmkl:2022.1.0, icc:2021.6.0, impi:2021.6.0, tbb:2021.6.0 Intel® Xeon® Max 9480: Test by Intel as of 9/29/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, LAMMPS v2021-09-29 cmkl:2022.1.0, icc:2021.6.0, impi:2021.6.0, tbb:2021.6.0 NAMD (apoa1, apoa1_npt_2fs, stmv, stmv_npt_2fs, f1atpase, hecbiosim-61k, hecbiosim-465k, hecbiosim-3000k) AMD EPYC 7763: Test by Intel as of 3/17/2023. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, NAMD V2.15alpha AMD EPYC 7773X: Test by Intel as of 3/17/2023. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, NAMD V2.15alpha Intell® Xeonl® 8480+: Test by Intel as of 3/17/2023. 1-node, 2x Intel Xeon 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.9525.D13.2302071332, ucode revision=0x2b000190, Rocky Linux 8.6, kernel 4.18, NAMD V2.15alpha Intell® Xeonl® Max 9480: Test by Intel as of 3/17/2023. 1-node, 2x Intel Xeon Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.9105.D93.2211240636, ucode revision=0xac000100, CentOS Stream 8, Kernel 4.18, NAMD V2.15alpha RELION (Plasmodium Ribosome 2D and 3D classification) AMD EPYC 7763: Test by Intel as of 3/24/2023. 1-node: 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.7, Kernel 4.18, RELION 3.13 AMD EPYC 7773X: Test by Intel as of 3/24/2023. 1-node: 2x AMD EPYC 7773X, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001229, Rocky Linux 8.7, Kernel 4.18, RELION 3.13 Intel Xeon 8480+: Test by Intel as of 3/24/2023. 1-node: 2x Intel Xeon 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.9525.D13.2302071332, ucode revision=0x2b000190, Rocky Linux 8.7, Kernel 4.18, RELION 3.13 Intel Xeon Max 9480 HBM Only: Test by Intel as of 3/24/2023. 1-node: 2x Intel Xeon Max 9480, HT On, Turbo On, 128 HBM2e, BIOS Version SE5C7411.86B.9525.D13.2302071332, ucode revision=0x2c000170, Rocky Linux 8.7, Kernel 4.18, RELION 3.13 VASP(CuC, Si, PdO4, PdO4_k221) AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version Ver 2.4 Rev 5.22, ucode revision= 0xa001173, Rocky Linux 8.6, Kernel 4.18, VASP6.3.2 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, VASP6.3.2 Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, VASP6.3.2 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, Total Memory 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, VASP6.3.2 Altair AcuSolve (HQ Model) AMD EPYC 7763: Test by Intel as of 9/27/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NPS4, 256 GB DDR4-2300, BIOS Version 2.1 Rev 5.22, ucode 0xa001144, Rocky Linux 8.6, kernel 4.18, Altair AcuSove 2021R2 AMD EPYC 7773X: Test by Intel as of 9/27/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NPS4, 256 GB DDR4-3200, BIOS Version M10, ucode 0xa001224, Rocky Linux 8.6, Kernel 4.18, Altair AcuSolve 2021R2 Intel® Xeon® 8480+: Test by Intel as of 09/28/2022. 1-node, 2x Intel® Xeon® 8480+, HT ON, Turbo ON, SNC4, 512 GB DDR5-4800, BIOS Version EGSDCRB1.86B.0083.D22.2206290535, ucode 0xaa0000a0, CentOS Stream 8, Kernel 4.18, Altair AcuSove 2021R2 Intel® Xeon® Max 9480: Test by Intel as of 10/03/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode 2c000020, CentOS Stream 8, Kernel 5.19, Altair AcuSolve 2021R2 Altair RADIOSS (Neon1M @ 80 ms, t10M @ 8 ms) AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version Ver 2.4 Rev 5.22, ucode revision= 0xa001173, Rocky Linux 8.6, Kernel 4.18, Altair RADIOSS 2022.2, Intel MPI 2021.7 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, Altair RADIOSS 2022.2, Intel MPI 2021.7 Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, Altair RADIOSS 2022.2, Intel MPI 2021.7 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, Altair RADIOSS 2022.2, Intel MPI 2021.7 Ansys Fluent (pump_2m, sedan_4m, rotor_3m, aircraft_wing_14m, combustor_12m, exhaust_system_33m) AMD EPYC 7763: Test by Intel as of 8/24/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NPS4, 256 GB DDR4-3200, BIOS ver. Ver 2.1 Rev 5.22, ucode 0xa001144, Rocky Linux 8.6, Kernel 4.18, Ansys Fluent 2022R1 AMD EPYC 7773X: Test by Intel as of 8/24/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NPS4, 256 GB DDR4-3200, BIOS ver. M10, ucode 0xa001224, Rocky Linux 8.6, Kernel 4.18, Ansys Fluent 2022R1 Intel® Xeon® 8480+: Test by Intel as of 09/02/2022. 1-node, 2x Intel® Xeon® 8480+, HT ON, Turbo ON, SNC4, 512 GB DDR5-4800, BIOS Version EGSDCRB1.86B.0083.D22.2206290535, ucode 0xaa0000a0, CentOS Stream 8, Kernel 4.18, Ansys Fluent 2022R1 Intel® Xeon® Max 9480: Test by Intel as of 08/31/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo ON, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode 2c000020, CentOS Stream 8, Kernel 5.19, Ansys Fluent 2022R1 Ansys LS-DYNA (ODB-10M) AMD EPYC 7763: Test by Intel as of 8/24/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, LS-DYNA R11 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC , HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, LS-DYNA R11 Intel® Xeon® 8480+: Test by Intel as of ww41'22. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, LS-DYNA R11 Intel® Xeon® Max 9480: Test by Intel as of ww36'22. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, kernel 5.19, LS-DYNA R11 Ansys Mechanical (V22iter-1, V22iter-2, V22iter-3, V22iter-4, V22direct-1, V22direct-2, V22direct-3) AMD EPYC 7763: Test by Intel as of 8/24/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NPS2, 512 GB DDR4-3200, BIOS ver. Ver 2.1 Rev 5.22, ucode 0xa001144, Rocky Linux 8.6, Kernel 4.18, Ansys Mechanical 2022 R2 AMD EPYC 7773X: Test by Intel as of 8/24/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NPS4, 512 GB DDR4-3200, BIOS ver. M10, ucode 0xa001229, CentOS Stream 8, Kernel 4.18, Ansys Mechanical 2022 R2 Intel® Xeon® 8480+: Test by Intel as of 09/02/2022. 1-node, 2x Intel® Xeon® 8480+, HT ON, Turbo ON, SNC4, 512 GB DDR5-4800, BIOS Version EGSDCRB1.86B.0083.D22.2206290535, ucode 0xaa0000a0, CentOS Stream 8, Kernel 4.18, Ansys Mechanical 2022 R2 Intel® Xeon® Max 9480 (cache mode): Test by Intel as of 08/31/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo ON, SNC4, 512 GB DDR5-4800 and 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode 2c000020, CentOS Stream 8, Kernel version 5.19, Ansys Mechanical 2022 R2 CONVERGE (SI8_engine_PFI_SAGE_transient_RAN) AMD EPYC 7763: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version Ver 2.4 Rev 5.22, ucode revision= 0xa001173, Rocky Linux 8.6, Kernel 4.18, Converge CFD 3.0.17 AMD EPYC 7773X: Test by Intel as of 10/7/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, NUMA configuration NPS=4, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, Converge CFD 3.0.17 Intel® Xeon® 8480+: Test by Intel as of 10/7/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000070, Rocky Linux 8.6, Kernel 4.18, Converge CFD 3.0.17 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, NUMA configuration SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, Converge CFD 3.0.17 OpenFOAM (Geomean of Motorbike 20M, Motorbike 42M) AMD EPYC 7763: Test by Intel as of 9/2/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version 2.1 rev5.22, ucode revision=0xa001144, Rocky Linux 8.6, Kernel 4.18, OpenFOAM 8 AMD EPYC 7773X: Test by Intel as of 9/2/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, OpenFOAM 8 Intel® Xeon® 8480+: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version EGSDCRB1.86B.0083.D22.2206290535, ucode revision=0xaa0000a0, CentOS Stream 8, Kernel 4.18, OpenFOAM 8 Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, OpenFOAM 8 ParSeNet (SplineNet) AMD EPYC 7763: Test by Intel as of 10/18/2022. 1-node, 2x AMD EPYC 7763, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version 2.4 rev5.22, ucode revision=0xa001173, Rocky Linux 8.6, Kernel 4.18, ParSeNet (SplineNet), PyTorch 1.13.0, IPEX 1.13.0-cpu, MKL (20220804), oneDNN (v2.6.0) AMD EPYC 7773X: Test by Intel as of 10/19/2022. 1-node, 2x AMD EPYC 7773X, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, ParSeNet (SplineNet), PyTorch 1.13.0, IPEX 1.13.0-cpu, MKL (20220804), oneDNN (v2.6.0) Intel® Xeon® 8480+: Test by Intel as of 10/18/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version EGSDCRB1.86B.0083.D22.2206290535, ucode revision=0xaa0000a0, CentOS Stream 8, Kernel 4.18, ParSeNet (SplineNet), PyTorch 1.11.0, Torch-CCL 1.2.0, IPEX 1.10.0, MKL (20220804), oneDNN (v2.6.0) Intel® Xeon® Max 9480: Test by Intel as of 09/12/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, ParSeNet (SplineNet), PyTorch 1.11.0, Torch-CCL 1.2.0, IPEX 1.10.0, MKL (20220804), oneDNN (v2.6.0) Siemens Simcenter Star-CCM+ (civil, HlMach10AoA10Sou, kcs_with_physics, lemans_poly_17m.amg, reactor, TurboCharger7M) Intel® Xeon® 8480+: Test by Intel as of 14-Sep-22. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 1024 GB DDR5-4800, BIOS Version EGSDCRB1.86B.0083.D22.2206290535, ucode revision=0xaa000090, CentOS Stream 8, Kernel 4.18, StarCCM+ 17.04.007 Intel® Xeon® Max 9480: Test by Intel as of 14-Sep-22. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, StarCCM+ 17.04.007 Cloverleaf (15360^2, 2955 Timesteps) AMD EPYC 7773X: Test by Intel as of 2/8/2023. Per node: 2x AMD EPYC 7773X, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10, ucode revision=0xa001224, Rocky Linux 8.6, Kernel 4.18, Cloverleaf 0fdb917 (August 9th, 2021) build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel Xeon 8480+: Test by Intel as of 2/8/2023. Per node: 2x Intel Xeon 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2b000161, Rocky Linux 8.6, Kernel 4.18, Cloverleaf 0fdb917 (August 9th, 2021) build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel Xeon Max 9480 HBM Only: Test by Intel as of 2/8/2023. Per node: 2x Intel Xeon Max 9480, HT On, Turbo On, 128 HBM2e, BIOS Version SE5C7411.86B.9409.D04.2212261349, ucode revision=0x2c000120, CentOS Stream 8, Kernel 5.19Cloverleaf 0fdb917 (August 9th, 2021) build with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit CosmoFlow (training on 8192 image batches) AMD EPYC 7773X : Test by Intel as of 10/7/2022. 1 node, 2x AMD EPYC 7773X, HT On, Turbo Core On, 512 GB DDR4-3200, BIOS M10, ucode 0xa001229, OS CentOS Stream 8, Kernel 4.18, Intel TensorFlow 2.8.0, horovod 0.22.1, keras 2.8.0, OpenMPI 4.1.0, ppn=8, LBS=16, ~25GB data, 16 epochs, Python 3.8 Intel® Xeon® 8480+ (AMX BF16): Test by Intel as of 10/18/2022. 1node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS EGSDCRB1.86B.0083.D22.2206290535, ucode 0xaa0000a0, CentOS Stream 8, kernel 4.18, AMX, BF16, Tensorflow 2.9.1, horovod 0.24.3, keras 2.9.0.dev2022021708, oneCCL 2021.5, ppn=8, LBS=16, ~25GB data, 16 epochs, Python 3.8 Intel® Xeon® Max 9480 (AMX BF16, cache mode): Test by Intel as of 10/18/2022. 1 node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, 128 HBM2e and 512 GB DDR (16 slots/ 32 GB/ 4800 MHz), BIOS SE5C7411.86B.8424.D03.2208100444, ucode 0x2c000020, CentOS Stream 8, Kernel 5.19, AMX, BF16, TensorFlow 2.9.1, horovod 0.24.0, keras 2.9.0.dev2022021708, oneCCL 2021.5, ppn=8, LBS=16, ~25GB data, 16 epochs, Python 3.9 MF-LBM (240x240x260) AMD EPYC 7763: Test by Intel as of 3/3/2023. Per node: 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.7, Kernel 4.18, MF-LBM source commit 7fcc6f0 as of Jan 8, 2023 AMD EPYC 7773X: Test by Intel as of 3/3/2023. Per node: 2x AMD EPYC 7773X, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version M10 rev5.22, ucode revision=0xa001224, Rocky Linux 8.7, Kernel 4.18, MF-LBM source commit 7fcc6f0 as of Jan 8, 2023 Intel Xeon 8480+: Test by Intel as of 3/3/2023. Per node: 2x Intel Xeon8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS Version SE5C7411.86B.8901.D03.2210131232, ucode revision=0x2b0000a1, Rocky Linux 8.7, Kernel 4.18, MF-LBM source commit 7fcc6f0 as of Jan 8, 2023 Intel Xeon Max 9480: Test by Intel as of 3/3/2023. Per node: 2x Intel Xeon Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.9105.D93.2211240636, ucode revision=0xac000100, Rocky Linux 8.7, Kernel 4.18, MF-LBM source commit 7fcc6f0 as of Jan 8, 2023 This offering is not approved or endorsed by OpenCFD Limited, producer and distributor of the OpenFOAM software via www.openfoam.com, and owner of the OPENFOAM® and OpenCFD® trademark MLPerf™ HPC-AI v0.7 Training benchmark Performance. Result not verified by MLCommons Association. Unverified results have not been through an MLPerf™ review and may use measurement methodologies and/or workload implementations that are inconsistent with the MLPerf™ specification for verified results. The MLPerf™ name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information (link) |
Intel® Xeon® CPU Max series is capable of 220 GF/s of HPCG performance | Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, HPCG from MKL_v2022.1.0 |
Intel® Xeon Max 9480 has 1.65x higher HPCG performance than AMD EPYC 9654 | Intel® Xeon® Max 9480: Test by Intel as of 9/2/2022. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode revision=0x2c000020, CentOS Stream 8, Kernel 5.19, HPCG from MKL_v2022.1.0 AMD EPYC 9654: Test by Intel as of 03/27/23. 1-node, 2x AMD EPYC 9654, HT On, Turbo On, CTDP=360W, NPS=4, 1536GB DDR5-4800, BIOS 1.2, microcode 0xa101111, Red Hat Enterprise Linux 8.7, Kernel 4.18, AMD official HPCG binary |
One node of Intel® Xeon® Max 9480 provides the same performance on Altair AcuSolve as four nodes of 3rd gen Intel® Xeon® scalable processors with 57% lower CPU power and no DDR DIMMS | Altair AcuSolve (HQ Model) Intel® Xeon® 6346: Test by Intel as of 10/08/2022. 4-nodes connected via HDR-200, 2x Intel® Xeon® 6346, 16 cores, HT ON, Turbo ON, Quad, 256 GB DDR4-3200, BIOS Version SE5C6200.86B.0020.P23.2103261309, ucode 0xd000270, Rocky Linux 8.6, kernel version 4.18.0-372.19.1.el8_6.crt1.x86_64, Altair AcuSolve 2021R2, HDR200 Infiniband Intel® Xeon® Max 9480: Test by Intel as of 10/03/2022. 1-node, 2x Intel® Xeon® Max 9480, HT ON, Turbo ON, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.8424.D03.2208100444, ucode 2c000020, CentOS Stream 8, kernel version 5.19.0-rc6.0712.intel_next.1.x86_64+server, Altair AcuSolve 2021R2 |
32 nodes of Intel® Xeon® Max 9480 perform 77.21x better than one node of Intel Xeon 8360Y for climate modeling 8 nodes of Intel® Xeon® Max 9480 perform 25.5x better than one node of AMD EPYC 7763 for climate modeling | MPAS-A (dyncore 30-km) AMD EPYC 7763: Test by Intel as of 3/8/2023. Per node: 2x AMD EPYC 7763, HT On, Turbo On, cTDP - 280, 256 GB DDR4-3200, BIOS Version 2.4 Rev 5.22, ucode revision=0xa001173, Rocky Linux 8.7, Kernel 4.18, NVIDIA Mellanox HDR InfiniBand 200Gbps, OFED Stack mlnx-5.7-1.0.2.0. MPAS-A version 7.3. Input: 15-km dycore, double-precision. MPAS-A was compiled with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel Xeon 8360Y: Test by Intel as of 3/1/2023. Per node: 2x Intel Xeon 8380, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version SE5C620.86B.01.01.0006.2207150335, ucode revision=0xd000375, Rocky Linux 8.7, Kernel 4.18, OmniPath 100Gbps, OFED Stack OPA orig-372.32.1_2.12.9. MPAS-A version 7.3. MPAS-A was compiled with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel Xeon 8480+: Test by Intel as of 3/1/2023. Per node: 2x Intel Xeon8480+, HT On, Turbo On, SNC4, 512 GB DDR5-4800, BIOS SE5C7411.86B.8713.D03.2209091345, ucode revision=0x2b000161, Rocky Linux 8.7, Kernel 4.18, NVIDIA Mellanox HDR InfiniBand 200Gbps, OFED Stack mlnx-5.7-1.0.2.0. MPAS-A version 7.3. MPAS-A was compiled with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit Intel Xeon Max 9480 HBM Only: Test by Intel as of 3/3/2023. Per node: 2x Intel Xeon Max 9480, HT On, Turbo On, SNC4, 128 GB HBM2e, BIOS Version SE5C7411.86B.9409.D04.2212261349, ucode revision=0x2c000120, Rocky Linux 8.7, Kernel 4.18, NVIDIA Mellanox HDR InfiniBand 200Gbps, OFED Stack mlnx-5.7-1.0.2.0. MPAS-A version 7.3. MPAS-A was compiled with Intel® Fortran Compiler Classic and Intel® MPI from 2022.3 Intel® oneAPI HPC Toolkit |
Intel® Xeon® Max has 2x higher throughput at the same latency and 1/3 lower latency for the same throughput vs Intel® Xeon® 8480+ on LLM Inference | GPT-J (BF16 Inference, batch size 1,2,4,8) Intel® Xeon® 8480+: Test by Intel as of 4/25/2023. 1-node, 2x Intel Xeon 8480+, HT On, Turbo On, 1024GB DDR5-4800, BIOS Version 3A11.uh, ucode 0x2b000111, CentOS Stream 8, Kernel 5.16, GCC 11.2.1, PyTorch torch-2.0.0.dev20230228%2Bcpu, Intel PyTorch Extensions 2.0, Intel Transformers Extenstions 1.0 Intel® Xeon Max 9480: Test by Intel as of 4/25/2023. 1-node, 2x Intel Xeon Max 9480, HT On, Turbo On, 1024GB DDR5-4800 and 128GB HBM2e, BIOS Version SE5C7411.86B.9525.D13.2302071332, ucode 0x2c000170, CentOS Stream 8, Kernel 5.19, GCC 11.2.1 20220127, PyTorch torch-2.0.0.dev20230228%2Bcpu, Intel PyTorch Extensions 2.0, Intel Transformers Extenstions 1.0 |
Texas Advanced Computing Center (TACC) sees 252% higher performance on Intel® Xeon® Max versus their current Frontera System across 15 workloads | Average performance increase from Frontera (2nd Gen Intel® Xeon® scalable processors) to Intel® Xeon® Max 9480. Applications compared were WRF CONUS 2.5km, Parsec liquid_water_64H2O_0.3A, Parsec Si1947H604_0.9A, Amber STMV_production_BP4_4fs, AWP-ODC Fortran, Seissol tpv5_1node, CESM PFS_Ld5_P56.mpasa120_mpasa, EWP MgB2_16, MILC grid 32x32x32x32, Enzo-E 128^3 root mech 512 root blocks, ISSM test435, MuST muffin_56x32, PSDNS 768x768x768, Plascom, and athena++ Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy |
Gaudi2 delivers 1.8x higher traininig performance for BERT-L | Gaudi2 is 1.5 - 2x faster than A100 for both training and inference. Habana ResNet50 Model: https://github.com/HabanaAI/Model-References/tree/master/TensorFlow/computer_vision/Resnets/resnet_keras Habana SynapseAI Container: https://vault.habana.ai/ui/repos/tree/General/gaudi-docker/1.7.0/ubuntu20.04/habanalabs/tensorflow-installer-tf-cpu-2.8.3 Habana Gaudi Performance: https://developer.habana.ai/resources/habana-training-models/ A100 / V100 Performance Source: https://ngc.nvidia.com/catalog/resources/nvidia:resnet_50_v1_5_for_tensorflow/performance, results published for DGX A100-40GB and DGX V100-32GB Habana BERT-Large Model: https://github.com/HabanaAI/Model-References/tree/master/TensorFlow/nlp/bert Habana SynapseAI Container: https://vault.habana.ai/ui/repos/tree/General/gaudi-docker/1.7.0/ubuntu20.04/habanalabs/tensorflow-installer-tf-cpu-2.8.3 Habana Gaudi Performance: https://developer.habana.ai/resources/habana-training-models/ A100 / V100 Performance Sources: https://ngc.nvidia.com/catalog/resources/nvidia:bert_for_tensorflow/performance, results published for DGX A100-40G and DGX V100-32G Measured January 2023 |
Gaudi2 delivers 2.4x higher fine-tuning performance for T5-3B | T5-3B Fine Tuning (batch size = 16) Habana Gaudi: Tested by Intel as of 12/14/2023, 1-node, 2S Intel® Xeon® 8380, HT On, Turbo On, 1024 GB DDR4-3200, 1x Habana Gaudi2 SynapseAI 1.9.0-580, NVIDIA A100: Tested by Intel as of 12/14/2023, 1-node, 2S Intel® Xeon® 8380, HT On, Turbo On, 1024 GB DDR4-3200, 1x NVIDIA A100 PCIe 80G, Driver 510.47.03, CUDA 11.6 https://huggingface.co/blog/habana-gaudi-2-benchmark |
Gaudi2 delivers 2.44x higher throughput for stable diffusion | https://huggingface.co/blog/habana-gaudi-2-benchmark#generating-images-from-text-with-stable-diffusion Habana Model scripts: https://github.com/HabanaAI/Model-References/tree/master/PyTorch/generative_models/stable-diffusion-v-2-1. Model performance: https://developer.habana.ai/resources/habana-models-performance/ Measured with SynapseAI 1.9.0 using BF16, batch size = 1, 50 steps with DDIM sampler. Results may vary. |
Gaudi2 delivers 60% higher power efficiency, measured in throughput per Watt, for inferencing large language models such as Bloom-176 Billion parameter model. Gaudi2 is 1.3X faster than A100-80G for BLOOMZ 176B inference. | BLOOMZ 176B Inference(batch size = 1, max length = 128, BF16) Habana Gaudi: Tested by Intel as of 4/3/2023, 1-node, 2S Intel® Xeon® 8380, HT On, Turbo On, 1024 GB DDR4-3200, 1x Habana Gaudi2 SynapseAI 1.9.0-580, NVIDIA A100: Tested by Intel as of 4/3/2023, 1-node, 2S Intel® Xeon® 8380, HT On, Turbo On, 1024 GB DDR4-3200, 1x NVIDIA A100 PCIe 80G, Driver 510.47.03, CUDA 11.6 Configuration used to measure power and performance: Software: Habana model scripts: https://github.com/HabanaAI/ModelReferences/tree/master/PyTorch/nlp/bloom GPU model scripts: https://huggingface.co/blog/bloom-inference-pytorch-scripts Measurements for Greedy Search, batch size = 1, max length = 128, BF16 Performance measured by Habana on following system and software configurations. Results may vary. |
Gaudi2 scale with 99% efficiency from 8 cards to 64 cards for Stable Diffusion Training | Stable Diffusion Inference(Training with BF16, batch size = 16, global batch size = 1024, for 1K iterations. Image size 256x256) Habana Gaudi: Tested by Intel as of 12/14/2023, Per Node, 2S Intel® Xeon® 8380, HT On, Turbo On, 1024 GB DDR4-3200, 8x Habana Gaudi2 SynapseAI 1.9.0-580 https://github.com/HabanaAI/Model-References/tree/master/PyTorch/generative_models/stable-diffusion-training |
Intel® Data Center GPU Max has 1.7x higher geomean HPC performance across 23 benchmarks and applications than NVIDIA A100 PCIe 80G Intel® Data Center GPU Max has 1.3x higher geomean HPC performance across 23 benchmarks and applications than NVIDIA H100 PCIe | BabelStream Triad: Intel® Data Center GPU Max 1550: Test by Intel as of 4/25/2023, 1-node 2x Intel Xeon 8360Y, HT On, Turbo On, 256 GB DDR4-3200, Ubuntu 20.04, Kernel 5.15 , ucode=0xd000375, IFWI 22WW51.5, oneAPI 2023.1, Triad array size 134217728, 100 iterations NVIDIA A100: Test by Intel as of 4/25/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 22.04, Kernel 5.15, ucode=0xd000363, driver 525.60.13, nvhpc 22.2, CUDA 12, Triad array size 134217728, 100 iterations NVIDIA H100: Test by Intel as of 4/25/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, ucode=0xd000389, driver 530.30.02, nvhpc 22.2, CUDA 12.0, Triad array size 134217728, 100 iterations RINF(SIZE=400000000, icase=39, stride=1, dyn.allocation=TRUE, avg.performance=L): Intel® Data Center GPU Max 1550: Test by Intel as of 4/25/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo On, 256 GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, Ubuntu 22.04, Kernel 5.15, ucode=0xd000375, IFWI_WW50.5, ifx (IFX) 2023.1.0 20230320 NVIDIA A100: Test by Intel as of 4/25/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 22.04, Kernel 5.15, ucode=0xd000363, driver 525.60.13, nvhpc 23.3, nvfortran 23.3-0 NVIDIA H100: Test by Intel as of 4/26/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, ucode=0xd000389, driver 530.30.02, nvhpc 23.3, nvfortran 23.3-0 ISO3DFD Intel Data Center GPU Max: Test by Intel as of 4/27/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, BIOS Version WLYDCRB1.SYS.0021.P25.2107280557, ucode = 0x8d0002e0, Ubuntu 20.04, Kernel 5.15, IFWI_WW50.5, Intel oneAPI C++ Compiler 2023.1.0 NVIDIA A100: Test by Intel as of 4/27/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x NVIDIA A100 80G PCIe, BIOS Version SE5C6200.86B.0022.D08.2103221623, ucode = 0xd000363, Ubuntu 22.04, Kernel 5.15, CUDA Driver 530.30.02, CUDA 11.8 nvcc NVIDIA H100: Test by Intel as of 4/27/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x NVIDIA H100 PCIe, BIOS Version SE5C6200.86B.0021.D40.2101090208, ucode = 0xd000375, Ubuntu 22.04, Kernel 5.15, CUDA Driver 525.60.13, CUDA 12.0 nvcc SpecFEM3D_Globe (global_s362ani_shakemovie) Intel Data Center GPU Max: Test by Intel as of 4/25/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, BIOS Version WLYDCRB1.SYS.0021.P25.2107280557, ucode = 0x8d0002e0, Ubuntu 20.04, Kernel 5.15, IFWI_WW50.5, Intel oneAPI C++ Compiler 2023.1.0 NVIDIA A100: Test by Intel as of 4/25/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x NVIDIA A100 80G PCIe, BIOS Version SE5C6200.86B.0022.D08.2103221623, ucode = 0xd000363, Ubuntu 22.04, Kernel 5.15, CUDA 11.8, CUDA Driver 530.30.02, NVIDIA H100: Test by Intel as of 4/25/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x NVIDIA H100 PCIe, BIOS Version SE5C6200.86B.0021.D40.2101090208, ucode = 0xd000375, Ubuntu 22.04, Kernel 5.15, CUDA 12.0, CUDA Driver, 525.60.13 FSI Kernels (American Monte Carlo, Binomial Options, Black Scholes, European Monte Carlo) Intel® Data Center GPU Max 1550: Test by Intel as of 4/27/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo On, 256GB DDR4-3200, ucode=0xd000375, BIOS Version WLYDCRB1.SYS.0021.P16.2105280638, Ubuntu 20.04, Kernel 5.15, 1x Intel® Data Center GPU Max 1550, fw_ver: PVC2_1.22505, oneAPI 2023.1 NVIDIA A100: Test by Intel as of 4/27/2023, 1-node 2x Intel® Xeon® Platinum 8360Y, HT On, Turbo On, 512GB DDR-3200, ucode=0xd000375, BIOS Version SE5C6200.86B.0022.D08.2103221623, CentOS Stream 8, Kernel 4.18, 1x NVIDIA A100 80GB PCIe, Driver Version: 530.30.02, CUDA SDK 12.1 NVIDIA H100: Test by Intel as of 4/27/2023, 1-node 2x Intel® Xeon® Platinum 8360Y, HT On, Turbo On, 512GB DDR-3200, ucode=0xd000389, BIOS Version SE5C6200.86B.0021.D40.2101090208, Ununtu 22.04, Kernel 4.18, 1x NVIDIA H100 PCIe, Driver Version: 525.60.13, CUDA SDK 12.0 Riskfuel Credit Option Pricing Training (8 layers, width=1024, input_dims=5) Intel® Data Center GPU Max 1550: Test by Intel as of 2/8/2022. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 1024 GB DDR5-4800, 1x Intel Datacenter GPU Max 1550, GPU Driver 1.3.23937, Ubuntu 20.04, Kernel 5.15; oneAPI DPC++/C++ Cpmpiler 2023.0.0, Intel Python 3.9, intel-extension-for-pytorch 1.13.10+xpu, torch 1.13.0a0+gitb1dde16 NVIDIA A100: Test by Intel as of 2/16/2022. 1-node, 2x Intel® Xeon® 8360Y, HT On, Turbo On, 512 GB DDR4-3200, 1x NVIDIA A100 80G PCIe, GPU Driver 515.48.07, CentOS Stream 8, Kernel 4.18; CUDA 11.7, Python 3.9, torch 1.12.1+cu113 NVIDIA H100: Test by Intel as of 2/8/2022. 1-node, 2x Intel® Xeon® 8360Y, HT On, Turbo On, 512 GB DDR4-3200, 1x NVIDIA H100 PCIe, GPU Driver 525.60.13, Ubuntu 22.04, Kernel 5.15, CUDA 12, Python 3.9, torch 2.0.0.dev20230202 AutoDock: Intel® Data Center GPU Max 1550: Test by Intel as of April 2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, BIOS Version WLYDCRB1.SYS.0021.P25.2107280557, ucode 0x8d0002e0, Ubuntu 22.04, Kernel 5.15, GPU Driver 22.32.23937.16, Intel(R) oneAPI DPC++/C++ Compiler 2023.1.0 (2023.x.0.20221201) NVIDIA A100: Test by Intel as of April 2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x PCIe NVIDIA A100 80G, BIOS Version SE5C6200.86B.0022.D08.2103221623, ucode=0xd000363 Ubuntu 22.04, Kernel 5.15, driver 525.60.13, CUDA 12 NVIDIA H100: Test by Intel as of April 2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x PCIe NVIDIA A100 80G, BIOS Version SE5C6200.86B.0022.D08.2103221623, ucode=0xd000389 Ubuntu 22.04, Kernel 5.15, driver 530.30.02, CUDA 12 Workload settings: ffile, lfile=Proteins 1ac8, 1stp, 3ce3, 3tmn, 7cpa from github source repository, nrun=100 (number of runs of the genetic algorithm), lsmet=ad, sw (the local-search method), autostop=0 (no autostopping), heuristics=0 (no stopping by heuristic), xmloutput=0 (no xmloutput), resnam= (location of results directory providing energy evaluations). Measured docking time LAMMPS (Copper Liquid Crystal, Water): Intel Data Center GPU Max: Test by Intel as of 4/25/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, BIOS Version WLYDCRB1.SYS.0021.P25.2107280557, ucode = 0x8d0002e0, Ubuntu 20.04, Kernel 5.15, IFWI_WW50.5, oneAPI 2023.0, LAMMPS Develop 8Feb2023 NVIDIA A100: Test by Intel as of 4/25/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x NVIDIA A100 80G PCIe, BIOS Version SE5C6200.86B.0022.D08.2103221623, ucode = 0xd000363, Ubuntu 22.04, Kernel 5.15, CUDA 12.0, CUDA Driver 530.30.02, LAMMPS Develop 8Feb2023 NVIDIA H100: Test by Intel as of 4/25/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x NVIDIA H100 PCIe, BIOS Version SE5C6200.86B.0021.D40.2101090208, ucode = 0xd000375, Ubuntu 22.04, Kernel 5.15, CUDA 12.0, CUDA Driver, 525.60.13, LAMMPS Develop 8Feb2023 miniBUDE (BM2 Long) : Intel® Data Center GPU Max 1550: Test by Intel as of 4/26/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo On, 256 GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, Ubuntu 22.04, Kernel 5.15, ucode=0xd000375, driver 22.32.23937.16, oneAPI icpx 2023.1.0 NVIDIA A100: Test by Intel as of 4/26/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 22.04, Kernel 5.15, ucode=0xd000363, driver 525.60.13, CUDA 11.8 NVIDIA H100: Test by Intel as of 4/26/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, ucode=0xd000389, driver 530.30.02, CUDA 12.0 NAMD (STMV, APOA-1, Rest2): Intel® Data Center GPU Max 1550: Test by Intel as of 4/26/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo On, 256 GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, Ubuntu 22.04, Kernel 5.15, ucode=0xd000375, driver 22.32.23937.16 NVIDIA A100: Test by Intel as of 4/26/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 22.04, Kernel 5.15, ucode=0xd000363, driver 525.60.13, CUDA 11.8 NVIDIA H100: Test by Intel as of 4/26/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 22.04, Kernel 5.15, ucode=0xd000389, driver 530.30.02, CUDA 12.0 CoMlSim Intel® Data Center GPU Max 1550: Test by Intel as of 04/21/2023. 1-node, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, HT On, Turbo On, Total Memory 256 GB (16x16GB 3200MT/s, Dual-Rank), BIOS Version WLYDCRB1.SYS.0021.P25.2107280557, CoMLSim, batch-size 2560, TensorFlow 2.12.0, ITEX 1.2.0, MKL (In basekit 2023.1.0.46401), oneDNN (Branch: rls-v3.1 commit id: 08638f8c) NVIDIA A100: Test by Intel as of 04/21/2023. GPU, GA100-80G, PCI ID 20B5, Intel® Xeon® Platinum 8360Y CPU @ 2.40GHz, 36 cores, HT On, Turbo On, Total Memory 512GB (16x32GB DDR4 3200 MT/s [3200 MT/s]), SE5C6200.86B.0022.D08.2103221623, Ubuntu 20.04.1 LTS, 5.13.0-28-generic, CoMLSim, TensorFlow 2.11, NVIDIA CUDA® 12.0.1, NVIDIA cuBLAS from CUDA 12.0.1, NVIDIA cuDNN 8.7.0 NVIDIA H100: Test by Intel as of 04/21/2023. GPU, H100, PCI ID 20B5, Intel® Xeon® Platinum 8360Y CPU @ 2.40GHz, 36 cores, HT On, Turbo On, Total Memory 512GB (16x32GB DDR4 3200 MT/s [3200 MT/s]), SE5C6200.86B.0022.D08.2103221623, Ubuntu 20.04.1 LTS, 5.13.0-28-generic, CoMLSim, TensorFlow 2.11, NVIDIA CUDA® 12.0.1, NVIDIA cuBLAS from CUDA 12.0.1, NVIDIA cuDNN 8.7.0 Jacobi Solver: Intel Data Center GPU Max: Test by Intel as of 4/24/2023, 1-node 1x Intel® Xeon® 8480+, HT On, Turbo Enabled, 512GB DDR5-4800, 1x Intel® Data Center GPU Max 1550, ucode = 0x8f000300, Ubuntu 20.04, Kernel 5.15, IFWI_WW50.5, driver 1.3.26002, oneAPI 2023.1.0, NVIDIA A100: Test by Intel as of 4/23/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x NVIDIA A100 80G PCIe, ucode = 0xd000363, Ubuntu 22.04, Kernel 5.15, CUDA 12.0, CUDA Driver 530.30.02, CUDA 12.0 NVIDIA H100: Test by Intel as of 4/24/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x NVIDIA H100 PCIe, ucode = 0xd000389, Ubuntu 22.04, Kernel 5.15, CUDA 12.0, CUDA Driver, 525.60.13, CUDA 12.0 3D-GAN Inference for Particle Shower Simulations Intel® Data Center GPU Max 1550: Test by Intel as of 3/31/2023, 1-node 2x Intel® Xeon® 8480+, HT On, Turbo On, 1024 GB DDR5-4800, 1x Intel® Data Center GPU Max 1550, Ubuntu 20.04, Kernel 5.15, BIOS Version EGSDCRB1.SYS.0077.D01.2203211346m, ucode=0x8f000300, agama-ci-devel-582, Intel(R) oneAPI DPC++/C++ Compiler 2023.1.0 (2023.1.0.20230302), Python 3.9.15, TensorFlow 2.11, Intel MPI 2021.9, NVIDIA A100: Test by Intel as of 3/31/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 20.04, Kernel 5.15, BIOS Version SE5C6200.86B.0022.D08.210322, ucode=0xd000363, GPU Driver 530.30.02, CUDA 12.1, Docker: nvidia/tensorflow:23.01-tf2-py3 (23.01 version for TF2.11) NVIDIA H100: Test by Intel as of 3/31/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, BIOS Version SE5C6200.86B.0021.D40.2102090208, ucode=0xd000389, GPU Driver 530.30.02, GNU Fortran (GCC) 9.4.0 g++ (GCC) 9.4.0, CUDA 12.1, Docker: nvidia/tensorflow:23.01-tf2-py3 (23.01 version for TF2.11) BigDFT: (https://gitlab.com/max-centre/benchmarks/-/tree/master/BigDFT/H2O/GPU) Intel® Data Center GPU Max 1550: Test by Intel as of 4/18/2023, 2x Intel® Xeon® 8480+, HT On, Turbo On, 1024 GB DDR5-4800, 1x Intel® Data Center GPU Max 1550 limited to 450W TDP, Ubuntu 20.04, Kernel 5.15, oneAPI 2023.1+intel-comp-rt/agama-ci-devel/543 NVIDIA A100: Test by Intel as of 4/19/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 20.04, Kernel 5.15, GPU Driver 530.30.02, oneAPI 2023.1+Cuda 12.0 NVIDIA H100: Test by Intel as of 4/18/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, GPU Driver 525.60.13, oneAPI 2023.1+Cuda 12.0 CloverLeaf Intel® Data Center GPU Max 1550: Test by Intel as of 4/23/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo On, 256 GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, Ubuntu 20.04, Kernel 5.15, ucode=0xd000375, Agama-devel-543, intel/oneapi/2023.1.0 NVIDIA A100: Test by Intel as of 4/25/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 22.04, Kernel 5.15, ucode=0xd000363, driver 525.60.13, nvhpc 22.2, CUDA 11.8 NVIDIA H100: Test by Intel as of 4/25/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, ucode=0xd000389, driver 530.30.02, nvhpc 22.2, CUDA 12.0 DeepGalaxy Intel® Data Center GPU Max 1550: Test by Intel as of 4/14/2023, 1-node 2x Intel® Xeon® 8480+, HT On, Turbo On, 1024 GB DDR5-4800, 1x Intel® Data Center GPU Max 1550, Ubuntu 20.04, Kernel 5.15, ucode=0x8f000300, TensorFlow 2.12, Intel Python 3.9.16, Intel MPI 2021.9 oneAPI 2023.1.0.46401 NVIDIA A100: Test by Intel as of 4/14/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 22.04, Kernel 5.15, ucode=0xd000363, CUDA 12.1, Docker: nvidia/tensorflow:23.01-tf2-py3 (23.01 version for TF2.11) NVIDIA H100: Test by Intel as of 4/14/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, ucode=0xd000389, CUDA 12.1, Docker: nvidia/tensorflow:23.01-tf2-py3 (23.01 version for TF2.11) DPEcho: Intel® Data Center GPU Max 1550: Test by Intel as of 1/31/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, BIOS Version WLYDCRB1.SYS.0021.P25.2107280557, Ubuntu 20.04, Kernel 5.15, oneAPI icpx Nightly 20230109 Intel® Xeon® 8480+: Test by Intel as of 1/24/2023, 1-node 2x Intel® Xeon® 8480+, HT On, Turbo Enabled, 512GB DDR5-4800, Rocky Linux 8.7, Kernel 4.18, oneAPI icpx Nightly 2023.0.0 NVIDIA A100: Test by Intel as of 1/18/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 128GB DDR4-3200, 1x PCIe NVIDIA A100 80G, BIOS Version SE5C6200.86B.0022.D08.2103221623, Ubuntu 20.04, Kernel 5.15, GPU Driver 510.73.05, Intel LLVM 20230109, CUDA 11.6 NVIDIA H100: Testing as of 1/18/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 128GB DDR4-3200, 1x PCIe NVIDIA H100, BIOS Version SE5C6200.86B.0022.D08.2103221623, Ubuntu 20.04, Kernel 5.15, GPU Driver 525.60.13, Intel LLVM 20230109, CUDA 12.0 Workload settings: Alfvén wave for Grid sizes: 36³, 48³, 72³, 96³, 132³, 192³, 264³, 390³, 516³ cells. DPEcho GitHub: https://github.com/LRZ-BADW/DPEcho GENE: Intel® Data Center GPU Max 1550: Test by Intel as of 4/21/2023, 1-node 2x Intel® Xeon® 8480+, HT On, Turbo On, 1024 GB DDR5-4800, 1x Intel® Data Center GPU Max 1550, Ubuntu 20.04, Kernel 5.15, GPU Driver Agama-devel-573, IFWI_WW51.5, ifort (IFORT) 2021.9.0 2023010 Intel(R) oneAPI DPC++/C++ Compiler 2023.1.0 (2023.x.0.20230309) NVIDIA A100: Test by Intel as of 4/21/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 256 GB DDR4-3200, 1x PCIe NVIDIA A100 80G, Ubuntu 20.04, Kernel 5.15, GPU Driver 530.30.02, GNU Fortran (GCC) 9.4.0 g++ (GCC) 9.4.0 NVIDIA H100: Test by Intel as of 4/21/2023, 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, total memory 512 GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, GPU Driver 525.60.13, GNU Fortran (GCC) 9.4.0 g++ (GCC) 9.4.0 GridQCD (Benchmark_dwf_fp32 - picked gird size per GPU for the highest TF throughput): Intel® Data Center GPU Max 1550: Test by Intel as of 4/27/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x Intel® Data Center GPU Max 1550, BIOS Version WLYDCRB1.SYS.0021.P16.2105280638, ucode = 0xd000375, Ubuntu 20.04, Kernel 5.15, Intel OneAPI C++ compiler (20230426), NEO UMD 026093, Grid Size 16x16x16x16 for 1T, 32x32x32x32 for 2T. NVIDIA A100: Test by Intel as of 4/27/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 256GB DDR4-3200, 1x NVIDIA A100 80G PCIe, BIOS Version SE5C6200.86B.0022.D08.2103221623, ucode = 0xd000363, Ubuntu 22.04, Kernel 5.15, CUDA 12, Grid Size 16x16x16x32 NVIDIA H100: Test by Intel as of 4/27/2023, 1-node 1x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 512GB DDR4-3200, 1x NVIDIA H100 PCIe, BIOS Version SE5C6200.86B.0021.D40.2101090208, ucode = 0xd000389, Ubuntu 22.04, Kernel 5.15, CUDA 12, Grid Size 16x16x16x32 QMCPack(NiO workload, 512 atoms NiO bulk, Nel = 6144, 8 and 16 walkers(threads per GPU)) Intel® Data Center GPU Max 1550: Test by Intel as of 2/1/2023, 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, 1024 GB DDR5-4800 and 128 GB HBM2e, 6x Intel® Data Center GPU Max 1550, SUSE Linux Enterprise Server 15 SP3, Kernel 5.3.18, nightly build software stack NVIDIA H100: 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 128GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, GPU Driver 525.60.13, CUDA 12.0 |
Intel® Data Center GPU Max leads competitors in single GPU and single node performance for plasma fusion simulations with XGC | Sunspot, Intel® Data Center GPU Max 1550: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 2x 52C Intel® Xeon® Max CPU, 6x Intel® Data Center GPU Max Polaris, NVIDIA A100: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x AMD EPYC Milan, 4x NVIDIA A100 40G PCIe Crusher, AMD Instinct MI250X: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x Optimized 3rd Gen AMD EPYC, 4x AMD Instinct MI250X Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy |
Intel® Data Center GPU Max leads competitors in OpenMC performance up to 100 GPUs with twice the performance of competing GPUs | Sunspot, Intel® Data Center GPU Max 1550: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 2x 52C Intel® Xeon® Max CPU, 6x Intel® Data Center GPU Max Polaris, NVIDIA A100: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x AMD EPYC Milan, 4x NVIDIA A100 40G PCIe Crusher, AMD Instinct MI250X: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x Optimized 3rd Gen AMD EPYC, 4x AMD Instinct MI250X Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy |
Intel® Data Center GPU Max scales up to 192 ranks for CosmicTagger | Sunspot, Intel® Data Center GPU Max 1550: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 2x 52C Intel® Xeon® Max CPU, 6x Intel® Data Center GPU Max Polaris, NVIDIA A100: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x AMD EPYC Milan, 4x NVIDIA A100 40G PCIe Crusher, AMD Instinct MI250X: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x Optimized 3rd Gen AMD EPYC, 4x AMD Instinct MI250X Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy |
Intel® Data Center GPU Max leads competitors in single GPU and single node performance for Quantum Monte Carlo | QMCPack(NiO workload, 512 atoms NiO bulk, Nel = 6144) Intel® Data Center GPU Max 1550: Test by Intel as of 2/1/2023, 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, 1024 GB DDR5-4800 and 128 GB HBM2e, 6x Intel® Data Center GPU Max 1550, SUSE Linux Enterprise Server 15 SP3, Kernel 5.3.18, nightly build software stack NVIDIA H100: 1-node 2x Intel® Xeon® 8360Y, HT On, Turbo Enabled, 128GB DDR4-3200, 1x PCIe NVIDIA H100, Ubuntu 20.04, Kernel 5.15, GPU Driver 525.60.13, CUDA 12.0 |
Intel® Data Center GPU Max performs computational chemistry at scale 72% faster than competing GPUs | Sunspot, Intel® Data Center GPU Max 1550: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 2x 52C Intel® Xeon® Max CPU, 6x Intel® Data Center GPU Max Polaris, NVIDIA A100: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x AMD EPYC Milan, 4x NVIDIA A100 40G PCIe Crusher, AMD Instinct MI250X: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x Optimized 3rd Gen AMD EPYC, 4x AMD Instinct MI250X Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy |
Intel® Data Center GPU Max scales up to 192 ranks for Connectomics | Sunspot, Intel® Data Center GPU Max 1550: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 2x 52C Intel® Xeon® Max CPU, 6x Intel® Data Center GPU Max Polaris, NVIDIA A100: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x AMD EPYC Milan, 4x NVIDIA A100 40G PCIe Crusher, AMD Instinct MI250X: Testing as of 5/12/2023 by Argonne National Laboratory. Each Node: 1x Optimized 3rd Gen AMD EPYC, 4x AMD Instinct MI250X Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy |
oneAPI drives cross platform performance on fusion simulations with GRILLIX and PARALLX, including a 9.3x speed up on Intel® Data Center GPU max series | GRILLIX/PARALLAX Intel® Xeon® 8380: Test by Intel as of 4/19/2023, 1-node, 2S Intel® Xeon® 8380, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version: SE5C620.86B.01.01.0006.2207150335, ucode 0xd000375, Rocky Linux 4.7, Kernel 4.18, Intel LLVM Fortran Compiler 2023.1.0 Intel® Xeon® 8480+: Test by Intel as of 4/19/2023, 1-node, 2S Intel® Xeon® 8480+, HT On, Turbo On, 256 GB DDR5-4800, BIOS Version S2EG1SI21A, ucode 0x2b000161, Ubuntu 20.04, Kernel 5.15, Intel LLVM Fortran Compiler 2023.1.0 Intel® Data Center GPU Max 1550: Test by Intel as of 4/19/2023, 1-node, 2S Intel® Xeon® 8480+, HT On, Turbo On, 256 GB DDR5-4800, BIOS Version S2EG1SI21A, ucode 0x2b000161, 1x Intel® Data Center GPU Max 1550, Ubuntu 20.04, Kernel 5.15, Intel LLVM Fortran Compiler 2023.1.0 |
Intel® Xeon® Max has 4.48x higher performance than competing offering s for quantum chemistry | CP2K: NVIDIA A100: Test by Intel as of 3/31/2023. 1-node, 2x Intel® Xeon® 8360Y, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version SEC6200.86B.0022.D08.2103221623, ucode = 0xd000363, 1x NVIDIA A100 PCIe 80G, Ubuntu 22.04, Kernel 5.15, Intel(R) MPI Library, Version 2021.9 Build 20230307, 4 ranks, 36 threads Intel® Xeon® 8380: Test by Intel as of 4/18/2023. 1-node, 2x Intel® Xeon® 8380, HT On, Turbo On, 256 GB DDR4-3200, BIOS Version SEC620.86B.01.01.0006.2207150335, ucode = 0xd000375, Rocky Linux 8.7, Kernel 4.18, Intel(R) MPI Library for Linux* OS, Version 2021.8 Build 20221129, 40 ranks, 4 threads Intel® Xeon® 8480+: Test by Intel as of 1/25/2023. 1-node, 2x Intel® Xeon® 8480+, HT On, Turbo On, 512 GB DDR5-4800, BIOS Version SE5C7411.86B9525.D132302071332, ucode = 0x2b000190, Rocky Linux 8.7, Kernel 4.18, Intel(R) MPI Library for Linux* OS, Version 2021.8 Build 20221129, 32 ranks, 7 threads Intel® Xeon® Max 9480: Test by Intel as of 1/25/2023. 1-node, 2x Intel® Xeon® Max 9480, HT On, Turbo On, 128 GB HBM2e, BIOS Version SE5C7411.86B9525.D132302071332, ucode = 0x20000170, Rocky Linux 8.7, Kernel 4.18, Intel(R) MPI Library for Linux* OS, Version 2021.8 Build 20221129, 32 ranks, 7 threads |