Performance Index

ID Date Classification
615781 11/27/2024 Public
Document Table of Contents

Intel® Gaudi® AI Accelerator

Series Presented Use Case Claim Processor Systems Measured Measurement Measurement Period Software
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 1.6X per/$ Inference throughput vs NV H200 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.86 Speedup relative NV H200 measured for LLAMA3 8B 128 input 128 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.81 Speedup relative NV H200 measured for LLAMA3 8B 128 input 2048 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.73 Speedup relative NV H200 measured for LLAMA3 8B 2048 input 128 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.70 Speedup relative NV H200 measured for LLAMA3 8B 2048 input 2048 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.70 Speedup relative NV H200 measured for LLAMA2 70B 128 input 128 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.93 Speedup relative NV H200 measured for LLAMA2 70B 128 input 2048 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.67 Speedup relative NV H200 measured for LLAMA2 70B 2048 input 128 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Refresh of Gaudi 3 collateral Inference throughout vs NV H200 0.77 Speedup relative NV H200 measured for LLAMA2 70B 2048 input 2048 output 1* Intel Gaudi 3 Ai accelerator , 1*Nvidia H200 GPU Data Source: TensorRT-LLM/docs/source/performance/perf-overview.md at main · NVIDIA/TensorRT-LLM · GitHub, Pricing estimates on publicly available information and Intel Internal Analysis.

Input Output sequences per GPU vs Intel Gaudi 3 AI Accelerator.

Intel measured results obtained on October 18 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H200 software details
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLaMA 3 8B ~1.09X inference Throughput LLaMA 3 8B Intel Gaudi 3 AI Accelerator vs. H100 1* Intel Gaudi 3 Ai accelerator (128GB HBM), 1*Nvidia H100 GPU (80GB HBM) Data Source: : https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 1 accelerator/GPU Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLaMA 3 8B 1.8X perf/$ Inference Throughput LLaMA 3 8B Intel Gaudi 3 Ai accelerator vs. H100 1* Intel Gaudi 3 Ai accelerator (128GB HBM), 1*Nvidia H100 GPU (80GB HBM) Data Source: : https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 1 accelerator/GPU Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLAaMA 2 70B 1.19X Inference Throughput LLaMA 2 70B Intel Gaudi 3 AI accelerator vs H100 2* Intel Gaudi 3 Ai accelerator (128GB HBM), 2*Nvidia H100 GPU (80GB HBM) Data Source:https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 2 accelerators/GPUs Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details
Intel® Gaudi® 3 accelerator Launch event 24th Sept 2024 Inference LLAaMA 2 70B ~2X Inference perf?$ Throughput LLaMA 2 70B Intel Gaudi 3 AI accelerator vs H100 2* Intel Gaudi 3 Ai accelerator (128GB HBM), 2*Nvidia H100 GPU (80GB HBM) Data Source:https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/performance/perf-overview.md input-output sequences: 128-2048tps on 2 accelerators/GPUs Intel measured results obtained on September 9th 2024 Software: Intel Gaudi software release 1.18.0. See Nvidia link for H100 software details