Performance Index

ID Date Classification
615781 11/18/2024 Public
Document Table of Contents

Edge

GPU(s) Use Case Claim System Configuration Measurement Period
Intel® Arc™ A750 H.265 1080p video decoding Intel Arc A750 delivers up to 84% better media processing performance than NVIDIA RTX 4070-S Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. Maximum capability is how many streams we can run at 30FPS. The input was a H265 1080p 2Mbps single object video August 2024
Intel® Arc™ A750 H.265 1080p video decoding Intel Arc A750 delivers over 120% better media processing performance than NVIDIA RTX 3070 Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. Maximum capability is how many streams we can run at 30FPS. The input was a H265 1080p 2Mbps single object video August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5S, BS16, INT8 Intel Arc A750 delivers up to 28% better AI inference performance than NVIDIA RTX 4070-S Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5S, BS16, INT8 Intel Arc A750 delivers up to 68% better AI inference performance than NVIDIA RTX 3070 Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5S, BS16, INT8 Intel Arc A750 delivers up to 22% better AI Performance/Watt than NVIDIA RTX 4070-S Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5S, BS16, INT8 Intel Arc A750 delivers up to 64% better AI Performance/Watt than NVIDIA RTX 3070 Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5M, BS32, INT8 Intel Arc A750 delivers up to 19% better AI inference performance than NVIDIA RTX 4070-S Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5M, BS32, INT8 Intel Arc A750 delivers up to 58% better AI inference performance than NVIDIA RTX 3070 Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5M, BS32, INT8 Intel Arc A750 delivers up to 13% better AI Performance/Watt than NVIDIA RTX 4070-S Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 AI Inference: Yolo-V5M, BS32, INT8 Intel Arc A750 delivers up to 58% better AI Performance/Watt than NVIDIA RTX 3070 Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. AI Benchmark Target Networks: NVIDIA: YoloV5-M: INT8 Model, YoloV5-S: INT8 Model Intel: NN Model: Yolo-V5S, Yolo-V5M, (Quantized models follow standard guide) August 2024
Intel® Arc™ A750 End-to-End Video Analytics Pipeline Intel Arc A750 has comparable performance to NVIDIA RTX 4070-S Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. System 1: Target IPs: Media, GPU Test Cases: end-to-end AI pipeline (WL1): Detection Benchmark using BenchmarkApp (YoloV5-M-640 and YoloV5S) (WL2): 1080p H265 2M Media decode + pre-processing + Detection (YoloV5-M-640 @ 10fps) + pre-processing + 2 Classification models (Resnet50 @ 10fps + Mobilenet-V2 @ 10fps System 2: Target IPs: Media, GPU Test Cases: end-to-end AI pipeline (WL1): Detection Benchmark using TensorRT (YoloV5-M-640 and YoloV5S) (WL2): 1080p H265 2M Media decode + pre-processing + Detection (YoloV5-M-640 @ 10fps) + pre-processing + 2 Classification models (Resnet50 @ 10fps + Mobilenet-V2 @ 10fps end-to-end AI pipeline Data was collected at different batch sizes for input videos (1080) and max performance was listed for each system. Measured KPIs are based on the number of streams. August 2024
Intel® Arc™ A750 End-to-End Video Analytics Pipeline Intel Arc A750 delivers up to 26% better performance than NVIDIA RTX 3070 Host setup: 12th Gen Intel® Core™ i7-12700, 2.1GHz, Turbo & Hyperthreading Enabled, 64GB RAM, 465.8 GB WDC WDS500G2B0B-00YS70. BIOS A.D0, OS Ubuntu 22.04.3 LTS Intel Arc A750 GPU. 28 Xe-cores. 2050MHz Graphics Clock. 8GB DGDDR6. 225W TBP. OS Ubuntu 22.04.1 LTS, Intel Compute Runtime driver 23.30.26918.9. OpenVINO 2024.0, DLStreamer 2024.0.1 NVIDIA RTX 4070-Super, 7168 cores, 12GB GDDR6X, 2550 MHz boost clock 220W TBP. NVIDIA RTX 3070, 5888 cores. 1695MHz boost clock speed, 8GB GDDR6, 220W TBP. OS Ubuntu 22.04.1 LTS, NVIDIA driver 545.23.08. Deepstream 6.4, TensorRT 24.01. System 1: Target IPs: Media, GPU Test Cases: end-to-end AI pipeline (WL1): Detection Benchmark using BenchmarkApp (YoloV5-M-640 and YoloV5S) (WL2): 1080p H265 2M Media decode + pre-processing + Detection (YoloV5-M-640 @ 10fps) + pre-processing + 2 Classification models (Resnet50 @ 10fps + Mobilenet-V2 @ 10fps System 2: Target IPs: Media, GPU Test Cases: end-to-end AI pipeline (WL1): Detection Benchmark using TensorRT (YoloV5-M-640 and YoloV5S) (WL2): 1080p H265 2M Media decode + pre-processing + Detection (YoloV5-M-640 @ 10fps) + pre-processing + 2 Classification models (Resnet50 @ 10fps + Mobilenet-V2 @ 10fps end-to-end AI pipeline Data was collected at different batch sizes for input videos (1080) and max performance was listed for each system. Measured KPIs are based on the number of streams. August 2024
Intel® Arc™ A370M Resnet 50 AI Inference Up to 2.4x higher performance at AI Inference​ using Resnet50, Batch Size 32 compared to NVIDIA® T1000 Test by Intel as of 31/08/23. 1-node, 1x 13th Gen Intel(R) Core(TM) i9-13900K, 24 cores, HT On, Turbo On, NUMA 1, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 32GB (2x16GB DDR5 4800 MT/s [4800 MT/s]), BIOS A.C0, microcode 0x119, 1x Ethernet Controller I225-V, 1x 931.5G WDC WDS100T2B0B-00YS70, 1x 1.8T WDC WDS200T2B0B-00YS70, 1x 28.8G DataTraveler 2.0, Ubuntu 22.04.3 LTS, 6.2.0-26-generic, WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_​SW, score=?UNITS., NVRM version: NVIDIA UNIX x86_​64 Kernel Module 535.86.10, NVIDIA Corporation TU117GL [T1000 8GB] (rev a1) Test by Intel as of 31/08/23.1-node, 1x 12th Gen Intel(R) Core(TM) i7-12800HE, 14 cores, HT On, Turbo On, NUMA 1, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 32GB (1x32GB DDR5 4800 MT/s [4800 MT/s]), BIOS 0.06.07, microcode 0x429, 1x Ethernet Connection (16) I219-LM, 1x 1.1T INTEL SSDSC2BB012T6, Ubuntu 22.04.3 LTS, 6.2.0-26-generic, WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_​SW, score=?UNITS. VGA compatible controller: Intel Corporation Device 5693 (rev 05) Openvino and tensorRT are frameworks that each GPU vendor to provide as an environment for people to make AI inference applications. Each framework contains an application you can use benchmark performance from AI Models. For A370M a caffe resnet50 model from the openvino open model zoo was used in the benchmark application in the Openvino Runtime For T1000 a caffe resnet50 model from Nvidia was used and is available from: https://github.com/NVIDIA-AI-IOT/jetson_benchmarks/blob/master/benchmark_csv/orin-benchmarks.csv Trtexec was used for the Nvidia T1000 and is part of the TensorRT framework. An Input size of 224x224 was used for both A370M and T1000 Batch Size 32 was chosen for the claim as it provides the highest performance for the T1000 August 31, 2023
Intel® Arc™ A370M H.264 Video Decoding Up to 2.28x higher performance at decoding​ 2 streams of H.264 1080p30 Video compared to NVIDIA® T1000 Test by Intel as of 31/08/23. 1-node, 1x 13th Gen Intel(R) Core(TM) i9-13900K, 24 cores, HT On, Turbo On, NUMA 1, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 32GB (2x16GB DDR5 4800 MT/s [4800 MT/s]), BIOS A.C0, microcode 0x119, 1x Ethernet Controller I225-V, 1x 931.5G WDC WDS100T2B0B-00YS70, 1x 1.8T WDC WDS200T2B0B-00YS70, 1x 28.8G DataTraveler 2.0, Ubuntu 22.04.3 LTS, 6.2.0-26-generic, WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_​SW, score=?UNITS., NVRM version: NVIDIA UNIX x86_​64 Kernel Module 535.86.10, NVIDIA Corporation TU117GL [T1000 8GB] (rev a1) Test by Intel as of 31/08/23.1-node, 1x 12th Gen Intel(R) Core(TM) i7-12800HE, 14 cores, HT On, Turbo On, NUMA 1, Integrated Accelerators Available [used]: DLB 0 [0], DSA 0 [0], IAA 0 [0], QAT 0 [0], Total Memory 32GB (1x32GB DDR5 4800 MT/s [4800 MT/s]), BIOS 0.06.07, microcode 0x429, 1x Ethernet Connection (16) I219-LM, 1x 1.1T INTEL SSDSC2BB012T6, Ubuntu 22.04.3 LTS, 6.2.0-26-generic, WORKLOAD+VERSION, COMPILER, LIBRARIES, OTHER_​SW, score=?UNITS. VGA compatible controller: Intel Corporation Device 5693 (rev 05) Bash script written to start two decodes then the time to complete the two decodes recorded using the time command. August 31, 2023