Intel® Ethernet 700 Series
Linux Performance Tuning Guide
Application Settings
Often a single thread (which corresponds to a single network queue) is not sufficient to achieve maximum bandwidth. Some platform architectures, such as AMD, tend to drop more Rx packets with a single thread compared to platforms with Intel-based processors.
Consider using tools like taskset or numactl to pin applications to the NUMA node or CPU cores local to the network device. For some workloads such as storage I/O, moving the application to a non-local node provides benefit.
Experiment with increasing the number of threads used by your application if possible.