SO_ATTACH_REUSEPORT_CBPF Approach
In this approach, Listen, Accept, and Processing are all performed by the same thread. In this model, each application thread performs both listen and accept functions. Because each thread listens, it uses the SO_REUSEPORT socket option so that it can use the same listen port for each thread. Each thread performs socket:listen and socket:accept. Therefore, when a connection is established with a client, the kernel decides which application thread needs to be woken up to accept this newly established connection. This results in each application thread being blocked inside the kernel. This model is different than the model discussed previously, where the main thread performs the socket:accept and dispatches the connection to an application worker thread.
To establish a 1:1 model of application thread to hardware queue in this multipurpose thread model, you must use the socket option SO_ATTACH_REUSEPORT_CBPF, which allows the application to attach a small classic Berkeley Packet Filter (cBPF) program to perform hardware network adapter queue-aware load balancing among the application threads. cBPF is a packet capture/filtering solution originally proposed by Steven McCanne and Van Jacobson from Lawrence Berkeley Laboratory in 1992. It has been pervasively used by network applications like tcpdump, wireshark, and others. A major goal of cBPF/eBPF (extended Berkeley Packet Filter) is to make the kernel configurable at runtime so that custom code can be injected as part of the kernel without rebuilding the kernel. Therefore, a small special-purposed virtual machine (VM) with a simplified instruction set and registers is designed to execute the customized application provided by the user. Because the application is running in the VM, it is mostly suitable to perform simple and specific tasks efficiently. For cBPF, the specific task is packet filtering.
For more information, see the following cBPF references:
The SO_ATTACH_REUSEPORT_CBPF socket option was introduced to the kernel in this LWN.net article. The article gives an example of using this capability to select a REUSEPORT socket based on the CPU core ID handling the incoming packet, but for ADQ it is used to select the application thread based on the receive queue it is mapped to. Each thread that performs a listen attaches a small function using setsockopt(listen_fd, SO_ATTACH_REUSEPORT_CBPF, [structure]). This cBPF program is then executed in the kernel, using the Receive Queue Mapping information available in the newly-created socket to select which thread needs to be woken up to accept the connection. This function is executed in the kernel in the accept connection code path. See the Cloudfare blog for a similar approach.
Applications that have successfully used this model for ADQ enablement include NGINX.