Still, a solution has emerged in the form of "large receive offload" (LRO), which takes a very similar approach: incoming packets are merged at reception time so that the operating system sees far fewer of them. This merging can be done either in the driver or in the hardware; even LRO emulation in the driver has performance benefits. LRO is widely supported by 10G drivers under Linux.
But LRO is a bit of a flawed solution, according to Herbert; the real problem is that it "merges everything in sight." This transformation is lossy; if there are important differences between the headers in incoming packets, those differences will be lost. And that breaks things. If a system is serving as a router, it really should not be changing the headers on packets as they pass through. LRO can totally break satellite-based connections, where some very strange header tricks are done by providers to make the whole thing work. And bridging breaks, which is a serious problem: most virtualization setups use a virtual network bridge between the host and its clients. One might simply avoid using LRO in such situations, but these also tend to be the workloads that one really wants to optimize. Virtualized networking, in particular, is already slower; any possible optimization in this area is much needed.
The solution is generic receive offload (GRO). In GRO, the criteria for which packets can be merged is greatly restricted; the MAC headers must be identical and only a few TCP or IP headers can differ. In fact, the set of headers which can differ is severely restricted: checksums are necessarily different, and the IP ID field is allowed to increment. Even the TCP timestamps must be identical, which is less of a restriction than it may seem; the timestamp is a relatively low-resolution field, so it's not uncommon for lots of packets to have the same timestamp. As a result of these restrictions, merged packets can be resegmented losslessly; as an added benefit, the GSO code can be used to perform resegmentation.
One other nice thing about GRO is that, unlike LRO, it is not limited to TCP/IPv4.