[DIVERT] IP header checksums - why calculate twice?

dohmniq · Aug 26, 2013

I've been toying with using IPDIVERT to adjust values in an IPv4 header. When adjusting an incoming IP header, the man page for divert(4) says:

Packets written as incoming and having incorrect checksums will be dropped.

My main issue was with trying to leverage the optimised kernel functions for checksumming an IP header, for example in_cksum_hdr(). Processes that connect to DIVERT sockets are based in user-land so in_cksum_hdr() isn't readily available during compile.

Eventually the thought hit me that if some part of the kernel has to validate checksums (to decide whether to drop a packet) AND if my user-land process has to calculate a checksum to avoid its packet being dropped THEN surely there are two wasted checksum calculations going on?

If a root-owned process, root needed for RAW socket, can be trusted to inject packets back into the IP stack then surely we can skip the checksum test and save a few CPU cycles plus a bit of latency.

Very simple patch for /usr/src/sys/netinet/ip_divert.c (based on rev 224575):

Code:

--- ip_divert.c.orig    2013-08-26 20:52:18.000000000 +0100
+++ ip_divert.c 2013-08-26 20:52:44.000000000 +0100
@@ -496,6 +496,12 @@
                /* Send packet to input processing via netisr */
                switch (ip->ip_v) {
                case IPVERSION:
+                       /* mark mbuf as having valid checksum
+                          to save userland divert process from 
+                          calculating checksum, and kernel having
+                          to check it */
+                       m->m_pkthdr.csum_flags |= CSUM_IP_CHECKED | 
+                                                       CSUM_IP_VALID;
                        netisr_queue_src(NETISR_IP, (uintptr_t)so, m);
                        break;
 #ifdef INET6

SirDice · Aug 27, 2013

You'll probably get a lot more responses if you send this to the freebsd-net@ mailing list. There aren't a lot of developers on this board.

throAU · Aug 28, 2013

Erm... if you are modifying the packet, this invalidates the checksum? Even if you allow the packet through the kernel with an invalid checksum, surely it will be dropped upstream in any case due to the checksum being invalid?

Having the kernel blindly trust a user-space application (even running as root) has checksummed its IP properly just smells of potential bad news to me. Essentially you are having the kernel trust user space at a fairly low level and I'm not sure that's a good idea.

I am not a FreeBSD developer though.

If the patch improves performance for you (have you done before/after benchmarks?), great. Not sure it belongs in the official source though.

dohmniq · Aug 28, 2013

[Will be moving this thread to freebsd-net mailing list as per SirDice's suggestion]

Thanks for the reply, throAU.

The situation I mentioned in my original post was for packets that had arrived from the outside world on a network interface and were making their way through the kernel. These are termed "incoming" in the divert(4) man page. I am reading these packets via the DIVERT socket, modifying them and then writing them back out via the same DIVERT socket. So this is why I picked up on this sentence from the man page:

Packets written as incoming and having incorrect checksums will be dropped.

Interestingly, packets that are outgoing automatically have their checksum(s?) recalculated. (Might just be IP payload checksum and not IP header checksum - haven't checked, sorry). So even if a process writes an invalid-checksummed, outgoing packet the kernel takes care of it and it should be accepted upstream.

Back to incoming packets, it's not too hard to write the code to generate the correct checksum - probably easier than trying to shoe-horn the optimised kernel code in. My issue is with having to do this ONLY because DIVERT has to tie-in with common kernel network code, and this tie-in point is before the kernel does checksum tests. Basically I'm jumping through a hoop, spending time calculating a checksum only because the kernel code is less likely to change to make life easier.

The decision about the kernel trusting user-space packets is a good thought - it's always good to have security in mind. That's a decision for the powers that be! I don't suppose it's really that much different from having another machine on the network generating packets, or a root process with a RAW socket.

I admit I haven't done benchmarks but I'm confident it should be an improvement as I replacing two checksum calculations with one flag set and one flag test. I really don't know how much DIVERT is used these days, especially with IP header change scenarios.

Hope this goes some way to responding to your reply!

[DIVERT] IP header checksums - why calculate twice?

dohmniq

SirDice

Administrator

throAU

dohmniq