Solved IPv6 selective failures on Internet, need testing

PMc · Dec 18, 2024

Hi all,
I put a series of free IP addresses (aliases. globally routable) on an interface, like so:

Code:

vtnet0: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80028<VLAN_MTU,JUMBO_MTU,LINKSTATE>
        ether 06:1d:92:01:03:01
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:20 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:21 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:22 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:23 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:24 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:25 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:26 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:27 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:28 prefixlen 64
        inet6 2003:e7:17ff:18d8:41d:92ff:fe01:29 prefixlen 64

Then I try to ping a remote site, setting the source to any of the addresses:

Code:

$ for i in 20 21 22 23 24 25 26 27 28 29
do ping -q -c 1 -S 2003:e7:17ff:18d8:41d:92ff:fe01:$i moon.daemon.contact >/dev/null 2>&1 && echo $i okay || echo $i FAIL
done
20 FAIL
21 okay
22 FAIL
23 FAIL
24 FAIL
25 FAIL
26 okay
27 okay
28 okay
29 okay

A random set of addresses fail. This is different with different source addresses, or different destination addresses, or a different protocol. There is no obvious pattern, but it is always about 50%. Sites like google or freebsd.org are not concerned.

JordanG · Dec 18, 2024

PMc said:
A random set of addresses fail. This is different with different source addresses, or different destination addresses, or a different protocol.

What do you mean by a different protocol?

Sites like Google and FreeBSD.org are handled by CDNs. If you ping them, you may be pinging a host which is very very close to you hop-wise. Try to asses how far away you are from the hosts you are trying to ping via traceroute6(8).

ICMPv6 echo responses may not be generated by the host you are trying to ping or they may be getting lost for a variety of reasons:

bad network connectivity (packets are genuinely getting lost)
network congestion
ICMP response rate limiting (it is a protection; FreeBSD has it implemented)
network intrusion detection and prevention systems that find your pattern of pings suspicious

Sending just one ping is a game of chance. Try pinging one host continuously (at least a hundred pings) and try varying the ping packet size. Repeat with a different source address. After you accumulate some results, contact your Internet provider to discuss the situation.

PMc · Dec 18, 2024

JordanG said:
What do you mean by a different protocol?

ICMP6 may work, UDP may fail, TCP may work again. Or any such pattern.
The point is: that pattern is constant, if UDP fails, it fails always, if ICMP works, it works always.

JordanG said:
Sites like Google and FreeBSD.org are handled by CDNs. If you ping them, you may be pinging a host which is very very close to you hop-wise. Try to asses how far away you are from the hosts you are trying to ping via traceroute6(8).

This is not distance-related. I have destinations nearby, and destinations in Africa with 200ms delay, and the problem is all the same on these destinations.

ICMPv6 echo responses may not be generated by the host you are trying to ping or they may be getting lost for a variety of reasons:

bad network connectivity (packets are genuinely getting lost)

network congestion

ICMP response rate limiting (it is a protection; FreeBSD has it implemented)

network intrusion detection and prevention systems that find your pattern of pings suspiciou

I own some of the hosts I am trying to ping, so I know they do create answers.

Also it is not packet-loss related: if I get such a pattern as shown above:

PMc said:

Code:

$ for i in 20 21 22 23 24 25 26 27 28 29
do ping -q -c 1 -S 2003:e7:17ff:18d8:41d:92ff:fe01:$i moon.daemon.contact >/dev/null 2>&1 && echo $i okay || echo $i FAIL
done
20 FAIL
21 okay
22 FAIL
23 FAIL
24 FAIL
25 FAIL
26 okay
27 okay
28 okay
29 okay

I can repeat this again and again, and the pattern will be exactly the same on every try.
And a few hours later it will still be the same. It changes only when some part of the quintuple (sender address, dest address, protocol) is changed.

JordanG said:
Sending just one ping is a game of chance.

That is why I do repeat it.

JordanG said:
Try pinging one host continuously (at least a hundred pings)

I did.
If one ping works, then each of a hundred pings do also work. If one ping fails, no ping will ever get through.

JordanG said:
and try varying the ping packet size.

I did.
There is no difference depending on size.

JordanG said:
Repeat with a different source address.

I did.
Each source address behaves differently.

Over all, the chance of some flow working or not, is near 50%.
It looks like somebody does a checksum over both IP addresses and the protocol, and one bit of that checksum is broken.
(I checked the checksums of my leaving packets, they seem ok.)

JordanG said:
After you accumulate some results, contact your Internet provider to discuss the situation.

I did.
They tried to connect me to somebody who would tell me how to install Windows, and didn't even manage to do that.
That is an entirely dead end - one doesn't reach anybody who would have heard about IPv6, or know what an ASN is.
They expect you to install Windows and then surf thru the shopping offers of their affiliates, and if that works, all is fine.
Deutsche Telekom, ASN 3320, be warned.

JordanG · Dec 18, 2024

PMc said:
I can repeat this again and again, and the pattern will be exactly the same on every try.
And a few hours later it will still be the same. It changes only when some part of the quintuple (sender address, dest address, protocol) is changed.

PMc said:
Over all, the chance of some flow working or not, is near 50%.

It looks like somebody does a checksum over both IP addresses and the protocol, and one bit of that checksum is broken.

You have almost discovered the root cause of this problem.

I know of two situations in which a hash value is computed based on IPv6 header fields like source address, destination address and protocol and TCP/UDP fields like source and destination ports:

Routers for purposes of quickly matching a packet to its corresponding flow or state kept per connection.
Network interface cards doing some form of hardware offloading

You could try:

Disabling all hardware offloading on all your NICs
Testing with a different OS. If no problem is observed with it, then the problem is not in the network routers and other network hardware along the path but is more local.
Testing with a different NIC. You can use a USB Ethernet adapter, for example.
Pinging another host in the same local network. This will exclude the router connecting you to the Internet.

PMc · Dec 18, 2024

I contacted my Internet Provider again. This time they told me that they are not responsible for packet transport, and I should contact postal services instead.
Whenever one may think the madness cannot be topped anymore, one gets corrected.

I got a little bit further, using traceroute6.
This is a connection that apparently is successful:

Code:

traceroute6 to wand.daemon.contact (2001:470:1f23:1b::1:0) from 2003:e7:17ff:18d8:41d:92ff:fe01:22, 64 hops max, 20 byte packets
 1  [AS3320] 2003:0:8305:c000::1  7.013 ms  7.117 ms  7.250 ms
 2  [AS6939] dtag-as3320.e0-50.switch2.fra2.he.net  9.796 ms * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  [AS0] e0-33.core1.cpt1.he.net  152.267 ms  151.597 ms  192.996 ms

And this is one that is not successful:

Code:

traceroute6 to wand.daemon.contact (2001:470:1f23:1b::1:0) from 2003:e7:17ff:18d8:41d:92ff:fe01:23, 64 hops max, 20 byte packets
 1  [AS3320] 2003:0:8305:c000::1  7.392 ms  6.394 ms  6.973 ms
 2  * * *
 3  * * *
 4  * * *
 5  * * *
 6  * * *
 7  * * *
 8  * *^C

It seems that indeed these packets do not make it from AS3320 to AS6939.

OTOH, a friend did a cross-check, and from their site everything seems to work. However they have a slightly different traceroute at the handover from AS 3320:

JordanG said:
I know of two situations in which a hash value is computed based on IPv6 header fields like source address, destination address and protocol and TCP/UDP fields like source and destination ports:

Routers for purposes of quickly matching a packet to its corresponding flow or state kept per connection.

Yeah, I got told that routers might do load-balancing based on a hash, and this might even explain the 50% failure rate (obviousely implicating that there is some malfunction with these devices).

JordanG said:
Network interface cards doing some form of hardware offloading

You could try:

Disabling all hardware offloading on all your NICs

I switched off all the options from the outbound igb already - but there is still a lot of bhyve and netgraph in the path...

JordanG said:
Testing with a different OS. If no problem is observed with it, then the problem is not in the network routers and other network hardware along the path but is more local.

Testing with a different NIC. You can use a USB Ethernet adapter, for example.

This is really good advice, indeed. But also laboursome, and requires to rip apart some of my vital infra, so I push that task along.. (I am actually happy when I get a situation where the most vital of my links are among the working 50%)

JordanG said:
Pinging another host in the same local network. This will exclude the router connecting you to the Internet.

That one is not applicable. The problem appears only on connections outbound to the Internet, and only to those that go to AS 6939.

dark.initr0 · Dec 20, 2024

By default traceroute6 sends UDP packets with port number increasing each hop.
Such UDP packets could be discarded.
Try use -I for use ICMP6 or -T -p 80 for TCP probes with fixed popular port

PMc · Dec 23, 2024

dark.initr0 said:
By default traceroute6 sends UDP packets with port number increasing each hop.
Such UDP packets could be discarded.
Try use -I for use ICMP6 or -T -p 80 for TCP probes with fixed popular port

ICMP could also be discarded (maybe not so frequently with IPv6).

Anyway, the obtained traceroute is already significant enough to identify the location of failure, however not the cause.
So after sufficiently pushing these informations around to the involved providers, finally on Friday morning 0410Z the issue disappeared as suddenly as it had appeared.
Nobody sent a letter of confession.