PF - Strange problem with Linux Debian/Ubuntu clients only

Hi!

I have very strange problem. The setup is as follows:
- FreeBSD (6.4-RELEASE-p11) router/firewall with PF doing nat, scrub and packet filtering;
- WWW servers behind this FreeBSD box;
- several workstations behind this FreeBSD box.

There are no problems with accessing these WWW servers from the Internet from many client operating systems (including Windows XP,Vista,7; FreeBSD-RELEASE 6.4, 7.3, 8.1, some Linux distributions - Fedora, CentOS, Slackware, even Symbian-based smart phones). Unfortunately external Debian and Ubuntu boxes can't access WWW servers behind this FreeBSD box. Same Debian/Ubuntu clients can access WWW servers on the internal segment. I suspect there is something wrong with interactions between Debian's TCP/IP stack and my PF firewall rules on the FreeBSD router.

Thanks in advance!
Lyubomir.

Here are excerpts from /etc/pf.conf file:
For obvious reasons public IP addresses are substituted with capital letters.

Code:
=========Start of pf.conf=========

ext_if="em0"    
int_if="em1"    
loopback_if="lo0"


internal_addr="192.168.100.100"
internal_net="192.168.100.0/24"

external_addr="A.B.C.D"

# 1st WWW Server
www1_int="192.168.100.240"
www1_ext="E.F.G.H"


# 2nd WWW Server
www2_int="192.168.100.246"
www2_ext="X.Y.Z.W"

# Options: tune the behavior of pf, default values are given.
#set timeout { interval 10, frag 30 }
#set timeout { tcp.first 120, tcp.opening 30, tcp.established 86400 }
#set timeout { tcp.closing 900, tcp.finwait 45, tcp.closed 90 }
#set timeout { udp.first 60, udp.single 30, udp.multiple 60 }
#set timeout { icmp.first 20, icmp.error 10 }
#set timeout { other.first 60, other.single 30, other.multiple 60 }
#set timeout { adaptive.start 0, adaptive.end 0 }
#set limit { states 10000, frags 5000 }
#set loginterface none
#set optimization normal
#set block-policy drop
#set require-order yes
#set fingerprints "/etc/pf.os"

# Normalization: reassemble fragments and resolve or reduce traffic ambiguities.
scrub in all

# Queueing: rule-based bandwidth control.
#altq on $ext_if bandwidth 2Mb cbq queue { dflt, developers, marketing }
#queue dflt bandwidth 5% cbq(default)
#queue developers bandwidth 80%
#queue marketing  bandwidth 15%

# Translation: specify how addresses are to be mapped or redirected.
# nat: packets going out through $ext_if with source address $internal_net will
# get translated as coming from the address of $ext_if, a state is created for
# such packets, and incoming packets will be redirected to the internal address.

# Static 1:1 NAT for the internal WWW servers
binat on $ext_if from $www1_int to any -> $www1_ext
binat on $ext_if from $www2_int to any -> $www2_ext
# NAT for the rest of the workstations
nat on $ext_if from $internal_net to any -> ($ext_if)

# block all incoming packets but allow ssh, pass all outgoing tcp and udp
# connections and keep state, logging blocked packets.
block in log all

pass in inet proto icmp all keep state

pass in on $ext_if proto tcp from any to $www1_int port 80 keep state
pass in on $ext_if proto tcp from any to $www2_int port 80 keep state

pass in on $int_if from $internal_net to any keep state

pass in on lo0 proto { tcp,udp } from any to any

pass out on $ext_if proto { tcp,udp,icmp } from $external_addr to any keep state
pass out on $int_if proto { tcp,udp,icmp } from $internal_net to any keep state

=========End of pf.conf=========
 
I don't see anything wrong with your config. Except maybe the last 2 lines, they're not needed but shouldn't cause any problems.

Try grabbing the session with tcpdump(1) and see if that has any clues.
 
Thank you for your reply SirDice!

Here are the matching packets from the "tcpdump -i em0 port 80" output. Again, real IP addresses are substituted with their descriptions.

Code:
08:31:21.234600 IP [Ubuntu Internet Client].64952 > E.F.G.H.http: S 3511544883:3511544883(0) win 5840 <mss 1380,sackOK,timestamp 158708 
0,nop,wscale 6>
08:31:21.234802 IP E.F.G.H.http > [Ubuntu Internet Client].64952: S 2807722954:2807722954(0) ack 3511544884 win 16384 <mss 1460,nop,wscale 
0,nop,nop,timestamp 0 0,nop,nop,sackOK>
08:31:21.236849 IP [Ubuntu Internet Client].64952 > E.F.G.H.http: . ack 1 win 92 <nop,nop,timestamp 158709 0>
08:31:21.237099 IP [Ubuntu Internet Client].64952 > E.F.G.H.http: P 1:378(377) ack 1 win 92 <nop,nop,timestamp 158709 0>
08:31:21.460595 IP [Ubuntu Internet Client].64952 > E.F.G.H.http: P 1:378(377) ack 1 win 92 <nop,nop,timestamp 158765 0>
08:31:21.460897 IP E.F.G.H.http > [Ubuntu Internet Client].64952: . ack 378 win 65158 <nop,nop,timestamp 193370631 158709>
08:32:18.944983 IP [Ubuntu Internet Client].64952 > E.F.G.H.http: F 378:378(0) ack 1 win 92 <nop,nop,timestamp 173135 0>
08:32:18.945182 IP E.F.G.H.http > [Ubuntu Internet Client].64952: . ack 379 win 65158 <nop,nop,timestamp 193371206 173135>
 
Ok. From the tcpdump trace it's clear that the Ubuntu client correctly does the three-way handshake (Syn, Syn/Ack, Ack). It's also clear it's the Ubuntu client that terminates the connection (Fin).

Have a look at your web server logs. See if there are any errors there. The connection is correctly made and taken down so your firewall rules aren't the problem.
 
Hi!

The webserver has no logs from this http session, because it doesn't start at all.
I decoded (using Ethereal) another capture and it shows the problem at the TCP layer:

Code:
No.	Time				Source		Destination	Protocol	Info
19	2010-11-22 14:12:22.829746	192.168.16.72	WWW-server	TCP		45308 > http [SYN] Seq=0 Len=0 MSS=1460 TSV=33858 TSER=0 WS=6
20	2010-11-22 14:12:22.832167	WWW-Server	192.168.16.72	TCP		http > 45308 {SYN,ACK] Seq=0 Ack=1 Win=16384 Len=0 MSS=1380 WS=0 TSV=0 TSER=0
21	2010-11-22 14:12:22.832219	192.168.16.72	WWW-Server	TCP		45308 > http [ACK] Seq=1 Ack=1 Win=5888 Len=0 TSV=33859 TSER=0
22	2010-11-22 14:12:22.832321	192.168.16.72	WWW-Server	HTTP		GET / HTTP/1.1	
23	2010-11-22 14:12:23.035190	192.168.16.72	WWW-Server	HTTP		[TCP Retransmission] GET / http/1.1
24	2010-11-22 14:12:23.039036	WWW-Server	192.168.16.72	TCP		[TCP Previous segment lost] http > 45308 {ACK] Seq=2737 Ack=378 Win=65158
 ****** All the previous TCP segments were lost *******
1391	2010-11-22 14:13:16.680791	192.168.16.72	WWW-Server	TCP		45308 > http [FIN, ACK] Seq=378 Ack=1 Win=5888 Len=0 TSV=47321 TSER=0
1392	2010-11-22 14:13:16.683786	WWW-Server	192.168.16.72	TCP		http > 45308 [ACK] Seq=1381 Win=65158 Len=0 TSV=197031743 TSER=47321

I found in the Internet similar problems with Debian/Ubuntu clients trying to obtain DHCP parameters through PF firewall and there was a problem with their kernel parameters.
However there is no way to fine tune default kernel parameters on all the Debian/Ubuntus.
I tried to disable scrub in PF with no effect so far.
Possibly even Google search robots use Debian derivatives, because this site can't be indexed for more than 3 months.

Any ideas?

Thanks,
Lyubomir.
 
Did you turn off tcp_extensions? Or similar settings like syncookies, drop_synfin, etc?

Can you do the same tcpdump on the webserver itself?
 
Hi!

As it can be seen on my pf configuration file posted at the beginning of this thread, I am using default settings. All the filter rules are tailored with stateful inspection, for example:
Code:
 pass in on $ext_if proto tcp from any to $www_int port 80 keep state

Here is the decoded raw capture from the WWW-server itself:
Internal WWW server has IP address 192.168.100.240.

Code:
No.	Time				Source		Destination	Protocol	Info
810	2010-11-23 09:33:41.137889	Ubuntu Client	192.168.100.240	TCP		6509 > http [SYN] Seq=0 Len=0 MSS=1380 TSV=340767 TSER=0 WS=6
811	2010-11-23 09:33:41.138120	192.168.100.240	Ubuntu Client	TCP		http > 6509 [SYN, ACK] Seq=0 Ack=1 Win=16384 Len=0 MSS=1460 WS=0 TSV=0 TSER=0
812	2010-11-23 09:33:41.140107	Ubuntu Client	192.168.100.240	TCP		6509 > http [ACK] Seq=1 Ack=1 Win=5888 Len=0 TSV=340768 TSER=0
813	2010-11-23 09:33:41.140219	Ubuntu Client	192.168.100.240	HTTP		GET / HTTP/1.1
820	2010-11-23 09:33:41.194837	192.168.100.240	Ubuntu Client	HTTP		HTTP/1.1 200 OK
821	2010-11-23 09:33:41.194960	192.168.100.240 Ubuntu Client	HTTP		Continuation or non-HTTP traffic
836	2010-11-23 09:33:41.342743	Ubuntu Client	192.168.100.240	HTTP		[TCP Retransmission] GET / HTTP/1.1
837	2010-11-23 09:33:41.343000	192.168.100.240 Ubuntu Client	TCP		[TCP Dup ACK 821#1] http > 6509 [ACK] Seq-2737 Ack=378 Win=65158 Len=0 TSV=197727981 TSER=340767
1591	2010-11-23 09:33:49.166749	192.168.100.240	Ubuntu Client	HTTP		[TCP Retransmission] HTTP/1.1 200 OK
2511	2010-11-23 09:33:49.296416	192.168.100.240	Ubuntu Client	HTTP		[TCP Retransmission] Continuation or non-HTTP traffic
2825	2010-11-23 09:33:50.202019	192.168.100.240	Ubuntu Client	HTTP		[TCP Retransmission] HTTP/1.1 200 OK
4612	2010-11-23 09:33:55.230996	192.168.100.240	Ubuntu Client	HTTP		[TCP Retransmission] Continuation or non-HTTP traffic
6469	2010-11-23 09:34:02.272184	192.168.100.240	Ubuntu Client	HTTP		[TCP Retransmission] HTTP/1.1 200 OK
8566	2010-11-23 09:34:07.402101	192.168.100.240	Ubuntu Client	HTTP		[TCP Retransmission] Continuation or non-HTTP traffic
.............
67538	2010-11-23 09:37:12.980069	Ubuntu Client	192.168.100.240	TCP		6509 > http [FIN,ACK] Seq=378 Ack=1 Win=5888 Len=0 TSV=303726 TSER=0
67539	2010-11-23 09:37:12.980209	192.168.100.240	Ubuntu Client	TCP		http > 6509 {ACK] Seq=1381 Ack=379 Win=65158 Len=0 TSV=197739987 TSER=393726

Thanks,
Lyubomir
 
Those settings have nothing to do with PF. They're sysctl's.
 
Thanks, SirDice!

There are no changes in the kernel parameters of the FreeBSD box. All of them are the default ones. Here are the running parameters:

# sysctl -a | grep tcp

Code:
tcpreass:         20,     1690,      0,    676,    94746
tcptw:            48,     2496,      4,    620,    24673
tcpcb:           464,    12328,     12,    140,    47589
net.inet.tcp.rfc1323: 1
net.inet.tcp.mssdflt: 512
net.inet.tcp.keepidle: 7200000
net.inet.tcp.keepintvl: 75000
net.inet.tcp.sendspace: 32768
net.inet.tcp.recvspace: 65536
net.inet.tcp.keepinit: 75000
net.inet.tcp.delacktime: 100
net.inet.tcp.v6mssdflt: 1024
net.inet.tcp.hostcache.cachelimit: 15360
net.inet.tcp.hostcache.hashsize: 512
net.inet.tcp.hostcache.bucketlimit: 30
net.inet.tcp.hostcache.count: 59
net.inet.tcp.hostcache.expire: 3600
net.inet.tcp.hostcache.prune: 300
net.inet.tcp.hostcache.purge: 0
net.inet.tcp.log_in_vain: 0
net.inet.tcp.blackhole: 0
net.inet.tcp.delayed_ack: 1
net.inet.tcp.rfc3042: 1
net.inet.tcp.rfc3390: 1
net.inet.tcp.insecure_rst: 0
net.inet.tcp.reass.maxsegments: 1600
net.inet.tcp.reass.cursegments: 0
net.inet.tcp.reass.maxqlen: 48
net.inet.tcp.reass.overflows: 0
net.inet.tcp.path_mtu_discovery: 1
net.inet.tcp.slowstart_flightsize: 1
net.inet.tcp.local_slowstart_flightsize: 4
net.inet.tcp.newreno: 1
net.inet.tcp.sack.enable: 1
net.inet.tcp.sack.maxholes: 128
net.inet.tcp.sack.globalmaxholes: 65536
net.inet.tcp.sack.globalholes: 0
net.inet.tcp.minmss: 216
net.inet.tcp.minmssoverload: 0
net.inet.tcp.tcbhashsize: 512
net.inet.tcp.do_tcpdrain: 1
net.inet.tcp.pcbcount: 16
net.inet.tcp.icmp_may_rst: 1
net.inet.tcp.isn_reseed_interval: 0
net.inet.tcp.maxtcptw: 2465
net.inet.tcp.nolocaltimewait: 0
net.inet.tcp.inflight.enable: 1
net.inet.tcp.inflight.debug: 0
net.inet.tcp.inflight.rttthresh: 10
net.inet.tcp.inflight.min: 6144
net.inet.tcp.inflight.max: 1073725440
net.inet.tcp.inflight.stab: 20
net.inet.tcp.syncookies: 1
net.inet.tcp.syncache.bucketlimit: 30
net.inet.tcp.syncache.cachelimit: 15359
net.inet.tcp.syncache.count: 0
net.inet.tcp.syncache.hashsize: 512
net.inet.tcp.syncache.rexmtlimit: 3
net.inet.tcp.msl: 30000
net.inet.tcp.rexmit_min: 30
net.inet.tcp.rexmit_slop: 200
net.inet.tcp.always_keepalive: 1

I would like to kindly remind the problem exists with Debian/Ubuntu clients ONLY. I have problem accessing WWW-Server from the laptop running Ubuntu Live CD. My friends can't access the WWW-Server behind FreeBSD box from PCs running Debian. The same laptop running Windows XP with the same network setup has no problems with this WWW page. There are no problems with another clients also.

Lyubomir.
 
Hello ,

www1_ext and www2_ext IPs are aliases on em0 ?

You can try to change your NAT rule for example to

Code:
nat on $ext_if from $internal_net to any -> $external_addr static-port
or just add static-port at the end of your current NAT rule , because it depends from your configuration

Please , paste the output from tcptraceroute from your Ubuntu station to your website ( on port 80 ) without any parameters
 
quintessence said:
Hello ,

www1_ext and www2_ext IPs are aliases on em0 ?

You can try to change your NAT rule for example to

Code:
nat on $ext_if from $internal_net to any -> $external_addr static-port
or just add static-port at the end of your current NAT rule , because it depends from your configuration

Please , paste the output from tcptraceroute from your Ubuntu station to your website ( on port 80 ) without any parameters

Hi, quintessence!

No, www1_ext and www2_ext IPs are NOT aliases on em0.
www1_ext belongs to a /29 subnet which is directly connected to em0.
www2_ext belongs to another /29 subnet which is routed through em0.
So I'm afraid your idea for changing the nat rule would not work.

The Debian/Ubuntu problem exists with both www1_ext and www2_ext addresses.


Here is the requested output of
Code:
tcptraceroute WWW-Server 80
:

Code:
ubuntu:~$ sudo tcptraceroute www.nij.bg 80
traceroute to www.nij.bg (195.138.135.99), 30 hops max, 60 byte packets
 1  hop1 (A1.B1.C1.D1)  1.364 ms  1.351 ms  1.341 ms
 2  hop2 (A2.B2.C2.D2)  1.808 ms  1.801 ms  1.792 ms
 3  hop3 (A3.B3.C3.D3)  2.556 ms  3.213 ms *
 4  WWW-Server (E.F.G.H)  2.447 ms  2.453 ms  2.444 ms
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * WWW-Server(E.F.G.H)  2.567 ms
ubuntu@ubuntu:~$

Thanks!
 
lyubomirrussev said:
Hi, quintessence!

No, www1_ext and www2_ext IPs are NOT aliases on em0.
www1_ext belongs to a /29 subnet which is directly connected to em0.
www2_ext belongs to another /29 subnet which is routed through em0.
So I'm afraid your idea for changing the nat rule would not work.
[/file]

Try to add static-port at the end of your current NAT rule instead of changing $ext_if with $external_addr

lyubomirrussev said:
Here is the requested output of
Code:
tcptraceroute WWW-Server 80
:

Code:
ubuntu:~$ sudo tcptraceroute www.nij.bg 80
traceroute to www.nij.bg (195.138.135.99), 30 hops max, 60 byte packets
 1  hop1 (A1.B1.C1.D1)  1.364 ms  1.351 ms  1.341 ms
 2  hop2 (A2.B2.C2.D2)  1.808 ms  1.801 ms  1.792 ms
 3  hop3 (A3.B3.C3.D3)  2.556 ms  3.213 ms *
 4  WWW-Server (E.F.G.H)  2.447 ms  2.453 ms  2.444 ms
 5  * * *
 6  * * *
 7  * * *
 8  * * *
 9  * * WWW-Server(E.F.G.H)  2.567 ms
ubuntu@ubuntu:~$

It seems to reach it after all , so you have take a look at your Microsoft station ( network configuration , web server restrictions , etc ) instead of looking at FreeBSD or PF side
 
quintessence said:
Try to add static-port at the end of your current NAT rule instead of changing $ext_if with $external_addr



It seems to reach it after all , so you have take a look at your Microsoft station ( network configuration , web server restrictions , etc ) instead of looking at FreeBSD or PF side

Thanks for your suggestion, quintessence!

I'm sorry, but I couldn't get your point. Could you, please, clarify your idea about adding static-port at the end of the general translation rule! The internal WWW-Server(s) are accessed through binat translation statements.
On the other hand the internal Microsoft IIS server was the first checkpoint when I started this troubleshooting. Everything with its network configuration is fine. This problem exists also when trying to access Apache httpd on Linux boxes in the internal network using binat on PF.
Now technical staff is inspecting L2 switches' ports in the internal network for any errors (duplex mismatch, CRC, etc.). However I doubt this will show anything, because switches don't mess with TCP handshake from specific HTTP client operating systems.

BR,
Lyubomir.
 
Hello ,

I meant to change from

Code:
nat on $ext_if from $internal_net to any -> ($ext_if)

to
Code:
nat on $ext_if from $internal_net to any -> ($ext_if) static-port

instead of
Code:
nat on $ext_if from $internal_net to any -> $external_addr static-port

Do you have any restrictions on web server part ( i.e deny some browsers ) , some header size limit , something related to HTTP header ?

EDIT: Sorry, the point is to prevent modifying the source port
 
quintessence said:
Hello ,

I meant to change from

Code:
nat on $ext_if from $internal_net to any -> ($ext_if)

to
Code:
nat on $ext_if from $internal_net to any -> ($ext_if) static-port

instead of
Code:
nat on $ext_if from $internal_net to any -> $external_addr static-port

Do you have any restrictions on web server part ( i.e deny some browsers ) , some header size limit , something related to HTTP header ?

EDIT: Sorry, the point is to prevent modifying the source port

No, I don't have any customized restrictions on WWW-Servers. All the parameters of WWW-servers are the default ones. Again, this happens both with MS IIS (Windows) and with Apache on Linux platforms.
Of course I can put static-port at the end of the general translation rule, but this rule works in the opposite direction.
Let me explain in more detail:
Remote WWW-Servers and some workstations are protected by FreeBSD box with PF firewall on their site. WWW-Servers are accessible from the Internet through binat and some consequent filtering pass in rules on this FreeBSD with PF. General
Code:
nat on $ext_if from $internal_net to any -> ($ext_if)
translation rule is there to provide the remote workstations access to the Internet. These workstations reside on the same segment where WWW-Servers are (behind the PF).
Thus, modifying the source port is not the case here.

I'll try to play with RFC1323 TCP extensions to find out whether this will change anything.

Thanks for your ideas!
 
Back
Top