PF Ports opening and closing randomly

Hi guys,

I have a really weird problem. A FreeBSD 10.2 machine running an OpenVPN service on TCP port 1194 listening on the LAN IP address of the machine 192.168.1.5

This machine is also guarded by a firewall, which I extensively use for routing to and from the jails, which works pretty well.

TCP port 1194 is open for connections coming from any address and from inside the LAN (192.168.1.0)

To connect from the outside, the big wide internet, I have a DDNS service. However, seen from outside, the OpenVPN machine is behind 3 routers.

Router 1 is actually the uplink device to the ISP without a firewall.
Router 2 will redirect any incoming packages on a particular TCP port (99000, port number changed for security reasons) and will redirect them to Router 3 on the same port.

Router 3 will accept packages on TCP port 99000 and redirect them to 192.168.1.5:1194 TCP, which is the IP and port for the OpenVPN service. From my understanding, the firewall on the OpenVPN machine sees those packages originating from 192.168.1.1

Now here is the broken part:
When I have another machine (e.g. my workstation laptop) connected to the LAN 192.168.1.0 and I initiate an OpenVPN connection targeting the EXTERNAL IP ADDRESS OF ROUTER 1 the connection will be established and works just fine. Here is the console output:

Here is the output from a successful connection attempt to the openvpn service from inside the LAN 192.168.1.0 connecting to the external IP
(WAN IP renamed to invalid 321.321.321.321 for security reasons, port also renamed)

Code:
[root@localhost VPN_client_michael]# openvpn laplace_passwd.ovpn
Fri Feb 12 17:33:55 2016 OpenVPN 2.3.10 x86_64-redhat-linux-gnu [SSL (OpenSSL)] [LZO] [EPOLL] [PKCS11] [MH] [IPv6] built on Jan  4 2016
Fri Feb 12 17:33:55 2016 library versions: OpenSSL 1.0.2f-fips  28 Jan 2016, LZO 2.08
Enter Auth Username: *******
Enter Auth Password: *************
Fri Feb 12 17:34:01 2016 Socket Buffers: R=[87380->87380] S=[16384->16384]
Fri Feb 12 17:34:01 2016 Attempting to establish TCP connection with [AF_INET]321.321.321.321:99000 [nonblock]
Fri Feb 12 17:34:02 2016 TCP connection established with [AF_INET]321.321.321.321:99000
Fri Feb 12 17:34:02 2016 TCPv4_CLIENT link local: [undef]
Fri Feb 12 17:34:02 2016 TCPv4_CLIENT link remote: [AF_INET]321.321.321.321:99000
Fri Feb 12 17:34:02 2016 TLS: Initial packet from [AF_INET]321.321.321.321:99000, sid=90fd48a0 69567732
Fri Feb 12 17:34:02 2016 WARNING: this configuration may cache passwords in memory -- use the auth-nocache option to prevent this
Fri Feb 12 17:34:02 2016 VERIFY OK: depth=1, C=CH, ST=ZH, L=Zurich, O=private, OU=private, CN=Fort-Funston CA, name=laplace_VPN_ca, emailAddress=mm@michaelmichael.org
Fri Feb 12 17:34:02 2016 Validating certificate key usage
Fri Feb 12 17:34:02 2016 ++ Certificate has key usage  00a0, expects 00a0
Fri Feb 12 17:34:02 2016 VERIFY KU OK
Fri Feb 12 17:34:02 2016 Validating certificate extended key usage
Fri Feb 12 17:34:02 2016 ++ Certificate has EKU (str) TLS Web Server Authentication, expects TLS Web Server Authentication
Fri Feb 12 17:34:02 2016 VERIFY EKU OK
Fri Feb 12 17:34:02 2016 VERIFY OK: depth=0, C=CH, ST=ZH, L=private, O=private, OU=MyOrganizationalUnit, CN=laplace, name=laplace_VPN_key, emailAddress=mm@michaelmichael.org
Fri Feb 12 17:34:02 2016 Data Channel Encrypt: Cipher 'AES-256-CBC' initialized with 256 bit key
Fri Feb 12 17:34:02 2016 Data Channel Encrypt: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri Feb 12 17:34:02 2016 Data Channel Decrypt: Cipher 'AES-256-CBC' initialized with 256 bit key
Fri Feb 12 17:34:02 2016 Data Channel Decrypt: Using 160 bit message hash 'SHA1' for HMAC authentication
Fri Feb 12 17:34:02 2016 Control Channel: TLSv1.2, cipher TLSv1/SSLv3 AES256-GCM-SHA384, 2048 bit RSA
Fri Feb 12 17:34:02 2016 [laplace] Peer Connection Initiated with [AF_INET]321.321.321.321:99000
Fri Feb 12 17:34:05 2016 SENT CONTROL [laplace]: 'PUSH_REQUEST' (status=1)
Fri Feb 12 17:34:05 2016 PUSH: Received control message: 'PUSH_REPLY,route 192.168.1.0 255.255.255.0,redirect-gateway def1 bypass-dhcp,dhcp-option DNS 192.168.1.1,dhcp-option DNS 8.8.8.8,route 10.8.1.0 255.255.255.0,topology net30,ping 30,ping-restart 300,ifconfig 10.8.1.6 10.8.1.5'
Fri Feb 12 17:34:05 2016 OPTIONS IMPORT: timers and/or timeouts modified
Fri Feb 12 17:34:05 2016 OPTIONS IMPORT: --ifconfig/up options modified
Fri Feb 12 17:34:05 2016 OPTIONS IMPORT: route options modified
Fri Feb 12 17:34:05 2016 OPTIONS IMPORT: --ip-win32 and/or --dhcp-option options modified
Fri Feb 12 17:34:05 2016 ROUTE_GATEWAY 192.168.1.1/255.255.255.0 IFACE=enp3s0f2 HWADDR=00:90:f5:cc:97:a9
Fri Feb 12 17:34:05 2016 TUN/TAP device tun0 opened
Fri Feb 12 17:34:05 2016 TUN/TAP TX queue length set to 100
Fri Feb 12 17:34:05 2016 do_ifconfig, tt->ipv6=0, tt->did_ifconfig_ipv6_setup=0
Fri Feb 12 17:34:05 2016 /usr/sbin/ip link set dev tun0 up mtu 1500
Fri Feb 12 17:34:05 2016 /usr/sbin/ip addr add dev tun0 local 10.8.1.6 peer 10.8.1.5
Fri Feb 12 17:34:05 2016 /usr/sbin/ip route add 321.321.321.321/32 via 192.168.1.1
Fri Feb 12 17:34:05 2016 /usr/sbin/ip route add 0.0.0.0/1 via 10.8.1.5
Fri Feb 12 17:34:05 2016 /usr/sbin/ip route add 128.0.0.0/1 via 10.8.1.5
Fri Feb 12 17:34:05 2016 /usr/sbin/ip route add 192.168.1.0/24 via 10.8.1.5
Fri Feb 12 17:34:05 2016 /usr/sbin/ip route add 10.8.1.0/24 via 10.8.1.5
Fri Feb 12 17:34:05 2016 Initialization Sequence Completed

If I connect the same workstation computer to another network, for example I take it to a friends place, connect to his WIFI and then try to initiate an OpenVPN connection the same way I did before, the connection attempt will fail with a "connection failed, retrying in 5 seconds" or something along that.

And this is the WEIRD part:
I also did a portscan from inside the LAN on the external IP several times in a row and I get different results: the port switches between open and closed. By trying several times, and by that I mean starting openvpn client over and over again, I am lucky sometimes and I get a connection, but sometimes not, and a couple of days later the port closes completely. During all this I do not make any changes to the pf or the openvpn config.

It also occurred, that I was in another town with THE SAME WORKSTATION LAPTOP connected to some local network and the OpenVPN connection wont work, but then (still in the same environment) tethering the same laptop to my smartphone which is connected to the mobile data network, I get a connection.

Here are the relevant (I hope all of them) pf.conf settings:

Code:
### *** /etc/pf.conf

ifce = "re0"
ip = 192.168.1.5


host_ports_tcp = 22 443 1194 8200
host_ports_udp = 22 1194 1900

www = 10.0.0.2
www_ports_tcp = 80 443 1500 1600 1700
www_ports_udp = 80
www_ifce = $ifce lo0

table <vpn> persist { 10.8.1.0/8 }
tunnels = "{ tun0, tun1, tun2, tun3, tun4, tun5 }"

#                           OpenVPN        Jails          My LAN  
table <rfc1918> persist { 10.8.1.0/24, 10.0.0.0/24,  192.168.1.0/24}
# note: I know this table is NOT rfc1918, but this line was inherited from an example and I adapred it to my needs
# the jail nat and rdr rules are also omitted in this
icmp_types = "echoreq"

jails = "{" $www "}"
open_tcp_world = "{" 80 443 1194 "}"
open_udp_world = "{" 80 1194 "}"
open_tcp_lan   = "{" $host_ports_tcp $www_ports_tcp "}"
open_udp_lan   = "{" $host_ports_udp $www_ports_udp "}"

set block-policy drop

# lo0 nicht filtern
set skip on lo0
set timeout { interval 10, frag 30 }
set timeout { tcp.first 120, tcp.opening 30, tcp.established 86400 }
set timeout { tcp.closing 900, tcp.finwait 45, tcp.closed 90 }
set timeout { udp.first 60, udp.single 30, udp.multiple 60 }
set timeout { icmp.first 20, icmp.error 10 }
set timeout { other.first 60, other.single 30, other.multiple 60 }
set timeout { adaptive.start 0, adaptive.end 0 }
set limit { states 10000, frags 5000 }
set loginterface re0
set optimization normal
set require-order yes
set fingerprints "/etc/pf.os"
set ruleset-optimization basic

scrub in all fragment reassemble random-id

# www traffic redirection to the jail
rdr pass on { $www_ifce } proto tcp from any to { $ifce } port { $www_ports_tcp } -> $www
rdr pass on { $www_ifce } proto udp from any to { $ifce } port { $www_ports_udp } -> $www

# enable jails to talk to the outside
nat on { $ifce } proto {tcp udp icmp} from $jails to any -> $ip

# that is necessary to enable VPN clients with their 10.8.1.0 IP addresses to talk to anywhere
nat on { $ifce lo0 } proto {tcp udp icmp} from <vpn> to any -> $ip

block log all
block return

antispoof quick for {$ifce}

# Allow SSH connection from standard LAN subnet adresses (all protocols)
pass in quick on {$ifce} proto { tcp, udp } from <rfc1918> to $ifce port 22

# opens specified ports for origins specified in the LAN table
pass in on {$ifce} proto tcp from <rfc1918> to {$ifce} port $open_tcp_lan
pass in on {$ifce} proto udp from <rfc1918> to {$ifce} port $open_udp_lan

pass in log (all) on $tunnels all keep state

# opens the specified ports for the rest of the internet
pass in on {$ifce} inet proto tcp from any port $open_tcp_world keep state
pass in on {$ifce} inet proto udp from any port $open_udp_world keep state

# Allow all outgoing traffic
pass out quick proto { tcp icmp udp } keep state

# This is the ping rule comment out to block (if block all is on above)
pass in on {$ifce} inet proto icmp all icmp-type $icmp_types keep state


Here is the nmap command and output, while scanning from the same LAN:

the nmap scan while having the workstation connected to the same LAN as the openvpn server. each scan has been started within seconds after the last one to complete, note the time.

Code:
root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0039s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE  SERVICE
99000/tcp closed unknown

Nmap done: 1 IP address (1 host up) scanned in 0.63 seconds
[root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0029s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE SERVICE
99000/tcp open  unknown

Nmap done: 1 IP address (1 host up) scanned in 0.13 seconds
[root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0044s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE  SERVICE
99000/tcp closed unknown

Nmap done: 1 IP address (1 host up) scanned in 0.14 seconds
[root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0040s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE  SERVICE
99000/tcp closed unknown

Nmap done: 1 IP address (1 host up) scanned in 0.15 seconds
[root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0039s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE SERVICE
99000/tcp open  unknown

Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds
[root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0022s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE SERVICE
99000/tcp open  unknown

Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds
[root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0051s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE SERVICE
99000/tcp open  unknown

Nmap done: 1 IP address (1 host up) scanned in 0.14 seconds
[root@localhost VPN_client_michael]# nmap -p 99000 -sS my.ddns.given.external.ip

Starting Nmap 7.00 ( https://nmap.org ) at 2016-02-12 18:12 CET
Nmap scan report for my.ddns.given.external.ip (321.321.321.321)
Host is up (0.0032s latency).
rDNS record for 321.321.321.321: 321-321-321-321.my.cool.isp
PORT      STATE  SERVICE
99000/tcp closed unknown

Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds

Now, at the time of writing, the port remains closed while connecting and I cannot get a connection at all from the outside (I checked that with my smartphone trying to connect while I am on the WIFI and then on the mobile data network, which gives me a real external IP). Also, I did the nmap scan froom inside the LAN and I also have the switching behaviour there, however, from the LAN I can connect, but not from the world. I really cannot see a pattern here.

Note: I also connect to HTTPS port from the outside by having the routers forwarding the packages the same way and this works like a charm. I have to mention, that the external HTTPS port is left at 443 and is not changed during the redirection by the routers, but for openvpn it is changed from 99000 to 1194, which is the only difference here.

Anyway, this drives me nuts, as I might be able to quickly stop by the machine while I am in town, but regularly I have to travel and I start chewing on my shirt when I find myself hundreds of km away and not being able to connect.

I hope this is detailed enough, if you need it, then I can also post the openvpn config and some netstat output from the openvpn machine.

Thanks,
mm
 
I think your biggest problem is having those three routers in between. Especially as it appears each one of them uses NAT. Any one of those can have states time-out or have some IPS function kick in and cause problems. NAT in and of itself isn't a problem, I have several OpenVPN instances running behind a firewall with NAT. But never three in a row...
 
thanks for the quick reply.

The thing that puzzles me is that I am not experiencing the same with the HTTPS port, which is forwarded the same way. Earlier I had an OpenVPN service running on a Synology Diskstation to which the packets were forwarded from the outside using the same settings on the routers (in fact I only changed the target IP on one router when I switched to OpenVPN on the FreeBSD machine, so I figured the problem originates not on the router(s)).

However, I just did some port scans right now on the LAN IP of the server (192.168.1.5) and the port is always reported to be open.

The routers typically do not distinguish between HTTPS traffic and openvpn traffic, so why would the timeout be a problem for OpenVPN but not for HTTPS?

The first router is only bridging btw. and only functions as a hardware interface, thus should be transparent for Layer 3
 
Alright, after prolonged and intense studies, I found the culprit that cost me weeks of spare time if summed up.

Code:
pass in on {$ifce} inet proto tcp from any port $open_tcp_world keep state
This rule, which should open external ports (on the machine, not the router) to be accessible from the outside has a semantic error, namely does it mean, that packets from ANY ADDRESS AND FROM THE SPECIFIED PORTS (source ports) shall pass.
What I actually intended it to mean was: Packets from ANY address TO the specified ports (destination ports) shall pass.

The working rule looks like this (note the "to" between "any" and "port")
Code:
pass in on {$ifce} inet proto tcp from any to port $open_tcp_world keep state

Obviously the OpenVPN client kind of randomizes the source port on every connection attempt, which was (to me the most probable reason) why the connection could sometimes be established and sometimes not. Same for port scans (no clue how nmap works in detail). The port change (ext port 99000 to int port 1194) performed by the router added to this ambiguity. In fact I find that a bit strange, because when the router port forwards a packet, then the source and destination ports and addresses are being rewritten afaik.

Also it is now clear why the HTTPS connection worked, because the source and destination ports are the same on the client, the server and the forwarding on the router (ext port = int port)

Obviously I was not familiar enough with the pf syntax and the exact meanings of "on, from, to" and so on. After reading the marvellous book by Peter N.M. Hansteen - The Book Of PF I am much more sensitized to the difference and the resulting logic. In his book he also points out this very common and widespread misconception.

Cheers and out,
mm
 
  • Thanks
Reactions: Oko
Yeah, trying to be too smart too early and writing rules with shortcuts when you don't quite yet have a knack of the rule idioms is a bad idea. I tend to write PF rules as verbosely as possible so that when I revisit the rules there's less chance of missing the important bits and getting it all wrong. The rule parser can not read your mind and you can write dozens of variations of the same rule that parse ok but with only one that makes sense and works.
 
Yeah, trying to be too smart too early and writing rules with shortcuts when you don't quite yet have a knack of the rule idioms is a bad idea. I tend to write PF rules as verbosely as possible so that when I revisit the rules there's less chance of missing the important bits and getting it all wrong. The rule parser can not read your mind and you can write dozens of variations of the same rule that parse ok but with only one that makes sense and works.

To be precise, the macros did not have anything to do with the error.
 
Back
Top