TIP: Producing detailed reports from pflog

anomie · Apr 24, 2009

Synopsis

Following some pflog reporting frustration, I decided to re-familiarize myself with Python and create my own reports.

The purpose of the following script is to review your pflog files - e.g. /var/log/pflog.0.bz2 - and provide detail-level information on traffic that has been logged there.

Be sure to read the caveats and Q&A below before using it.

The Script

Code:

#!/usr/local/bin/python

#
# FPFparse.py
#
# Author:  anomie
# Purpose: Reads a "tcpdump -enr"-formatted PF logfile and produces a 
#          detailed packet activity report.
#
# Example usages:   
#    # tcpdump -enr pflog.0 | ./FPFparse.py - 
#    # tcpdump -enr pflog.0 > /tmp/mylog && ./FPFparse.py /tmp/mylog
#    # bzcat pflog.0.bz2 | tcpdump -enr - | ./FPFparse.py -
#
# See bottom of script for copyright info.
#

import re
import sys

class FPFparse:


  def __init__(self):
    self.pfdict = {}
    self.usingstdin = False
    self.infile = ""

    cliarg = sys.argv[1:]

    if not cliarg: 
      print "Fatal error: a tcpdump -enr formatted file needs to be" 
      print "passed in as an argument. (Or use '-' to provide stdin"
      print "as the argument.)" 

      sys.exit(1)

    if cliarg[0] == "-":
      self.usingstdin = True
    else:
      self.infile = cliarg[0]


  def parsedata(self):
    if self.usingstdin:
      for logline in sys.stdin:
        loglist = self.chopline(logline)
        self.dictstuff(loglist)

      self.report()

    else:
      try:
        pflog = open(self.infile, 'r')
      except IOError:
        print 'Cannot open', self.infile
      else:
        for logline in pflog.readlines():
          loglist = self.chopline(logline)
          self.dictstuff(loglist)

        self.report()

        pflog.close()


  def chopline(self, logline):
    pattern = re.compile(r"""
      \s                     # whitespace
      ([pb][al][so][sc]k?    # (pass|block)
      \s                     # whitespace
      [io][nu]t?             # (in|out)
      \son\s                 # ' on '
      \w{1,5}:)              # 1-5 non blanks -- e.g. xl0, le0, etc.
      \s                     # whitespace
      (\d{1,3}\.             # source IP, part 1
      \d{1,3}\.              #          , part 2
      \d{1,3}\.              #          , part 3
      \d{1,3})\.             #          , part 4
      .*                     # match anything
      \s                     # white space
      (\d{1,3}\.             # target IP, part 1
      \d{1,3}\.              #          , part 2
      \d{1,3}\.              #          , part 3
      \d{1,3}\.              #          , part 4
      \d{1,5}):              # target port
    """, re.VERBOSE)

    if not pattern.search(logline):
      print "Fatal error: A line in your input file is malformed." 
      print "(Are you sure you used tcpdump -enr to create it??)"
      print "Line contents:" 
      print logline

      sys.exit(1)

    logtuple = pattern.search(logline).groups()
    loglist = list(logtuple)

    return loglist


  def dictstuff(self, loglist):
    #
    # At this point we have list entries that look something like: 
    # ['pass in on le0', '172.16.39.146', '10.6.17.212.3128']
    #
    netflow = loglist[0]
    srcip = loglist[1]
    target = loglist[2]

    if not netflow in self.pfdict:
      self.pfdict[netflow] = {}

    if not srcip in self.pfdict[netflow]:
      self.pfdict[netflow][srcip] = []

    self.pfdict[netflow][srcip].append(target)
    #
    # By the end of this function, we have a dictionary object 
    # which contains dictionary entries itself. The key for the 
    # outer dictionary is e.g. 'pass in on le0'. The key for the
    # inner dictionary is '<src ip>'. 
    #


  def report(self):
    #
    # Lots of object swapping voodoo going on in this function. 
    # This can be modified to produce whatever format of report
    # you fancy. 
    #
    for netflow in self.pfdict:
      print netflow 
      print '----------------'

      srclist = self.pfdict[netflow].keys()
      srclist.sort()

      for srcip in srclist:
        trglist = self.pfdict[netflow][srcip]
        trgset = set(trglist)
        trglist = list(trgset)
        trglist.sort()

        print
        print "from %s" % (srcip)

        for target in trglist:
          ftrglist = target.split('.')
          ftrgip = ".".join(ftrglist[0])
          ftrgport = ftrglist[4]
 
          print " -> %s (%s) :  %s packets" % (ftrgip, \
                   ftrgport, self.pfdict[netflow][srcip].count(target))

      print

if __name__ == "__main__":

  obj = FPFparse()

  obj.parsedata()


#  Copyright (c) 2009 anomie
#  All rights reserved.
#
#  Redistribution and use in source and binary forms, with or without
#  modification, are permitted provided that the following conditions
#  are met:
#  1. Redistributions of source code must retain the above copyright
#     notice, this list of conditions and the following disclaimer.
#  2. Redistributions in binary form must reproduce the above copyright
#     notice, this list of conditions and the following disclaimer in the
#     documentation and/or other materials provided with the distribution.
#
#  THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
#  ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
#  ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
#  FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
#  DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
#  OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
#  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
#  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
#  OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
#  SUCH DAMAGE.

Sample Output

Code:

pass out on ath0:
----------------

from 10.0.0.2
 -> 10.10.254.210 (80) :  4 packets
 -> 10.10.186.201 (80) :  1 packets
 -> 10.10.186.208 (80) :  1 packets
 -> 10.10.186.209 (80) :  2 packets
 -> 10.10.186.216 (80) :  4 packets
...snip...

pass in on le0:
----------------

from 10.1.174.124
 -> 10.1.174.212 (3128) :  202 packets

from 10.1.174.133
 -> 10.1.174.212 (3128) :  35 packets

from 10.1.174.147
 -> 10.1.174.212 (3128) :  74 packets
...snip...

block in on xl0:
----------------

from 10.3.174.1
 -> 10.3.255.255 (138) :  4 packets

from 10.3.174.104
 -> 10.3.255.255 (138) :  5 packets

from 10.3.174.107
 -> 10.3.255.255 (138) :  5 packets
 -> 255.255.255.255 (67) :  2 packets
...snip...

Caveats

This has been tested on FreeBSD 6.4, Python 2.5. Close major/minor versions should work, but I can't vouch for that. Python 3 may not work at all, and may require a rewrite.
I am a Python novice, and I may have used bad form in some areas. We all have to start somewhere.
If you run this script on very large pflog files, expect very large reports.

Q & A

Questions:

Why didn't you write this in Bourne shell or <your favorite scripting language>?
Why didn't you use <pflog reporting tool>?
What sort of traffic are you reporting on?

Answers:

I attempted this with Bourne shell first, but it got unwieldy and required writing intermediate temp files along the way. I don't like <your favorite scripting language>, so I selected Python, which I am fond of.
See the synopsis section at the start of this post.
The script generates reports on packets that were logged via (pass|block) (in|out) rules. If separate entries are created by pflog for NAT or redirection, I'm not currently capturing those. (A quick edit to the regexp should solve that.)

anomie · May 24, 2009

Patch

In case anyone is actually using this reporting script: I made a change so that in the event of a malformed (read: unmatched) logfile line, the script no longer errors out. Now the script just prints the malformed line, skips processing it further, and continues on with the next line.

Patch.txt

Code:

--- FPFparse.py 2009/05/24 17:48:57     1.4
+++ FPFparse.py 2009/05/24 18:03:50
@@ -45,7 +45,8 @@
     if self.usingstdin:
       for logline in sys.stdin:
         loglist = self.chopline(logline)
-        self.dictstuff(loglist)
+        if loglist:
+          self.dictstuff(loglist)
 
       self.report()
 
@@ -57,7 +58,8 @@
       else:
         for logline in pflog.readlines():
           loglist = self.chopline(logline)
-          self.dictstuff(loglist)
+          if loglist:
+            self.dictstuff(loglist)
 
         self.report()
 
@@ -87,12 +89,10 @@
     """, re.VERBOSE)
 
     if not pattern.search(logline):
-      print "Fatal error: A line in your input file is malformed." 
-      print "(Are you sure you used tcpdump -enr to create it??)"
-      print "Line contents:" 
+      print "Skipping unmatched line:" 
       print logline
 
-      sys.exit(1)
+      return []
 
     logtuple = pattern.search(logline).groups()
     loglist = list(logtuple)

Example report snippet

Code:

Skipping unmatched line:
10:02:01.431735 rule 11/0(match): block in on le0: 210.51.56.65 > 10.99.174.216: ICMP time exceeded in-transit, length 36

block in on le0:
----------------

from 113.15.177.204
 -> 10.99.174.231 (5584) :  1 packets

from 113.22.116.57
 -> 10.99.174.231 (19010) :  1 packets

from 113.61.196.126
 -> 10.99.174.212 (24044) :  1 packets

from 115.134.60.227
 -> 10.99.174.216 (10211) :  3 packets

...