TIP: Producing detailed reports from pflog

Synopsis

Following some pflog reporting frustration, I decided to re-familiarize myself with Python and create my own reports.

The purpose of the following script is to review your pflog files - e.g. /var/log/pflog.0.bz2 - and provide detail-level information on traffic that has been logged there.

Be sure to read the caveats and Q&A below before using it.

The Script

Code:
#!/usr/local/bin/python

#
# FPFparse.py
#
# Author:  anomie
# Purpose: Reads a "tcpdump -enr"-formatted PF logfile and produces a 
#          detailed packet activity report.
#
# Example usages:   
#    # tcpdump -enr pflog.0 | ./FPFparse.py - 
#    # tcpdump -enr pflog.0 > /tmp/mylog && ./FPFparse.py /tmp/mylog
#    # bzcat pflog.0.bz2 | tcpdump -enr - | ./FPFparse.py -
#
# See bottom of script for copyright info.
#

import re
import sys

class FPFparse:


  def __init__(self):
    self.pfdict = {}
    self.usingstdin = False
    self.infile = ""

    cliarg = sys.argv[1:]

    if not cliarg: 
      print "Fatal error: a tcpdump -enr formatted file needs to be" 
      print "passed in as an argument. (Or use '-' to provide stdin"
      print "as the argument.)" 

      sys.exit(1)

    if cliarg[0] == "-":
      self.usingstdin = True
    else:
      self.infile = cliarg[0]


  def parsedata(self):
    if self.usingstdin:
      for logline in sys.stdin:
        loglist = self.chopline(logline)
        self.dictstuff(loglist)

      self.report()

    else:
      try:
        pflog = open(self.infile, 'r')
      except IOError:
        print 'Cannot open', self.infile
      else:
        for logline in pflog.readlines():
          loglist = self.chopline(logline)
          self.dictstuff(loglist)

        self.report()

        pflog.close()


  def chopline(self, logline):
    pattern = re.compile(r"""
      \s                     # whitespace
      ([pb][al][so][sc]k?    # (pass|block)
      \s                     # whitespace
      [io][nu]t?             # (in|out)
      \son\s                 # ' on '
      \w{1,5}:)              # 1-5 non blanks -- e.g. xl0, le0, etc.
      \s                     # whitespace
      (\d{1,3}\.             # source IP, part 1
      \d{1,3}\.              #          , part 2
      \d{1,3}\.              #          , part 3
      \d{1,3})\.             #          , part 4
      .*                     # match anything
      \s                     # white space
      (\d{1,3}\.             # target IP, part 1
      \d{1,3}\.              #          , part 2
      \d{1,3}\.              #          , part 3
      \d{1,3}\.              #          , part 4
      \d{1,5}):              # target port
    """, re.VERBOSE)

    if not pattern.search(logline):
      print "Fatal error: A line in your input file is malformed." 
      print "(Are you sure you used tcpdump -enr to create it??)"
      print "Line contents:" 
      print logline

      sys.exit(1)

    logtuple = pattern.search(logline).groups()
    loglist = list(logtuple)

    return loglist


  def dictstuff(self, loglist):
    #
    # At this point we have list entries that look something like: 
    # ['pass in on le0', '172.16.39.146', '10.6.17.212.3128']
    #
    netflow = loglist[0]
    srcip = loglist[1]
    target = loglist[2]

    if not netflow in self.pfdict:
      self.pfdict[netflow] = {}

    if not srcip in self.pfdict[netflow]:
      self.pfdict[netflow][srcip] = []

    self.pfdict[netflow][srcip].append(target)
    #
    # By the end of this function, we have a dictionary object 
    # which contains dictionary entries itself. The key for the 
    # outer dictionary is e.g. 'pass in on le0'. The key for the
    # inner dictionary is '<src ip>'. 
    #


  def report(self):
    #
    # Lots of object swapping voodoo going on in this function. 
    # This can be modified to produce whatever format of report
    # you fancy. 
    #
    for netflow in self.pfdict:
      print netflow 
      print '----------------'

      srclist = self.pfdict[netflow].keys()
      srclist.sort()

      for srcip in srclist:
        trglist = self.pfdict[netflow][srcip]
        trgset = set(trglist)
        trglist = list(trgset)
        trglist.sort()

        print
        print "from %s" % (srcip)

        for target in trglist:
          ftrglist = target.split('.')
          ftrgip = ".".join(ftrglist[0])
          ftrgport = ftrglist[4]
 
          print " -> %s (%s) :  %s packets" % (ftrgip, \
                   ftrgport, self.pfdict[netflow][srcip].count(target))

      print

if __name__ == "__main__":

  obj = FPFparse()

  obj.parsedata()


#  Copyright (c) 2009 anomie
#  All rights reserved.
#
#  Redistribution and use in source and binary forms, with or without
#  modification, are permitted provided that the following conditions
#  are met:
#  1. Redistributions of source code must retain the above copyright
#     notice, this list of conditions and the following disclaimer.
#  2. Redistributions in binary form must reproduce the above copyright
#     notice, this list of conditions and the following disclaimer in the
#     documentation and/or other materials provided with the distribution.
#
#  THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
#  ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
#  IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
#  ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
#  FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
#  DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
#  OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
#  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
#  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
#  OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
#  SUCH DAMAGE.

Sample Output

Code:
pass out on ath0:
----------------

from 10.0.0.2
 -> 10.10.254.210 (80) :  4 packets
 -> 10.10.186.201 (80) :  1 packets
 -> 10.10.186.208 (80) :  1 packets
 -> 10.10.186.209 (80) :  2 packets
 -> 10.10.186.216 (80) :  4 packets
...snip...

pass in on le0:
----------------

from 10.1.174.124
 -> 10.1.174.212 (3128) :  202 packets

from 10.1.174.133
 -> 10.1.174.212 (3128) :  35 packets

from 10.1.174.147
 -> 10.1.174.212 (3128) :  74 packets
...snip...

block in on xl0:
----------------

from 10.3.174.1
 -> 10.3.255.255 (138) :  4 packets

from 10.3.174.104
 -> 10.3.255.255 (138) :  5 packets

from 10.3.174.107
 -> 10.3.255.255 (138) :  5 packets
 -> 255.255.255.255 (67) :  2 packets
...snip...

Caveats

  • This has been tested on FreeBSD 6.4, Python 2.5. Close major/minor versions should work, but I can't vouch for that. Python 3 may not work at all, and may require a rewrite.
  • I am a Python novice, and I may have used bad form in some areas. We all have to start somewhere.
  • If you run this script on very large pflog files, expect very large reports.

Q & A

Questions:
  1. Why didn't you write this in Bourne shell or <your favorite scripting language>?
  2. Why didn't you use <pflog reporting tool>?
  3. What sort of traffic are you reporting on?

Answers:
  1. I attempted this with Bourne shell first, but it got unwieldy and required writing intermediate temp files along the way. I don't like <your favorite scripting language>, so I selected Python, which I am fond of.
  2. See the synopsis section at the start of this post.
  3. The script generates reports on packets that were logged via (pass|block) (in|out) rules. If separate entries are created by pflog for NAT or redirection, I'm not currently capturing those. (A quick edit to the regexp should solve that.)
 
Patch

In case anyone is actually using this reporting script: I made a change so that in the event of a malformed (read: unmatched) logfile line, the script no longer errors out. Now the script just prints the malformed line, skips processing it further, and continues on with the next line.

Patch.txt
Code:
--- FPFparse.py 2009/05/24 17:48:57     1.4
+++ FPFparse.py 2009/05/24 18:03:50
@@ -45,7 +45,8 @@
     if self.usingstdin:
       for logline in sys.stdin:
         loglist = self.chopline(logline)
-        self.dictstuff(loglist)
+        if loglist:
+          self.dictstuff(loglist)
 
       self.report()
 
@@ -57,7 +58,8 @@
       else:
         for logline in pflog.readlines():
           loglist = self.chopline(logline)
-          self.dictstuff(loglist)
+          if loglist:
+            self.dictstuff(loglist)
 
         self.report()
 
@@ -87,12 +89,10 @@
     """, re.VERBOSE)
 
     if not pattern.search(logline):
-      print "Fatal error: A line in your input file is malformed." 
-      print "(Are you sure you used tcpdump -enr to create it??)"
-      print "Line contents:" 
+      print "Skipping unmatched line:" 
       print logline
 
-      sys.exit(1)
+      return []
 
     logtuple = pattern.search(logline).groups()
     loglist = list(logtuple)

Example report snippet

Code:
Skipping unmatched line:
10:02:01.431735 rule 11/0(match): block in on le0: 210.51.56.65 > 10.99.174.216: ICMP time exceeded in-transit, length 36

block in on le0:
----------------

from 113.15.177.204
 -> 10.99.174.231 (5584) :  1 packets

from 113.22.116.57
 -> 10.99.174.231 (19010) :  1 packets

from 113.61.196.126
 -> 10.99.174.212 (24044) :  1 packets

from 115.134.60.227
 -> 10.99.174.216 (10211) :  3 packets

...
 
Back
Top