Sunday, December 8, 2013

digpcap.py - Network Capture files mangling


If you are storing huge amount of network capture files (tcpdump output, or from similar programs), then you might be interested in trying this Python script that aims at finding "The" correct capture file(s) among all the directories that are automagically created by tcpdump when the option to write on disk is activated.

Example of such tcpdump command line:

tcpdump -s 0 -v -C 100 -W 10000  -w $dumpdir/`date +"%Y/%m/%d/"`dump

where:

  • -s 0 (to capture entire packet)
  • -v (verbose, prints more details. See tcpdump man page for additional v options, e.g. -vvv)
  • -C 100 (capture up to 100MB of packets per file)
  • - W 10000 (store up to 10000 rollover files before starting from 0 again, e.g. dump00001, dump00002, etc)
  • -w (write to file)
  • $dumpdir (the root folder of the directory tree where captures are stored, e.g. /var/log/pcap). $dumpdir is important: you will have to modify the Python script to reflect your own value (I could have this passed by command-line argument but, for my own usage, it doesn't make sense as this folder is not likely to change often: YMMV).
  • `date +"%Y/%m/%d/"`dump (e.g. if today's date is the 15th of November 2013, then it will yield 2013/11/15/dump
  • So -w $dumpdir/`date +"%Y/%m/%d/"`dump will store capture file as /var/log/pcap/2013/11/15/dump00001 and the -C switch will create dump00002 as soon as dump00001 reaches the 100MB's limit.

The script returns a list of files matching start and end time constraints provided as arguments.

Some examples of arguments the script is able to handle:

raskal$ ./digpcap.py 
usage: digpcap.py [-h] -f FROMDATE [-t TODATE] [-s SEARCH] [-v]
digpcap.py: error: argument -f/--fromdate is required

  • -f (from date) is the only mandatory argument required to avoid the above error message. It is the date you want to start searching for ... something.
  • Without -t (to date), it's defaulting to now().
  • -s is an optional argument, and it's not in use at the moment (look at Script content for the roadmap).
  • -v switches to verbose mode and therefore cannot be used to pipe the output of the script to another command like tshark (in a for loop context, see below).
  • -h ... well, usage and some examples....
Script dependencies:

- python-dateutil
- capinfos

- Update the CAP_FOLDER global variable to reflect your own setup
- CAP_FOLDER is the place where tcpdump capture files are stored.
CAP_FOLDER example (no trailing / slash at the end please): 
CAP_FOLDER = '/var/log/pcap'
The -v argument switch (-v output in blue)
raskal$ ./digpcap.py -v -f "Dec. 11, 2013 14:40:00"
Will dig None from 2013-12-11 14:40:00 to 2013-12-11 14:49:51.544586
2013/12/11/dump11007 interval: 2013-12-11 14:39:08 to 2013-12-11 14:47:22
2013/12/11/dump11008 interval: 2013-12-11 14:47:22 to 2013-12-11 14:53:51
2013/12/11/dump11007
2013/12/11/dump11008


As you can mention, the -v argument produces more output not suitable for further piping and processing. Use it to check that the script is not messing around.
Last example: searching for capture files stored between November the 4th at 11:30am and November the 5th at 3:46:50am (no -v switch this time).
raskal$ ./digpcap.py -f "Nov. 4, 2013 11:30" -t "Nov. 5, 2013 3:46:50" 
2013/11/04/dump09007
2013/11/04/dump09008
2013/11/04/dump09009
2013/11/04/dump09010
2013/11/05/dump00001
2013/11/05/dump00002
Note: at midnight, a cron job is restarting the tcpdump process thus dump##### numbering is restarting afresh.
Without -v you can further process each file one by one, your imagination is the limit.(Note: no need to enter the time in HH:MM:SS format, HH is enough) raskal$ for f in `./digpcap.py -f "Aug. 23, 2012 14" -t "Aug. 24, 2012 14"`; do echo "Look Ma, got " $f; done Look Ma, got 2012/08/23/ipv4.pcap Look Ma, got 2012/08/23/ipv6.pcap Look Ma, got 2012/08/24/fragment.pcap Look Ma, got 2012/08/24/link.pcap The script... digpcap.py (version 0.3)