Add timestamp for pcap file’s name

I wrote a post about splitting large pcap file into small ones before. After that, you should add timestamp for pcap‘s file name, and it will be easy for you to find related pcap files to process.

Assume there is a folder which includes all pcap files generated by following tcpdump command:

tcpdump -r input.pcap -w ./pcap/frag -C 1000

It will generate ./pcap/frag./pcap/frag1, …, etc. You can use following script to add timestamp for every file:

#!/bin/sh

directory=./pcap
cd "$directory" || exit 1

for old_file_name in *
do
    timestamp=$(tshark -nr "${old_file_name}" -T fields -e frame.time_epoch -c 1)
    new_file_name="${old_file_name}.${timestamp}.pcap"
    mv "${old_file_name}" "${new_file_name}"
done

The file’s name will be fragxx.1542222065.974954000.pcap now.

P.S., the script can be downloaded here.

Process large pcap file

To process large pcap file, usually it is better to split it into small chunks first, then process every chunk in parallel. I implement a simple shell script to do it:

#!/bin/sh

input_pcap=input.pcap
output_pcap=./pcap/frag.pcap
spilt_size=1000
output_index=1
loop_count=10
exit_flag=0

command() {
    echo "$1" "$2" > log"$2"
}

tcpdump -r ${input_pcap} -w ${output_pcap} -C ${spilt_size}

command ${output_pcap}

while :
do
    loop_index=0
    while test ${loop_index} -lt ${loop_count}
    do
        if test -e ${output_pcap}${output_index}
        then
            command ${output_pcap} ${output_index} &
            output_index=$((output_index + 1))
            loop_index=$((loop_index + 1))
        else
            exit_flag=1
            break
        fi
    done
    wait

    if test ${exit_flag} -eq 1
    then
        exit 0
    fi
done

First of all, split input pcap file into 1GB chunks. Then launch 10 processes to crunch data (in above example, just simple output). Definitely you can customize it.

BTW, the code can be downloaded here.

Search IP fragmentation pcap files

The following shell script searches IP fragment pcap files in a folder:

#!/bin/sh

for file in ./*.pcap
do
    frag_packets=$(tshark -r $file -Y "ip.flags.mf==1 || ip.frag_offset>0")
    if [[ "${frag_packets}" != "" ]]
    then
        echo "$file"
    fi
done

We should pay attention to -Y option which is for display filters; if what you want is capture filters, -f is the right choice.

P.S., the code can be downloaded here.

The display format of Bash’s built-in time command

Check the default output format of bash‘s built-in time command:

# time

real    0m0.000s
user    0m0.000s
sys     0m0.000s

You can use -p option to output in POSIX format:

# time -p
real 0.00
user 0.00
sys 0.00

From bash source code, we know the the definitions of these two formats:

#define POSIX_TIMEFORMAT "real %2R\nuser %2U\nsys %2S"
#define BASH_TIMEFORMAT  "\nreal\t%3lR\nuser\t%3lU\nsys\t%3lS"

To decipher the meanings of them, we need to refer bash manual:

TIMEFORMAT

The value of this parameter is used as a format string specifying how the timing information for pipelines prefixed with the time reserved word should be displayed. The ‘%’ character introduces an escape sequence that is expanded to a time value or other information. The escape sequences and their meanings are as follows; the braces denote optional portions.

%%
A literal ‘%’.

%[p][l]R
The elapsed time in seconds.

%[p][l]U
The number of CPU seconds spent in user mode.

%[p][l]S
The number of CPU seconds spent in system mode.

%P
The CPU percentage, computed as (%U + %S) / %R.

The optional p is a digit specifying the precision, the number of fractional digits after a decimal point. A value of 0 causes no decimal point or fraction to be output. At most three places after the decimal point may be specified; values of p greater than 3 are changed to 3. If p is not specified, the value 3 is used.

The optional l specifies a longer format, including minutes, of the form MMmSS.FFs. The value of p determines whether or not the fraction is included.
……

Take POSIX_TIMEFORMAT as an example: %2R denotes using second as time unit, and the precision is two digits after a decimal point; %2U and %2S are similar.

Now you can comprehend the output of time, correct? Try using BASH_TIMEFORMAT as a practice.

 

Beware “No such file or directory” error in using ksh

Check following simple script:

#!/usr/bbin/python
print("hello world!")

I misspelled python path intentionally. On Linux Bash, it reported following error:

$ ./hello.py
-bash: ./hello.py: /usr/bbin/python: bad interpreter: No such file or directory

It prompted me that “/usr/bbin/python” couldn’t be found. While on OpenBSD ksh:

$ ./hello.py
ksh: ./hello.py: No such file or directory

It gave an illusion that hello.py didn’t exist. So be careful about this error information if you use ksh.