Parse pcapng file

Libpcap can be used to parse pcapng file, but only provides features compatible to pcap file (Please refer pcapng wiki).

PcapPlusPlus is a better choice and it can parse more information about the file and packet (e.g., the comment field of every packet). The following is a simple program which prints the metadata of pcapng file and the first 10 packets’ comments:

#include <iostream>
#include <PcapFileDevice.h>

int
main()
{
    std::cout << pcap_lib_version() << '\n';
    pcpp::PcapNgFileReaderDevice input_file("/Users/nanxiao/Downloads/capture.pcapng");
    if (input_file.open()) {
        std::cout << "Open successfully\n";
    } else {
        std::cerr << "Open failed\n";
        return 1;
    }

    std::cout << input_file.getOS() << '\n';
    std::cout << input_file.getHardware() << '\n';
    std::cout << input_file.getCaptureApplication() << '\n';
    std::cout << input_file.getCaptureFileComment() << '\n';

    pcpp::RawPacket packet;
    std::string comment;
    for (size_t i = 1; i < 10; i++)
    {
        if (input_file.getNextPacket(packet, comment)) {
            std::cout << i << ":" << comment << '\n';
        } else {
            std::cerr << "Get packet failed\n";
            return 1;
        }
    }

}

P.S., the code can be downloaded here.

tshark can’t process macOS’s pcapng file well

Wireshark‘s tshark program can’t process macOS‘s pcapng file well. E.g.:

$ sudo tcpdump -w foo.pcapng
Password:
tcpdump: data link type PKTAP
tcpdump: listening on pktap, link-type PKTAP (Apple DLT_PKTAP), capture size 262144 bytes
^C24 packets captured
27 packets received by filter
0 packets dropped by kernel

Use tshark to read and write the generated foo.pcapng:

$ tshark -r foo.pcapng -w bar.pcapng
tshark: An error occurred while writing to the file "bar.pcapng": Internal error.

I also met following error before:

$ tshark -r apsd-107.pcapng -w foo.pcapng
tshark: The capture file being read can't be written as a "pcapng" file.

macOS has its own bespoke libpcap and tcpdump, so if the pcapng file is generated by tcpdump, using tcpdump itself to process pcapng file seems the only choice.

A workaround is if you don’t care about losing information, you can use wireshark to convert the pcapng file to pcap first:

pcap_next_ex() blocks in Void Linux forever

My Void Linux is a virtual machine. I implemented a simple capturing packets program using libpcap, but found the program is blocked in pcap_next_ex():

(gdb) bt
#0  0x00007ffff7ad7763 in __GI___poll (fds=fds@entry=0x7fffffffe160, nfds=nfds@entry=1, timeout=-1)
    at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007ffff7f8309d in poll (__timeout=<optimized out>, __nfds=1, __fds=0x7fffffffe160) at /usr/include/bits/poll2.h:46
#2  pcap_wait_for_frames_mmap (handle=handle@entry=0x55555556eba0) at ./pcap-linux.c:5018
#3  0x00007ffff7f886f4 in pcap_read_linux_mmap_v3 (handle=0x55555556eba0, max_packets=1, callback=0x7ffff7f82bf0 <pcap_oneshot_mmap>,
    user=0x7fffffffe1f0 "\260\355VUUU") at ./pcap-linux.c:5577
#4  0x00007ffff7f8c6a2 in pcap_next_ex (p=<optimized out>, pkt_header=<optimized out>, pkt_data=<optimized out>) at ./pcap.c:505
......

The libpcap version in Void Linux is 1.9.1. After checking code, I found although I set timeout in pcap_open_live(), this value doesn’t take effect in set_poll_timeout:

    ......
    if (handlep->tp_version == TPACKET_V3 && !broken_tpacket_v3)
        handlep->poll_timeout = -1; /* block forever, let TPACKET_V3 wake us up */
    ......

I don’t have physical machine, so not sure this issue only happens in virtual machine or not.

Enhance Hyperscan’s pcapCorpus.py script

Hyperscan‘s pcapCorpus.py script can convert a pcap file containing UDP and TCP packets to a corpus file which can be processed by hsbench program, I did some improvements to this script:

a) Support parsing VLAN packets;
b) Continue to handle instead of exit when meeting exceptional packets.

P.S., the code can be downloaded here.

Handle IP fragmentation pcap file

Wireshark has a handy feature which can follow TCP stream, but sometimes, it may not work as you expect. Check following diagram:

The IP packet carries a GTP payload, but since it is fragmented, and only first one is captured, so Wireshark won’t dissect it, and if you try follow TCP stream of this session, this packet will be ignored.

stripe is a cool tool which can peel away encapsulating headers. But from my testing, you should add -f option, otherwise the IP fragmented packet which I mentioned previously will be skipped, but even with this option, stripe will not remove the headers. So I write a simple program which just removes headers for specified packet (The code is here for reference).