Porting google/benchmark into OpenBSD

I want to use google/benchmark on OpenBSD, but find it support many platforms whereas lacks OpenBSD (the code is here):

#if defined(__CYGWIN__)
  #define BENCHMARK_OS_CYGWIN 1
#elif defined(_WIN32)
  #define BENCHMARK_OS_WINDOWS 1
#elif defined(__APPLE__)
  #define BENCHMARK_OS_APPLE 1
  #include "TargetConditionals.h"
  #if defined(TARGET_OS_MAC)
    #define BENCHMARK_OS_MACOSX 1
    #if defined(TARGET_OS_IPHONE)
      #define BENCHMARK_OS_IOS 1
    #endif
  #endif
#elif defined(__FreeBSD__)
  #define BENCHMARK_OS_FREEBSD 1
#elif defined(__NetBSD__)
  #define BENCHMARK_OS_NETBSD 1
#elif defined(__linux__)
  #define BENCHMARK_OS_LINUX 1
#elif defined(__native_client__)
  #define BENCHMARK_OS_NACL 1
#elif defined(EMSCRIPTEN)
  #define BENCHMARK_OS_EMSCRIPTEN 1
#elif defined(__rtems__)
  #define BENCHMARK_OS_RTEMS 1
#elif defined(__Fuchsia__)
#define BENCHMARK_OS_FUCHSIA 1
#elif defined (__SVR4) && defined (__sun)
#define BENCHMARK_OS_SOLARIS 1
#endif

Although it can be built successfully on OpenBSD, but “make test” reports some failures:

# make test

......

91% tests passed, 5 tests failed out of 54

Total Test time (real) =  40.18 sec

The following tests FAILED:
          1 - benchmark (Child aborted)
         38 - options_benchmarks (Child aborted)
         39 - basic_benchmark (Child aborted)
         43 - fixture_test (Child aborted)
         47 - reporter_output_test (Child aborted)
Errors while running CTest
*** Error 8 in /root/Project/benchmark/build (Makefile:130 'test': /usr/local/bin/ctest --force-new-ctest-process --exclude-regex "CMake.Fil...)

Check the following simple test file:

# cat test.cc
#include <benchmark/benchmark.h>

static void BM_StringCreation(benchmark::State& state) {
  for (auto _ : state)
    std::string empty_string;
}
// Register the function as a benchmark
BENCHMARK(BM_StringCreation);

// Define another benchmark
static void BM_StringCopy(benchmark::State& state) {
  std::string x = "hello";
  for (auto _ : state)
    std::string copy(x);
}
BENCHMARK(BM_StringCopy);

BENCHMARK_MAIN();

Compile and run it:

# c++ -I/usr/local/include -L/usr/local/lib -std=c++11 test.cc -o test -lbenchmark
root:/root/Project# ./test
failed to open /proc/cpuinfo
2018-05-02 17:14:11
Running ./test
Run on (-1 X 2545.25 MHz CPU )
***WARNING*** Library was built as DEBUG. Timings may be affected.
---------------------------------------------------------
Benchmark                  Time           CPU Iterations
---------------------------------------------------------
BM_StringCreation         40 ns         40 ns   17597275
BM_StringCopy             13 ns         13 ns   53511385

failed to open /proc/cpuinfo“? “-1 X 2545.25 MHz CPU“? Messy output, so I decided to port it on OpenBSD:

(1) The first thing to do is defining BENCHMARK_OS_OPENBSDin src/internal_macros.h:

......
#elif defined(__OpenBSD__)
  #define BENCHMARK_OS_OPENBSD 1
......

(2) The second task should fill the value of CPUInfo‘s members:

struct CPUInfo {
  ......
  int num_cpus;
  double cycles_per_second;
  std::vector<CacheInfo> caches;
  bool scaling_enabled;
  ......
};

Check CPUInfo‘s constructor:

CPUInfo::CPUInfo()
    : num_cpus(GetNumCPUs()),
      cycles_per_second(GetCPUCyclesPerSecond()),
      caches(GetCacheSizes()),
      scaling_enabled(CpuScalingEnabled(num_cpus)) {}

I know I need to implement GetNumCPUs(), GetCPUCyclesPerSecond(), etc. For FreeBSD and NetBSD, benchmark uses sysctlbyname function:

......
if (sysctlbyname(Name.c_str(), nullptr, &CurBuffSize, nullptr, 0) == -1)
    return ValueUnion();
......

Unfortunately, OpenBSD doesn’t support sysctlbyname, so I use sysctl to get CPU’s number and speed:

 if ((Name == "hw.ncpu") || (Name == "hw.cpuspeed")){
    ValueUnion buff(sizeof(int));
    ......
    if (sysctl(mib, 2, buff.data(), &buff.Size, nullptr, 0) == -1) {
      return ValueUnion();
    }
    return buff;
  }

For cache information, the OpenBSD can’t provide ready-made information, and I think it is not worthy to use other work-around method to get it. E.g., use CPUID instruction on X86 architectures (If you really want to know it, lscpu will give you a hand on X86 platform) . Regarding to whether CPU support scaling or not, I can’t find any help about OpenBSD, so just leave it here.

The whole patch is here. Not sure whether google likes to merge it or not (Update: it is already merged), But at least all test cases can pass on OpenBSD now:

# make test

......

100% tests passed, 0 tests failed out of 54

Total Test time (real) =  17.54 sec

And the test program also outputs normal log:

# ./test
2018-05-02 14:49:03
Running ./a.out
Run on (2 X 2534 MHz CPU s)
***WARNING*** Library was built as DEBUG. Timings may be affected.
---------------------------------------------------------
Benchmark                  Time           CPU Iterations
---------------------------------------------------------
BM_StringCreation         42 ns         42 ns   16761725
BM_StringCopy             13 ns         13 ns   51990267

P.S., if you want to use google/benchmark on OpenBSD, you can consider importing my patch. 🙂

Install vtop on OpenBSD

vtop is a tool which gives you an intuition of CPU/memory usage during a period. This post introduces how to install it on OpenBSD (assuming working asroot account):

(1) Install node.js (refer here):

# pkg_add node

(2) Install vtop:

# npm install -g vtop

(3) Since vtop uses Unicode braille characters to CPU and Memory charts, I need to change OpenBSD to use UTF-8 encoding (refer here and here):

export LC_CTYPE="en_US.UTF-8"

Now, vtop works well on my Cygwin terminal:

Capture

Use “cat” to concatenate files

cat is a neat command to concatenate files on Unix (please see this post). Let’s see some examples:

# cat 1.txt
1
# cat 2.txt
2
# cat 1.txt 2.txt > new.txt
# cat new.txt
1
2
# cat 1.txt 2.txt >> new.txt
# cat new.txt
1
2
1
2

Please notice if the output file is also the input file, the input file content will be truncated first:

# cat 1.txt
1
# cat 2.txt
2
# cat 1.txt 2.txt > 1.txt
# cat 1.txt
2
# cat 2.txt
2

So the correct appending file method is using >>:

# cat 1.txt
1
# cat 2.txt
2
# cat 2.txt >> 1.txt
# cat 1.txt
1
2
# cat 2.txt
2

Check the implementation of cat in OpenBSD, and the core parts are iterating files and reading content from them:

(1) Iterate every file (raw_args):

void
raw_args(char **argv)
{
    int fd;

    fd = fileno(stdin);
    filename = "stdin";
    do {
        if (*argv) {
            if (!strcmp(*argv, "-"))
                fd = fileno(stdin);
            else if ((fd = open(*argv, O_RDONLY, 0)) < 0) {
                warn("%s", *argv);
                rval = 1;
                ++argv;
                continue;
            }
            filename = *argv++;
        }
        raw_cat(fd);
        if (fd != fileno(stdin))
            (void)close(fd);
    } while (*argv);
}

You need to pay attention that cat use - to identify stdin.

(2) Read content from every file (raw_cat):

void
raw_cat(int rfd)
{
    int wfd;
    ssize_t nr, nw, off;
    static size_t bsize;
    static char *buf = NULL;
    struct stat sbuf;

    wfd = fileno(stdout);
    if (buf == NULL) {
        if (fstat(wfd, &sbuf))
            err(1, "stdout");
        bsize = MAXIMUM(sbuf.st_blksize, BUFSIZ);
        if ((buf = malloc(bsize)) == NULL)
            err(1, "malloc");
    }
    while ((nr = read(rfd, buf, bsize)) != -1 && nr != 0)
        for (off = 0; nr; nr -= nw, off += nw)
            if ((nw = write(wfd, buf + off, (size_t)nr)) == 0 ||
                 nw == -1)
                err(1, "stdout");
    if (nr < 0) {
        warn("%s", filename);
        rval = 1;
    }
}

a) When reading the first file, the cat command uses fstat to get the st_blksize attribute of stdout which is “optimal blocksize for I/O”, then allocates the memory:

    ......
    if (buf == NULL) {
        if (fstat(wfd, &sbuf))
            err(1, "stdout");
        bsize = MAXIMUM(sbuf.st_blksize, BUFSIZ);
        if ((buf = malloc(bsize)) == NULL)
            err(1, "malloc");
    }
    ......

b) Read the content of file and write it into stdout:

    ......
    while ((nr = read(rfd, buf, bsize)) != -1 && nr != 0)
        for (off = 0; nr; nr -= nw, off += nw)
            if ((nw = write(wfd, buf + off, (size_t)nr)) == 0 ||
                 nw == -1)
                err(1, "stdout");
    ......

When read returns 0, it means reaching the end of file. If write doesn’t return the number you want to write, it is also considered as an error.

Upgrade OpenBSD from 6.2 to 6.3

Since OpenBSD 6.3 is released, it is time to upgrade 6.2.

The upgrade manual is here. But for newbies like me, I think the most challenge step is booting from ramdisk kernel, bsd.rd: Download and copy it into root file system:

# mv bsd.rd /

Then reboot machine, during prompting boot>, input boot /bsd.rd:

boot> boot /bsd.rd

Then upgrade OpenBSD according to the instructions.

For me, there are 2 important aspects of OpenBSD 6.3:
(1) The vim is upgrade to 8.0.1589, and the strange display issue is fixed when using terminal (Please refer this thread).
(2) My lscpu is included into ports since OpenBSD 6.3, so you can install it on x86 architectures:

# pkg_add lscpu
# lscpu
Architecture:            amd64
Byte Order:              Little Endian
Active CPU(s):           2
Total CPU(s):            2
......

The anatomy of tee program on OpenBSD

The tee command is used to read content from standard input and displays it not only in standard output but also saves to other files simultaneously. The source code of tee in OpenBSD is very simple, and I want to give it an analysis:

(1) tee leverages Singlely-linked List defined in sys/queue.h to manage outputted files (including standard output):

struct list {
    SLIST_ENTRY(list) next;
    int fd;
    char *name;
};
SLIST_HEAD(, list) head;

......

static void
add(int fd, char *name)
{
    struct list *p;
    ......
    SLIST_INSERT_HEAD(&head, p, next);
}

int
main(int argc, char *argv[])
{
    struct list *p;
    ......
    SLIST_INIT(&head);
    ......
    SLIST_FOREACH(p, &head, next) {
        ......
    }
}

To understand it easily, I extract the macros from sys/queue.h and created a file which utilizes the marcos:

#define SLIST_HEAD(name, type)                      \
struct name {                               \
    struct type *slh_first; /* first element */         \
}

#define SLIST_ENTRY(type)                       \
struct {                                \
    struct type *sle_next;  /* next element */          \
}

#define SLIST_FIRST(head)   ((head)->slh_first)
#define SLIST_END(head)     NULL
#define SLIST_EMPTY(head)   (SLIST_FIRST(head) == SLIST_END(head))
#define SLIST_NEXT(elm, field)  ((elm)->field.sle_next)

#define SLIST_FOREACH(var, head, field)                 \
    for((var) = SLIST_FIRST(head);                  \
        (var) != SLIST_END(head);                   \
        (var) = SLIST_NEXT(var, field))

#define SLIST_INIT(head) {                      \
    SLIST_FIRST(head) = SLIST_END(head);                \
}

#define SLIST_INSERT_HEAD(head, elm, field) do {            \
    (elm)->field.sle_next = (head)->slh_first;          \
    (head)->slh_first = (elm);                  \
} while (0)

struct list {
    SLIST_ENTRY(list) next;
    int fd;
    char *name;
};
SLIST_HEAD(, list) head;

int
main(int argc, char *argv[])
{
    struct list *p;
    SLIST_INIT(&head);

    SLIST_INSERT_HEAD(&head, p, next);
    SLIST_FOREACH(p, &head, next) {

    }
}

Then employed gcc‘s pre-processing function:

# gcc -E slist.c
# 1 "slist.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "slist.c"
# 30 "slist.c"
struct list {
 struct { struct list *sle_next; } next;
 int fd;
 char *name;
};
struct { struct list *slh_first; } head;

int
main(int argc, char *argv[])
{
 struct list *p;
 { ((&head)->slh_first) = NULL; };

 do { (p)->next.sle_next = (&head)->slh_first; (&head)->slh_first = (p); } while (0);
 for((p) = ((&head)->slh_first); (p) != NULL; (p) = ((p)->next.sle_next)) {

 }
}

It becomes clear now! The head node in list contains only 1 member: slh_first, which points to the first valid node. For the elements in the list, it is embedded with next struct which uses sle_next to refer to next buddy.

(2) By default, tee will overwrite the output files. If you want to append it, use -a option, and the code is as following:

while (*argv) {
    if ((fd = open(*argv, O_WRONLY | O_CREAT |
        (append ? O_APPEND : O_TRUNC), DEFFILEMODE)) == -1) {
        ......
    } 
    ......
}

(3) The next part is the skeleton of saving content to files:

while ((rval = read(STDIN_FILENO, buf, sizeof(buf))) > 0) {
    SLIST_FOREACH(p, &head, next) {
        n = rval;
        bp = buf;
        do {
            if ((wval = write(p->fd, bp, n)) == -1) {
                ......
            }
            bp += wval;
        } while (n -= wval);
    }
}

We need to iterates every opened file descriptor and write contents into it.

(4) Normally, theinterrupt signal will cause tee exit:

# tee
fdkfkdfjk
fdkfkdfjk
^C
#

To disable this feature, use -i option:

# tee -i
fdhfhd
fdhfhd
^C^C

The corresponding code is like this:

......
case 'i':
    (void)signal(SIGINT, SIG_IGN);
    break;