Differentiate “application operator” and “function composition” in Haskell

I don’t know other guys, but for me, sometimes I am confused with $ (“application operator”) and . (“function composition”) in Haskell, so I want to write a small summary to differentiate them.

Check the type of these two operators:

> :t ($)
($) :: (a -> b) -> a -> b
> :t (.)
(.) :: (b -> c) -> (a -> b) -> a -> c

For infix operator, $, its left operand should be a function whose type is a->b, the right operand should be the input parameter of this function: a, then get the result b. Check following example:

countWords :: String -> Int
countWords s = length $ words s

words returns a String list which is the input argument of length, then length returns how many words this string has. Run countWords in ghci:

> let countWords s = length $ words s
> countWords "a b c"

See . now; maybe another the type notation of . can give you a hand in comprehending it:

(.) :: (b -> c) -> (a -> b) -> (a -> c)

We can think the operands of . are both functions, and the return value is also a function. however, the three functions are not random, but have relationships with their parameters. Modify countWords and run it in ghci:

> let countWords s = length . words s

<interactive>:12:29: error:
    * Couldn't match expected type `a -> [a0]'
                  with actual type `[String]'
    * Possible cause: `words' is applied to too many arguments
      In the second argument of `(.)', namely `words s'
      In the expression: length . words s
      In an equation for `countWords': countWords s = length . words s
    * Relevant bindings include
        countWords :: String -> a -> Int (bound at <interactive>:12:5)

Woops! Error is generated. The reason is the space ” “, or function application has the highest precedence, so words s should be evaluated first:

:t words
words :: String -> [String]

The return value is [String], not a function, so it doesn’t satisfy the type of . who requires the two operands must be functions. Change the countWordsdefinition:

> let countWords s = (length . words) s
> countWords "a b c"

This time it works like a charm! For the same reason, the second operand of $ must be consistent with the input parameter of the first operand:

> let countWords = length $ words

<interactive>:18:18: error:
    * No instance for (Foldable ((->) String))
        arising from a use of `length'
    * In the expression: length $ words
      In an equation for `countWords': countWords = length $ words

It is time to warp it up: the operands of . is function, and we can use .to chain many functions to generate a final one which works as the left operand of $, feed it with one argument and produce the final result. Like this:

> length . words $ "1 2 3"

The same as:

> length $ words $ "1 2 3"

Porting google/benchmark into OpenBSD

I want to use google/benchmark on OpenBSD, but find it support many platforms whereas lacks OpenBSD (the code is here):

#if defined(__CYGWIN__)
#elif defined(_WIN32)
#elif defined(__APPLE__)
  #include "TargetConditionals.h"
  #if defined(TARGET_OS_MAC)
    #if defined(TARGET_OS_IPHONE)
      #define BENCHMARK_OS_IOS 1
#elif defined(__FreeBSD__)
#elif defined(__NetBSD__)
#elif defined(__linux__)
#elif defined(__native_client__)
#elif defined(EMSCRIPTEN)
#elif defined(__rtems__)
#elif defined(__Fuchsia__)
#elif defined (__SVR4) && defined (__sun)

Although it can be built successfully on OpenBSD, but “make test” reports some failures:

# make test


91% tests passed, 5 tests failed out of 54

Total Test time (real) =  40.18 sec

The following tests FAILED:
          1 - benchmark (Child aborted)
         38 - options_benchmarks (Child aborted)
         39 - basic_benchmark (Child aborted)
         43 - fixture_test (Child aborted)
         47 - reporter_output_test (Child aborted)
Errors while running CTest
*** Error 8 in /root/Project/benchmark/build (Makefile:130 'test': /usr/local/bin/ctest --force-new-ctest-process --exclude-regex "CMake.Fil...)

Check the following simple test file:

# cat test.cc
#include <benchmark/benchmark.h>

static void BM_StringCreation(benchmark::State& state) {
  for (auto _ : state)
    std::string empty_string;
// Register the function as a benchmark

// Define another benchmark
static void BM_StringCopy(benchmark::State& state) {
  std::string x = "hello";
  for (auto _ : state)
    std::string copy(x);


Compile and run it:

# c++ -I/usr/local/include -L/usr/local/lib -std=c++11 test.cc -o test -lbenchmark
root:/root/Project# ./test
failed to open /proc/cpuinfo
2018-05-02 17:14:11
Running ./test
Run on (-1 X 2545.25 MHz CPU )
***WARNING*** Library was built as DEBUG. Timings may be affected.
Benchmark                  Time           CPU Iterations
BM_StringCreation         40 ns         40 ns   17597275
BM_StringCopy             13 ns         13 ns   53511385

failed to open /proc/cpuinfo“? “-1 X 2545.25 MHz CPU“? Messy output, so I decided to port it on OpenBSD:

(1) The first thing to do is defining BENCHMARK_OS_OPENBSDin src/internal_macros.h:

#elif defined(__OpenBSD__)

(2) The second task should fill the value of CPUInfo‘s members:

struct CPUInfo {
  int num_cpus;
  double cycles_per_second;
  std::vector<CacheInfo> caches;
  bool scaling_enabled;

Check CPUInfo‘s constructor:

    : num_cpus(GetNumCPUs()),
      scaling_enabled(CpuScalingEnabled(num_cpus)) {}

I know I need to implement GetNumCPUs(), GetCPUCyclesPerSecond(), etc. For FreeBSD and NetBSD, benchmark uses sysctlbyname function:

if (sysctlbyname(Name.c_str(), nullptr, &CurBuffSize, nullptr, 0) == -1)
    return ValueUnion();

Unfortunately, OpenBSD doesn’t support sysctlbyname, so I use sysctl to get CPU’s number and speed:

 if ((Name == "hw.ncpu") || (Name == "hw.cpuspeed")){
    ValueUnion buff(sizeof(int));
    if (sysctl(mib, 2, buff.data(), &buff.Size, nullptr, 0) == -1) {
      return ValueUnion();
    return buff;

For cache information, the OpenBSD can’t provide ready-made information, and I think it is not worthy to use other work-around method to get it. E.g., use CPUID instruction on X86 architectures (If you really want to know it, lscpu will give you a hand on X86 platform) . Regarding to whether CPU support scaling or not, I can’t find any help about OpenBSD, so just leave it here.

The whole patch is here. Not sure whether google likes to merge it or not (Update: it is already merged), But at least all test cases can pass on OpenBSD now:

# make test


100% tests passed, 0 tests failed out of 54

Total Test time (real) =  17.54 sec

And the test program also outputs normal log:

# ./test
2018-05-02 14:49:03
Running ./a.out
Run on (2 X 2534 MHz CPU s)
***WARNING*** Library was built as DEBUG. Timings may be affected.
Benchmark                  Time           CPU Iterations
BM_StringCreation         42 ns         42 ns   16761725
BM_StringCopy             13 ns         13 ns   51990267

P.S., if you want to use google/benchmark on OpenBSD, you can consider importing my patch. 🙂

First taste of google/benchmark

Today, I tried google/benchmark. The build process is idiomatic:

# git clone https://github.com/google/benchmark.git
# git clone https://github.com/google/googletest.git benchmark/googletest
# cd benchmark/
# mkdir build
# cd build/
# cmake ..
# make

But the “make test” generates an error (please refer this issue). Write a simple test.cc:

# cat test.cc
#include <benchmark/benchmark.h>

static void BM_StringCreation(benchmark::State& state) {
  for (auto _ : state)
    std::string empty_string;
// Register the function as a benchmark

// Define another benchmark
static void BM_StringCopy(benchmark::State& state) {
  std::string x = "hello";
  for (auto _ : state)
    std::string copy(x);


Build and run it:

# g++ -pthread test.cc -o test -lbenchmark
# ./test
2018-04-30 18:14:59
Running ./test
Run on (2 X 2394 MHz CPU s)
CPU Caches:
  L1 Data 32K (x2)
  L1 Instruction 32K (x2)
  L2 Unified 4096K (x1)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
Benchmark                  Time           CPU Iterations
BM_StringCreation         10 ns         10 ns   71084124
BM_StringCopy             52 ns         52 ns   10000000

Maybe I will take a plunge in this project further.

Small examples show copy elision in C++

“return value optimization” is a common technique of copy elision whose target is eliminating unnecessary copying of objects. Check the following example:

#include <iostream>
using namespace std;

class Foo {
    Foo() {cout<<"default constructor is called"<<endl;}
    Foo(const Foo& other) {cout<<"copy constructor is called"<<endl;}
    Foo(Foo&& other) {cout<<"move constructor is called"<<endl;}

Foo func()
    Foo f;
    return f;

int main()
    Foo a = func();
    return 0;

The compiler is clang 5.0.1:

# c++ --version
OpenBSD clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
Target: amd64-unknown-openbsd6.3
Thread model: posix
InstalledDir: /usr/bin

Build an execute it:

# c++ -std=c++11 test.cpp
# ./a.out
default constructor is called

You may expect Foo‘ copy constructor is called at least once:

Foo a = func();

However, the reality is that compiler may be clever enough to know the content in local variable f of func() will be finally copied to a, so it creates a first, and pass a into func(), like this:

Foo a;

Let’s modify the program:

#include <iostream>
using namespace std;

class Foo {
    Foo() {cout<<"default constructor is called"<<endl;}
    Foo(const Foo& other) {cout<<"copy constructor is called"<<endl;}
    Foo(Foo&& other) {cout<<"move constructor is called"<<endl;}

Foo func(Foo f)
    return f;

int main()
    Foo a;
    Foo b = func(a);
    return 0;

This time, the temp variable f of func() is a parameter. Build and run it:

# c++ -std=c++11 test.cpp
# ./a.out
default constructor is called
copy constructor is called
move constructor is called

the temp variable fof func() is constructed by copy constructor:

Foo b = func(a);

In above statement, the func(a) returns a temporary variable, which is a rvalue, so the Foo‘s move constructor is used to construct b. If Foo doesn’t define move constructor:

Foo(Foo&& other) {cout<<"move constructor is called"<<endl;}

Then “Foo b = func(a);” will trigger copy constructor to initialize b.

A performance issue about copy constructor

These two day, I debugged a performance issue which is related to copy constructor: the class A has a member b which is NTL::ZZX type:

class A
    enum class type {zzx_t, ...} t;
    NTL::ZZX b;

When member t‘s value is zzx_t, b is valid. Otherwise b‘s content should be outdated.

There are 2 methods of implementing A‘s copy constructor:

A(const A& other) : t(other.t), b(other.b)

In this method, NTL::ZZX‘s copy constructor is called in spite of anything.


A(const A& other) : t(other.t)
    if (t == zzx_t)
        b = other.b;

In this case, NTL::ZZX‘s default constructor is called first. NTL::ZZX‘s copy assignment operator is invoked only if “t == zzx_t” condition is met.

NTL::ZZX‘s default constructor nearly does nothing, and copy constructor does approximate work as copy assignment operator. But in our scenario, t‘s value is not zzx_t at 80 percent of the time. So the second implementation of copy constructor gets a big performance boost compared to first one.