Be aware of blocking IO APIs

Yesterday, I did performance analysis for one project. This is the output of mpstat command for old version:

And this is the CPU utilisation for new version:

For new version, the iowait ratio is remarkably high. After Checking the code, I found the original serialisation was just a fflush, but now for some reasons, it was replaced by fdatasync which is a blocking API and only returns when the data is transferred to the storage device. Therefore the thread which invokes fdatasync will be stuck there and can’t process any other message. So we must pay attention to use blocking IO APIs, sometimes they may bring you results which you don’t want.

Fix “cannot find libasan_preinit.o” issue in Void Linux

In my Void Linux, using clang with -fsanitize=address option is OK:

$ clang -std=c11 -fsanitize=address test.c
$

Whilst gcc reports following error:

$ gcc -std=c11 -fsanitize=address test.c
/usr/bin/ld: cannot find libasan_preinit.o: No such file or directory
/usr/bin/ld: cannot find -lasan
collect2: error: ld returned 1 exit status

The solution is to install libsanitizer-devel:

$ sudo xbps-install -Su libsanitizer-devel
$ gcc -std=c11 -fsanitize=address test.c
$

Check the linked libraries of executalbe file:

$ ldd a.out
    linux-vdso.so.1 (0x00007fff1a58b000)
    libasan.so.5 => /usr/lib/libasan.so.5 (0x00007fc1f0f02000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fc1f0d3f000)
    libdl.so.2 => /usr/lib/../lib/libdl.so.2 (0x00007fc1f0d3a000)
    librt.so.1 => /usr/lib/../lib/librt.so.1 (0x00007fc1f0d2f000)
    libpthread.so.0 => /usr/lib/../lib/libpthread.so.0 (0x00007fc1f0d0e000)
    libstdc++.so.6 => /usr/lib/../lib/libstdc++.so.6 (0x00007fc1f0a99000)
    libm.so.6 => /usr/lib/../lib/libm.so.6 (0x00007fc1f0952000)
    libgcc_s.so.1 => /usr/lib/../lib/libgcc_s.so.1 (0x00007fc1f0938000)
    /lib/ld-linux-x86-64.so.2 (0x00007fc1f193d000)

Reference:
How to use gcc with sanitizers in Void Linux?.

The alignment of dynamically allocating memory

Check Notes from max_align_t:

Pointers returned by allocation functions such as malloc are suitably aligned for any object, which means they are aligned at least as strictly as max_align_t.

It means the memory allocated dynamically is guaranteed to alignof(max_align_t) bytes aligned.

Check Notes from aligned_alloc:

Passing a size which is not an integral multiple of alignment or a alignment which is not valid or not supported by the implementation causes the function to fail and return a null pointer (C11, as published, specified undefined behavior in this case, this was corrected by DR 460).

It means the alignment for aligned_alloc is implementation dependent.

Write a simple program to test aligned_alloc behavior in macOS and Linux (X86_64):

$ cat align.c
#include <stdalign.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

int main()
{
    printf("alignof(max_align_t)=%zu\n\n", alignof(max_align_t));

    size_t size = 1024;
    size_t align[] = {1, 2, 4, 8, 16, 32, 64};
    for (size_t i = 0; i < sizeof(align) / sizeof(align[0]); i++)
    {
        void *p = aligned_alloc(align[i], size);
        printf("align=%zu, pointer is %p\n", align[i], p);
        free(p);
    }
}

Build and run it in macOS:

$ cc align.c -o align
$ ./align
alignof(max_align_t)=16

align=1, pointer is 0x0
align=2, pointer is 0x0
align=4, pointer is 0x0
align=8, pointer is 0x7fbd48801600
align=16, pointer is 0x7fbd48801600
align=32, pointer is 0x7fbd48801600
align=64, pointer is 0x7fbd48801600

In Linux (X86_64):

$ cc align.c -o align
$ ./align
alignof(max_align_t)=16

align=1, pointer is 0x5645aec676b0
align=2, pointer is 0x5645aec676b0
align=4, pointer is 0x5645aec676b0
align=8, pointer is 0x5645aec676b0
align=16, pointer is 0x5645aec676b0
align=32, pointer is 0x5645aec67ac0
align=64, pointer is 0x5645aec67f40

Both macOS and Linux (X86_64) have the same alignment of allocating memory from free storage: 16 bytes. macOS requires the alignment of aligned_alloc is at least 8 bytes; whilst Linux (X86_64) doesn’t have this requirement.

P.S., the code can be downloaded here.

Be aware of huge pages in Linux

On a freshly installed Linux machine, I find my application will crash unexpectly:

==5611==AddressSanitizer's allocator is terminating the process instead of returning 0
==5611==If you don't like this behavior set allocator_may_return_null=1
==5611==AddressSanitizer CHECK failed: ../../../../libsanitizer/sanitizer_common/sanitizer_allocator.cc:216 "((0)) != (0)" (0x0, 0x0)
    #0 0x7f7dc13b94a2  (/lib64/libasan.so.5+0xf94a2)
    #1 0x7f7dc13d60a9 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/lib64/libasan
.so.5+0x1160a9)
    #2 0x7f7dc13bf3d6  (/lib64/libasan.so.5+0xff3d6)
    #3 0x7f7dc13bf43a  (/lib64/libasan.so.5+0xff43a)
    #4 0x7f7dc12e9319  (/lib64/libasan.so.5+0x29319)
    #5 0x7f7dc12e6f56  (/lib64/libasan.so.5+0x26f56)
    #6 0x7f7dc13adeba in malloc (/lib64/libasan.so.5+0xedeba)
......

After debugging, the root cause is memory not enough. This is the memory usage in idle state:

The reason for memory usage is so high even in idle state is related to huge pages configuration:

$ cat /etc/default/grub
......
GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/root rhgb quiet default_hugepagesz=1G hugepagesz=1G hugepages=100"

After shrinking huge pages usage, the application runs smoothly. About huge pages, this post is a good reference.

Enable Pressure Stall Information (PSI) on Void Linux

There are two configurations for Pressure Stall Information (PSI) feature in kernelCONFIG_PSI for supporting PSI or not and CONFIG_PSI_DEFAULT_DISABLED for diabling PSI or not. Check my Void Linux configurations:

$ zgrep PSI /proc/config.gz
CONFIG_PSI=y
CONFIG_PSI_DEFAULT_DISABLED=y

We can see PSI is disabled by default, and it will incur following errors:

$ cat /proc/pressure/cpu
cat: /proc/pressure/cpu: Operation not supported

To enable this feature, add “psi=1” for GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub file:

$ cat /etc/default/grub
#
# Configuration file for GRUB.
#
......
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=4 slub_debug=P page_poison=1 psi=1"

Then regenerating the grub configuration and reboot:

$ sudo grub-mkconfig -o /boot/grub/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.2.13_1
Found initrd image: /boot/initramfs-5.2.13_1.img
done
$sudo reboot

The PSI is enabled:

$ cat /proc/pressure/cpu
some avg10=0.00 avg60=0.00 avg300=0.00 total=890919

References:
Getting Started with PSI;
Where are the current kernel build options stored?;
grub.