Enhance libunwind on illumos

In my last post, I mentioned that I used libunwind to debug a memory leak issue recently. I actually run this program on illumos too, but unfortunately met following errors:

$ cat /tmp/backtrace.log
0x401b3b: -- error(unspecified (general) error): unable to obtain symbol name for this frame
0x401b47: -- error(unspecified (general) error): unable to obtain symbol name for this frame
0x401b5e: -- error(unspecified (general) error): unable to obtain symbol name for this frame
0x401757: -- error(unspecified (general) error): unable to obtain symbol name for this frame
0x4016b8: -- error(unspecified (general) error): unable to obtain symbol name for this frame

I used gdb to do single-step debug, then found the libunwind illumos implementation just reuses the Linux APIs:

......
#include "os-linux.h" // using linux header for map_iterator implementation
......

On Linux, the map file is /proc/$pid/maps, but on illumos, the file is /proc/$pid/map. Hmm, the first step is wrong, then no need to progress further.

I tried to see what is in /proc/$pid/map:

$ cat /proc/$$/map
cat: input error on /proc/511/map: Value too large for defined data type

cat couldn’t help. Then resorted to vim:

$ vim /proc/$$/map
^@^@@^@......

Just messy code. Now that it is not plain test, how to decrypt it? Aha, since pmap can display it correctly:

$ pmap $$
511:    -bash
0000000000400000        828K r-x--  /usr/bin/bash
00000000004DE000         20K rw---  /usr/bin/bash
00000000004E3000         60K rw---  /usr/bin/bash
0000000000F09000       1872K rw---    [ heap ]
FFFFFC7FEC110000          4K rwx--    [ anon ]
......

Let me check pmap implementation. After going through pmap code, I found I should use libproc APIs to extract related information. I referred the code from pmap and implemented a total new tdep_get_elf_image API, and it worked:

$ cat /tmp/backtrace.log
0x401b3b: (foo+0x9)
0x401b47: (bar+0x9)
0x401b5e: (main+0x14)
0x401757: (_start_crt+0x87)
0x4016b8: (_start+0x18)

I submitted a Pull Request as well, and hope it can be finally merged.

Building OmniOS requires some memory

I followed Building OmniOS to build OmniOS, but met following errors:

$ pfexec /opt/ooce/bin/omni build_world
......
./buildctl: fork: Not enough space
../lib/functions.sh: line 627: wait_for: No record of process 22984
../lib/functions.sh: fork: Not enough space
Waiting for illumos build...
......

The reason is the virtual machine which I used has only ~1.5G memory, and it is not enough. Increase the memory to 4G, then it is OK to build.

Build OpenSSL on OmniOS

Clone OpenSSL and create a build directory:

# git clone https://github.com/openssl/openssl.git
# cd openssl
# mkdir build
# cd build

List the supported build platforms of openssl:

# ../Configure LIST
BS2000-OSD
BSD-generic32
BSD-generic64
......

Because there is no illumos option, use Solaris instead. At the same time, I want to build in debug mode:

# ../Configure solaris64-x86_64-gcc --debug
# make
# make test

That’s all!

Does CPU’s run queue include current running task on illumos?

vmstat is a useful tool to check system performance on Unix Operating Systems. The following is its output from illumos:

# vmstat
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr ro s0 -- --   in   sy   cs us sy id
 1 0 0 1831316 973680 14 38  0  0  1  0 5314 3 3  0  0 1513  472  391 83  9  7

From the manual, the first column kthr:r’ meaning is:

kthr
                 Report the number of kernel threads in each of the three following states:

                 r
                      the number of kernel threads in run queue

So the question is how to define “run queue” here? After some googling, I find this good post:

The reported operating system run queue includes both processes waiting to be serviced and those currently being serviced.

But from another excerpt from Solaris 9 System Monitoring and Tuning:

The column labeled r under the kthr section is the run queue of processes waiting to get on the CPU(s).

Does illumos‘s run queue include both current running and wait-to-run tasks? I can’t get a clear answer (I know illumos is derived from Solaris, but not sure whether the description in Solaris 9 is still applicable for illumos).

I tried to get help from illumos discussion mailing list, and Joshua gave a detailed explanation. Though he didn’t tell me the final answer, I knew where I should delve into:

(1) The statistic of runque is actually from disp_nrunnable (the code is here):

        ......
        /*
        * First count the threads waiting on kpreempt queues in each
        * CPU partition.
        */
        uint_t cpupart_nrunnable = cpupart->cp_kp_queue.disp_nrunnable;
        cpupart->cp_updates++;
        nrunnable += cpupart_nrunnable;
        ......
        /* Now count the per-CPU statistics. */
        uint_t cpu_nrunnable = cp->cpu_disp->disp_nrunnable;
        nrunnable += cpu_nrunnable;
        ......
        if (nrunnable) {
            sysinfo.runque += nrunnable;
            sysinfo.runocc++;
        }

The definition of disp_nrunnable:

/*
 * Dispatch queue structure.
 */
typedef struct _disp {
......
volatile int    disp_nrunnable; /* runnable threads in cpu dispq */
......
} disp_t;

(2) Checked the code related to disp_nrunnable:

/*
 * disp() - find the highest priority thread for this processor to run, and
 * set it in TS_ONPROC state so that resume() can be called to run it.
 */
static kthread_t *
disp()
{
    ......
    dq = &dp->disp_q[pri];
    tp = dq->dq_first;
    ......
    /*
     * Found it so remove it from queue.
     */
    dp->disp_nrunnable--;
    ......
    thread_onproc(tp, cpup);        /* set t_state to TS_ONPROC */
    ......
}

Checked definition of TS_ONPROC:

/*
 * Values that t_state may assume. Note that t_state cannot have more
 * than one of these flags set at a time.
 */
......
#define TS_RUN      0x02    /* Runnable, but not yet on a processor */
#define TS_ONPROC   0x04    /* Thread is being run on a processor */
......

Umm, it became clear: when kernel picks one task to run, removes it from the dispatch queue, decreases disp_nrunnable by 1, and set task’s state as TS_ONPROC. Based on above analysis, illumos‘s run queue should include only wait-to-run tasks, not current running ones.

To verify my thought, I implemented following simple CPU-intensive program:

# cat foo.c
int main(){
    while(1);
}

My virtual machine has only 1 CPU. Use vmstat when it is in idle state:

# vmstat 1
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr ro s0 -- --   in   sy   cs us sy id
 1 0 0 1829088 970732 8  22  0  0  0  0 2885 1 2  0  0 1523  261  343 85 10  6
 0 0 0 1827484 968844 16 53  0  0  0  0  0  0  0  0  0 2173  330  285  0 10 90
 0 0 0 1827412 968808 0   1  0  0  0  0  0  0  0  0  0 2177  284  258  0 10 90
 0 0 0 1827412 968808 0   0  0  0  0  0  0  0  0  0  0 2167  296  301  0  9 91
 0 0 0 1827412 968808 0   0  0  0  0  0  0  0  0  0  0 2173  278  298  0  9 91
 0 0 0 1827412 968808 0   0  0  0  0  0  0  0  0  0  0 2173  280  283  0  9 91
 0 0 0 1827412 968808 0   0  0  0  0  0  0  0  0  0  0 2175  279  329  0 10 90
 ......;

kthr:r was 0, and cpu:id (the last column) was more than 90. Launched one instance of foo program:

# ./foo &
[1] 668

Checked vmstat again:

# vmstat 1
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr ro s0 -- --   in   sy   cs us sy id
 1 0 0 1829076 970720 8  21  0  0  0  0 2860 1 2  0  0 1528  260  343 84 10  6
 0 0 0 1826220 968100 16 53  0  0  0  0  0  0  0  0  0 1550  334  262 90 10  0
 0 0 0 1826148 968064 0   1  0  0  0  0  0  0  0  0  0 1399  288  264 91  9  0
 0 0 0 1826148 968064 0   0  0  0  0  0  0  0  0  0  0 1283  277  253 92  8  0
 0 0 0 1826148 968064 0   0  0  0  0  0  0  0  0  0  0 1367  281  247 91  9  0
 0 0 0 1826148 968064 0   0  0  0  0  0  0  0  0  0  0 1420  277  239 91  9  0
 0 0 0 1826148 968064 0   0  0  0  0  0  0  0  0  0  0 1371  281  239 91  9  0
 0 0 0 1826148 968064 0   0  0  0  0  0  0  0  0  0  0 1289  278  250 92  8  0
 ......

This time kthr:r was still 0, but cpu:id became 0. Launched another foo:

# ./foo &
[2] 675

Checked vmstat again:

# vmstat 1
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr ro s0 -- --   in   sy   cs us sy id
 1 0 0 1828912 970572 7  20  0  0  0  0 2672 1 1  0  0 1518  244  337 84  9  6
 1 0 0 1825748 967656 16 53  0  0  0  0  0  0  0  0  0 1554  335  284 90 10  0
 1 0 0 1825676 967620 0   1  0  0  0  0  0  0  0  0  0 1497  286  271 90 10  0
 1 0 0 1825676 967620 0   0  0  0  0  0  0  0  0  0  0 1387  309  288 92  8  0
 2 0 0 1825676 967620 0   0  0  0  0  0  0  0  0  0  0 1643  365  291 90 10  0
 1 0 0 1825676 967620 0   0  0  0  0  0  0  0  0  0  0 1446  276  273 91  9  0
 1 0 0 1825676 967620 0   0  0  0  0  0  0  0  0  0  0 1325  645  456 92  8  0
 1 0 0 1825676 967620 0   0  0  0  0  0  0  0  0  0  0 1375  296  300 91  9  0

kthr:r became 1 (yes, it was 2 once). There were 2 CPU-intensive foo program running, whereas kthr:r‘s value was only 1, it means kthr:r excludes the on-CPU task. I also run another foo process, and as expected, kthr:r became 2.

Finally, the test results proved the statement from Solaris 9 System Monitoring and Tuning is right:

The column labeled r under the kthr section is the run queue of processes waiting to get on the CPU(s).

A minimal manual of contributing to illumos

Compared to fire-and-forget mode of submitting patch to OpenBSD, i.e., you just need to copy-and-paste .diff file into mail and send itto the tech@openbsd.org, contributing to illumos seems a little complicated (This is the detailed document). How about I just found a typo and want to submit a patch? This post will guide you step by step.

prerequisites: illumos project uses git as the version control tool. Though its source code is also hosted in githubillumos project doesn’t accept Pull Request directly. You must follow below steps.

(1) Every commit should be associated with an issue, if you found an issue no one noticed before, just create an issue in bug-tracker and assign it to yourself (An issue example).

(2) Clone project, modify code, compile and do some tests. Once it’s done, upload the .diff file to Review Board (A Review Request example) and shoot a mail into developer@lists.illumos.org to invite other people to review it (A review mail example).

(3) Once reviewers think your code is ready, you can follow this guide to prepare patch and send RTI mail to advocates@lists.illumos.org (A RTI example). Depends on the situation, you may also need to further polish your code. Once the patch meets the quality requirements, congratulations, your commit will be merged, like this one.

Happy contributing to illumos!