Linux | Nan Xiao's Blog

The doubt about “perf record -p pid sleep 10”

In Brendan Gregg’s awesome perf Examples, there is one command which can be used to sample for assigned time, e.g., 10 seconds:

# Sample on-CPU functions for the specified PID, at 99 Hertz, for 10 seconds:
perf record -F 99 -p PID sleep 10

I have a doubt about it: wouldn’t perf sample two processes, one with specified PID and the other is sleep task? Yes, I know sleep should do nothing to affect sampling, but as someone who likes to find the root cause, I want to know whether the sleep will be sampled or not.

First of all, check the manual page, unfortunately, I couldn’t get the answer.

Secondly, experiment. I wrote a simple program which just prints something forever:

$ cat a.c
#include <stdio.h>
int main()
{
    while (1)
        printf("Hello\n");
}

Compile and use perf record to profile it for a while:

$ gcc a.c
$ sudo perf record ./a.out
Hello
Hello
.....;
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.106 MB perf.data (2409 samples) ]

Check the perf.data:

Samples: 2K of event 'cycles:ppp', Event count (approx.): 853929764
Overhead  Command  Shared Object      Symbol
   9.66%  a.out    [kernel.kallsyms]  [k] retint_userspace_restore_args
   7.69%  a.out    [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   4.29%  a.out    [kernel.kallsyms]  [k] n_tty_write
   3.75%  a.out    [kernel.kallsyms]  [k] system_call_after_swapgs
   3.63%  a.out    [kernel.kallsyms]  [k] irq_return
   3.01%  a.out    [kernel.kallsyms]  [k] native_write_msr_safe
......

Nothing special, just to make sure that a.out process can be profiled. Launch htop program whose PID is 39741, then execute following command for several seconds:

$ sudo perf record -p 39741 ./a.out
Hello
Hello
......
^CHello
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (98 samples) ]

a.out ran smoothly, but from perf report:

Samples: 98  of event 'cycles:ppp', Event count (approx.): 77850922
Overhead  Command  Shared Object       Symbol
   1.02%  htop     [kernel.kallsyms]   [k] sys_read                                                  ▒
   1.02%  htop     [kernel.kallsyms]   [k] getname                                                   ▒
   1.02%  htop     [kernel.kallsyms]   [k] __d_lookup_rcu                                            ▒
   1.02%  htop     [kernel.kallsyms]   [k] generic_permission                                        ▒
   1.02%  htop     [kernel.kallsyms]   [k] do_task_stat
......

I couldn’t see any samples from a.out, and all are from htop. So it seems a.out won’t be profiled in this scenario.

Third, check perf source code to verify my speculation. Roughly speaking, “perf record [options] command [options]” will fork a child process to run command, and it is in evlist__prepare_workload:

int evlist__prepare_workload(struct evlist *evlist, struct target *target, const char *argv[],
                 bool pipe_output, void (*exec_error)(int signo, siginfo_t *info, void *ucontext))
{
    ......
    if (target__none(target)) {
        if (evlist->core.threads == NULL) {
            fprintf(stderr, "FATAL: evlist->threads need to be set at this point (%s:%d).\n",
                __func__, __LINE__);
            goto out_close_pipes;
        }
        perf_thread_map__set_pid(evlist->core.threads, 0, evlist->workload.pid);
    }
    ......
}

For target__none:

static inline bool target__none(struct target *target)
{
    return !target__has_task(target) && !target__has_cpu(target);
}

Because I have already set interested process by pid before, target__has_task(target) will return true, which will cause target__none() returns false, therefore perf_thread_map__set_pid won’t be invoked and the forked process isn’t added into evlist->core.threads. That’s the reason why the command process won’t be profiled. Similarly, if target__has_cpu(target) returns true, the command process is not tracked either.

In summary, to find answer of doubt: read code, write code to verify, repeat, that’s all.

Mount ftrace control directory on Void Linux

On Void Linux, the /sys/kernel/tracing is empty, which means the ftrace control directory is not mounted automatically, so you need mount it manually:

# mount -t tracefs nodev /sys/kernel/tracing
# cd /sys/kernel/tracing
# ls
README                      free_buffer               set_event               trace_clock
available_events            function_profile_enabled  set_event_notrace_pid   trace_marker
available_filter_functions  hwlat_detector            set_event_pid           trace_marker_raw
......

Fix “Address 0xxxxxxxxx out of bounds” issue

Yesterday, I came across a crash issue when program accessed “out of bound” memory:

......
func = 0x7ffff7eb1dc8 <Address 0x7ffff7eb1dc8 out of bounds>
......

After some debugging, I found the reason is program uses dlopen to load some dynamic library, records the memory address of the library, but still read the memory after dlclose it.

Introduction of Unix pseudo-random number functions

I can’t find detailed introduction of Unix pseudo-random number functions, so I decide to write one myself. Please notice many platforms provide reentrant versions, like Linux. Now that non-reentrant versions should just invoke reentrant versions with global data (refer glibc), I will use non-reentrant versions to demonstrate in this post.

(1) Call random() only:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Compile and run for 2 times:

# gcc random.c
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793

Same outputs as expected because they are “pseudo”.

(2) Call srandom(1). According to spec:

Like rand(), random() shall produce by default a sequence of numbers that can be duplicated by calling srandom() with 1 as the seed.

Verify it:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    srandom(1);
    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Compile and run:

# gcc random.c
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793

Yes, the output is same as the first test (call random() only).

(3) Call srandom(2):

#include <stdio.h>
#include <stdlib.h>

int main()
{
    srandom(2);
    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Build and run it:

# gcc random.c
# ./a.out
1505335290
1738766719
190686788
260874575
747983061

Hmm, this time different output is generated.

(4) Let’s see an example of using initstate():

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned int seed = 1;
    char state[128];

    if (initstate(seed, state, sizeof(state)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

From the spec:

The initstate() function allows a state array, pointed to by the state argument, to be initialized for future use.
So how initstate() will initialize the state array? Let’s see the implementation of glibc:

  ......
  int32_t *state = &((int32_t *) arg_state)[1]; /* First location.  */
  /* Must set END_PTR before srandom.  */
  buf->end_ptr = &state[degree];

  buf->state = state;

  __srandom_r (seed, buf);
  ......

initstate() actually calls srandom() to initialize the state array. Build and run program:

# gcc random.c
# ./a.out
1804289383
846930886
1681692777
1714636915
1957747793

The same output as the first test (call random() only), and this complies to another quote extracted from spec:

If initstate() has not been called, then random() shall behave as though initstate() had been called with seed=1 and size=128.

Change seed from 1 to 2:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned int seed = 2;
    char state[128];

    if (initstate(seed, state, sizeof(state)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }
    return 0;
}

Compile and run again:

# gcc random.c
# ./a.out
1505335290
1738766719
190686788
260874575
747983061

This time, the output is same as the third test (call srandom(2) only). Definitely, you can change the size of state array and modify seed during running, like this:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned int seed = 2;
    char state[64];

    if (initstate(seed, state, sizeof(state)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        srandom(seed + i);
        printf("%ld\n", random());
    }
    return 0;
}

(5) Finally, let’s see setstate(). Check following example:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    unsigned seed = 1;
    char state_1[128], state_2[128];

    if (initstate(seed, state_1, sizeof(state_1)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    seed = 2;
    if (initstate(seed, state_2, sizeof(state_2)) == NULL)
    {
        printf("initstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }

    if (setstate(state_1) == NULL)
    {
        printf("setstate error\n");
        return 1;
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%ld\n", random());
    }

    return 0;
}

Compile and run the program:

# gcc random.c
# ./a.out
1505335290
1738766719
190686788
260874575
747983061
1804289383
846930886
1681692777
1714636915
1957747793

You can see the first 5 numbers are same as invoking srandom(2), whilst the last 5 numbers are same as invoking srandom(1).

Last but not least, please keep state memory always valid during usage of these pseudo-random number functions.

Configure static IP address on Arch Linux

I use NetworkManager to manage network on Arch Linux:

# systemctl --type=service
  UNIT                               LOAD   ACTIVE SUB     DESCRIPTION                               >
  dbus.service                       loaded active running D-Bus System Message Bus                  >
  getty@tty1.service                 loaded active running Getty on tty1                             >
  kmod-static-nodes.service          loaded active exited  Create list of static device nodes for the>
  NetworkManager.service             loaded active running Network Manager
......

So I will leverage nmcli to configure static IP address:

(1) Check current connection:

# nmcli con show
NAME                UUID                                  TYPE      DEVICE
Wired connection 1  f1014ad6-4291-3c29-9b0d-b552a0c5eb02  ethernet  enp0s3

(2) Configure IP address, gateway and DNS:

# nmcli con modify f1014ad6-4291-3c29-9b0d-b552a0c5eb02 ipv4.addresses 192.168.1.124/24
# nmcli con modify f1014ad6-4291-3c29-9b0d-b552a0c5eb02 ipv4.gateway 192.168.1.1
# nmcli con modify f1014ad6-4291-3c29-9b0d-b552a0c5eb02 ipv4.dns "8.8.8.8"
# nmcli con modify f1014ad6-4291-3c29-9b0d-b552a0c5eb02 ipv4.method manual

(3) Active connection:

# nmcli con up f1014ad6-4291-3c29-9b0d-b552a0c5eb02

It is done! BTW, the connection file can be found:

# cat /etc/NetworkManager/system-connections/Wired\ connection\ 1.nmconnection
[connection]
id=Wired connection 1
uuid=f1014ad6-4291-3c29-9b0d-b552a0c5eb02
type=ethernet
autoconnect-priority=-999
interface-name=enp0s3
......

Reference:
How to Configure Network Connection Using ‘nmcli’ Tool.