kernel | Nan Xiao's Blog

Enable Pressure Stall Information (PSI) on Void Linux

There are two configurations for Pressure Stall Information (PSI) feature in kernel: CONFIG_PSI for supporting PSI or not and CONFIG_PSI_DEFAULT_DISABLED for diabling PSI or not. Check my Void Linux configurations:

$ zgrep PSI /proc/config.gz
CONFIG_PSI=y
CONFIG_PSI_DEFAULT_DISABLED=y

We can see PSI is disabled by default, and it will incur following errors:

$ cat /proc/pressure/cpu
cat: /proc/pressure/cpu: Operation not supported

To enable this feature, add “psi=1” for GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub file:

$ cat /etc/default/grub
#
# Configuration file for GRUB.
#
......
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=4 slub_debug=P page_poison=1 psi=1"

Then regenerating the grub configuration and reboot:

$ sudo grub-mkconfig -o /boot/grub/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.2.13_1
Found initrd image: /boot/initramfs-5.2.13_1.img
done
$sudo reboot

The PSI is enabled:

$ cat /proc/pressure/cpu
some avg10=0.00 avg60=0.00 avg300=0.00 total=890919

References:
Getting Started with PSI;
Where are the current kernel build options stored?;
grub.

The tips of learning Linux kernel

As the Linux kernel has become one of the most gigantic and complex software project in the world, its complication scare many novices away. In this post, I will give some personal experience on how to learn Linux kernel, and hope these tips can offer some help to newcomers.

(1) Download vanilla kernel and install it.

Yes, I suggest you can find a physical machine, or if you really don’t have one at hand, virtual machine is also OK. Download the newest vanilla kernel from kernel.org, then build and install it. This process isn’t too hard and makes you conquer the fear of Linux kernel. After your first successful setting up of Linux kernel, and read the release version number from uname -r output:

# uname -r
4.6.0

I think this will enable you gain more confidence.

(2) Study the elementary skills of Linux kernel programming.

Looking back when you begin user-space C programming on *nix platform, you need to know allocating memory through malloc; opening file through fopen/open; using pthread library to construct concurrent program, and so on. Linux is nothing more than a platform, and you also need to study the rules of playing with it. For example, you should be familiar with how to tweak list (list.h); giving out memory should use kmalloc, etc. There are many classical books and tutorials elaborate these knowledge. Although some posts seem outdated (the version of kernel is still 2.6.x.), but they are also applicable to current.

(3) Dive into one module.

Once you get the basic expertise of Linux kernel programming, you should focus on one aspect of the kernel. If you are a full-time kernel programmer, congratulations! You should concentrate on your work area and try to be the expert of this domain. If kernel is just your hobby, you should select one module which you have great interest on. I.e., if you are curious about debugging, kdump should be your taste; if you pay close attention to dynamic tracing, BPF will be the right stuff which you want to find. After picking out the part you want to contribute, you should dig into the code and attempt to master every detail of it. You should also subscribe the related mailing list to acquaint the newest progress. The final goal is to check in meaningful patches for kernel, from a trivial typo to an enhanced feature. Think your code will run on millions of thousands of devices, it is really amazing!

(4) Others

When you meet an issue, you can try to get help from mailing list or forums. You can also try to take part in local community to recognize people in the same camp. Anyway, Endeavor to utilize all the resource you can find.

Happy hacking!

Upgrade Linux kernel on RHEL 7

My OS is RHEL 7.2 (minimal installation version). To use some new kernel features (such as BPF), I need to upgrade kernel to 4.x.

(1) Register the system and apply a subscription:

# subscription-manager register --username <username> --password <password> --auto-attach

(2) Use yum install to install the following software packages:

openssl-devel
ncurses-devel
bc
gcc
perl

BTW, when executing yum install perl, it prompts errors, so I download the source code from perl official website, and build it form scratch:

./configure.gnu
make 
make test
make install

(3) Download the stable kernel from kernel.org and extract it, then build it:

make menuconfig
make 
make modules_install install

According to your requirement, maybe installing the header files is also need:

make INSTALL_HDR_PATH=/usr/local headers_install

(4) Reboot system, and select right kernel on boot time, enjoy it:

# uname -a
Linux localhost.localdomain 4.5.0 #1 SMP Mon Apr 11 09:56:46 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

Why doesn’t Linux device driver need to update file position in read/write functions?

From LDD3, “char drivers” section:

loff_t f_pos;

The current reading or writing position. loff_t is a 64-bit value on all platforms ( long long in gcc terminology). The driver can read this value if it needs to know the current position in the file but should not normally change it; read and write should update a position using the pointer they receive as the last argument instead of acting on filp->f_pos directly. The one exception to this rule is in the llseek method, the purpose of which is to change the file position.

Why “read and write should update a position using the pointer they receive as the last argument instead of acting on filp->f_pos directly“? After checking the kernel code(the version is 3.0), I get the answer.

Use read system call as an example, and others are similar. Firstly, check read code (fs/read_write.c):

SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
    struct file *file;
    ssize_t ret = -EBADF;
    int fput_needed;

    file = fget_light(fd, &fput_needed);
    if (file) {
        loff_t pos = file_pos_read(file);
        ret = vfs_read(file, buf, count, &pos);
        file_pos_write(file, pos);
        fput_light(file, fput_needed);
    }

    return ret;
}

The core part is the following part:

loff_t pos = file_pos_read(file);
ret = vfs_read(file, buf, count, &pos);
file_pos_write(file, pos);

file_pos_read is very simple, just one statement:

static inline loff_t file_pos_read(struct file *file)
{
    return file->f_pos;
}

It returns the current file position.

Then let we see the vfs_read:

ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
{
    ssize_t ret;

    if (!(file->f_mode & FMODE_READ))
        return -EBADF;
    if (!file->f_op || (!file->f_op->read && !file->f_op->aio_read))
        return -EINVAL;
    if (unlikely(!access_ok(VERIFY_WRITE, buf, count)))
        return -EFAULT;

    ret = rw_verify_area(READ, file, pos, count);
    if (ret >= 0) {
        count = ret;
        if (file->f_op->read)
            ret = file->f_op->read(file, buf, count, pos);
        else
            ret = do_sync_read(file, buf, count, pos);
        if (ret > 0) {
            fsnotify_access(file);
            add_rchar(current, ret);
        }
        inc_syscr(current);
    }

    return ret;
}

Exclude a lot of condition checks, the skeleton is just like this:

if (file->f_op->read)
    ret = file->f_op->read(file, buf, count, pos);
else
    ret = do_sync_read(file, buf, count, pos);

If the driver provides the read function, use it, else call do_sync_read. No matter which function is used, the new file position should be updated in the memory which pos points to.

Finally, it is file_pos_write‘s function to update the new position:

static inline void file_pos_write(struct file *file, loff_t pos)
{
    file->f_pos = pos;
}

From the above analysis, we can see that it’s no need for every device driver update the file position, and file_pos_read/write will do this uniformly.Other functions are similar, so we can answer the question posted at the beginning of the article now.

How to modify the local version of Linux kernel?

Execute “make menuconfig” command, then select “General setup” -> “Local version - append to kernel release“. Add you preferred name, e.g.: “.nan“, then save it.

Check if it is saved in .config file successfully:

[root@linux]# grep -i ".nan" .config
CONFIG_LOCALVERSION=".nan"

After making sure it is saved successfully, you can execute “make” command.