Linux kernel 笔记 (43)——do_sys_open

以下是do_sys_openkernel 3.12版本的代码:

long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
{
    struct open_flags op;
    int fd = build_open_flags(flags, mode, &op);
    struct filename *tmp;

    if (fd)
        return fd;

    tmp = getname(filename);
    if (IS_ERR(tmp))
        return PTR_ERR(tmp);

    fd = get_unused_fd_flags(flags);
    if (fd >= 0) {
        struct file *f = do_filp_open(dfd, tmp, &op);
        if (IS_ERR(f)) {
            put_unused_fd(fd);
            fd = PTR_ERR(f);
        } else {
            fsnotify_open(f);
            fd_install(fd, f);
        }
    }
    putname(tmp);
    return fd;
}

核心部分如下:

a)get_unused_fd_flags得到一个文件描述符;
b)do_filp_open得到一个struct file结构;
c)fd_install把文件描述符和struct file结构关联起来。

struct file包含f_op成员:

struct file {
    ......
    const struct file_operations    *f_op;
    ......
    void            *private_data;
    ......
}

struct file_operations又包含open成员:

struct file_operations {
    ......
    int (*open) (struct inode *, struct file *);
    ......
}

open成员的两个参数:实际文件的inode节点和struct file结构。

open系统调用执行驱动中open方法之前(struct file_operations中的open成员),会将private_data置成NULL,用户可以根据自己的需要设置private_data的值(参考do_dentry_open函数)。

 

openat VS open

2.6.16版本开始,GNU/Linux引入openat系统调用:

#define _XOPEN_SOURCE 700 /* Or define _POSIX_C_SOURCE >= 200809 */
#include <fcntl.h>
int openat(int  dirfd , const char * pathname , int  flags , ... /* mode_t  mode */);
Returns file descriptor on success, or –1 on error

open相比,多了一个dirfd参数。关于它的用法,参考以下解释:

If pathname specifies a relative pathname, then it is interpreted relative to the directory referred to by the open file descriptor dirfd, rather than relative to the process’s current working directory.

If pathname specifies a relative pathname, and dirfd contains the special value AT_FDCWD , then pathname is interpreted relative to the process’s current working directory (i.e., the same behavior as open(2)).

If pathname specifies an absolute pathname, then dirfd is ignored.

总结起来,如果pathname是绝对路径,则dirfd参数没用。如果pathname是相对路径,并且dirfd的值不是AT_FDCWD,则pathname的参照物是相对于dirfd指向的目录,而不是进程的当前工作目录;反之,如果dirfd的值是AT_FDCWDpathname则是相对于进程当前工作目录的相对路径,此时等同于open。参考kernel代码则一目了然:

SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
{
    if (force_o_largefile())
        flags |= O_LARGEFILE;

    return do_sys_open(AT_FDCWD, filename, flags, mode);
}

SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags,
        umode_t, mode)
{
    if (force_o_largefile())
        flags |= O_LARGEFILE;

    return do_sys_open(dfd, filename, flags, mode);
}

引入openat(及其它at结尾的函数)有以下两个原因:

First, openat() allows an application to avoid race conditions that could occur when using open(2) to open files in directories other than the current working directory. These race conditions result from the fact that some component of the directory prefix given to open(2) could be changed in parallel with the call to open(2). Such races can be avoided by opening a file descriptor for the target directory, and then specifying that file descriptor as the dirfd argument of openat().

Second, openat() allows the implementation of a per-thread “current working directory”, via file descriptor(s) maintained by the application. (This functionality can also be obtained by tricks based on the use of /proc/self/fd/dirfd, but less efficiently.)

参考资料:
openat(2) – Linux man page
The Linux programming interface

 

Linux kernel 笔记 (42)——container_of

container_of定义在<linux/kernel.h>中:

/**
 * container_of - cast a member of a structure out to the containing structure
 * @ptr:    the pointer to the member.
 * @type:   the type of the container struct this is embedded in.
 * @member: the name of the member within the struct.
 *
 */
#define container_of(ptr, type, member) ({          \
    const typeof( ((type *)0)->member ) *__mptr = (ptr);    \
    (type *)( (char *)__mptr - offsetof(type,member) );})

它的功能是通过一个结构体成员的地址,得到结构体的地址。举例如下:

struct st_A
{
        int member_b;
        int member_c;
};

struct st_A a;

container_of(&(a.member_c), struct st_A, member_c)会得到变量a的地址,也就是&a的值。