如何理解“load average”?

*nix系统中,执行topuptime命令可以显示当前系统的load average(分别是过去1515分钟的load平均值):

# uptime
 14:43:37 up 22 days,  1:47,  5 users,  load average: 0.00, 0.01, 0.05

load指的是正在使用和等待使用CPUprocess的数量和。因此,在单核系统上,load average这个值低于1.00表示系统还很空闲,1.00表示系统已经达到100%利用率了,高于1.00就需要引起注意了。

此外,100%利用率和系统处理器数目有关,单核系统的值是1.00,双核系统值就是2.00了,以此类推。因此在多处理器系统上,有可能load average的值很高,可是系统CPU实际上还很空闲。

P.S.:得到CPU数目的方法:

# grep 'model name' /proc/cpuinfo | wc -l
8

参考资料:
WHAT ABOUT MULTI-PROCESSORS? MY LOAD SAYS 3.00, BUT THINGS ARE RUNNING FINE!

What’s the difference between load average and CPU load?

Examining Load Average

load average video

 

Bash quoting简介

Bash quoting可以关闭Bash中具有特殊含义的meta字符的功能:
a)单引号:所有meta字符的功能均被关闭;
b)双引号:大部分meta字符的功能被关闭,除了$等少数字符;
c)反斜线(\):仅跟着\后面的meta字符被关闭。
这样就可以理解为什么解压多个zip文件时,要使用“unzip '*.zip'”而不是“unzip *.zip”。因为第二种会首先把*.zip替换成所有的文件名,而第一种方法不会这样做。

参考资料:
Shell十三问
How do I unzip multiple / many files under Linux?

 

shmmax和shmall

Linux kernel中针对shared memory有两个重要的配置项:shmmaxshmall

shmmax定义了一次分配shared memory的最大长度,单位是byte

# cat /proc/sys/kernel/shmmax
18446744073692774399

shmall定义了一共能分配shared memory的最大长度,单位是page

最大“shared memory” = shmall(cat /proc/sys/kernel/shmall) * pagesize(getconf PAGE_SIZE)

shmmax为例,介绍一下修改值的方法:

(1)现在系统shmmax的值:

# sysctl -a | grep shmmax
kernel.shmmax = 18446744073692774399

(2)修改shmmax的值:

# echo "536870912" > /proc/sys/kernel/shmmax
# sysctl -a | grep shmmax
kernel.shmmax = 536870912

可以看到值发生了变化。但是重启系统以后,shmmax又变回之前的值。如果要让值永久生效,可以使用下列方法:

# echo "kernel.shmmax = 536870912" >>  /etc/sysctl.conf
# sysctl -a | grep shmmax
kernel.shmmax = 18446744073692774399
# sysctl -p
kernel.shmmax = 536870912
# sysctl -a | grep shmmax
kernel.shmmax = 536870912

另外,关于如何设置shmallshmmax的值,也可以参考这个脚本

参考资料:
The Mysterious World of Shmmax and Shmall
Configuring SHMMAX and SHMALL for Oracle in Linux
What is shmmax, shmall, shmmni? Shared Memory Max

 

find命令的“-exec COMMAND \;”

下面这个find命令列出当前目录下的*.stp文件:

# find . -name '*.stp' -exec ls {} \;
./Documents/one.stp
./Documents/two.stp

关于find命令的“-exec COMMAND \;”:

find

-exec COMMAND \;

Carries out COMMAND on each file that find matches. The command sequence terminates with ; (the “;” is escaped to make certain the shell passes it to find literally, without interpreting it as a special character).

If COMMAND contains {}, then find substitutes the full path name of the selected file for “{}”.

;的作用是标示命令完结,\;是让shell;原封不动地传给find命令。而{}会使用查找出来的文件的全路径名。

参考资料:
16.2. Complex Commands

 

Linux kernel 笔记 (49)——ERESTARTSYS和EINTR

LDD3中提到驱动代码返回ERESTARTSYSEINTR时如何选择:

Note the check on the return value of down_interruptible; if it returns nonzero, the operation was interrupted. The usual thing to do in this situation is to return -ERESTARTSYS。 Upon seeing this return code, the higher layers of the kernel will either restart the call from the beginning or return the error to the user. If you return -ERESTARTSYS , you must first undo any user-visible changes that might have been made, so that the right thing happens when the system call is retried. If you cannot undo things in this manner, you should return -EINTR instead.

即如果可以把用户看到的设备状态完全回滚到执行驱动代码之前,则返回ERESTARTSYS,否则返回EINTR。因为EINTR错误可以使系统调用失败,并且返回错误码为EINTR给应用程序。而ERESTARTSYS有可能会让kernel重新发起操作,而不会惊动应用程序。可以参考这篇帖子

 

Linux系统上“run”和“/var/run”目录

以下摘自wikipedia

Modern Linux distributions include a /run directory as a temporary filesystem (tmpfs) which stores volatile runtime data, following the FHS version 3.0. According to the FHS version 2.3, such data were stored in /var/run but this was a problem in some cases because this directory isn’t always available at early boot. As a result, these programs have had to resort to trickery, such as using /dev/.udev, /dev/.mdadm, /dev/.systemd or /dev/.mount directories, even though the device directory isn’t intended for such data.[19] Among other advantages, this makes the system easier to use normally with the root filesystem mounted read-only.

/run是一个临时文件系统,存储系统启动以来的信息。当系统重启时,这个目录下的文件应该被删掉或清除。如果你的系统上有/var/run目录,应该让它指向run。参看SuSE 12的实现:

# df -h
Filesystem      Size  Used Avail Use% Mounted on
......
tmpfs           431M  7.1M  424M   2% /run
......

# ls -lt /var/run
lrwxrwxrwx 1 root root 4 Nov  5 21:14 /var/run -> /run

 

/dev/mem,/dev/kmem和/dev/port

/dev/mem/dev/kmem/dev/port这三个文件分别代表物理内存,kernel虚拟内存和I/O端口。参考下面:

/dev/mem is a character device file that is an image of the main memory of the computer. It may be used, for example, to examine (and even patch) the system. Byte addresses in /dev/mem are interpreted as physical memory addresses. References to nonexistent locations cause errors to be returned.

The file /dev/kmem is the same as /dev/mem, except that the kernel virtual memory rather than physical memory is accessed.

/dev/port is similar to /dev/mem, but the I/O ports are accessed.

 

openat VS open

2.6.16版本开始,GNU/Linux引入openat系统调用:

#define _XOPEN_SOURCE 700 /* Or define _POSIX_C_SOURCE >= 200809 */
#include <fcntl.h>
int openat(int  dirfd , const char * pathname , int  flags , ... /* mode_t  mode */);
Returns file descriptor on success, or –1 on error

open相比,多了一个dirfd参数。关于它的用法,参考以下解释:

If pathname specifies a relative pathname, then it is interpreted relative to the directory referred to by the open file descriptor dirfd, rather than relative to the process’s current working directory.

If pathname specifies a relative pathname, and dirfd contains the special value AT_FDCWD , then pathname is interpreted relative to the process’s current working directory (i.e., the same behavior as open(2)).

If pathname specifies an absolute pathname, then dirfd is ignored.

总结起来,如果pathname是绝对路径,则dirfd参数没用。如果pathname是相对路径,并且dirfd的值不是AT_FDCWD,则pathname的参照物是相对于dirfd指向的目录,而不是进程的当前工作目录;反之,如果dirfd的值是AT_FDCWDpathname则是相对于进程当前工作目录的相对路径,此时等同于open。参考kernel代码则一目了然:

SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
{
    if (force_o_largefile())
        flags |= O_LARGEFILE;

    return do_sys_open(AT_FDCWD, filename, flags, mode);
}

SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags,
        umode_t, mode)
{
    if (force_o_largefile())
        flags |= O_LARGEFILE;

    return do_sys_open(dfd, filename, flags, mode);
}

引入openat(及其它at结尾的函数)有以下两个原因:

First, openat() allows an application to avoid race conditions that could occur when using open(2) to open files in directories other than the current working directory. These race conditions result from the fact that some component of the directory prefix given to open(2) could be changed in parallel with the call to open(2). Such races can be avoided by opening a file descriptor for the target directory, and then specifying that file descriptor as the dirfd argument of openat().

Second, openat() allows the implementation of a per-thread “current working directory”, via file descriptor(s) maintained by the application. (This functionality can also be obtained by tricks based on the use of /proc/self/fd/dirfd, but less efficiently.)

参考资料:
openat(2) – Linux man page
The Linux programming interface

 

“/dev/tty”,“/dev/console”和“/dev/tty0”的区别

这篇笔记来自于stackoverflow的一篇帖子,答案如下:

From the documentation(http://www.kernel.org/doc/Documentation/devices.txt):

    /dev/tty        Current TTY device
    /dev/console    System console
    /dev/tty0       Current virtual console

In the good old days /dev/console was System Administrator console. And TTYs were users' serial devices attached to a server.
Now /dev/console and /dev/tty0 represent current display and usually are the same. You can override it for example by adding console=ttyS0 to grub.conf. After that your /dev/tty0 is a monitor and /dev/console is /dev/ttyS0.

An exercise to show the difference between /dev/tty and /dev/tty0:

Switch to the 2nd console by pressing Ctrl+Alt+F2. Login as root. Type "sleep 5; echo tty0 > /dev/tty0". Press Enter and switch to the 3rd console by pressing Alt+F3.
Now switch back to the 2nd console by pressing Alt+F2. Type "sleep 5; echo tty > /dev/tty", press Enter and switch to the 3rd console.

You can see that "tty" is the console where process starts, and "tty0" is a always current console.

早些时候,/dev/console是系统管理员控制台,而TTYs则代表用户连接服务器的串行设备。而现在,/dev/console/dev/tty0均指当前的显示设备,并且通常情况下是一样的。你可以修改/dev/console所关联的设备。举个例子,在grub.conf中加入console=ttyS0。则现在,/dev/tty0所关联的是显示器,而dev/console则关联/dev/ttyS0

/dev/tty是当前进程控制的tty设备,而tty0则是当前的控制台。当你在一个终端执行“sleep 5; echo tty0 > /dev/tty0”命令后,切换到其它终端,则tty0会在你切换后的终端显示。而执行“sleep 5; echo tty > /dev/tty”命令后,无论切换到那个终端,tty始终会在输入命令的终端显示。