Unix中的zombie进程和orphan进程

Unix中子进程退出后,如果父进程没有使用wait()函数获得子进程的退出状态,则子进程的相关信息仍然会在系统的进程表里占用一席之地,这时的子进程称之为zombie进程。如果父进程先于子进程退出,这时的子进程称之为orphan进程,而init进程则会变成orphan进程的父进程。init进程会定期处理父进程是initzombie进程。

参考资料:
Zombie process
Zombie process vs Orphan process

Linux系统中locale相关的笔记

 下列笔记摘自The Linux programming interface

(1)

The actual locales that are defined on a system can vary. SUSv3 doesn’t make any requirements about this, except that a standard locale called POSIX (and synonymously, C, a name that exists for historical reasons) must be defined. This locale mirrors the historical behavior of UNIX systems. Thus, it is based on an ASCII character set, and uses English for names of days and months, and for yes/no responses. The monetary and numeric components of this locale are undefined.

The locale command displays information about the current locale environment (within the shell). The command locale –a lists the full set of locales defined on the system.

(2)

There are two different methods of setting the locale using setlocale(). The locale argument may be a string specifying one of the locales defined on the system (i.e., the name of one of the subdirectories under /usr/lib/locale), such as de_DE or en_US. Alternatively, locale may be specified as an empty string, meaning that locale settings should be taken from environment variables:

setlocale(LC_ALL, “”);

We must make this call in order for a program to be cognizant of the locale environment variables. If the call is omitted, these environment variables will have no effect on the program.

以下摘自stackoverflow.com

A C program inherits its locale environment variables when it starts up. This happens automatically. However, these variables do not automatically control the locale used by the library functions, because ISO C says that all programs start by default in the standard ‘C’ locale.

FreeBSD中的sysctl函数

FreeBSDsysctl家族的函数定义:

#include <sys/types.h>
 #include <sys/sysctl.h>

 int
 sysctl(const int *name, u_int namelen, void *oldp, size_t *oldlenp,
 const void *newp, size_t newlen);

 int
 sysctlbyname(const char *name, void *oldp, size_t *oldlenp,
 const void *newp, size_t newlen);

 int
 sysctlnametomib(const char *name, int *mibp, size_t *sizep);

sysctl函数参数中,namenamelen用来表明内核参数IDoldpoldlenp用来存储当前内核参数的值;而newpnewlen则用来设置新的内核参数值。如果不需要的话,可以把相应的值置成NULL
看一下sysctlbyname的实现:

int
sysctlbyname(const char *name, void *oldp, size_t *oldlenp,
    const void *newp, size_t newlen)
{
    int real_oid[CTL_MAXNAME+2];
    size_t oidlen;

    oidlen = sizeof(real_oid) / sizeof(int);
    if (sysctlnametomib(name, real_oid, &oidlen) < 0)
        return (-1);
    return (sysctl(real_oid, oidlen, oldp, oldlenp, newp, newlen));
}

可以看到,sysctlbyname首先通过sysctlnametomib获得真正的ID,接着调用sysctl完成想要的工作。

参考资料:
SYSCTL(3)
Grokking SYSCTL and the Art of Smashing Kernel Variables

Sed基础知识

以下内容摘自sed-awk-101-hacks-ebook
Sed语法:

sed [options] {sed-commands} {input-file}
或
sed [options] -f {sed-commands-in-a-file} {input-file}

Sed命令执行流程: 读取一行到一个临时缓存区,然后对这个缓存区执行相应的命令,并输出缓存区的内容,接下来清空缓存区,读取下一行。

Capture

Sed不会修改原始文件,并且总是输出到stdout,因此通常要使用-n选项来禁止自动输出默认的pattern space。举例如下:

# cat employee.txt
101,John Doe,CEO
102,Jason Smith,IT Manager
103,Raj Reddy,Sysadmin
104,Anand Ram,Developer
105,Jane Miller,Sales Manager
# sed '2 p' employee.txt
101,John Doe,CEO
102,Jason Smith,IT Manager
102,Jason Smith,IT Manager
103,Raj Reddy,Sysadmin
104,Anand Ram,Developer
105,Jane Miller,Sales Manager
# sed -n '2 p' employee.txt
102,Jason Smith,IT Manager

如果想保存sed命令的输出,可以把这些输出重定向到某个文件:>filename

 

uptime命令简介

uptime命令用来显示系统已经运行的时间:

# uptime
 19:05:33 up  3:16,  2 users,  load average: 0.00, 0.01, 0.05

19:05:33是当前系统时间,up 3:16是系统已经运行了3小时16分。后面还有用户和系统load信息。如果只关心系统运行了多次时间,可以使用下列命令:

# uptime -p
up 3 hours, 16 minutes

uptime命令得到系统运行时间是通过读取/proc/uptime文件:

# cat /proc/uptime
11984.78 95454.77

第一个字段是系统启动的秒数,第二个字段是系统每个CPU core处在idle状态的时间总和。

 

devfs,tmpfs和devtmpfs

以下摘自Specfs, Devfs, Tmpfs, and Others

specfs – specfs, or Special FileSystem, is a virtual filesystem used to access special device files. This filesystem is odd compared to other filesystems in general because this filesystem does not require a mount-point, yet the OS can still use specfs. However, specfs can be mounted by the user (mount -t specfs none /dev/streams). The device files for character devices in the /dev/ directory use specfs.

devfs – devfs is a device manager in the form of a filesystem. The Device FileSystem is largely the same as specfs except for some differences in the way they function and their uses. devfs is used for most of the device files in /dev/. Most Unix and Unix-like systems use devfs including Mac OS X, *BSD, and Solaris. Nearly all Unix and Unix-like systems that use devfs place it on the kernelspace. However, Linux uses a userspace-kernelspace hybrid approach. This means the devfs virtual filesystem is on the kernelspace and userspace.

tmpfs – The Temporary filesystem is a virtual filesystem for storing temporary files. This filesystem is really in the memory and/or in the swap space. Obviously, all data on this filesystem are lost when the system is shutdown. The mount point is /tmp/.

devtmpfs – This is an improved devfs. The purpose of devtmpfs is to boost boot-time. devtmpfs is more like tmpfs than devfs. The mount-point is /dev/. devtmpfs only creates device files for currently available hardware on the local system.

总结一下:
devfs是文件系统形式的device managertmpfs存在在内存和swap中,因此只能保存临时文件。devtmpfs是改进的devfs,也是存在内存中,挂载点是/dev/

 

Idle状态的CPU在做什么?

What Does an Idle CPU Do?这篇文章介绍了CPUIdle状态下在干什么,讲解地很清晰。我总结一下:

当操作系统“无事可做”时,就会运行Idle进程。以LinuxIdle进程代码为例:

while (1) {
    while(!need_resched()) {
        cpuidle_idle_call();
    }

    /*
      [Note: Switch to a different task. We will return to this loop when the
      idle task is again selected to run.]
    */
    schedule_preempt_disabled();
}

可以看到只要没有“调度”需求(need_resched),就会执行cpuidle_idle_call函数。而对于Intel处理器而言,保持Idle状态意味着执行hlt指令。

 

使用vmstat命令监控CPU使用

vmstat命令可以用来监控CPU的使用状况。举例如下:

# vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 5201924   1328 5578060    0    0     0     0 1582 6952  2  1 98  0  0
 1  0      0 5200984   1328 5577996    0    0     0     0 2020 20567  9  1 90  0  0
 0  0      0 5198668   1328 5577952    0    0     0     0 1568 7617  5  1 94  0  0
 0  0      0 5194844   1328 5578000    0    0     0   187 1249 7057  1  1 98  0  0
 0  0      0 5199956   1328 5578232    0    0     0     0 1496 7306  4  1 95  0  0

上述命令每隔1秒输出系统状态,最后5列是描述的是CPU状况。man手册上关于这5列的含义描述的很清楚:

CPU
       These are percentages of total CPU time.
       us: Time spent running non-kernel code.  (user time, including nice time)
       sy: Time spent running kernel code.  (system time)
       id: Time spent idle.  Prior to Linux 2.5.41, this includes IO-wait time.
       wa: Time spent waiting for IO.  Prior to Linux 2.5.41, included in idle.
       st: Time stolen from a virtual machine.  Prior to Linux 2.6.11, unknown.

vmstat实质上是从/proc/stat文件获得系统状态:

# cat /proc/stat
cpu  381584 711 299364 1398303520 429839 0 251 0 0 0
cpu0 90740 58 44641 174627550 131209 0 120 0 0 0
cpu1 43141 26 22925 174746812 108219 0 10 0 0 0
cpu2 41308 35 25097 174831161 25877 0 40 0 0 0
cpu3 39301 70 27514 174836084 27792 0 4 0 0 0
cpu4 39187 78 46191 174750027 109013 0 0 0 0 0
......

需要注意的是这里数字的单位是Jiffies

另外,vmstat计算CPU时间百分比使用的是“四舍五入”算法(vmstat.c):

static void new_format(void){
    ......
    duse = *cpu_use + *cpu_nic;
    dsys = *cpu_sys + *cpu_xxx + *cpu_yyy;
    didl = *cpu_idl;
    diow = *cpu_iow;
    dstl = *cpu_zzz;
    Div = duse + dsys + didl + diow + dstl;
    if (!Div) Div = 1, didl = 1;
    divo2 = Div / 2UL;
    printf(w_option ? wide_format : format,
           running, blocked,
           unitConvert(kb_swap_used), unitConvert(kb_main_free),
           unitConvert(a_option?kb_inactive:kb_main_buffers),
           unitConvert(a_option?kb_active:kb_main_cached),
           (unsigned)( (unitConvert(*pswpin  * kb_per_page) * hz + divo2) / Div ),
           (unsigned)( (unitConvert(*pswpout * kb_per_page) * hz + divo2) / Div ),
           (unsigned)( (*pgpgin        * hz + divo2) / Div ),
           (unsigned)( (*pgpgout           * hz + divo2) / Div ),
           (unsigned)( (*intr          * hz + divo2) / Div ),
           (unsigned)( (*ctxt          * hz + divo2) / Div ),
           (unsigned)( (100*duse            + divo2) / Div ),
           (unsigned)( (100*dsys            + divo2) / Div ),
           (unsigned)( (100*didl            + divo2) / Div ),
           (unsigned)( (100*diow            + divo2) / Div ),
           (unsigned)( (100*dstl            + divo2) / Div )
    );
    ......
}

所以会出现CPU利用百分比相加大于100的情况:2 + 1 + 98 = 101

另外,在Linux系统上,r字段表示的是当前正在运行和等待运行的task的总和。

 

参考资料:
/proc/stat explained
procps

 

Bash中的测试表达式

Bash shell中,每个执行命令都有一个返回值表示其退出状态:0表示true1表示falsetest命令是专门测试执行命令返回值,其格式如下:

test expression
或:
[ expression ]

目前test只支持3种测试对象:字符串,整数(0和正整数,不包含负数和小数点)和文件。当expression测试为“真”时,test命令就返回0true),反之返回非0false)。 关于test表达式的例子和解释,可以参考How to understand if condition in bash?

参考资料:
Shell十三问