4月, 2015 | 我的站点

CPU执行一条指令包含下面5个步骤，其中每个步骤都会由CPU的一个专门的功能单元（function unit）来完成：
（1）取指令；
（2）解码；
（3）执行指令；
（4）内存访问；
（5）写回寄存器。
最后两个步骤是可选的，因为很多指令只会访问寄存器，不会访问内存。上面的每个步骤至少要花费一个时钟周期（clock cycle）去完成。内存访问通常是最慢的，要占用多个时钟周期。

指令流水线（Instruction Pipeline）：是一种可以并行执行多条指令的CPU结构（architecture），也即同时执行不同指令的不同部分。假设上面提到的执行指令5个步骤每个步骤都占1个时钟周期，那么完成一个指令需要5个时钟周期（假设步骤4和5都要经历）。在执行这条指令的过程，每个步骤只有CPU的一个功能单元是工作的，其它的都在空闲中。采用指令流水线以后，多个功能单元可以同时活跃，举个例子：在解码一条指令时，可以同时取下一条指令。这样可以大大提高效率。理想情况下，执行每条指令仅需要1个时钟周期。

更进一步，如果CPU内执行特定功能的功能单元有多个的话，那么每个时钟周期可以完成更多的指令。这种CPU结构称之为“超标量（superscalar）”。指令宽度（Instruction Width）描述了可以并行处理的指令的数量。现代CPU一般是3-wide或4-wide,即每个时钟周期可处理3~4条指令。

Cycles per instruction（CPI）是描述CPU在哪里耗费时钟周期和理解CPU利用率的一个重要度量参数。这个参数也可以表示为instructions per cycle（IPC）。CPI表达了指令处理的效率，并不是指令本身的效率。

在bash shell中执行一个命令时，其实是由bash shell fork出一个子进程，然后在这个子进程中运行相应的命令，直至退出。在一个终端执行下列操作：

[root@localhost ~]# echo $$
19954
[root@localhost ~]# sleep 100

在另一个终端执行下列操作：

[root@localhost bin]# ps -ef | grep 19954
root     19954 19353  0 03:01 pts/3    00:00:00 /bin/bash
root     20265 19954  0 04:39 pts/3    00:00:00 sleep 100
root     20267 19354  0 04:39 pts/0    00:00:00 grep --color=auto 19954

可以看到第一个终端的bash shell（进程ID是19954）fork产生了sleep 100这个进程。

当在bash shell中执行一个bash shell脚本时，其实先会fork出一个subshell子进程，再由这个子进程执行脚本中的命令。举个例子可能会解释的更清楚。

这是一个简单的bash shell脚本（test.sh）：

#!/bin/bash
sleep 100

在一个终端执行这个脚本：

[root@localhost ~]# echo $$
19954
[root@localhost ~]# ./test.sh

在另一个终端执行下列操作：

[root@localhost bin]# ps -ef | grep test
root     20309 19954  0 05:01 pts/3    00:00:00 /bin/bash ./test.sh
root     20316 19354  0 05:02 pts/0    00:00:00 grep --color=auto test
[root@localhost bin]# ps -ef | grep 20309
root     20309 19954  0 05:01 pts/3    00:00:00 /bin/bash ./test.sh
root     20310 20309  0 05:01 pts/3    00:00:00 sleep 100
root     20318 19354  0 05:02 pts/0    00:00:00 grep --color=auto 20309

可以看到第一个终端的bash shell（进程ID是19954）fork产生了test.sh这个进程（进程ID是20309），而test.sh这个进程又fork产生了sleep 100这个进程。

接下来要提一下source（“.”命令作用是一样的）这个bash shell内建命令。man source是这样解释的：

.  filename [arguments]
source filename [arguments]
      Read and execute commands from filename in the current shell environment and return the exit status of the last command executed from filename.  If filename does not contain a slash, file names in  PATH  are used to find the directory containing filename.  The file searched for in PATH need not be executable.  When bash is not in posix mode, the current directory is searched if no  file is  found in PATH.  If the sourcepath option to the shopt builtin command is turned off, the PATH is not searched.  If any arguments are supplied, they become the positional parameters when filename  is  executed.  Otherwise the positional parameters are unchanged.  The return status is the status of the last command exited within the script (0 if no commands are executed), and false if filename is not found or cannot be read.

可以看出，“source filename [arguments]”会在当前的shell环境执行文件中的命令，也就是不会产生一个subshell子进程。仍利用test.sh脚本演示一下：

在一个终端执行这个脚本：

[root@localhost ~]# echo $$
19954
[root@localhost ~]# source ./test.sh

在另一个终端执行下列操作：

[root@localhost bin]# ps -ef | grep test
root     20349 19354  0 05:24 pts/0    00:00:00 grep --color=auto test
[root@localhost bin]# ps -ef | grep 19954
root     19954 19353  0 03:01 pts/3    00:00:00 /bin/bash
root     20345 19954  0 05:24 pts/3    00:00:00 sleep 100
root     20347 19354  0 05:24 pts/0    00:00:00 grep --color=auto 19954

可以看到并没有test.sh这个进程，sleep 100这个进程是由bash shell（进程ID是19954）进程直接fork产生的。

参考资料：
Shell十三问

一	二	三	四	五	六	日
« 3月				5月 »
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

月份：2015年4月

闲侃CPU（三）

Bash shell进程问题浅析