技术 | 我的站点

docker笔记（2）——“docker daemon”，“image”和“container”

使用“service docker start”命令实际上是启动docker daemon程序，可以用“ps”命令查看一下：

[root@localhost ~]# ps -ef | grep docker
root     24791     1  0 Mar18 ?        00:00:03 /usr/bin/docker -d --selinux-enabled
root     24969 24950  0 03:25 pts/0    00:00:00 grep --color=auto docker

“-d”选项表明开启daemon模式。

Docker中最重要的概念是images和containers。images可以比做虚拟机软件（VirtualBox或者VMware）的虚拟磁盘镜像文件（VirtualBox的格式是*.vdi，VMware是*.vmdk）。从images可以创建一个或多个containers进程，这个可以比做从虚拟磁盘镜像文件创建出虚拟机程序。

使用docker run命令可以创建一个containers，并且在containers中运行命令，如下所示：

[root@localhost ~]# docker run --rm -ti ubuntu /bin/bash
root@88129ebf9b61:/# ls

新的containers运行的是一个ubuntu镜像，并且进入bash交互模式。

docker笔记（1）—— RHEL 7.0安装docker

官方RHEL 7.0安装docker的文档在这里。由于这个需要用户注册，所以在这里我介绍另一种方法：使用CentOS的docker rpm包。

（1）CentOS的软件包在这里：http://cbs.centos.org/repos/virt7-testing/x86_64/os/，你可以配置到yum源（软件仓库）的配置文件里，类似这样：

[centos-extra]
name=centos extra
baseurl=http://cbs.centos.org/repos/virt7-testing/x86_64/os/
enabled=1
gpgcheck=0

（2）运行“yum install docker”命令。

（3）安装成功后，运行“docker version”命令：

[root@localhost yum.repos.d]# docker version
Client version: 1.5.0
Client API version: 1.17
Go version (client): go1.3.3
Git commit (client): a8a31ef/1.5.0
OS/Arch (client): linux/amd64
FATA[0000] Get http:///var/run/docker.sock/v1.17/version: dial unix /var/run/docker.sock: no such file or directory. Are you trying to connect to a TLS-enabled daemon without TLS?

可以看到有“FATA[0000]......”提示，原因是没有启动docker daemon程序，使用“service docker start”可以启动docker程序。再次执行“docker version”命令：

[root@localhost bin]# docker version
Client version: 1.5.0
Client API version: 1.17
Go version (client): go1.3.3
Git commit (client): a8a31ef/1.5.0
OS/Arch (client): linux/amd64
Server version: 1.5.0
Server API version: 1.17
Go version (server): go1.3.3
Git commit (server): a8a31ef/1.5.0

可以看到“FATA[0000]......”提示没有了。

（4）接下来的步骤可参考这里。

Concurrency（并发）和Parallelism（并行）的解释

今天看到一个对Concurrency（并发）和Parallelism（并行）这两个术语的一个解释，感觉很精确。记录在此，以供日后查阅（原文在这里）：

Concurrency and parallelism are two related but distinct concepts.

Concurrency means, essentially, that task A and task B both need to happen independently of each other, and A starts running, and then B starts before A is finished.

There are various different ways of accomplishing concurrency. One of them is parallelism--having multiple CPUs working on the different tasks at the same time. But that's not the only way. Another is by task switching, which works like this: Task A works up to a certain point, then the CPU working on it stops and switches over to task B, works on it for a while, and then switches back to task A. If the time slices are small enough, it may appear to the user that both things are being run in parallel, even though they're actually being processed in serial by a multitasking CPU.

Concurrency（并发）是指系统上的多个task（任务）之间由于没有任何依赖关系，所以可以同时运行。以两个task（A和B），两核CPU（A和B）系统为例，如果在一个时间点上，task A在CPU A上运行，而task B在CPU B上运行，我们就可以称这两个task是Parallelism（并行）运行的。对单核CPU来说，由于在一个时间点上只能执行一个task，所以这两个task只能是Sequential（串行）运行的。

一句话总结，Concurrency（并发）描述了task之间的逻辑关系（能否同时运行），而Parallelism则描述了task运行时的真正状态（是否真的在同时运行）。

Perf笔记（二）

Perf_events所处理的hardware event（硬件事件）需要CPU的支持，而目前主流的CPU基本都包含了PMU（Performance Monitoring Unit，性能监控单元）。PMU用来统计性能相关的参数，像cache命中率，指令周期等等。由于这些统计工作是硬件完成的，所以CPU开销很小。

以X86体系结构为例，PMU包含了两种MSRs（Model-Specific Registers，之所以称之为Model-Specific，是因为不同model的CPU，有些register是不同的）：Performance Event Select Registers和Performance Monitoring Counters（PMC）。当想对某种性能事件（performance event）进行统计时，需要对Performance Event Select Register进行设置，统计结果会存在Performance Monitoring Counter中。

当perf_events工作在采样模式（sampling，perf record命令即工作在这种模式）时，由于采样事件发生时和实际处理采样事件之间有时间上的delay，以及CPU流水线和乱序执行等因素，所以得到的指令地址IP(Instruction Pointer)并不是当时产生采样事件的IP，这个称之为skid。为了改善这种状况，使IP值更加准确，Intel使用PEBS（Precise Event-Based Sampling），而AMD则使用IBS（Instruction-Based Sampling）。

以PEBS为例：每次采样事件发生时，会先把采样数据存到一个缓冲区中（PEBS buffer），当缓冲区内容达到某一值时，再一次性处理，这样可以很好地解决skid问题。

执行一下perf list --help命令，会看到下面内容：

The p modifier can be used for specifying how precise the instruction address should be. The p modifier can be specified multiple times:

       0 - SAMPLE_IP can have arbitrary skid
       1 - SAMPLE_IP must have constant skid
       2 - SAMPLE_IP requested to have 0 skid
       3 - SAMPLE_IP must have 0 skid

For Intel systems precise event sampling is implemented with PEBS which supports up to precise-level 2.

现在可以理解，经常看到的类似“perf record -e "cpu/mem-loads/pp" -a”命令中，pp就是指定IP精度的。

Perf笔记（一）

Perf_events是目前在Linux上使用广泛的profiling/tracing工具，除了本身是内核（kernel）的组成部分以外，还提供了用户空间（user-space）的命令行工具（“perf”，“perf-record”，“perf-stat”等等）。

perf_events提供两种工作模式：采样模式（sampling）和计数模式（counting）。“perf record”命令工作在采样模式：周期性地做事件采样，并把信息记录下来，默认保存在perf.data文件；而“perf stat”命令工作在计数模式：仅仅统计某个事件发生的次数。

我们经常看到类似这样的命令：“perf record -a ...... sleep 10”。在这里，“sleep”这个命令相当于一个“dummy”命令，没有做任何有意义的工作，它的作用是让“perf record”命令对整个系统进行采样，并在10秒后自动结束采样工作。

MiB（Mebibyte）和 MB（Megabyte）

上周五和同事讨论问题时，自己才知道不仅有MB（Megabyte），还有MiB（Mebibyte）。说来惭愧，以前看到同事ppt提到MiB，还以为是笔误呢。简单来讲，1MiB包含1024^2 bytes，自从MiB出现后，1MB更倾向于表示1000^2 bytes。同样道理，现在也有Kibibyte和Gibibyte。

参考资料：
What is the difference between 1 MiB and 1 MB and why should we care?；
Mebibyte。

闲侃CPU（一）

这个文章系列来自于Brendan Gregg所著《Systems Performance: Enterprise and the Cloud》一书第六章《CPU》的读书笔记。

系统板卡上的CPU插槽称之为socket，一颗物理CPU芯片可以称之为processor。现在CPU早已经进入多核时代，一颗CPU processor可以包含多个core，而一个core又可以包含多个hardware thread。每个hardware thread在操作系统看来，就是一个logic CPU，即一个可以被调度的CPU实例（instance）。举个例子，如果一颗CPU processor包含4个core，而每个core又包含2个hardware thread，则从操作系统角度看来，一共有8个可以使用的“CPU”（1*4*2 = 8）。

以lscpu输出为例：

[root@linux ~]# lscpu
......
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                120
On-line CPU(s) list:   0-119
Thread(s) per core:    2
Core(s) per socket:    15
Socket(s):             4
......

120 = 2 * 15 * 4, 即CPU(s) = Thread(s) per core * Core(s) per socket * Socket(s)。

为了改善内存访问性能，CPU processor提供了寄存器3级cache。整个存储模型如下所示（从上往下，容量越小，CPU访问越快）：
* register（寄存器）
* L1 cache
* L2 cache
* L3 cache
* Main memory（主存储器）
* Storage Device（外接存储器）

在VirtualBox上安装Solaris 11

为了更方便地研究DTrace，花了一下午时间装了个Solaris 11的虚拟机。基本就是靠着google，解决了所有问题。在这里简单列举一下步骤，希望可以帮到有需求的朋友：

（1）在VirtualBox上安装Solaris 11，基本可以参考这篇文章;
（2）由于VirtualBox不支持滚动条，所以我习惯于用ssh客户端登陆上去进行操作。而Solaris默认不支持root用户直接ssh登陆。修改方法参照这篇文章；
（3）Solaris 11安装包默认没有gcc，可参考stackoverflow这篇帖子下载安装gcc。期间如果涉及到如何配置Solaris 11的DNS服务，可参考这篇帖子。

Unix/Linux命令行小技巧（20）- 按目录size大小列举目录

使用“du --block-size=kB | sort -n”或“du --block-size=kB | sort -nr”命令可以按目录size从小到大或从大到小列举目录。
举个例子：

[root@localhost /]$ du --block-size=kB | sort -n
 0kB    ./dev/bsg
 0kB    ./dev/bus
......
[root@localhost /]$ du --block-size=kB | sort -nr
 1179418kB    .
 937862kB    ./usr
......

技巧出处：https://twitter.com/nixcraft/status/290924082088775681。

Unix/Linux命令行小技巧（19）- 显示系统的内存使用

使用“ps -A --sort -rss -o pid,comm,pmem,rss | less”命令可以显示系统的内存使用。
举个例子：

[root@home]$ ps -A --sort -rss -o pid,comm,pmem,rss | less
 PID COMMAND         ％MEM  RSS
1386 abrtd           0.1   4824
1223 hald            0.0   3888
1423 login           0.0   3420
......

可以看到打印了每个进程占用的内存百分比，以及RSS的大小。
技巧出处：https://twitter.com/nixcraft/status/288158831551332353。

2025 年 6 月
一	二	三	四	五	六	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30