技术 | 我的站点

C语言中使用scanf的一些注意事项

（1）scanf读取double类型数据应该使用%lf；而使用printf打印double时，要注意%lf用在printf中是C99才支持的，因此如果编译器不支持C99则要使用%f：

double d;
scanf("%lf", &d);
printf("%f", d);

参考资料：
Reading in double values with scanf in c；
Why does scanf() need “%lf” for doubles, when printf() is okay with just “%f”?。

（2）

char str[10];
scanf ("%[^\n]%*c", str);

%[^\n]含义是从stdin读取输入保存到str，直到遇到第一个\n；而%*c则丢弃掉这个\n。
参考资料：
scanf: “%[^\n]” skips the 2nd input but “ %[^\n]” does not. why?；
What does scanf(“%*[^\n]%*c”) mean?。

（3）限制输入字符串的长度，预留结尾的NUL：

char str[10]
scanf("%9s", str);

由一道题理解DFS算法

GENERATE PARENTHESES这篇文章讲解了如何使用Depth First Search，即DFS算法生成所有有效的括号组合：

void generateParentheses(int n) {
    dfs("", n, n);
}
void dfs(String s, int left, int right) // build the set of solutions using DFS approach
{
    if(left == 0 && right == 0) // BASE CASE: there is no more parentheses to add? we have a solution!
        System.out.println(s);
    if(left > 0) // while we have left parentheses to add, just add them
        dfs(s + "(", left - 1, right); // we call our function recursively with a left parentheses added
    if(right > left) // We are gonna add a right parentheses if we have more right parentheses than left ones
        dfs(s + ")", left, right - 1);
}

由此我也总结了一下DFS算法：
（1）DFS算法也是使用递归来解决问题；
（2）

if(left == 0 && right == 0) // BASE CASE: there is no more parentheses to add? we have a solution!
    System.out.println(s);

上述代码是递归的终止条件。
（3）

if(left > 0) // while we have left parentheses to add, just add them
    dfs(s + "(", left - 1, right); // we call our function recursively with a left parentheses added
if(right > left) // We are gonna add a right parentheses if we have more right parentheses than left ones
    dfs(s + ")", left, right - 1);

每调用一次dfs()函数，会为字符串添加(或)。

if(left > 0) // while we have left parentheses to add, just add them
    dfs(s + "(", left - 1, right); // we call our function recursively with a left parentheses added

会递归地生成当前位置是(的所有情况。if(right > left)分支则会生成当前位置是)的所有情况。

strace命令介绍

strace是Linux上的一个很好用的工具，它可以用来输出程序在运行过程中发生的系统调用以及收到的信号的相关信息，因此在调试和诊断问题时有很大的帮助，特别是在程序没有源码，或是在前期做一些粗略的分析时。strace命令格式如下：

strace [options] command [args]

举个例子：

＃ strace sleep 300
execve("/usr/bin/sleep", ["sleep", "300"], [/* 24 vars */]) = 0
brk(0)                                  = 0x22fa000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f70d1ef8000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
......
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=20243, si_uid=0} ---
......

从上面例子可以看出，对于系统调用，比如open，access，strace都会输出详细的参数和返回值，如果发生了错误，也会输出细致的错误信息。而对于接收到的信号，除了输出信息外，还要注意信号信息的前后都加了“---”，以示与系统调用的区别。

以下是一些常用的选项：
（1）-o：把strace执行结果输出到指定文件里：

# strace -o out ls

（2）-t：打印时间：

# strace -t ls
10:30:07 execve("/usr/bin/ls", ["ls"], [/* 24 vars */]) = 0
10:30:07 brk(0)
......

(3）-e：只关注某一系统调用：

# strace -e open ls
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
......

（4）-y：显示和文件描述符关联的文件路径：

# strace -y ls
......
fstat(3</etc/ld.so.cache>, {st_mode=S_IFREG|0644, st_size=32951, ...}) = 0
mmap(NULL, 32951, PROT_READ, MAP_PRIVATE, 3</etc/ld.so.cache>, 0) = 0x7fba3db13000
close(3</etc/ld.so.cache>)              = 0
......

（5）-f：追踪运行进程所生成的子进程。

参考资料：
strace(1) – Linux man page；
A swiss army knife of debugging tools。

什么是semaphore？

以下摘自The Little Book of Semaphores：

A semaphore is like an integer, with three differences:
1. When you create the semaphore, you can initialize its value to any integer, but after that the only operations you are allowed to perform are increment (increase by one) and decrement (decrease by one). You cannot read the current value of the semaphore.
2. When a thread decrements the semaphore, if the result is negative, the thread blocks itself and cannot continue until another thread increments the semaphore.
3. When a thread increments the semaphore, if there are other threads waiting, one of the waiting threads gets unblocked.
To say that a thread blocks itself (or simply “blocks”) is to say that it notifies the scheduler that it cannot proceed. The scheduler will prevent the thread from running until an event occurs that causes the thread to become unblocked. In the tradition of mixed metaphors in computer science, unblocking is often called “waking”.
That’s all there is to the definition, but there are some consequences of the definition you might want to think about.
• In general, there is no way to know before a thread decrements a semaphore whether it will block or not (in specific cases you might be able to prove that it will or will not).
• After a thread increments a semaphore and another thread gets woken up, both threads continue running concurrently. There is no way to know which thread, if either, will continue immediately.
• When you signal a semaphore, you don’t necessarily know whether another thread is waiting, so the number of unblocked threads may be zero or one.
Finally, you might want to think about what the value of the semaphore means. If the value is positive, then it represents the number of threads that can decrement without blocking. If it is negative, then it represents the number of threads that have blocked and are waiting. If the value is zero, it means there are no threads waiting, but if a thread tries to decrement, it will block.

semaphore的初始值为1，通常就用来构建mutex。

Linux kernel 笔记（63）——改变启动的kernel

原文在这里。

得到当前系统运行的kernel（系统为CentOS）：

# egrep ^menuentry /etc/grub2.cfg | cut -f 2 -d \'
CentOS Linux (4.8.3) 7 (Core)
CentOS Linux (3.10.0-327.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-d07a2009dd34415fa45624985dccbdf6) 7 (Core)

使用grub2-set-default改变启动的kernel：

# grub2-set-default 0

如果仅仅想生效一次，可以使用grub2-reboot命令：

＃ grub2-reboot 0

inode，“hard link”和“symbol link”

在*nix文件系统上，每个文件的存储实际可以看成包含两部分：inode和实际存储文件内容的数据块。其中inode存储文件的metadata，包含创建时间，访问权限，等等，当然还有指向文件具体数据块的指针。正是通过这个指针，将indoe和数据块关联起来。

要注意，inode中并不保存文件的名字。关于文件名字和inode的映射存储在目录文件中。因此，当访问一个文件时，其实是通过这个文件所在的目录文件访问到这个文件的inode信息，继而进行文件操作的。

接下来，看一下hard link，symbol link和inode之间的关系。首先创建一个文件和指向这个文件的hard link和symbol link：

# echo 'Hello, World!' > myfile.txt
# ln myfile.txt my-hard-link
# ln -s myfile.txt my-soft-link

查看这3个文件的inode信息：

# ls -ailt my*
325332 lrwxr-xr-x  1 root  wheel  10 Oct 24 05:26 my-soft-link -> myfile.txt
325331 -rw-r--r--  2 root  wheel  14 Oct 24 05:25 my-hard-link
325331 -rw-r--r--  2 root  wheel  14 Oct 24 05:25 myfile.txt

可以看到myfile.txt和my-hard-link其实对应的是同一个inode节点：325331，而my-soft-link对应的是另一个inode节点：325332。接下来删除myfile.txt，然后分别读取my-hard-link和my-soft-link文件内容：

# rm myfile.txt
# ls -ailt my*
325332 lrwxr-xr-x  1 root  wheel  10 Oct 24 05:26 my-soft-link -> myfile.txt
325331 -rw-r--r--  1 root  wheel  14 Oct 24 05:25 my-hard-link
# cat my-hard-link
Hello, World!
# cat my-soft-link
cat: my-soft-link: No such file or directory

可以看到，因为my-hard-link和myfile.txt对应相同的inode节点：325331，因此删除myfile.txt后，仍然可以通过my-hard-link读取325331这个inode节点所对应的文件内容。而my-soft-link仅仅是指向myfile.txt这个文件名字，因此一旦myfile.txt被删除，也就无法读取文件内容了。

参考资料：
Inodes – an Introduction；
What is the difference between a symbolic link and a hard link?。

得到一个有符号数的符号位

Compute the sign of an integer描述了如何得到一个有符号数的符号位：

int v;      // we want to find the sign of v
int sign;   // the result goes here 

// CHAR_BIT is the number of bits per byte (normally 8).
sign = -(v < 0);  // if v < 0 then -1, else 0. 
// or, to avoid branching on CPUs with flag registers (IA32):
sign = -(int)((unsigned int)((int)v) >> (sizeof(int) * CHAR_BIT - 1));
// or, for one less instruction (but not portable):
sign = v >> (sizeof(int) * CHAR_BIT - 1);

关于方法1：

sign = -(v < 0);  // if v < 0 then -1, else 0.

参考C规范6.5.8:6：

Each of the operators < (less than), > (greater than), <= (less than or equal to), and >= (greater than or equal to) shall yield 1 if the specified relation is true and 0 if it is false.92) The result has type int.

因此(v < 0)会得到1或0。

关于方法2和3：

// or, to avoid branching on CPUs with flag registers (IA32):
sign = -(int)((unsigned int)((int)v) >> (sizeof(int) * CHAR_BIT - 1));
// or, for one less instruction (but not portable):
sign = v >> (sizeof(int) * CHAR_BIT - 1);

参考stackoverflow这篇帖子。因为有符号数类型中的负数右移是C规范中未定义行为，取决与具体体系结构的实现，因此需要先把它转化成无符号数进行右移操作：

((unsigned int)((int)v) >> (sizeof(int) * CHAR_BIT - 1))

Linux下使用vmstat命令获得系统CPU的使用状态

本文是使用vmstat命令监控CPU使用的续文。

在Linux下使用vmstat命令可以得到系统CPU的使用状态：

# vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2  0      0 1860352    948 131040    0    0  2433   137  252  897  2  7 90  1  0

其中描述CPU状态的是最后5列：

------cpu-----
us sy id wa st
2  7 90  1  0

要注意，上面数字的含义是百分比。即CPU运行user space程序的时间占2%，。。。

各列含义如下：

us（user time）：CPU运行user space代码的时间；
sy（system time）：CPU运行kernel代码的时间，比如执行系统调用；
id（idle time）：CPU处于idle状态的时间；
wa（IO-wait time）：CPU处于idle状态，因为所有正在运行的进程都在等待I/O操作完成，因此当前无可以调度的进程；
st（stolen time）：CPU花费在执行系统上运行的虚拟机的时间。

参考资料：
The precise meaning of I/O wait time in Linux；
Linux Performance Analysis in 60,000 Milliseconds。

Memoization和dynamic programming

这篇笔记摘自Tutorial for Dynamic Programming：

Dynamic programming (usually referred to as DP ) is a very powerful technique to solve a particular class of problems. It demands very elegant formulation of the approach and simple thinking and the coding part is very easy. The idea is very simple, If you have solved a problem with the given input, then save the result for future reference, so as to avoid solving the same problem again.. shortly ‘Remember your Past’ 🙂 . If the given problem can be broken up in to smaller sub-problems and these smaller subproblems are in turn divided in to still-smaller ones, and in this process, if you observe some over-lappping subproblems, then its a big hint for DP. Also, the optimal solutions to the subproblems contribute to the optimal solution of the given problem ( referred to as the Optimal Substructure Property ).

There are two ways of doing this.

1.) Top-Down : Start solving the given problem by breaking it down. If you see that the problem has been solved already, then just return the saved answer. If it has not been solved, solve it and save the answer. This is usually easy to think of and very intuitive. This is referred to as Memoization.

2.) Bottom-Up : Analyze the problem and see the order in which the sub-problems are solved and start solving from the trivial subproblem, up towards the given problem. In this process, it is guaranteed that the subproblems are solved before solving the problem. This is referred to as Dynamic Programming.

Memoization是一种“自顶向下”解决问题的方式，并且顾名思义，它具有保存结果的含义：它把一个问题细化成子问题，如果子问题已经解决，就直接获得结果，反正则解决子问题，并把结果保存起来。
Dynamic Programming则是“自下而上”的解决问题方式，它从处理子问题入手，在解决最终问题之前，保证它的所有子问题都已解决。

分类：技术