我的站点

一个系统软件工程师的随手涂鸦

Month: 七月 2015 (Page 1 of 5)

Shark代码分析笔记(1)——Makefile

Shark项目的Makefile

#Define BPF_ENABLE if Linux kernel is 4.0+
#BPF_DISABLE=1

CFLAGS=-I. -I core/ -I core/libuv/include -I core/luajit/src/ -I bpf/libbpf/

CORE_LIB=core/luajit/src/libluajit.a core/libuv/.libs/libuv.a

PERF_LIBS= perf/libperf.a perf/libtraceevent.a perf/libapikfs.a
LIB=$(CORE_LIB) $(PERF_LIBS) -lm -ldl -lelf -lc -lpthread

OBJS=core/shark.o core/luv/luv.o perf/perf.o

BUILTIN_LUA_OBJS = perf/perf_builtin_lua.o
OBJS += $(BUILTIN_LUA_OBJS)

ifndef BPF_DISABLE
OBJS += bpf/bpf.o bpf/libbpf/bpf_load.o bpf/libbpf/libbpf.o
BUILTIN_LUA_OBJS += bpf/bpf_builtin_lua.o
else
CFLAGS += -DBPF_DISABLE
endif

TARGET=shark

#ffi need to call some functions in library, so add -rdynamic option
$(TARGET) : core/luajit/src/libluajit.a core/libuv/.libs/libuv.a core/shark_init.h $(OBJS) force
    $(CC) -o $(TARGET) -rdynamic $(OBJS) $(LIB)

core/luajit/src/libluajit.a:
    @cd core/luajit; make

core/libuv/.libs/libuv.a:
    @cd core/libuv; ./autogen.sh; ./configure; make


DEPS := $(OBJS:.o=.d)
-include $(DEPS)

%.o : %.c
    $(CC) -MD -g -c $(CFLAGS) $< -o $@

LUAJIT_BIN=core/luajit/src/luajit

core/shark_init.h : core/shark_init.lua
    cd core/luajit/src; ./luajit -b ../../shark_init.lua ../../shark_init.h

bpf/bpf_builtin_lua.o : bpf/bpf.lua
    cd core/luajit/src; ./luajit -b ../../../bpf/bpf.lua ../../../bpf/bpf_builtin_lua.o

perf/perf_builtin_lua.o : perf/perf.lua
    cd core/luajit/src; ./luajit -b ../../../perf/perf.lua ../../../perf/perf_builtin_lua.o

force:
    true

clean:
    @rm -rf $(TARGET) *.d *.o core/*.d core/*.o bpf/*.d bpf/*.o perf/*.d perf/*.o core/shark_builtin.h bpf/bpf_builtin_lua.h perf/perf_builtin_lua.h

(1)因为有些BPF的选项只在比较高版本的Linux kernel上才支持,所以加了一个编译开关BPF_DISABLE,可以用来关闭BPF功能( 关闭BPF编译:make BPF_DISABLE=1 )。

(2)

PERF_LIBS= perf/libperf.a perf/libtraceevent.a perf/libapikfs.a

core/luajit/src/libluajit.a:
    @cd core/luajit; make

core/libuv/.libs/libuv.a:
    @cd core/libuv; ./autogen.sh; ./configure; make

使用了三个从Linux kernel生成的perf相关的库:perf/libperf.aperf/libtraceevent.aperf/libapikfs.a

还有三个第三方库:luajit(core/luajit)libuv(core/libuv))luv(core/luv)

(3)

core/shark_init.h : core/shark_init.lua
    cd core/luajit/src; ./luajit -b ../../shark_init.lua ../../shark_init.h

bpf/bpf_builtin_lua.o : bpf/bpf.lua
    cd core/luajit/src; ./luajit -b ../../../bpf/bpf.lua ../../../bpf/bpf_builtin_lua.o

perf/perf_builtin_lua.o : perf/perf.lua
    cd core/luajit/src; ./luajit -b ../../../perf/perf.lua ../../../perf/perf_builtin_lua.o

core/shark_init.lua用来生成core/shark_init.hbpf/bpf.luaperf/perf.lua分别用来生成bpf/bpf_builtin_lua.operf/perf_builtin_lua.o

(4)

DEPS := $(OBJS:.o=.d)
    -include $(DEPS)

%.o : %.c
    $(CC) -MD -g -c $(CFLAGS) $< -o $@

编译生成object文件,并会生成依赖文件。

(5)

TARGET=shark

#ffi need to call some functions in library, so add -rdynamic option
$(TARGET) : core/luajit/src/libluajit.a core/libuv/.libs/libuv.a core/shark_init.h $(OBJS) force
    $(CC) -o $(TARGET) -rdynamic $(OBJS) $(LIB)

最终编译生成一个可执行文件:shark

*NIX & Hacking —— 第6期

做一本我感兴趣的杂志,就这么简单!

CPU

Is there a way to dump a CPU’s CPUID information?

DTrace

Reducing RAM usage in pkgin

Git

365GIT
Deal with git am failures
Ry’s Git Tutorial

Golang

cgasm
Embedding Lua in Go
GopherCon 2015
GoWork
go-torch

Kernel

Getting into Linux Kernel Development
KernelDebuggingTricks
vmlinuz Definition

Python

Functional Programming in Python
Fun with BPF, or, shutting down a TCP listening socket the hard way

Unix

On Hurd, Linux and the (mis)adventures of cross-compiling a GNU Hurd toolchain
Unix as IDE: Introduction

Easter egg

What are some good computer tricks that are not commonly known?
Why aren’t there a lot of old programmers at software companies?
x86 Exploitation 101: “House of Lore” – People and traditions

git patch简介

本文简单介绍一下git patch

首先创建包含git的工作目录:

git init git_repo

接着在这个文件夹下创建一个文本文件(a.txt):

aaaa
bbbb
cccc
dddd
eeee
ffff

把这个文件加到git版本控制:

git add a.txt
git commit -m "Initialize a.txt"

接着再开出一个patch分支,剩下的操作都在这个分支上进行:

git checkout -b patch

然后把a.txt文件第二行的bbbb改成bb11

aaaa
bb11
cccc
dddd
eeee
ffff

提交:

git commit -a -m "Modify a.txt"

接下来生成相对于master分支的patch

[root@localhost git_repo]# git format-patch master
0001-Modify-a.txt.patch

看一下0001-Modify-a.txt.patch这个patch文件:

From 9512ec20468586e0632ece9e97e4e89b3a68c40e Mon Sep 17 00:00:00 2001
From: root <root@localhost.localdomain>
Date: Thu, 30 Jul 2015 02:30:45 -0400
Subject: [PATCH 1/3] Modify a.txt

---
 a.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/a.txt b/a.txt
index 1707b56..1ba3da1 100644
--- a/a.txt
+++ b/a.txt
@@ -1,5 +1,5 @@
 aaaa
-bbbb
+bb11
 cccc
 dddd
 eeee
--
2.4.3

重点看一下@@ -1,5 +1,5 @@-1,5表示原来的文件,+1,5表示修改后的文件。1表示起始行号,5表示从起始行号算起,一共包含多少行。下面的-bbbb表示删除原来文件的内容,而+bb11表示修改后文件的内容。

下面切换回master分支:

[root@localhost git_repo]# git checkout master
Switched to branch 'master'

看一下master分支的a.txt

[root@localhost git_repo]# cat a.txt
aaaa
bbbb
cccc
dddd
eeee
ffff

没有任何变化。

接下来merge patch

[root@localhost git_repo]# git am < 0001-Modify-a.txt.patch
Applying: Modify a.txt

再次查看a.txt

[root@localhost git_repo]# cat a.txt
aaaa
bb11
cccc
dddd
eeee
ffff

可以看到patch已经成功地merge进了a.txt

Linux kernel IOMMU代码分析笔记(4)——fault处理函数

dmar_fault是发生DMA remapping faultinterrupt remapping fault时的处理函数。代码如下:

irqreturn_t dmar_fault(int irq, void *dev_id)
{
    struct intel_iommu *iommu = dev_id;
    int reg, fault_index;
    u32 fault_status;
    unsigned long flag;

    raw_spin_lock_irqsave(&iommu->register_lock, flag);
    fault_status = readl(iommu->reg + DMAR_FSTS_REG);
    if (fault_status)
        pr_err("DRHD: handling fault status reg %x\n", fault_status);

    /* TBD: ignore advanced fault log currently */
    if (!(fault_status & DMA_FSTS_PPF))
        goto unlock_exit;

    fault_index = dma_fsts_fault_record_index(fault_status);
    reg = cap_fault_reg_offset(iommu->cap);
    while (1) {
        u8 fault_reason;
        u16 source_id;
        u64 guest_addr;
        int type;
        u32 data;

        /* highest 32 bits */
        data = readl(iommu->reg + reg +
                fault_index * PRIMARY_FAULT_REG_LEN + 12);
        if (!(data & DMA_FRCD_F))
            break;

        fault_reason = dma_frcd_fault_reason(data);
        type = dma_frcd_type(data);

        data = readl(iommu->reg + reg +
                fault_index * PRIMARY_FAULT_REG_LEN + 8);
        source_id = dma_frcd_source_id(data);

        guest_addr = dmar_readq(iommu->reg + reg +
                fault_index * PRIMARY_FAULT_REG_LEN);
        guest_addr = dma_frcd_page_addr(guest_addr);
        /* clear the fault */
        writel(DMA_FRCD_F, iommu->reg + reg +
            fault_index * PRIMARY_FAULT_REG_LEN + 12);

        raw_spin_unlock_irqrestore(&iommu->register_lock, flag);

        dmar_fault_do_one(iommu, type, fault_reason,
                source_id, guest_addr);

        fault_index++;
        if (fault_index >= cap_num_fault_regs(iommu->cap))
            fault_index = 0;
        raw_spin_lock_irqsave(&iommu->register_lock, flag);
    }

    writel(DMA_FSTS_PFO | DMA_FSTS_PPF, iommu->reg + DMAR_FSTS_REG);

unlock_exit:
    raw_spin_unlock_irqrestore(&iommu->register_lock, flag);
    return IRQ_HANDLED;
}  

分析一下这段代码:

(1)

fault_status = readl(iommu->reg + DMAR_FSTS_REG);
if (fault_status)
    pr_err("DRHD: handling fault status reg %x\n", fault_status);

/* TBD: ignore advanced fault log currently */
if (!(fault_status & DMA_FSTS_PPF))
    goto unlock_exit;

fault_index = dma_fsts_fault_record_index(fault_status);

DMAR_FSTS_REGFault Status Register,其中PPFPrimary Pending Fault)是否置位表明fault recording registers是否还有fault信息。而fault_index则是第一个含有fault信息的fault recording registers的索引。

(2)

reg = cap_fault_reg_offset(iommu->cap);

计算fault recording registers的地址偏移量。

(3)接下来的while循环会读取包含fault信息的fault recording registers

(4)

    data = readl(iommu->reg + reg +
            fault_index * PRIMARY_FAULT_REG_LEN + 12);
    if (!(data & DMA_FRCD_F))
        break;

    fault_reason = dma_frcd_fault_reason(data);
    type = dma_frcd_type(data);

    data = readl(iommu->reg + reg +
            fault_index * PRIMARY_FAULT_REG_LEN + 8);
    source_id = dma_frcd_source_id(data);

    guest_addr = dmar_readq(iommu->reg + reg +
            fault_index * PRIMARY_FAULT_REG_LEN);
    guest_addr = dma_frcd_page_addr(guest_addr);
    /* clear the fault */
    writel(DMA_FRCD_F, iommu->reg + reg +
        fault_index * PRIMARY_FAULT_REG_LEN + 12);

fault recording registers读取fault信息。需要注意的是,由于X86平台是小端模式,所以寄存器的高位内容会位于内存的高地址空间。另外,每读取完一个fault recording register信息,要把DMA_FRCD_F写回寄存器,用来表明软件已经读完了。

(5)dmar_fault_do_one是格式化打印的fault信息,其代码如下:

static int dmar_fault_do_one(struct intel_iommu *iommu, int type,
        u8 fault_reason, u16 source_id, unsigned long long addr)
{
    const char *reason;
    int fault_type;

    reason = dmar_get_fault_reason(fault_reason, &fault_type);

    if (fault_type == INTR_REMAP)
        pr_err("INTR-REMAP: Request device [[%02x:%02x.%d] "
               "fault index %llx\n"
            "INTR-REMAP:[fault reason %02d] %s\n",
            (source_id >> 8), PCI_SLOT(source_id & 0xFF),
            PCI_FUNC(source_id & 0xFF), addr >> 48,
            fault_reason, reason);
    else
        pr_err("DMAR:[%s] Request device [%02x:%02x.%d] "
               "fault addr %llx \n"
               "DMAR:[fault reason %02d] %s\n",
               (type ? "DMA Read" : "DMA Write"),
               (source_id >> 8), PCI_SLOT(source_id & 0xFF),
               PCI_FUNC(source_id & 0xFF), addr, fault_reason, reason);
    return 0;
}

(6)

writel(DMA_FSTS_PFO | DMA_FSTS_PPF, iommu->reg + DMAR_FSTS_REG);

最后表明软件已处理完所有的fault信息。

Luajit笔记(1)——Luajit简介

Luajit(官方网站:http://luajit.org/)是针对Lua语言的一个JIT(Just-In-Time)的编译器,目前完全支持Lua5.1版本。

下载Luajit源代码后,编译安装很简单:

make
make install

生成一个可执行文件:luajit(其实是一个符号连接):

[root@Fedora lua_program]# ls -lt /usr/local/bin/luajit
lrwxrwxrwx. 1 root root 12 Jul 27 21:06 /usr/local/bin/luajit -> luajit-2.0.4

luajit可以用来运行Lua脚本和语句:

[root@Fedora lua_program]# luajit
LuaJIT 2.0.4 -- Copyright (C) 2005-2015 Mike Pall. http://luajit.org/
JIT: ON CMOV SSE2 SSE3 fold cse dce fwd dse narrow loop abc sink fuse
> print("Hello world\n")
Hello world

此外还会生成一些动态和静态链接库,供应用程序使用:

lrwxrwxrwx.  1 root root      22 Jul 27 21:06 libluajit-5.1.so -> libluajit-5.1.so.2.0.4
lrwxrwxrwx.  1 root root      22 Jul 27 21:06 libluajit-5.1.so.2 -> libluajit-5.1.so.2.0.4
-rwxr-xr-x.  1 root root  458144 Jul 27 21:06 libluajit-5.1.so.2.0.4
-rw-r--r--.  1 root root  790748 Jul 27 21:06 libluajit-5.1.a

以下面的C程序为例:

#include <stdio.h>
#include <string.h>
#include "luajit-2.0/lua.h"
#include "luajit-2.0/lualib.h"
#include "luajit-2.0/lauxlib.h"


int main (void) {
    char buff[256];
    int error;
    lua_State *L = lua_open();   /* opens Lua */
    luaL_openlibs(L);

    while (fgets(buff, sizeof(buff), stdin) != NULL) {
        error = luaL_loadbuffer(L, buff, strlen(buff), "line") ||
                lua_pcall(L, 0, 0, 0);
        if (error) {
          fprintf(stderr, "%s", lua_tostring(L, -1));
          lua_pop(L, 1);  /* pop error message from the stack */
        }
    }

    lua_close(L);
    return 0;
}

编译(链接Luajit库):

gcc -g -o a a.c -lluajit-5.1

运行:

[root@Fedora test]# ./a
print("hello")
hello

Linux kernel 笔记 (10) ——编译和安装Linux kernel命令简介

编译和安装Linux kernel时常用的命令:

make
编译出Linux kernel image文件,即vmlinuz

make modules
把在配置时选择M的配置项编译成一个一个的小模块(选项Y已经编译进vmlinuz,选项N会忽略掉)。这些模块会链接新编译出来的kernel image

make install
安装vmlinuz文件。如保存到/boot文件夹。

make modules_install
安装模块文件到/lib/modules/lib/modules/<version>

参考资料:
What happens in each step of the Linux kernel-building process?

Linux kernel 笔记 (9) ——如何理解“make oldconfig”?

在把一个老版本kernel.config文件拷贝到一个新版本的kernel源代码文件夹后,要执行“make oldconfig”命令。它的作用是检查已有的.config文件和Kconfig文件的规则是否一致,如果一致,就什么都不做,否则提示用户哪些源代码中有的选项在.config文件没有。

参考资料:
What does “make oldconfig” do exactly – Linux kernel makefile
How to understand ‘make oldconfig’?

Linux kernel 笔记 (8) ——vmlinux,vmlinuz,zImage,bzImage

vmlinux
一个非压缩的,静态链接的,可执行的,不能bootableLinux kernel文件。是用来生成vmlinuz的中间步骤。

vmlinuz
一个压缩的,能bootableLinux kernel文件。vmlinuzLinux kernel文件的历史名字,它实际上就是zImagebzImage

[root@Fedora boot]# file vmlinuz-4.0.4-301.fc22.x86_64
vmlinuz-4.0.4-301.fc22.x86_64: Linux kernel x86 boot executable bzImage, version 4.0.4-301.fc22.x86_64 (mockbuild@bkernel02.phx2.fedoraproject.o, RO-rootFS, swap_dev 0x5, Normal VGA

zImage
仅适用于640k内存的Linux kernel文件。

bzImage
Big zImage,适用于更大内存的Linux kernel文件。

总结一下,启动现代Linux系统时,实际运行的即为bzImage kernel文件。

参考资料:
vmlinuz Definition

北京东三环互联网美企招聘信息

代友发一则招聘信息:

公司:

北京东三环互联网美企

招聘职位:

(1)Sr. Software Engineer
C/C++, Go, Python, database design, SQL and/or knowledge of TCP/IP and network programming。

(2)Sr./Lead Software Engineer
Java/Ruby, Web system development。

(3)Sr./Lead QA Engineer
Extensive hands-on experience in devising test methods and automation framework design and implementation。

(4)Sr./Lead Software Engineer
Solid programming skills and passion for elegant, well-abstracted, reusable code components; C++, Python, Go experience。

(5)Sr./Lead Software Engineer
Data structure, Algorithm, Linux C++, billable ratio, identify user profiles, accurately predict traffic。

(6)Lead/Principal QA Engineer
Extensive hands-on experience in devising test methods and automation framework design and implementation。

(7)Principal Engineer/Architect
Experienced on large-scale web, cloud and/or Big Data framework or applications (like Hadoop, Spark)。

(8)Lead/Principal Dev-Ops Engineer
Design and develop tools targeting applications monitoring and software release automation, Python/C++/Java is preferred。

我没有给出详解的JD,只列出一些关键字。个人觉得这些信息足够了。

待遇:

这个取决于个人能力,需要自己谈,但绝对不会低于国内一线互联网公司的工资。

福利:

早晚免费班车,免费午餐,不限量饮料,零食等。

加班:

不敢说百分百没有,因为有时系统升级还是要盯一盯的。其它情况基本不会加班。

联系方式:

请发到我的邮箱:nan#chinadtrace.org(把#换成@)。尽量做到每信必复。但是由于工作关系,回复晚了,还请见谅。

有效期:

长期有效。一旦失效,我会在标题表明。

老友聚会

7月17日,臣一家来北京玩。下午参观了天文馆,晚上吃了“东来顺”。
7月18日晚上,四家人在“便宜坊”聚会,并合影留恋。
7月19日,送他们一家人离京。
时间好快,上次四个人聚会还是10年前的事。希望以后这样的机会能多一点吧。。。

Page 1 of 5

Powered by WordPress & Theme by Anders Norén