Crash工具笔记 (3)—— 在Xen环境使用crash

这两周一直在crash邮件列表里讨论如何在SuSE Xen上使用crash调试Dom0 kernel。邮件来来回回讨论很多(参见这里),最后还发现了一个bug。细节不说了,把最后的结果总结一下:

(1)由于SuSE kerenl默认编译打开CONFIG_STRICT_DEVMEM编译开关,所以crash工具无法完全访问/dev/mem,可以使用/proc/kcore作为代替;

(2)SuSE带有crash.ko驱动(位于:“/lib/modules/uname -r/updates/crash.ko”),但默认没有安装,可以自己手动安装(使用insmod命令),然后就可以使用了:

# crash

crash 7.1.3
Copyright (C) 2002-2014  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

crash: /boot/xen-4.5.gz: original filename unknown
       Use "-f /boot/xen-4.5.gz" on command line to prevent this message.

WARNING: machine type mismatch:
         crash utility: X86_64
         /var/tmp/xen-4.5.gz_ud3IRy: X86

crash: /boot/symtypes-3.12.49-6-default.gz: original filename unknown
       Use "-f /boot/symtypes-3.12.49-6-default.gz" on command line to
prevent this message.

crash: /boot/symvers-3.12.49-6-default.gz: original filename unknown
       Use "-f /boot/symvers-3.12.49-6-default.gz" on command line to
prevent this message.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: /boot/vmlinux-3.12.49-6-xen.gz
   DEBUGINFO: /usr/lib/debug/boot/vmlinux-3.12.49-6-xen.debug
    DUMPFILE: /dev/crash
        CPUS: 128
        DATE: Fri Nov 20 06:55:06 2015
      UPTIME: 18:51:36
LOAD AVERAGE: 1.76, 1.48, 1.21
       TASKS: 1230
    NODENAME: dl980-5
     RELEASE: 3.12.49-6-xen
     VERSION: #1 SMP Mon Oct 26 16:05:37 UTC 2015 (11560c3)
     MACHINE: x86_64  (1995 Mhz)
      MEMORY: 125.9 GB
         PID: 6618
     COMMAND: "crash"
        TASK: ffff881ea93b2140  [THREAD_INFO: ffff881e869f2000]
         CPU: 112
       STATE: TASK_RUNNING (ACTIVE)

 

Crash工具笔记 (2)—— 打印运行“crash”命令的调试信息

使用-d number可以打印运行crash命令时,输出的调试信息。number越大,输出的信息越多。目前-d8可以打印所有的调试信息。举例如下:

# crash -d8

crash 7.0.2-6.el7
Copyright (C) 2002-2013  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.


find_booted_kernel: search for [Linux version 3.10.0-123.el7.x86_64.debug (mockbuild@x86-017.build.eng.bos.redhat.com) (gcc version 4.8.2 20
140120 (Red Hat 4.8.2-16) (GCC) ) #1 SMP Mon May 5 11:24:18 EDT 2014]
mount_points[0]: / (167c600)
mount_points[1]: /proc (167c620)
mount_points[2]: /sys (167c640)
mount_points[3]: /dev (167c660)
mount_points[4]: /sys/kernel/security (167c680)
mount_points[5]: /dev/shm (167c6b0)
mount_points[6]: /dev/pts (167c6d0)
mount_points[7]: /run (167c6f0)
mount_points[8]: /sys/fs/cgroup (167c710)
mount_points[9]: /sys/fs/cgroup/systemd (167c740)
mount_points[10]: /sys/fs/pstore (167c780)
mount_points[11]: /sys/fs/cgroup/cpuset (167c7b0)
mount_points[12]: /sys/fs/cgroup/cpu,cpuacct (167c7f0)
mount_points[13]: /sys/fs/cgroup/memory (167c830)
......

 

Crash工具笔记 (1)—— “current context”

成功启动crash会话后,会有一个task被指定为current context。因为有一些命令是context-sensitive,也即这些命令的运行会依赖于current context,所以知道当前的current context就很重要。

选择current context的标准:
a)coredump文件:

The task that was running when die() was called.
The task that was running when panic() was called.
The task that was running when an ALT-SYSRQ-c keyboard interrupt was received.
The task that was running when the character "c" was echoed to /proc/sysrq-trigger. 

b)当前运行的系统:

`crash`命令本身.

执行set命令显示当前current context

crash> set
    PID: 2366
COMMAND: "crash"
   TASK: ffff88001ae60000  [THREAD_INFO: ffff88001c1f0000]
    CPU: 0
  STATE: TASK_RUNNING (ACTIVE)

也可利用set命令改变当前current context

crash> set 1
    PID: 1
COMMAND: "systemd"
   TASK: ffff88001dfd8000  [THREAD_INFO: ffff88001dfe0000]
    CPU: 0
  STATE: TASK_INTERRUPTIBLE