addr2line原理与使用
软件版本 | 硬件版本 | 更新内容 |
---|---|---|
1. 概述
当我们程序crash之后,系统往往会帮我们做dump_stack的操作,也就可以看到当时调用栈的信息,那么如何从调用栈的信息反向找出对应的代码的那一行就需要使用addr2line.
2. 原理
我们知道ELF文件中除了ELF标准定义的那些标准section,我也是可以加入一些自定义的段来保存一些特殊的信息的,当我们使用gcc加上g选项编译生成一个ELF文件时,这个ELF中会存在一个 非标准 的section,也就是 .debug_info 这个段,这个段就存有addr2line所需要的信息。
.debug_info 这个段的信息是按 DWARF 定义的格式来存储的,当然信息解析就需要参考DWARF来解析,addr2line就是这样的一个按DWARF来将一个程序地址转化一个文件名:行号的程序。
例如我们存在如下一个简单的test.c程序:
#include <stdio.h>
int add(int a, int b) {
return a + b;
}
int main(int argc, char *argv[])
{
int a = 0;
int b = 1;
int c = 0;
c = add(a, b);
return 0;
}
然后通过gcc -g -o test ./test.c
来编译成出一个ELF文件叫test
再通过readelf -w ./test
,来dump它的dubug_info段,如下所示
...
The File Name Table (offset 0x9f):
Entry Dir Time Size Name
1 1 0 0 test.c
2 2 0 0 stddef.h
3 3 0 0 types.h
4 4 0 0 struct_FILE.h
5 4 0 0 FILE.h
6 5 0 0 stdio.h
7 3 0 0 sys_errlist.h
Line Number Statements:
[0x000000f8] Set column to 23
[0x000000fa] Extended opcode 2: set Address to 0x1129
[0x00000105] Special opcode 7: advance Address by 0 to 0x1129 and Line by 2 to 3
[0x00000106] Set column to 11
[0x00000108] Special opcode 202: advance Address by 14 to 0x1137 and Line by 1 to 4
[0x00000109] Set column to 1
[0x0000010b] Special opcode 118: advance Address by 8 to 0x113f and Line by 1 to 5
[0x0000010c] Special opcode 36: advance Address by 2 to 0x1141 and Line by 3 to 8
[0x0000010d] Set column to 6
[0x0000010f] Advance PC by constant 17 to 0x1152
[0x00000110] Special opcode 34: advance Address by 2 to 0x1154 and Line by 1 to 9
[0x00000111] Special opcode 104: advance Address by 7 to 0x115b and Line by 1 to 10
[0x00000112] Special opcode 104: advance Address by 7 to 0x1162 and Line by 1 to 11
[0x00000113] Special opcode 105: advance Address by 7 to 0x1169 and Line by 2 to 13
[0x00000114] Set column to 9
[0x00000116] Advance PC by constant 17 to 0x117a
[0x00000117] Special opcode 21: advance Address by 1 to 0x117b and Line by 2 to 15
[0x00000118] Set column to 1
[0x0000011a] Special opcode 76: advance Address by 5 to 0x1180 and Line by 1 to 16
[0x0000011b] Advance PC by 2 to 0x1182
[0x0000011d] Extended opcode 1: End of Sequence
...
上面的信息就是记录了汇编指令和行号之间的对应关系,如[0x00000110] Special opcode 34: advance Address by 2 to 0x1154 and Line by 1 to 9
,这行最后面的13其实就是源码test.c中的第9行,这行中的0x1154其实是一个汇编指令地址。
然后我们再通过objdump -d ./test
来反汇编来如下内容:
...
0000000000001129 <add>:
1129: f3 0f 1e fa endbr64
112d: 55 push %rbp
112e: 48 89 e5 mov %rsp,%rbp
1131: 89 7d fc mov %edi,-0x4(%rbp)
1134: 89 75 f8 mov %esi,-0x8(%rbp)
1137: 8b 55 fc mov -0x4(%rbp),%edx
113a: 8b 45 f8 mov -0x8(%rbp),%eax
113d: 01 d0 add %edx,%eax
113f: 5d pop %rbp
1140: c3 retq
0000000000001141 <main>:
1141: f3 0f 1e fa endbr64
1145: 55 push %rbp
1146: 48 89 e5 mov %rsp,%rbp
1149: 48 83 ec 20 sub $0x20,%rsp
114d: 89 7d ec mov %edi,-0x14(%rbp)
1150: 48 89 75 e0 mov %rsi,-0x20(%rbp)
1154: c7 45 f4 00 00 00 00 movl $0x0,-0xc(%rbp)
115b: c7 45 f8 01 00 00 00 movl $0x1,-0x8(%rbp)
1162: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
1169: 8b 55 f8 mov -0x8(%rbp),%edx
116c: 8b 45 f4 mov -0xc(%rbp),%eax
116f: 89 d6 mov %edx,%esi
1171: 89 c7 mov %eax,%edi
1173: e8 b1 ff ff ff callq 1129 <add>
1178: 89 45 fc mov %eax,-0x4(%rbp)
117b: b8 00 00 00 00 mov $0x0,%eax
1180: c9 leaveq
1181: c3 retq
1182: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
1189: 00 00 00
118c: 0f 1f 40 00 nopl 0x0(%rax)
...
可以看到1154就代码中第9行,定义一个变量。 以上就是addr2line工作的基本原理,详细的东西需要仔细研究 DWARF,个人感觉没有什么意义,知道大概原理就可以了。
3. addr2line的使用
addr2line使用非常简单,常用的就是addr2line -e <ELF文件> <address ...>
,如果不使用**-e指定默认使用a.out**,另外地址怎么找,就是就是从ELF文件读取出sysbol再加上crash时调用栈中的offset就可以了,使用内核crash时会打印类似如下的stack信息:
root@raspi:~# echo c /proc/sysrq-trigger
c /proc/sysrq-trigger
root@raspi:~# echo c > /proc/sysrq-trigger
[31323.615771] sysrq: Trigger a crash
[31323.619359] Kernel panic - not syncing: sysrq triggered crash
[31323.625243] CPU: 3 PID: 725 Comm: bash Not tainted 5.8.18-g6c23a5884ae7 #1
[31323.632247] Hardware name: Raspberry Pi 4 Model B (DT)
[31323.637489] Call trace:
[31323.640007] dump_backtrace+0x0/0x188
[31323.643756] show_stack+0x28/0x38
[31323.647150] __dump_stack+0x2c/0x3c
[31323.650718] dump_stack+0x23c/0x2ec
[31323.654289] panic+0x2e8/0x578
[31323.657415] sysrq_handle_reboot+0x0/0x2c
[31323.661509] __handle_sysrq+0xd8/0x1fc
[31323.665342] write_sysrq_trigger+0xb0/0xc0
[31323.669528] pde_write+0x54/0x68
[31323.672829] proc_reg_write+0x8c/0xa8
[31323.676570] vfs_write+0xf0/0x210
[31323.679957] ksys_write+0x68/0xf0
[31323.683344] __arm64_sys_write+0x1c/0x28
[31323.687356] __invoke_syscall+0x20/0x2c
[31323.691277] invoke_syscall+0x80/0xd0
[31323.695021] el0_svc_common+0xbc/0x150
[31323.698853] do_el0_svc+0x34/0x44
[31323.702249] el0_svc+0x40/0x50
[31323.705378] el0_sync_handler+0x134/0x200
[31323.709475] el0_sync+0x158/0x180
[31323.712880] SMP: stopping secondary CPUs
[31323.716902] Kernel Offset: disabled
[31323.720466] CPU features: 0x240022,20006000
[31323.724731] Memory Limit: none
[31323.727877] ---[ end Kernel panic - not syncing: sysrq triggered crash ]---
每行中最的**+0xXX/0xXX**,"/"前的0xXX就是offset,后面的0xXX是当前函数的长度。然后可以从vmlinux中找到当前函数的起始地址,也可以从编译生成的System.map中找到函数的起始地址再加上偏移就可以了。最后通过addr2line就可以知道那个文件的那行的。
以pde_write如下所示:
//在System.map中如下:
...
cat ./System.map | grep "pde_write"
ffff8000104e9a58 t pde_write
...
ffff8000104e9a58 + 0x54 = 0xffff8000104e9aac
//执行 aarch64-linux-gnu-addr2line -e vmlinux 0xffff8000104e9aac 输出如下
fs/proc/inode.c:330
提示
欢迎评论、探讨,如果发现错误请指正。转载请注明出处! 探索者