Valgrind
Introduction
Valgrind是一个用于构建动态分析工具(Dynamic Analysis Tools)的插桩(Instructmentation)框架。
提供了一些有用的标准工具(tools):
- Memcheck:用于C/C++的内存错误检测器;
- Cachegrind:缓存和分支预测分析器;
- Callgrind:生成调用图的缓存分析器,与Cachegrind有一些重叠但收集更多的信息;
- Helgrind:线程错误检测器;
- DRD:线程错误检测器,与Helgrind类似,但使用不同的分析技术,因此可能会发现不同问题;
- Massif:堆分析器,目的是使得程序使用更少的内存;
- DHAT:堆分析器;目的是帮助理解块的生命周期、块的利用率以及布局的效率问题。
另外,
- Lackey是一个示例工具,用于说明一些插桩基础知识;
- Nulgrind是最简单的Valgrind工具,不进行任何分析或仪器化,仅用于测试目的。
Valgrind与CPU和操作系统的细节密切相关,在较小程度上与编译器和C基础库有关。
Valgrind是通过标准的Unix ./configure、make、make install过程构建的。
Valgrind的许可证为GPL v2.
Valgrind Core 初探
工作方式
Valgrind被设计为尽可能的非侵入性(non-intrusive)。它直接作用于现有的可执行文件,而无需重新编译、重新链接或以其他方式修改要检查的程序。
valgrind [valgrind-options] your-prog [your-prog-options]
最重要的参数是--tool指定valgrind需要执行的工具:
valgrind --tool=memcheck ls -l
无论使用哪种工具,程序都会在启动前被valgrind接管;
链接后的程序(包括可执行程序及所需的运行库)在由Valgrind Core虚拟的(synthetic)CPU运行;
Valgrind Core将代码(UCode)递交给指定的工具,工具添加插桩代码后并将结果交还给核心。
不同工具添加的插桩代码量差异很大:一方面Memcheck会添加代码来检查每个内存访问和每个计算得出的值,使得程序运行速度比本机运行慢10-50倍;另一方面Nulgrind完全不添加任何插桩代码,导致总体上仅减慢大约4倍。
Valgrind模拟程序执行的每一条指令。因此tools会检查或分析应用程序中的代码以及所有支持的动态链接库,包括C库、图形库等等。
开始使用
首先考虑是否使用调试信息重新编译应用程序和支持库(使用-g选项)。如果没有调试信息,Valgrind 工具最多只能猜测特定代码片段属于哪个函数,使得错误消息和性能分析输出几乎无用。而使用-g,Valgrind会输出直接指向相关源代码行的消息。

另外,Valgrind建议在编译程序时禁用代码优化-O以及启用-Wall选项,有助于降低False Positives以及False Negatives的出现几率,并且获得更小的Performance Overhead.
支持线程
Valgrind支持多线程程序的运行。
对于多线程程序,程序将使用本机的线程库;Valgrind会序列化执行该程序,以便一次只运行一个(内核)线程。这种方法避免了实现一个真正多线程版本的Valgrind时可能出现的可怕的实现问题,意味着多线程应用程序即使在多处理器或多核机器上也永远不会同时使用多个CPU.
支持信号
Valgrind拥有相当完整的信号实现。它应该能够处理任何符合POSIX标准的信号使用。
Limitations
- 在x86和amd64架构上,不支持3DNow!指令。如果翻译器遇到这些指令,Valgrind在执行该指令时会生成SIGILL信号。除此之外,在x86和amd64上,基本上支持所有指令,包括64位模式下的AVX和AES,以及32位模式下的SSSE3;
- 在Valgrind的Memcheck工具下,程序的内存消耗会显著增加。这是由于Valgrind在后台维护了大量的管理信息所致。另一个原因是Valgrind会动态地翻译原始可执行文件。经过翻译和插桩的代码比原始代码大12-18倍;
- Valgrind浮点数实现并不是IEEE754标准,存在精度及四舍五入与IEEE754不同的问题;除此之外,Valgrind对部分浮点算数指令也可能会与本机的实现不一致。
一个例子
sewardj@phoenix:~/newmat10$ ~/Valgrind-6/valgrind -v ./bogon
==25832== Valgrind 0.10, a memory error detector for x86 RedHat 7.1.
==25832== Copyright (C) 2000-2001, and GNU GPL'd, by Julian Seward.
==25832== Startup, with flags:
==25832== --suppressions=/home/sewardj/Valgrind/redhat71.supp
==25832== reading syms from /lib/ld-linux.so.2
==25832== reading syms from /lib/libc.so.6
==25832== reading syms from /mnt/pima/jrs/Inst/lib/libgcc_s.so.0
==25832== reading syms from /lib/libm.so.6
==25832== reading syms from /mnt/pima/jrs/Inst/lib/libstdc++.so.3
==25832== reading syms from /home/sewardj/Valgrind/valgrind.so
==25832== reading syms from /proc/self/exe
==25832==
==25832== Invalid read of size 4
==25832== at 0x8048724: BandMatrix::ReSize(int,int,int) (bogon.cpp:45)
==25832== by 0x80487AF: main (bogon.cpp:66)
==25832== Address 0xBFFFF74C is not stack'd, malloc'd or free'd
==25832==
==25832== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==25832== malloc/free: in use at exit: 0 bytes in 0 blocks.
==25832== malloc/free: 0 allocs, 0 frees, 0 bytes allocated.
==25832== For a detailed leak analysis, rerun with: --leak-check=yes
Valgrind Core 再探
Valgrind Memcheck 源码分析_valgrind源码解释_Linsoft1994的博客-CSDN博客
use-after-free.c:
#include <stdlib.h>
int main() {
char *x = (char*)malloc(10 * sizeof(char*));
free(x);
return x[5];
}
root@ubuntu-2204:/home/ubuntu/Desktop/ASan# gcc use-after-free.c -g
root@ubuntu-2204:/home/ubuntu/Desktop/ASan# gdb --args valgrind ./a.out
在main函数设置断点,并运行程序:
pwndbg> **b main**
Breakpoint 1 at 0x1360: file launcher-linux.c, line 355.
pwndbg> **r**
Starting program: /usr/local/bin/valgrind ./a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main (argc=argc@entry=2, argv=argv@entry=0x7fffffffe408, envp=0x7fffffffe420) at launcher-linux.c:355
355 {
当执行valgrind ./a.out时,valgrind程序的main函数位于launcher-linux.c

Valgrind/coregrind/launcher-linux.c
在/home/ubuntu/Desktop/valgrind/coregrind/launcher-linux.c:520调用默认tool memcheck

在此处设置断点,继续运行
pwndbg> **b /home/ubuntu/Desktop/valgrind/coregrind/launcher-linux.c:520**
Breakpoint 2 at 0x555555555683: file launcher-linux.c, line 520.
pwndbg> **c**
Continuing.
Breakpoint 2, main (argc=argc@entry=2, argv=argv@entry=0x7fffffffe408, envp=0x7fffffffe420) at launcher-linux.c:520

打印execve的参数toolfile和argv:

打印环境变量:

可以看到/home/ubuntu/Desktop/valgrind/coregrind/launcher-linux.c:520处调用memcheck-amd64-linux
Valgrind/memcheck-amd64-linux
memchec作为valgrind的tool存在,必须实现4个函数:
static void mc_pre_clo_init( void );
static void mc_post_clo_init ( void );
IRSB* MC_(instrument) ( VgCallbackClosure* closure,
IRSB* sb_in,
const VexGuestLayout* layout,
const VexGuestExtents* vge,
const VexArchInfo* archinfo_host,
IRType gWordTy, IRType hWordTy );
static void mc_fini ( Int exitcode );
前面说到execve执行memcheck-amd64-linux;要用gdb跟入execve,使用catch exec指令:
pwndbg> **catch exec**
Catchpoint 1 (exec)
pwndbg> **r**
Starting program: /usr/local/bin/valgrind ./a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
process 22723 is executing new program: /usr/local/libexec/valgrind/memcheck-amd64-linux
Catchpoint 1 (exec'd /usr/local/libexec/valgrind/memcheck-amd64-linux), 0x00000000580ad940 in _start ()
进入memcheck-amd64-linux的_start函数:

然后调用_start_in_C_linux函数,在该函数下断点,继续运行:
pwndbg> **b _start_in_C_linux**
Breakpoint 2 at 0x580b3900: file m_main.c, line 3086.
pwndbg> **c**
Continuing.
Breakpoint 2, _start_in_C_linux (pArgc=0x7fffffffe3c0) at m_main.c:3086
进入valgrind/coregrind/m_main.c中的代码:

然后调用valgrind_main:

在valgrind_main下断点,继续运行:
pwndbg> **b valgrind_main**
Breakpoint 3 at 0x580b1b30: file m_main.c, line 1272.
pwndbg> **c**
Continuing.
Breakpoint 3, valgrind_main (argc=argc@entry=2, argv=argv@entry=0x7fffffffe3c8, envp=envp@entry=0x7fffffffe3e0) at m_main.c:1272
进入valgrind_main函数:

在mc_pre_clo_init出下断点,继续运行:
pwndbg> **b mc_pre_clo_init**
Breakpoint 2 at 0x5800c6f0: file mc_main.c, line 8392.
pwndbg> **c**
Continuing.
Breakpoint 2, mc_pre_clo_init () at mc_main.c:8392
8392 {

发现在valgrind_main.isra+3700出调用mc_pre_clo_init,查看此处的汇编:

在此处下断点,重新运行跳转到此处:
pwndbg> **b *valgrind_main+3700**
Breakpoint 3 at 0x580b29a4: file m_main.c, line 1752.
pwndbg> **c**
Continuing.
Breakpoint 3, valgrind_main (argc=argc@entry=2, argv=argv@entry=0x7fffffffe3c8, envp=envp@entry=0x7fffffffe3e0) at m_main.c:1752
1752 if (VG_(needs).var_info)
Mc_pre_clo_init

Mc_post_clo_init

MC_(instrument)
使用vscode跳转到宏MC_(str)的定义如下:

因此MC_(instrument)展开应该为vgMemCheck_instrument;下断点到该函数,运行:
wndbg> **b vgMemCheck_instrument**
Breakpoint 2 at 0x58035670: file mc_translate.c, line 8564.
pwndbg> **c**
Continuing.
==23902== Memcheck, a memory error detector
==23902== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==23902== Using Valgrind-3.22.0.GIT and LibVEX; rerun with -h for copyright info
==23902== Command: ./a.out
==23902==
Breakpoint 2, vgMemCheck_instrument (closure=0x1002da9ce0, sb_in=0x597e3a48 <temporary+74784>, layout=0x5828aaa0 <amd64guest_layout>, vge=0x1002da9d00, archinfo_host=0x1002da9db0, gWordTy=Ity_I64, hWordTy=Ity_I64) at mc_translate.c:8564
8564 {

可以看到MC_(instrument)的实现位于valgrind/memcheck/mc_translate.c;函数调用栈:

Mc_fini
!!!重要:先前在使用pwndbg调试的时候经常会出现:

原因是pwndbg默认会拦截SIGSEGV信号并且强制结束程序:


实际上Valgrind会处理这个信号,因此应该将其pass给程序;使用如下命令:
pwndbg> **handle SIGSEGV nostop noprint pass**
Signal Stop Print Pass to program Description
SIGSEGV No No Yes Segmentation fault
pwndbg> **b mc_fini**
Breakpoint 2 at 0x5800d630: file mc_main.c, line 8286.
pwndbg> **c**
Continuing.
==25904== Memcheck, a memory error detector
==25904== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==25904== Using Valgrind-3.22.0.GIT and LibVEX; rerun with -h for copyright info
==25904== Command: ./a.out
==25904==
==25904== Invalid read of size 1
==25904== at 0x109197: main (use-after-free.c:5)
==25904== Address 0x4a97045 is 5 bytes inside a block of size 80 free'd
==25904== at 0x484B28C: free (vg_replace_malloc.c:974)
==25904== by 0x10918E: main (use-after-free.c:4)
==25904== Block was alloc'd at
==25904== at 0x484880F: malloc (vg_replace_malloc.c:431)
==25904== by 0x10917E: main (use-after-free.c:3)
==25904==
==25904==
Breakpoint 2, mc_fini (exitcode=0) at mc_main.c:8286
8286 {

函数调用栈总结
Valgrind程序执行:
coregrind/launcher-linux.c:main+819→execve(“memcheck-amd64-linux”, argv, new_env)
Memcheck-amd64-linux程序执行:
coregrind/m_main.c:(_start)→_start_in_C_linux→valgrind_main.isra (带-fipa-sra优化选项)
- memcheck/mc_main.c:mc_pre_clo_init:valgrind_main.isra+3700→mc_pre_clo_init
- memcheck/mc_main.c:mc_post_clo_init:valgrind_main.isra+4574→mc_post_clo_init
- memcheck/mc_translate.c:MC_(instrument):valgrind_main.isra→ run_a_thread_NORETURN+190→[vgPlain_scheduler+2858→vgPlain_scheduler+2858] or [vgPlain_scheduler+1662→handle_chain_me+171]→LibVEX_Translate+81→ LibVEX_FrontEnd+1748→tool_instrument_then_gdbserver_if_needed+34→ vgMemCheck_instrument
- memcheck/mc_main.c:mc_fini:run_a_thread_NORETURN+500→ shutdown_actions_NORETURN+266 → mc_fini
Memcheck
Memcheck是一个内存错误检测工具。它能够检测在C和C++程序中常见的以下问题:
- 访问不应该访问的内存,例如越界访问堆块(heap blocks)、越界访问栈顶部,以及在已释放的内存后继续访问(UAF);
- 使用未被初始化的值,或者从其他未被初始化的值派生而来的值。


- 不正确地释放堆内存,如重复释放堆块或malloc/new/new[]与free/delete/delete[]不匹配;
- 在memcpy及相关函数中,源(src)指针和目标(dst)指针有重叠;
- 将可疑的(可能为负值)
size值传递给内存分配函数的大小参数。 - 内存泄漏
报错解释
- 非法读/写错误(程序在地址0xBFFFF0E0处进行了4字节的读取):
Invalid read of size 4
at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9)
by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9)
by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326)
by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621)
Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd
这种情况发生在你的程序在Memcheck认为不应该的位置读取或写入内存时。在这个例子中,程序在地址0xBFFFF0E0处进行了4字节的读取,该地址位于系统提供的库libpng.so.2.1.0.9.
- 使用未初始化的值
Conditional jump or move depends on uninitialised value(s)
at 0x402DFA94: _IO_vfprintf (_itoa.h:49)
by 0x402E8476: _IO_printf (printf.c:36)
by 0x8048472: main (tests/manuel1.c:8)
当你的程序使用一个未被初始化(即未定义)的值时,会报告一个未初始化值使用错误。在这里,未定义的值被用于C库的printf机制的某个地方。这个错误是在运行以下程序时报告的:
int main()
{
int x;
printf ("x = %d\\n", x);
}
- 在系统调用中使用未初始化或不可寻址的值
#include <stdlib.h>
#include <unistd.h>
int main( void )
{
char* arr = malloc(10);
int* arr2 = malloc(sizeof(int));
write( 1 /* stdout */, arr, 10 );
exit(arr2[0]);
}
Syscall param write(buf) points to uninitialised byte(s)
at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so)
by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so)
by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out)
Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd
at 0x259852B0: malloc (vg_replace_malloc.c:130)
by 0x80483F1: main (a.c:5)
Syscall param exit(error_code) contains uninitialised byte(s)
at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so)
by 0x8048426: main (a.c:8)
- 非法free(double frees)
Invalid free()
at 0x4004FFDF: free (vg_clientmalloc.c:577)
by 0x80484C7: main (tests/doublefree.c:10)
Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd
at 0x4004FFDF: free (vg_clientmalloc.c:577)
by 0x80484C7: main (tests/doublefree.c:10)
- 一个堆块使用不适当的释放函数被释放时(new [] → free)
Mismatched free() / delete / delete []
at 0x40043249: free (vg_clientfuncs.c:171)
by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149)
by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60)
by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44)
Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd
at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152)
by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314)
by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416)
by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272)
- 源块与目标块重叠:
==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21)
==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71)
==27492== by 0x804865A: main (overlap.c:40)
- 可疑的参数值
==32233== Argument 'size' of function malloc has a fishy (possibly negative) value: -3
==32233== at 0x4C2CFA7: malloc (vg_replace_malloc.c:298)
==32233== by 0x400555: foo (fishy.c:15)
==32233== by 0x400583: main (fishy.c:23)
- realloc size 0
==77609== realloc() with size 0
==77609== at 0x48502B8: realloc (vg_replace_malloc.c:1450)
==77609== by 0x201989: main (realloczero.c:8)
==77609== Address 0x5464040 is 0 bytes inside a block of size 4 alloc'd
==77609== at 0x484CBB4: malloc (vg_replace_malloc.c:397)
==77609== by 0x201978: main (realloczero.c:7)
内存泄漏