Catching memory overwrites dbgheap1.cpp Finding bugs with a debugger Finding bugs without the debugger mapfilt.exe Making a better bug-trap heap errors double-deletes delete uninitialized pointers accessing deleted memory memory leaks malloc+free+realloc/new+delete/new[]+delete[] multiple heaps tracing callers tracing Statistic Conclusion [figure:membug.gif] 1:Hunting memory bugs Have you ever hunted a memory overwrite bug? If you have you know how hard they can be to track down. Every programmers nightmare is when such a bug suddenly creps up in a large project. The runtime libraries included with most compilers include a debug-version of the heap management functions but they usually first detect the overwrite after it has happened. Would it not be nice if you could detect the overwrite when it happens? In the following sections I will implement a complete debugging/tracing library which will significantly reduce the time you spend tracing heap management bugs. I will not tell you how to avoid heap errors only how to detect and find them. Note: I primarily use Watcom C/C++, but the tools have also been tested with VAC++ 3.0. The sources may contain information on what you have to do and/or change if you use VAC++. All sources should be easy to adapt to other compilers (BC/2, EMX, Metaware...) 2:Catching memory overwrites OS/2 implements protection of the memory. An example is that writing to an area containing the code for the program will trap. eg: code: void foo() { ... } ... char *p=(char*)foo; strcpy(p,"hello world"); :ecode This code will trap due to OS/2's memory protection. The area where the code is is marked as read-only/execute-only. OS/2 has an API to allocate and deallocate memory (DosAllocMem()/DosFreeMem()), but also an API to change the protection of a memory block: DosSetMem(). The protection (or "attributes") can be set individually for each page. A page is 4K. The most common form of memory overwrite is to allocate too few bytes, usually due to an off-by-one error, eg: code: char *p=(char)malloc(11); strcpy(p,"hello world"); :ecode This code allocates 10 bytes but puts a 12-byte string (remember the terminating NUL) into it. The program may work, and worse: the program may crash several hours later in a completly unrelated function. By replacing the standard heap management functions with our own functions we can take full control of how the memory is allocated. I will only use three attributes of memory protection features: unknown, committed and uncomitted. * unknown = If we have never allocated a page in any way the virtual addresses are unknown. Accessing such a page causes a trap. * comitted = When a page is accessible (read/write) it is committed. Note: whether OS/2 really has reserved a physical piece of memory doesn't matter. * uncomitted (or "reserved") = It is also possible to only reserve range of virtual addresses without physical memory being reserved. Accessing such a page causes a trap. Now back to catching memory overwrites. If we could allocate exactly 11 bytes an ensure that accessing memory outside the 11 bytes would cause a trap it would be a lot easier to find where the memory overwrite happens. This can be done without too much trouble. By reserving enough pages to hold the requested number of bytes + 1 page and only committing the requested number of bytes and returning a pointer into the area so that accesses beyond the requested number of bytes cases a trap, we can catch memory overwrites in seconds instead of hunting them for days. [figure:membug1.gif] Note that each allocation uses at least 8K address space and at least 4K memory. Since OS/2 DosAllocMem() only allocates memory below 512MB you can only make 512MB/8K = 65536 allocations. It is even worse on some version of OS/2 (documentation says Warp3 and Warp4 server) where allocations reserves at least 64K address space (512M/64K = 8192 allocations). So if your program makes many allocations it may run out of address space. But that way you can also examine how well your program handles low memory conditions. 3:dbgheap1.cpp The full source is in dbgheap1.cpp. dbgheap1.cpp implements replacements for new/delete/new[]/delete[]/malloc/free/realloc/calloc that uses the mechanism described in the previous section. Here are a few highlights: code: struct chunk_header { unsigned chunk_size; //# of comitted bytes unsigned block_size; //heap block size (requested size) unsigned block_offset; //offset from start of block to user-)pointer }; :ecode Each chunk of memory starts with a chunk_header. The chunk header contains information that we need when the chunk is to be DosFreeMem()'ed. It also contains redundant information so we can check for overwrites of the chunk_header. Nexts comes a (maybe zero) number of bytes that are unused. These are filled with FILL_BYTE (0xfe) so they can also be checked for overwrites. Then comes the area where the 'user' memory is. It is positioned so that the end of that area is aligned with the end of a page. Then comes the important thing: an uncommitted page. If the program tries to use more bytes that were allocated it will trap. When the memory is to be freed, the freemem() first converts the "user"-pointer (which points somewhere into a page) to a page-aligned pointer. Then overwrite of the chunk_header is checked: code: if(chp->block_offset != chp->chunk_size-chp->block_size) { exit(0); } :ecode Redundancy helps debugging. Then the "user"-pointer is checked. We only accept the exact same pointer that were returned from allocmem(): code: if(p != pchunk+chp->block_offset) { exit(0); } :ecode Then the unused area that (hopefully) contains FILL_BYTE is checked: code: for(unsigned char *checkp=((unsigned char*)pchunk)+sizeof(chunk_header); checkp 2:Statistics The log generated by dbgheap2.cpp can be used not only for debugging but also performance tuning. heapstat.cpp generates a simple statistic. "heapstat tracemon.log" gives a result like this: code: Heap statistics: Size<= Operations Peak 4 0 0 8 0 0 16 2 2 256 1 1 1024 1 1 4096 3 3 32768 0 0 65536 0 0 4294967295 0 0 Size Count 10 2 39 1 388 1 4096 3 :ecode heapstat.cpp tells you for some preselected values how many allocations there were of that size and maximum number of allocation that were at any time. It also shows size and count for individual sizes This information can be used for performance tuning. Example 1: if there are extremely many allocations of a particular (small) size, it may be worthwhile to program a special-purpose allocator for that size. Example 2: If almost all allocations are larger than 16384 bytes it might be faster to code your own allocator that uses DosAllocMem() directly. 2:Conclusion [figure:membug3.gif] Good knowledge of what OS/2 can do combined with a few simple tools can be a very powerful. In fact, the tools described in this article can do more than most (all?) commercial heap-debug tools. And it's free! What is missing? the capability to read debug-info to better pinpoint who called malloc/new/delete/..., automating starting/stopping tracemon. ...ooh yes and a sluggish GUI interface, limiting the tool to 1 or 2 compilers, a $600 bill and a 3-inch thick manual. This is left as an exercise to the reader That's all folk.