Underlying Technology

This document explains in some detail the differences in technology between these three competing products and how the technologies affect the types of errors detected by answering the following questions:

What are the basic, underlying differences in technology between Insure++, Purify, and Sentinel?
Based on these competing technologies, what errors can be/cannot be detected by each product?
Where do the particular strengths of Insure++ in comparison to its competitors lie, particularly with respect to the types of problems for which each product is best suited?

What are the basic, underlying differences in technology between Insure++, Purify, and Sentinel?

Here is how the different products operate:

Sentinel

Sentinel works by intercepting library calls. Basically, it tries to force the program to call Sentinel's functions when the program calls system functions. In order to achieve that, Sentinel changes the names of about 400 system library calls. The process is best described with an example of a call to malloc. Consider a program:

        main()
        {
                char *ptr;

                ptr = malloc(4);
        }

Sentinel will change this program to:

        main()
        {
                char *ptr;

                ptr = SENT_malloc(4);
        }

The change is done at the linking stage when the name of the malloc call is replaced with SENT_malloc. The SENT_malloc function is written by Sentinel and calls malloc. SENT_malloc also performs checking.

Sentinel checks only errors which occur in dynamic memory. Sentinel can detect memory corruption, reading of uninitialized memory, and they claim that they can detect memory leaks, but they really cannot. I will explain this in detail later.

Memory corruption is checked through direct interception, but only if it happens in arguments to functions which Sentinel checks. As an example, consider memcpy. If the requested copy is bigger then the space which is allocated for the data, Sentinel will detect the error.

The other way in which Sentinel detects memory corruption is through fences. Around each allocated memory block, Sentinel puts a fence and stores a specific pattern in that fence. During the trapping of functions, Sentinel swaps through all allocated spaces and checks if fences were overwritten. If they were, Sentinel reports an error. It cannot tell by whom or when the fence was overwritten. It can just state that the fence was overwritten. Sentinel can also detect reads of uninitialized memory. It does this through patterns again. When memory is allocated, Sentinel writes a pattern to it. When one of the functions tries to copy this memory to some other block, it signals the error. This is also not a very thorough technique, and it detects errors very rarely.

Purify

Purify works differently. Purify reads objects files during the linking phase and instruments them at the assembly level. Every store and read instruction is instrumented. In this way, Purify is better then Sentinel. Purify also has its own malloc and free library and uses it to intercept calls to malloc and free. Purify does detect memory corruption errors, but only in dynamic memory. It uses a technique similar to Sentinel's. It builds a fence for each memory block. If, during execution, the program wants to write or read to a fence address, Purify reports an error. The error is triggered when the program writes to the fence address. This means that Purify does not detect errors which jump the fence. "Fence jumping" occurs when pointers are overwritten and then later used to write or read from memory.

To detect uninitialized memory reads, Purify uses a "coloring" technique. At the beginning of the program, it assumes that all memory is not initialized. As the program executes, every write which occurs in the program is logged and the memory location which is written is marked as initialized. When the program tries to read memory which was not initialized, an error is reported. One problem with this technique is that Purify does not have an understanding of the data structures on which it operates and cannot know if the writes and reads are of the correct type and form. It can only see that they are in allowable memory locations. This checking level is quite weak.

Insure++

Insure++ is a source _and_ object level tool. If the source code of the program is available, Insure++ works by reading the source code, instrumenting it, and storing the resulting source code in a temporary file. This file is passed to the compiler to generate the instrumented program. Every pointer operation, memory write, memory read, and much more is instrumented. Because Insure++ works at the source level, it understands your program much more completely than the other tools. Its greater accuracy is a direct result of this. During instrumentation and execution, Insure++ builds a database of all memory areas which the program uses. It has a database of dynamic, static, stack, and shared memory blocks. Each block is described by the memory address at which it starts, its size, and its type. In addition to its knowledge of the program's memory blocks, Insure++ also has a database of pointers used in the program. During assignment statements like

        ptr = &a;

Insure++ links pointers to specific data blocks. Pointers are not allowed to access blocks if they are not assigned to that block. If the pointer accesses a block to which it is not assigned, Insure++ reports an error. Thus, with Insure++, there is no possibility of jumping a fence. When a write or read through a pointer gets beyond the size of the memory block, Insure++ detects it. Reading uninitialized memory checks work in the same way. Every block which is generated by the program is first marked in the database as not initialized. When the program tries to read that block of memory, Insure++ checks if it was marked as written and if not, reports an error.

The key to Insure++'s success is its ability to instrument source code. This allows for checking every read and write to memory against the accumulated database of pointers and blocks. The database allows Insure++ to track memory accesses with incredible precision and in all memory segments. This is especially critical for detection of memory leaks. Because Insure++ monitors all pointers and memory blocks in the program, it can detect the instruction which overwrites the last pointer to a memory block. This is how Insure++ detects when leaks occur, and is absolutely unique.

If the source code of the program is not available, Insure++ works in a very similar fashion to Sentinel. In fact, Sentinel's functionality is included in Insure++, and is actually a small subset of Insure's total functionality.

Based on these competing technologies, what errors can be/cannot be detected by each product?

Both Purify and Sentinel can detect memory corruption errors, but only in dynamic memory. They do not detect corruption on the stack or in static memory at all, because the fence technique only works for dynamic memory. This is a big disadvantage. Insure++ detects memory corruption errors in all memory segments. This is because a) Insure++ works at the source level, and it is relatively easy to track corruption in all segments at the source level - all memory accesses are still typed. b) Insure++'s database allows it to map every pointer to the memory block and detect if memory was overwritten.

The other major difference is in leak detection. Neither Purify nor Sentinel can detect when a leak occurs. At the end of the program, both tools sweep all memory and list all outstanding blocks as possible leaks. They cannot, however, pinpoint which memory blocks were actually leaked or, most importantly, when they were leaked.

Insure++, on the other hand, detects not only that a leak occurred, but also when it occurred. Insure++ does this by detecting the overwriting of the last pointer which was pointing to a particular memory block.

Where do the particular strengths of Insure++ in comparison to its competitors lie, particularly with respect to the types of problems for which each product is best suited?

As I mentioned, Insure++ is superior in memory corruption detection and memory leak detection. In addition, Insure++ has a big advantage in detecting errors in third party libraries. When an error occurs in a third party library, neither Sentinel nor Purify can tell if the error was caused by the user passing wrong arguments to the library or if the error is within the library function itself. Insure++ includes interfaces to over 2000 different library functions. When the user calls a library function, the interface checks the correctness of the call. If the call was correct, but an error is still detected, Insure++ determines that the error is an internal library error. If the wrong parameters were passed to the library function, Insure++ catches that error at the interface level.