Underlying Technology
This document explains in some detail the differences in technology
between these three competing products and how the technologies affect
the types of errors detected by answering the following questions:
Here is how the different products operate:
Sentinel
Sentinel works by intercepting library calls. Basically, it tries
to force the program to call Sentinel's functions when the program
calls system functions. In order to achieve that, Sentinel changes the
names of about 400 system library calls. The process is best described
with an example of a call to malloc. Consider a program:
main()
{
char *ptr;
ptr = malloc(4);
}
Sentinel will change this program to:
main()
{
char *ptr;
ptr = SENT_malloc(4);
}
The change is done at the linking stage when the name of the malloc call
is replaced with SENT_malloc. The SENT_malloc function is
written by Sentinel and calls malloc. SENT_malloc also
performs checking.
Sentinel checks only errors which occur in dynamic memory. Sentinel
can detect memory corruption, reading of uninitialized memory, and
they claim that they can detect memory leaks, but they really cannot.
I will explain this in detail later.
Memory corruption is checked through direct interception, but only if
it happens in arguments to functions which Sentinel checks. As an
example, consider memcpy. If the requested copy is bigger then the
space which is allocated for the data, Sentinel will detect the error.
The other way in which Sentinel detects memory corruption is through
fences. Around each allocated memory block, Sentinel puts a fence and
stores a specific pattern in that fence. During the trapping of
functions, Sentinel swaps through all allocated spaces and checks if
fences were overwritten. If they were, Sentinel reports an error. It
cannot tell by whom or when the fence was overwritten. It can just
state that the fence was overwritten. Sentinel can also detect reads
of uninitialized memory. It does this through patterns again. When
memory is allocated, Sentinel writes a pattern to it. When one of the
functions tries to copy this memory to some other block, it signals
the error. This is also not a very thorough technique, and it detects
errors very rarely.
Purify
Purify works differently. Purify reads objects files during the
linking phase and instruments them at the assembly level. Every store
and read instruction is instrumented. In this way, Purify is better
then Sentinel. Purify also has its own malloc and free library and
uses it to intercept calls to malloc and free. Purify does detect
memory corruption errors, but only in dynamic memory. It uses a
technique similar to Sentinel's. It builds a fence for each memory
block. If, during execution, the program wants to write or read to a
fence address, Purify reports an error. The error is triggered when
the program writes to the fence address. This means that Purify does
not detect errors which jump the fence. "Fence jumping" occurs when
pointers are overwritten and then later used to write or read from
memory.
To detect uninitialized memory reads, Purify uses a "coloring"
technique. At the beginning of the program, it assumes that all
memory is not initialized. As the program executes, every write which
occurs in the program is logged and the memory location which is
written is marked as initialized. When the program tries to read
memory which was not initialized, an error is reported. One problem
with this technique is that Purify does not have an understanding of
the data structures on which it operates and cannot know if the writes
and reads are of the correct type and form. It can only see that they
are in allowable memory locations. This checking level is quite weak.
Insure++
Insure++ is a source _and_ object level tool. If the source code of
the program is available, Insure++ works by reading the source code,
instrumenting it, and storing the resulting source code in a temporary
file. This file is passed to the compiler to generate the
instrumented program. Every pointer operation, memory write, memory
read, and much more is instrumented. Because Insure++ works at the
source level, it understands your program much more completely than
the other tools. Its greater accuracy is a direct result of
this. During instrumentation and execution, Insure++ builds a database
of all memory areas which the program uses. It has a database of
dynamic, static, stack, and shared memory blocks. Each block is
described by the memory address at which it starts, its size, and its
type. In addition to its knowledge of the program's memory blocks,
Insure++ also has a database of pointers used in the program. During
assignment statements like
ptr = &a;
Insure++ links pointers to specific data blocks. Pointers are not
allowed to access blocks if they are not assigned to that block. If
the pointer accesses a block to which it is not assigned, Insure++
reports an error. Thus, with Insure++, there is no possibility of
jumping a fence. When a write or read through a pointer gets beyond
the size of the memory block, Insure++ detects it. Reading
uninitialized memory checks work in the same way. Every block which is
generated by the program is first marked in the database as not
initialized. When the program tries to read that block of memory,
Insure++ checks if it was marked as written and if not, reports an
error.
The key to Insure++'s success is its ability to instrument source
code. This allows for checking every read and write to memory against
the accumulated database of pointers and blocks. The database allows
Insure++ to track memory accesses with incredible precision and in all
memory segments. This is especially critical for detection of memory
leaks. Because Insure++ monitors all pointers and memory blocks in the
program, it can detect the instruction which overwrites the last
pointer to a memory block. This is how Insure++ detects when leaks
occur, and is absolutely unique.
If the source code of the program is not available, Insure++ works in
a very similar fashion to Sentinel. In fact, Sentinel's functionality
is included in Insure++, and is actually a small subset of Insure's
total functionality.
Both Purify and Sentinel can detect memory corruption errors, but only
in dynamic memory. They do not detect corruption on the stack or in
static memory at all, because the fence technique only works for
dynamic memory. This is a big disadvantage. Insure++ detects memory
corruption errors in all memory segments. This is because a) Insure++
works at the source level, and it is relatively easy to track
corruption in all segments at the source level - all memory accesses
are still typed. b) Insure++'s database allows it to map every
pointer to the memory block and detect if memory was overwritten.
The other major difference is in leak detection. Neither Purify nor
Sentinel can detect when a leak occurs. At the end of the program,
both tools sweep all memory and list all outstanding blocks as
possible leaks. They cannot, however, pinpoint which memory blocks
were actually leaked or, most importantly, when they were leaked.
Insure++, on the other hand, detects not only that a leak occurred,
but also when it occurred. Insure++ does this by detecting the
overwriting of the last pointer which was pointing to a particular
memory block.
As I mentioned, Insure++ is superior in memory corruption detection
and memory leak detection. In addition, Insure++ has a big advantage
in detecting errors in third party libraries. When an error occurs in
a third party library, neither Sentinel nor Purify can tell if the
error was caused by the user passing wrong arguments to the library or
if the error is within the library function itself. Insure++ includes
interfaces to over 2000 different library functions. When the user
calls a library function, the interface checks the correctness of the
call. If the call was correct, but an error is still detected,
Insure++ determines that the error is an internal library error. If
the wrong parameters were passed to the library function, Insure++
catches that error at the interface level.
|