Insure++ User's Guide - Insure++
Part I
As shown in the Getting Started
manual, using Insure++ is essentially easy to do. You simply recompile
your program using the special insure command instead
of your normal compiler. Running the program normally will then generate
a report whenever an error is detected that usually contains enough detail
to track down and correct the problem.
What does this give you?
Obviously, the most important advantage of Insure++ is the fact
that it automatically detects errors that might otherwise go unnoticed
in normal testing. Subtle memory corruption errors and dynamic memory
problems often don't crash the program or cause it to give incorrect
answers until the program is shipped to customers and they run it on
their test cases. Then the problems start.
Even if Insure++ doesn't find any problems in your programs,
running it gives you the confidence that your program doesn't contain
any errors.
Of course, Insure++ can't possibly check everything that your
program does. However, its checking is extensive and covers every class
of programming error. The following sections discuss the types of errors
that Insure++ will detect.
This is one of the most unpleasant errors that can occur, especially if
it is well disguised. As an example of what can happen, consider the
program shown in Figure 1, which
concatenates the arguments given on the command line and prints the
resulting string.
1: /*
2: * File: hello.c
3: */
4: int main (int argc, char *argc[])
5: {
6: int i;
7: char str[16];
8:
9: str[0] = '\0';
10: for(i=0; i<argc; i++) {
11: strcat(str, argv[i]);
12: if(i < (argc-1)) strcat(str, "");
13: }
14: printf("You entered: %s\n", str);
15: return;
16:}
Figure 1. "Hello world" with bug
If you compile and run this program with your normal compiler, you'll
probably see nothing interesting, e.g.,
$ cc -g -o hello hello.c
$ hello
You entered: hello
$ hello world
You entered: hello world
$ hello cruel world
You entered: hello cruel world
If this were the extent of your test procedures, you would probably
conclude that this program works correctly, despite the fact that it
has a very serious memory corruption bug.
If you compile with Insure++, the command
"hello cruel world" generates the errors shown in
Figure 2, because the string that is
being concatenated becomes longer than the 16 characters allocated in
the declaration at line 7.
[hello.c.:11] **WRITE_OVERFLOW**
>> strcat(str, srgv[i]);
Writing overflows memory: str
bbbbbbbbbbbbbbbbbbbbbbbbbbbb
| 16 | 2 |
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
Writing (w) : 0xf7fff8a8 thru 0xf7fff8b9
(18 bytes)
To block (b) : 0xf7fff8a8 thru 0xf7fff8b7
(16 bytes)
str, declared at hello.c, 7
Stack trace where the error occurred:
strcat() (interface)
main() hello.c, 11
**Memory corrupted. Program may crash!!**
[hello.c:14] **READ_OVERFLOW**
>> printf("You entered: %s\n", str);
String is not null terminated within range: str
Reading : 0xf7fff8a8 thru 0xf7fff8b9 (18 bytes)
From block : 0xf7fff8a8 thru 0xf7fff8b7 (16 bytes)
str, declared at hello.c, 7
Stack trace where the error occurred:
main() hello.c, 16
You entered: hello cruel world
Figure 2. Insure++'s messages from the
"Hello world" program
Insure++ finds all problems related to overwriting memory or
reading past the legal bounds of an object,
regardless of whether it is allocated statically (i.e., a
global variable), locally on the stack, dynamically
(with malloc or
new ), or even as a shared memory block.
It also detects the case in which a pointer crosses from one block of
memory into another and starts to overwrite
memory there, even if the memory blocks are
adjacent.
Problems with pointers are among the most difficult encountered by C
programmers. Insure++ detects pointer related problems in the
following categories
- Operations on
NULL pointers.
- Operations on uninitialized pointers.
- Operations on pointers that don't actually point to valid
data.
- Operations which try to compare or otherwise
relate pointers that don't point at the same
data object.
- Function calls through function pointers
that don't actually point to functions.
Figure 3 shows the code for a
second attempt at the "Hello world" program that uses dynamic
memory allocation.
1: /*
2: *
3: */
4: #include <stlib.h>
5:
6: int main (int argc, char *argv[]
7: {
8: char *string, *string_so_far;
9: int i, length;
10:
11: length = 0; /* Include last NULL */
12:
13: for(i=0; i<argc, i++) {
14: length += strlen(argv[i]+1;
15: string = malloc(length);
16: /*
17: * Copy the string built so far.
18: */
19: if(string_so_far != (char *)0)
20: strcpy(string, string_so_far);
21: else *string = '\0';
22:
23: strcat(string, argv[i]);
24: if(i < argc-1) strcat(string, "");
25: string_so_far = string;
26: }
27: printf("You entered: %s\n", string);
28: return (0);
29:}
Figure 3. "Hello world" with dynamic memory
allocation
The basic idea of this program is that we keep track of the
current string size in the variable length . As
each new argument is processed, we add its length to the
length variable and allocate a block of memory
of the new size. Notice that the code is careful to include the
final NULL character when computing the string
length (line 11) and also the space between strings (line 14). Both
of these would be easy mistakes to make. It's an interesting exercise
to see how quickly Insure++ would find such an error.
The code in lines 19-24 either copies the argument to the buffer or
appends it depending on whether or not this is the first pass round the
loop. Finally in line 25 we point at the new, longer string by
assigning the pointer string to the variable
string_so_far .
If you compile and run this program under Insure++, you'll see an
"uninitialized pointer" errors reported for lines 19 and 20.
This is because the variable string_so_far hasn't been
set to anything before the first trip through the argument loop!
A "memory leak" occurs when a piece of dynamically allocated
memory cannot be freed because the program no longer contains any
pointers that point to the block. A simple example of this behavior
can be seen by running the (corrected) "Hello world" program
with the arguments
hello3 this is a test
If we examine the state of the program at line 27, just before executing
the call to malloc for the second
time, we observe:
- The variable
string_so_far points to the
string "hello" which it was assigned
as a result of the previous loop iteration.
- The variable string points to the extended string
"hello this" which was assigned on
this loop iteration.
These assignments are shown schematically in
Figure 4 - both variables point
to blocks of dynamically allocated memory.
The next statement
string_so_far = string;
will make both variables point to the longer memory block as shown in
Figure 5.
Once this has happened, however, there is no remaining pointer that
points to the shorter block. Even if you wanted to, there is no way that the
memory that was previously pointed to by string_so_far
can be reclaimed - it is permanently allocated. This is known as a
"memory leak", and is diagnosed by Insure++ as shown in
Figure 6.
[hello3.c:27] **LEAK ASSIGN**
>> string_so_far = string;
Memory leaked due to reassignment: string
In block: 0x0001fbb0 thru 0x0001fbb6 (7 bytes)
block allocated at:
malloc() (interface)
main() hello3.c, 17
Stack trace where the error occurred:
main() hello3.c, 27
Figure 6. Insure++ error report for the memory
leak
This example is called
LEAK_ASSIGN by Insure++
since it is caused when a pointer is re-assigned. Other types that
Insure++ detects include:
LEAK_FREE
- Occurs when you free a block of memory that contains
pointers to other memory blocks. If there are no other pointers
that point to these secondary blocks then they are permanently
lost and will be reported by Insure++.
LEAK_RETURN
- Occurs when a function returns a pointer to an allocated block
of memory, but the returned value is ignored
in the calling routine.
LEAK_SCOPE
- Occurs when a function contains a local variable that points
to a block of memory, but the function returns without saving
the pointer in a global variable or passing it back to its
caller.
Notice that Insure++ indicates the exact source line on which
the problem occurs, which is a key issue in finding and fixing memory
leaks. This is an extremely important feature, because it's easy to
introduce subtle memory leaks into your applications, but very hard to
find them all. Using Insure++, you can instantly pinpoint the
line of source code which caused the leak.
Whether or not this is a serious problem depends on your application.
To get more information on the seriousness of the problem, make a file
called .psrc in your current directory and add
to it the line(1)
insure++.summarize leaks
Now when you run the program again, you will see the same output as
before, followed by a summary of all the memory leaks in your code.
MEMORY LEAK SUMMARY
===================
4 outstanding memory references for 55 bytes.
Leaks detected during execution
-------------------------------
55 bytes 4 chunks allocated at hello3.c, 17
This shows that even this short program lost four different chunks of
memory. The total of 55 bytes isn't very large and you might well
ignore it in a program this size. If, however, this was a routine in
a larger program, it would be a serious problem, because every
time the routine is called it allocates blocks of memory and loses
some. As a result the program gradually consumes more and more memory
and will finally crash when the memory space on the host machine is
exhausted.
This type of bug can be extremely hard to detect, because it might take
literally days to show up. It is exactly the type of bug that survives
all your in-house testing and only shows up when you ship a product to
a customer who needs to use it for some enormous processing task!
You may be wondering why Insure++ only prints one error
message although the summary indicates that 4 memory leaks
occurred. This is because Insure++ normally shows only the
first error of any given type at each particular source line.
If you wish, you can change this
behavior as described in "Insure++ Reports".
You can obtain additional information about each individual memory
leak with the .psrc option "
insure++.summarize leaks".
For an even higher level of checking, we suggest the following algorithm
for removing all memory leaks from your code. This process is
unique - no other tool can do this. If you complete the following steps,
there will not be any memory leaks left in your code.
- Compile your program normally, but link with
insure -Zuse and run the program with
Inuse (see "Using Inuse" in the Inuse
manual). If you see an increase in the heap size as you run the
program, you are leaking memory.
- Compile all source code, but not libraries, with Insure++.
Clean all leaks that are detected by Insure++.
- Compile everything that makes up your application with
Insure++ - source code and libraries. Clean any leaks
detected by Insure++. If you do not have source for any of
the libraries, skip this step and proceed to Step 4.
- Repeat Step 1. If memory is increasing,
add
insure++.summarize
leaks outstanding to your
.psrc file and run your Insure++
checked program again. Any outstanding memory reference shown
is a potential leak.
- You must now examine each outstanding memory reference to determine
whether or not it is a leak. If the pointer is passed into a
library function, it may be saved. If this is the case, it is not
a leak. Once every outstanding memory reference is understood,
and those that are leaks are cleared, the program is free of memory
leaks.
Using dynamically allocated memory properly is another tricky issue. In
many cases programs continue running well after a programming error causes
serious memory corruption - sometimes they don't crash at all.
One common mistake is to try to reuse a pointer
after it has already been freed.
As an example we could modify the "Hello world" program to
de-allocate memory blocks before allocating the larger ones. Consider
the following piece of code which does just that:
21: if(string_so_far != (char *)0) {
22: free(string_so_far);
23: strcpy(string, string_so_far);
24: }
25: else *string = '\0';
If you run this code (hello4.c )
through Insure++, you'll get another error message about a
"dangling pointer" at line 23. The term
"dangling pointer" is used to
mean a pointer that doesn't point at a valid memory block anymore. In
this case the block is freed at line 22 and then used in the following
line!
This is another common problem that often goes unnoticed, because many
machines and compilers allow this particular behavior.
In addition to this error Insure++, also detects the following
- Reading from or writing to "dangling pointers".
- Passing "dangling pointers" as arguments to functions
or returning them from functions.
- Freeing the same memory block multiple times.
- Attempting to free statically allocated
memory.
- Freeing stack memory (local variables).
- Passing a pointer to
free
that doesn't point to the beginning of a memory block.
- Calls to
free with NULL
or uninitialized pointers.
- Passing non-sensical arguments or arguments of the wrong data type
to
malloc , calloc , realloc or
free .
Another way that Insure++ can help you track down dynamic memory
problems is through the RETURN_FAILURE error code.
Normally, Insure++ will not issue an error if
malloc , for example, returns a NULL
pointer because it is out of memory. This behavior is the default, because
it is assumed that the user program is already checking for, and handling,
this case.
If your program appears to be failing due to an unchecked return code,
you can enable the
RETURN_FAILURE
error message class. Insure++ will then print a message whenever
any system call fails.
The standard C library string handling functions are a rich source of
potential errors, since they do very little checking on the bounds of
the objects being manipulated.
Insure++ detects problems such as overwriting the end of a
buffer as described in"Memory corruption". Another common problem is
caused by trying to work with strings that are not null-terminated, as
in the following example.
1: /*
2: * File: readovr2.c
3: */
4: main()
5: {
6: char junk;
7: char b[8];
8: strncpy(b, "This is a test",
9: sizeof(b));
10: printf("$s\n", b);
11: return (0);
12: }
This program attempts to copy the string "This is a
test " into a buffer which is only 8 characters long.
Although it uses strncpy to avoid overwriting
its buffer, the resulting copy doesn't have a NULL
on the end. Insure++ detects this problem in line 10
when the call to printf tries to print the string.
A particularly unpleasant problem to track down occurs when your program
makes use of an uninitialized variable. These problems are often
intermittent and can be particularly difficult
to find using conventional means, since any alteration in the operation of
the program may result in different behavior. It is not unusual for this
type of bug to show up and then immediately disappear whenever you do
something to try to trace it.
Insure++ performs checking for uninitialized data in two
sub-categories
To clarify the difference between these categories consider the following
code
1: /*
2: * File: readuni1.c
3: */
4: #include <stdio.h>
5:
6: int main()
7: {
8: struct rectangle {
9: int width;
10: int height;
11: };
12:
13: struct rectangle box;
14: int area;
15:
16: box.width = 5;
17: area = box.width*box.height;
18: printf("area = %d\n", area);
19: return (0);
20: }
In line 17 the value of box.height is used to calculate
a value which is most definitely invalid, since its value was never assigned
Insure++ detected this error in the
READ_UNINIT_MEM(read)
category. This category is enabled by default, so a message will be displayed.
In you changed line 17 to
17: area = box.height;
Insure++ would report errors of type
READ_UNINIT_MEM(copy) for both
lines 17 and 18, but only if you had unsuppressed this error category.
In a significant change from earlier versions, Insure++ now
detects uninitialized memory references using a full flow-analysis
of your application's source code (and can often detect problems at
compile time) by default. In addition to the performance enhancements
made to enable this change, there are several new
.psrc options, which allow greater control over
this portion of Insure++'s checking abilities
(see runtime options).
The default setting is the most comprehensive form of error detection,
but obviously involves some overhead during compilation. If you wish to
track only uninitialized pointers, you can set the following
.psrc option.
insure++.checking_uninit off
Turning off this option does not, however, completely disable
uninitialized variable checking. No errors will be reported in the
READ_UNINIT_MEM
class, but Insure++ will
still check for uninitialized pointer variables and report these
errors in the
READ_UNINIT_PTR
error category.
If
checking_uninit
is disabled, uninitialized
pointer errors will be reported in the
READ_UNINIT_PTR
category, not
READ_UNINIT_MEM .
Insure++ can also detect variables which have no effect on the
behavior of your application, either because they are never used, or
because they are assigned values which are never used. In most cases
these are not serious errors, since the offending statements can simply
be removed, and so they are suppressed by default.
Occasionally, however, an unused variable may be a symptom of a
logical program error, so you may wish to enable this checking
periodically. See "Unused variables" for more details.
A lot of programs make either explicit or implicit assumptions about
the various data types on which they operate. A common assumption made
on workstations is that pointers and integers have the same number of
bytes. While some of these problems can be detected during compilation,
some codes go to great lengths to hide operations with typecasts such as
char *p;
int ip;
ip = (int)p;
On many systems this type of operation would be
valid and would cause no problems. When such code is ported to
alternative architectures, however, problems can arise. The code shown
above would fail, for example, when executed on a PC
(16-bit integer, 32-bit pointer)
or a 64-bit architecture such as the
DEC Alpha (32-bit integer, 64-bit pointer).
In cases where such an operation loses information, Insure++
will report an error. On machines for which the data types have the
same number of bits (or more), no error is reported.
Insure++ detects inconsistent declarations of variables between
source files.
A common problem is caused when an object is declared as an array in
one file, e.g.,
int myblock[128];
but as a pointer in another
extern int *myblock;
See the files
baddec11.c
and
baddec12.c for an
example. Insure++ also reports differences in size, so that an
array declared as one size in one file and another in a second will be
detected.
The printf and scanf family of
functions are easy places to make mistakes which show up either as bugs
or portability problems.
Consider, for example, the code
foo()
{
double f;
scanf("%f", &f);
}
This code will not crash, but the value read into the variable
f will not be correct, since its data type
(double ) doesn't match the format specified in the
call to scanf (float ).
As a result, incorrect data will be transferred to the program.
In a similar way, the example
badform2.c
foo()
{
float f;
scanf("%lf", &f);
}
corrupts memory, since too much data will be
written over the supplied variable. This error can be very difficult
to detect.
Insure++ detects both of these bugs.
A more subtle issue arises when data types used in I/O statements match
"accidentally". The code
foo()
{
long l = 123;
printf("l = %d\n", l);
}
functions correctly on machines where types int and
long have the same number of bits, but fails otherwise.
Insure++ detects this error, but classifies it differently from
the previous cases. You can choose to ignore this type of problem
while still seeing the previous bugs. (See
BAD_FORMAT for details.)
In addition to checking printf and
scanf arguments, Insure++ also detects errors
in other I/O statements. The code
foo(line)
char line[80];
{
gets(line);
}
works as long as the input supplied by the user is shorter than 80
characters, but fails on longer input. Insure++ checks for this
case and reports an error if necessary.
This case is somewhat tricky, since Insure++ can only check for
an overflow after the data has been read. In extreme
cases the act of reading the data will crash the program before
Insure++ gets the chance to report it.
Calling functions with incorrect arguments is a common problem in many
programs, and can often go unnoticed.
Insure++ detects the error in the following program
double foo(dd)
double dd;
{
return dd + 1.0;
}
main()
{
printf("Result = %f\n", foo(1));
}
in which the argument passed to the function foo
in main is an integer rather than a floating point
number.
Converting this program to ANSI style (e.g., with
a function prototype for foo )
makes it correct since the argument passed in main will
be automatically converted to double . Insure++ doesn't
report an error in this case.
Insure++ detects several different categories of errors, which you
can enable or suppress separately depending on which types of bugs you
consider important.
- Sign errors
- Arguments agree in type but one is signed and the other
unsigned, e.g.,
int vs.
unsigned int .
- Compatible
- The arguments are different data types which happen to occupy
the same amount of memory on the current machine, e.g.
int vs. long if both
are thirty-two bits. While this error may not cause problems
on your current machine, it is a portability problem.
- Incompatible types
- Similar to the example above - data types are fundamentally
different or require different amounts of memory.
int vs. long
would appear in this category on machines where they require
different numbers of bits.
During compilation, Insure++'s parser detects a number of
C++-specific problems and prints warning messages. These messages are
coded by the chapter, section, and paragraphs pertaining to that warning
in the draft ANSI standard. Therefore, if you are uncertain what
a particular warning message means or would like additional information,
you can consult the standard for an explanation.
As an example, when processed by Insure++, the code
void foo(char *str) { }
void func()
{
void *iptr = (char *) 0;
foo(iptr);
}
will produce the warning
insure -c foo.C
[foo.C:5] Warning:13-2: wrong arguments passed to function 'foo'
| declared at: [foo.C:1]
| expected args: (char *)
| passed args: (void *)
>> foo(iptr);
Interfacing to library software is often tricky, because passing an
incorrect argument to a routine may cause it to fail in an unpredictable
manner. Debugging such problems is much harder than correcting your
own code, since you typically have much less information about how
the library routine should work.
Insure++ has built-in knowledge of a large number of system
calls and checks the arguments you pass to ensure
correct data type and, if appropriate, correct range.
For example, the code
void myrewind(FILE fp)
{
fseek(fp, (long)0, 3);
}
would generate an error since the last argument passed to the
fseek function is outside the legal range.
Checking the return codes from system calls and dealing correctly
with all the error cases that can arise is a very difficult task. It
is a very rare program that deals with all possible cases correctly.
An unfortunate consequence of this is that programs can fail unexpectedly
after they have been shipped to customers because some system call
fails in a way that had not been anticipated. The consequences of this
can range from a nasty "core dump" to a system that performs
erratically at the customer location.
Insure++ has a special error class,
RETURN_FAILURE
, that can be used to detect these problems. All the system calls
known to Insure++ contain special error checking code that detects
failures. Normally these errors are suppressed,
since it is assumed that the application is handling them itself, but they
can be enabled at runtime by adding the line
insure++.unsuppress RETURN_FAILURE
to a .psrc file. Any system call that
returns an error code will then print a message indicating the name
of the routine, the arguments supplied, and the reason for the
error.
This capability detects any error in any system call.
Among the potential benefits are automatic detection of errors in
the following situations
and many others.
In order for Insure++ to be able to correctly track memory in
threaded programs, all calls to pthread_create ()
or thr_create () must have been "seen" by
Insure++. In Insure++ 3.1 and earlier versions this
meant that all files that ever call thread creation routines must be
instrumented with Insure++. This is still true in version 4.0
if you are using backward compatibility mode (interface_preference tqi tqs).
Starting
with version 4.0, Insure++ defaults to using "Library
Interpositioning" (referred to as TQL interfaces from now on).
This mode guarantees that the above requirement is met, even if
Insure++ was only used to link the executable and none of the
source files have been instrumented.
The previous sections described the various types of problems detected
by Insure++. As you can see, a very large number of problems can
be detected as simply as recompiling your program and running it under
Insure++. Hopefully, this will eliminate many bugs that you might
otherwise ship to your customers.
It would be naive, however, to expect that Insure++ will remove
all of the bugs in your code. Some will still make it through all the
testing steps. Luckily, Insure++ can still help even after you've
shipped your product.
An important way that Insure++ can help you reach the Total
Quality Software goal is to ship two versions of your product to your
customers:
- The normal version, compiled without Insure++
- A version built with Insure++
This second version can be used at the
customer site to help
track down problems. This will dramatically improve the efficiency of
your support staff at finding bugs in the released software.
Footnotes
- (1)
- If you already have a file called
.psrc
in your directory, simply add this line to it.
For more information, call (888) 305-0041 or send email to:
insure@parasoft.com
Introduction
Insure++ Reports
Insure++ User's Guide TOC
|