The Evolution of Software Debugging
Dr. Adam Kolawa, ParaSoft
Table of Contents
Software debugging is the process by which developers attempt to remove
coding defects from a computer program. It is not uncommon for the debugging
phase of software development to take 60-70% of the overall development time.
In fact, debugging is responsible for 80% of all software project overruns.
Ultimately, a great amount of difficulty and uncertainty surround the crucial
process of software debugging.
This is because at each stage of the error detection process, it is
difficult to determine how long it will take to find and fix an error, not
to mention whether or not the defect will actually be fixed. In order to
remove bugs from the software, the developers must first discover that a
problem exists, then classify the error, locate where the problem actually
lies in the code, and finally create a solution that will remedy the
situation (without introducing other problems!). Some problems are so
elusive that it may take programmers many months, or in extreme cases, even
years to find them. Developers are constantly searching for ways to improve
and streamline the process of software debugging. At the same time, they
have been attempting to automate techniques used in error detection.
Over the years, debugging technology has substantially improved, and it
will continue to develop significantly in the near future. This article puts
recent developments in debugging technology in historical perspective and
describes where the technology is headed.
A study of debugging technologies reveals in interesting trend. Most
debugging innovations have centered around reducing the dependency on human
abilities and interaction. Debugging technology has developed through several
stages:
At the dawn of the computer age it was difficult for programmers to coax
computers to produce output about the programs they ran. Programmers were
forced to invent different ways to obtain information about the programs
they used. They not only had to fix the bugs, but they also had to build
the tools to find the bug. Devices such as scopes and program-controlled
bulbs were used as an early technique of debugging.
Eventually, programmers began to detect bugs by putting print instructions
inside their programs. By doing this, programmers were able to trace the
program path and the values of key variables. The use of print-statements
freed programmers from the time-consuming task of building their own
debugging tools. This technique is still in common use and is actually
well-suited to certain kinds of problems.
Although print-statements were an improvement in debugging techniques,
they still required a considerable amount of programmer time and effort.
What programmers needed was a tool that could execute one instruction of a
program at a time, and print values of any variable in the program. This
would free the programmer from having to decide ahead of time where to put
print-statements, since it would be done as he stepped through the program.
Thus, runtime debuggers were born. In principle, a runtime debugger is
nothing more than an automatic print-statement. It allows the programmer
to trace the program path and the variables without having to put print-
statements in the code.
Today, virtually every compiler on the market comes with a runtime
debugger. The debugger is implemented as a switch passed to the compiler
during compilation of the program. Very often this switch is called the
"-g" switch. The switch tells the compiler to build enough
information into the executable to enable it to run with the runtime
debugger.
The runtime debugger was a vast improvement over print statements,
because it allowed the programmer to compile and run with a single
compilation, rather than modifying the source and re-compiling as he tried
to narrow down the error.
Runtime debuggers made it easier to detect errors in the program, but
they failed to find the cause of the errors. The programmer needed a better
tool to locate and correct the software defect.
Software developers discovered that some classes of errors, such as memory
corruption and memory leaks, could be detected automatically. This was a step
forward for debugging techniques, because it automated the process of finding
the bug. The tool would notify the developer of the error, and his job was to
simply fix it.
Automatic debuggers come in several varieties. The simplest ones are just
a library of functions that can be linked into a program. When the program
executes and these functions are called, the debugger checks for memory
corruption. If it finds this condition, it reports it. The weakness of such
a tool is its inability to detect the point in the program where the memory
corruption actually occurs. This happens because the debugger does not watch
every instruction that the program executes, and is only able to detect a
small number of errors.
The next group of runtime debuggers is based on OCI technology. These
tools read the object code generated by compilers, and before programs are
linked, they are instrumented. The basic principle of these tools is that
they look for processor instructions that access memory. In the object code,
any instruction that accesses memory is modified to check for corruption.
These tools are more useful that the ones based on library techniques, but
they are still not perfect. Because these tools are triggered by memory
instructions, they can only detect errors related to memory. These tools can
detect errors in dynamic memory, but they have limited detection ability on
the stack and they do not work on static memory. They cannot detect any other
type of errors, because of the weaknesses in OCI technology. At the object
level, a lot of significant information about the source code is permanently
lost and cannot be used to help locate errors. Another drawback of these
tools is that they cannot detect when memory leaks occur. Pointers and
integers are not distinguishable at the object level, making the cause of
the leak undetectable.
The third group of runtime debuggers is based on SCI technology. The tool
reads the source code of the program, analyzes it, and instruments so that
every program instruction is sandwiched between the tool's instructions.
Because the tool reads the source code, it can discover errors related to
memory and other large classes of errors. Moreover, for memory corruption
errors, the tool is able to detect errors in all memory segments including
heap, static, and stack memory. The big advantage of this tool is that it
can track pointers inside programs, and leaks can be traced to point where
they occurred. This generation of tools is constantly evolving. In addition
to looking for memory errors, these tools are able to detect language
specific errors and algorithmic errors. These tools will be the basis for
the next step of technological development.
All of the present tools have one common drawback. They still require the
programmer to go through the extra step of looking for runtime errors after
the program is compiled. In a sense, the process hasn't changed much since
the Debugging Stone-Age. First, you write code, and then you check for
errors. This two-stage process still exists, only at a higher level. The
procedure needs to be integrated into one stage.
The next logical step in the development of automatic debugging techniques
is to integrate these technologies with compilers. There is no reason why
compiler vendors could not expand the use of the "-g" switch.
Instead of just providing information for runtime debugging, the
"-g" switch, by default, should perform automatic runtime error
detection. Such tight integration would have a tremendous impact on
programmers productivity. A major problem with automatic runtime debuggers
is that they are not used enough. Programmers typically develop their code,
and once they are at the end of the development process they run the error
detection tool. This is inefficient, because the programmer could use the
runtime debugger to discover errors throughout the whole development process.
The developer will be able to discover errors during the very first run of
the new program.
This is a logical step in the evolution of debugging. Human intervention
will be eliminated. When this technology is integrated with the compilers
runtime debugging will be transparent, not a distinct process.
Apart from functionality and ease-of-use, integration of runtime
debugging tools with compilers will lead to significant technological
advance. SCI-based tools are precursors of this step. Such tools are very
similar to compilers. Tools that are based on this technology parse the
program, generate a parse tree, instrument it, and write out source code that
is passed to the compiler. When tools such as these are integrated with the
compilers, the process is significantly shortened. Instead of using an extra
tool to parse the source code, the code can be parsed using the compiler's
purser. After the parse tree is generated, the tree can be passed to the
debugging tool for instrumentation. Once instrumented, the parse tree can
be sent back to the compilers for generation. The SCI debugging tool is ready
for full integration with compilers, and this process has already begun.
As stated earlier, the debugging process will be significantly improved
in the near future. We expect that SCI-based tools and compilers can be
easily integrated. They also contain technology which will be the basis for
the integration. Eventually, developers will begin to expect compilers to
support runtime debuggers. This will be a required feature.
The question is not if the integration will happen, but rather when
it will happen. The compiler vendors who integrate with SCI-based tools will
have a significant advantage. There are two reasons for this:
- Productivity of developers working with these compilers will be much
higher, thus platforms offering such compilers will be the platforms
of choice for serious code development. At present, most compilers are
relatively similar. This is true for independent compiler vendors as
well as for the compilers provided with the machine. The integration
with the SCI-based automatic debuggers will make the compiler and
platform more attractive.
- The experience developed from integrating SCI tools with compilers
will lead to yet another technological advancement. An SCI-based
automatic runtime debugger is similar in its design to other tools.
Development tools such as syntax checking, coverage analysis, and
others rely on their own ability to analyze and instrument the parse
tree. At present, all of these tools rely on their own parses to
perform their functions. The development and maintenance of parsers
is a significant task for companies that provide these tools. Tool
development would be easier if, instead of parsing code, these tools
could use the compiler parse tree to perform their functions. We
believe that technology developed during the integration of SCI-based
automatic runtime debuggers will lead to easier ways for tools to be
integrated with compilers. Finally, because of this, more third-party
tools will be available for such compilers.
When one of the compiler vendors integrates SCI-based automatic debugging
with their compiler, other vendors will follow suite. As more and more
integration happens, the third party vendors will require some type of
standard interface which allow them to connect their tools to compiler parse
trees. This will naturally lead to the development of a standard interface.
This will be beneficial to all parties. Compiler vendors will get more tools
that can work with their compilers, and tool vendors will not have to support
their own parses. Programmers will benefit from sophisticated systems that
assist them in debugging their software.
The future of automatic debugging tools will be very similar to what is
currently happening in consumer computer products. For example, the developers
of basic spreadsheets allow third-party vendors to connect their modules to
the basic application, and the modules are seamlessly integrated into a whole
system. The same thing will happen when compiler vendors and tool vendors
integrate their systems.
The future of development tools is very exciting. We believe that automatic
error detection will expand. The development of OCI-based automatic error
detection tools was a big step forward. At present, this technology is being
replaced with SCI-based tools which are more accurate, can detect larger
classes of errors and are extensible. These tools will be the basis of future
compiler-based automatic detection tools which will detect errors that at this
stage we cannot even imagine.
|