Debugging Software Crashes

来源:百度文库 编辑:神马文学网 时间:2024/10/03 02:47:30
Debugging software crashes is one of the most difficult parts of real-timeand embedded software development. Software crashes when an application performs an illegal operation and theoperating system is forced to abort the execution of the application. Here wewill discuss several causes of crash in typical embedded application. A goodunderstanding ofC to assembly would behelpful in understanding the content described here.
The following software problems lead to crashes:
Invalid Array Indexing
Un-initialized Pointer Operations
Unauthorized Buffer Operations
Illegal Stack Operations
Invalid Processor Operations
Infinite Loop
Invalid array indexing is one of the biggest source of crashes in C and C++programs. Both the languages do not support array bound checking, thus invalidarray indexing usually goes undetected during testing. Out of bound arrayindexing will corrupt data structures that allocated memory after the array.Another point often missed in analyzing  array indexing problems is thefact that invalid array indexing can corrupt data structures declared before thearray. This happens when the array is indexed with a very large unsigned numberthat represents a negative number in signed arithmetic. Consider an array bwhichis accidentally indexed with the number 0xFFFFFFFF, Since array index isconsidered to be a signed integer, this access will be treated as an access to -1 index. Thus this access will corrupt variables declared before thearray, i.e. memory allocated to a. If thearray is indexed with an index greater that 99, itwill corrupt c.
Array Declaration
Data1 a; // Corrupted when b is indexed with 0xFFFFFFFF (-1) int b[100]; // Declaration of b. Keep in mind that array indexing is a signed operation Data2 c; // Corrupted when index into b is greater than 99
Un-initialized pointer operations are also a big reason for crashes in C andC++ programs. This problem is so acute that languages like Java and C# do notpermit pointer operations. If a pointer is not initialized before access, thiscan result in corrupting pretty much any area of the memory. Sometimes this canresult in hard to detect crashes as the pointer causing memory corruption mightbe located in completely unrelated area of the code. Also, un-initializedpointers can lead to unexpected behavior when the memory map of the applicationis modified. This happens if an un-initialized pointer operation was corruptinga unused memory block. Shifting the memory map or resizing of data structuresmight cause the corrupting pointer access to modify used memory. This type ofproblems should be suspected when a developer has just changed the size of somedata structure and a stable application starts crashing.
A special case of this problem is invalid access resulting with an attempt toread or write using a NULL pointer. Here the detection of the problem is verymuch hardware dependent. On some platforms, accessing memory for read or writeusing in NULL pointer will result in an exception. On other platforms, readusing a NULL pointer might go undetected but a write operation results in acrash. In yet other architectures, read and write accesses using NULL pointersmight go undetected.
Another special condition is described below. If UpdateTerminalInfo is calledwith an un-initialized pointer, there is a possibility that the program does notcrash when status is updated in the structure but it crashes inUpdateAdditionalInfo when the info variable is updated. This can happen if thebeginning of the structure maps to a valid address but following elements map toillegal addresses.
Un-initialized Pointer Crash
typedef struct { int status; . . . int info; }TerminalInfo; void UpdateTerminalInfo(TerminalInfo *pTermInfo) { pTermInfo->status = INSERVICE; UpdateAdditionalInfo(pTermInfo); } void UpdateAdditionalInfo(TerminalInfo *pTermInfo) { pTermInfo->info = TERMINAL_INFO; }
Many times applications free an area of memory but continue to use a pointerto the memory. This can result in hard to detect crashes as the buffer mighthave been reallocated to some other application. This might lead to unexpected behaviorin a different application. Sometimes this might also cause a crash in thememory management subsystem of the operating system as unauthorized bufferaccess might corrupt the heap management data structures.
A special case of unauthorized buffer operations is covered below. Here thebuffer is freed up in the function and an access is attempted to the bufferafter freeing it. This type of problem might go undetected and might even beharmless on some systems. However in a multithreaded design, the buffer mighthave already been allocated to a different thread!
Unauthorized Buffer Operation
void foo(Data1 *buf) { // buf is freed in this line free(buf); // An access is attempted to buf even after it has been freed up. // This might cause a problem if the thread got descheduled between // the free statement and unauthorized buffer operation. The buffer // might have already been allocated to a different thread! buf->x = NULL; }
Illegal stack operations can lead to hard to detect crashes. This typicallytakes place when a program passes a pointer of the wrong type to a function. Theexample given below shows a case of a function expecting an integer pointer andthe caller passes a pointer to a character.
char pointer/int pointer mixup
main() { char count; // The routine expects a int pointer but a char pointer has been passed // Older compilers and non ANSI C compilers do not catch this error GetCount(&count); // The called function was expecting an int (say 4 byte) variable. It was // however passed a char pointer with one byte space. GetCount will still // write four bytes, thus corrupting local variables or parameters on the // stack } bool GetCount(int *pCount) { . . . *pCount = returnValue; return true; }
Processors detect various exception conditions and abort program executionwhen they detect an error condition. A few of these conditions are:
Divide by zero attempted by application
Program running in user mode attempted to execute an instruction that can only be executed in supervisor (kernel) mode.
Program attempted access to an illegal address. The address might be out of range or the program might not have the privilege to perform the access. For example, a program attempting to write to read only segment will result in an exception.
Misaligned access to memory also results in an exception. Most modern processors restrict long word reads to addresses divisible by 4. An exception will be raised if a long word operation is attempted at an address that is not divisible by 4. (See thebyte alignment and ordering article for details)
When a program enters an infinite loop, it might crash due to invalid arrayindexing when the loop index exceeds the array bounds and corrupts memory. Inother scenarios, the program continues to loop until a watchdog kicks in andaborts the program. If watchdog functionality is not supported, the system will"hang" and never recover from the error. Thus all embedded systemsmust be designed to support watchdog reset functionality.
See the article onfault handlingtechniques for more details about watchdog handling.