Bounds Errors
There are many valid optimizations which uncover bugs that are masked in the
debug version. Yes, sometimes it is a compiler bug, but 99% of the time it is
a genuine logic error that just happens to be harmless in the absence of optimization,
but fatal when it is in place. For example, if you have an off-by-one array access,
consider code of the following general form
void func()
{
char buffer[10];
int counter;
lstrcpy(buffer, "abcdefghik"); // 11-byte copy, including NUL
...
In the debug version, the NUL byte at the end of the string overwrites the
high-order byte of counter, but unless counter gets > 16M, this is
harmless even if counter is active. But in the optimizing compiler, counter
is moved to a register, and never appears on the stack. There is no space allocated
for it. The NUL byte overwrites the data which follows buffer, which may
be the return address from the function, causing an access error when the function
returns.
Of course, this is sensitive to all sorts of incidental features of the layout.
If instead the program had been
void func()
{
char buffer[10];
int counter;
char result[20];
wsprintf(result, _T("Result = %d"), counter);
lstrcpy(buffer, _T("abcdefghik")); // 11-byte copy, including NUL
then the NUL byte, which used to overlap the high order byte of counter
(which doesn't matter in this example because counter is obviously no
longer needed after the line using it is printed) now overwrites the first byte
of result, with the consequence that the string result now appears
to be an empty string, with no explanation of why it is so. If result
had been a char * variable or some other pointer you would be getting
an access fault trying to access through it. Yet the program "worked in
the debug version"! Well, it didn't, it was wrong, but the error was masked.
In such cases you will need to create a version of the executable with
debug information, then use the break-on-value-changed feature to look for
the bogus overwrite. Sometimes you have to get very creative to trap these
errors.
Been there, done that. I once got a company award at the monthly company meeting
for finding a fatal memory overwrite error that was a "seventh-level bug",
that is, the pointer that was clobbered by overwriting it with another valid
(but incorrect) pointer caused another pointer to be clobbered which caused an
index to be computed incorrectly which caused...and seven levels of damage later
it finally blew up with a fatal access error. In that system, it was impossible
to generate a release version with symbols, so I spent 17 straight hours single-stepping
instructions, working backward through the link map, and gradually tracking it
down. I had two terminals, one running the debug version and one running the
release version. It was obvious in the debug version what had gone wrong, after
I found the error, but in the unoptimized code the phenomenon shown above masked
the actual error.