Engineering Tradeoffs
What is the best mechanism? Obviously, an n log n sort
is infinitely preferable to an n2 sort. No question. But
sometimes you don't have the option. For example, I once had to keep a sorted
list of objects. These objects had been stored in a file. Version v of
the program stored these in most-recently-created order, although each object
had a relationship to other objects which was later handled during the processing.
I had added some new features to the program, which now required that the list
be kept in sorted order. It was easy. I just read the list in, created a side
vector, sorted the vector, then relinked the list based on the sorted order.
Unfortunately, this was on a small machine (MS-DOS, 640K). For a really large
input file, by the time I read the whole list in, there was no space left to
allocate the side vector for sorting. This was, of course, a fatal error; the
program could not read existing large files. Fortunately, this was discovered
during the beta testing. So what I did was read the file in, and try to malloc the
side vector. If there was insufficient space, I got back a NULL pointer.
At that point, I would do a bubble-sort of the list, swapping the positions
of the elements in the list (it was a two-way linked list, so this was easy).
The difference was that when I could call qsort, it only took a couple
seconds (even on a 6MHz 286) to do the sort. But when I did the bubble sort,
it took several minutes. So all I did was post a message saying "converting
old file to new format, this may take several minutes", then sorted the
file. Once it was sorted, of course, it did not need to be sorted again, so
this happened exactly once for each large file read in.
In another program I wrote many years ago, I was sorting large symbol tables
for various kinds of display (classes, members, etc.). Being in a hurry to
get to the final stages, I tossed a bubble sort into the sort algorithm because
I could write it in under five minutes, about as fast as I could type. After
I'd run all the tests, I ran a "real" example. The program printed
out the message "Starting reports" and ten minutes later it hadn't
finished. Ouch! Obviously, the problem was my poor n2 sort.
So I got out my favorite n log n sort algorithm (this
was in the days before C; my program was not written in C, and qsort did
not exist), dusted it off, translated it to the language I was programming
in, and replugged the sort subroutine.
I then reran the program. It printed out "Starting reports", and
ten minutes later it hadn't finished! A bit of poking with the debugger revealed
that I was still in the sort routine. This didn't make any sense. So
I tossed in my performance measurement tools, and learned something that I
should not have forgotten.
When we talk about complexity, we simplify our discussions. We talk about n log n, n2,
and such, but forget one thing: these designations are not accurate!
The correct designation is C × (n log n)
or C × n2,
where C is the "constant of proportionality". My detailed measurement
showed that I was actually expending all my time in the string-compare routines,
the equivalent of strcmp. So while I had reduced the fundamental complexity,
I had not reduced the string-compare time, so I was doing n log n string
compares of somewhat lengthy strings.
What to do? Well, because of the way I was handling
the reports, I had to re-sort the information many times based on a multifield
structure, where the fields were string names. So what I did was take my
symbol table, build a side vector which embodied all the names in
the symbol table, and for each symbol table entry, I assigned an integer
which was its position in the side vector. Thereafter, when I needed to sort
a set of names, instead of reaching out through its symbol table pointer
and doing a string compare, I reached out and picked up the integer. If name n1 was
less than name n2, then its corresponding integer value k1 was
necessarily less than the integer value k2for name n2,
The result was that the sorting and report generation required less than
one minute (on a machine roughly comparable to a 286 computer, with about
as much memory).