Lock Contention Described
Databases that don't do just row-level locking often use a technique called lock
escalation to achieve better performance. Unless its clear from the outset that
a whole table will be modified, these databases start off using row locks, and
they make plans to trade these locks in for rough-grained locks later if too
many rows are modified.
Unfortunately, lock escalation introduces and amplifies a whole new problem:
deadlock. If two users try to modify semantically-unrelated but physically-near
data in two separate tables in reverse order, both users will start off with
row locks, then try to upgrade them to page locks, and the situation will be
that each user wants something the other user has, so they're stuck. This is
deadlock.
For example:
- UserA modifies some rows in TableA, causing a page lock affecting not just
the rows UserA modified, but many others
- UserB modifies some rows in TableB, causing a page lock affecting not just
the rows UserA modified, but many others
- UserA wants to modify some rows that UserB has locked (but not modified)
in TableB
- UserB wants to modify - or maybe just access - some rows that UserA has locked
(but not modified) in TableA.
Something's gotta give. To deal with this problem, the database occasionally
looks for deadlocks, and kills off one of the transactions so the other can finish.
It usually kills the one that's made the least modifications so that it minimizes
the cost of rolling back changes. Databases that use only row-level locking almost
never have this problem because two users rarely want to modify the exact same
row, and even more rarely do they attain locks in the perfectly poor order needed
to cause deadlock.
Also, databases like this use lock timeouts to prevent users from waiting
too long for a lock. Query timeouts also factor in here. You can write code to
retry queries that timeout, but this only automates database congestion. Any
timeout that is often reached will only serve to worsen the user experience.
Things simply should not take that long.
In practice and under high load, SQL Server's locking system - which is based
on lock escalation - does not perform well. Why? Lock contention. Lock contention
is the problems of deadlock and waiting for locks. In a system in which many
users are modifying the database at once, and many more users are trying to access
the database concurrently, the locks are flying, users spend a lot of time waiting
to attain locks, deadlocks are frequent, and users are far from happy.
Granted, if you've only got a few occasional users you won't have much trouble
with SQL Server's out-of-the-box behavior. You'll be hard pressed to see these
problems with simple in-the-office tests or deployments involving just a few
users. But throw a couple hundred concurrent users at your database and a constant
stream of INSERTS and UPDATES with quite a few DELETEs
sprinkled in, and you'll start reading Oracle literature and eyeing your war
chest. However, I've got a solution for you that will only cost you a code review,
a few minor tweaks, and a system test. You do have a system test procedure in
place, right?