or, why using a sql server database table as an ipc mechanism is not a good idea for a high-performance, high-response, high-availability system (but will be, in the near-future)

but, it is a good idea for a low-performance, local use case that involves maybe two or three machines at once, or for testing ipc algorithms

before you chastise me for picking on a specific implementation style, let me just say that i find it to be quite elegant, if the corresponding implementation could match the runtime requirements of how a queue should respond. it's a good idea if we could get rid of some of the fluff a database system uses and take the good parts out, namely, the memory management, storage simplicity, backup simplicity, high-availability, single-server integration, the tool availability to manipulate it, that sort of thing. there are a LOT of benefits to using a table, but it all is for naught if the rest of the system suffers because of it. that is my main point.

here is a great presentation describing why it doesn't work, and how a competiting db system implemented it, and why it should've never become a huge deal within vpm : PgQ - generic high-performance queue.

and that's what we need to do for vpm's rbq. which, i think, is still possible. but it needs to be done. the rest of this note kind of describes the reasons outlined in the presentation.

Here is a discussion of what I've noticed with VPM, in generic form. See if you can draw the same parallels I did! :) : article

And the second part of that article series here

This is so popular, there's even a Wikipedia page!

this note is provided as a rough draft, and thus, doesn't fully present its arguments as coherently as a copy-editor would make it.

Queues are a really, extremely important concept in any computer program that has to deal with multiple inputs and some way to organize those inputs, be they language tokens, passengers on a amusement park ride, or waiting at a stop light.

However, a queue is only as good as it's implementation allows it to function. Take, for example, a system that is processing fictional notes representing changes in the ownership of a product or abstract notion of a product, perhaps symbolized by a code or something, and whose owner is also symbolized by a similiar code. A "code", in this sense, is a string of language entities, perhaps letters from the Latin alphabet or Devanagari iconographs.

Naturally, the owners of these symbolized products want to keep track of how many symbols they own, and keep track of whom they've perhaps sold them to, or performed transactions with, on certain "markets", let's say. So they decide to hold all this information in a large, high-availability storage program, a database, perhaps. Since most databases these days can be accessed with the SQL language, they could use a "SQL server". In doing so, they want to make sure their information is accurate, so they put a lot of restrictions on what can access the data bank at one time. In computer science, we call these "locks" as a way to prevent multiple changes from stacking atop one-another and creating inconsistent data. We also have other nice features from using a relational data bank system to store the information so we do not have to duplicate too many details. This costs us in terms of disk efficency, which is in high demand, but not high supply (unless we have more than one disk controller.)

Getting back to queues, it also has a specific set of requirements to follow. First, let's list the outputs :

  1. We define pop() to return AND SIMULTANEOUSLY remove the front element from our queue. If we have a priority queue, this is the one with the highest priority. This operation should execute in constant time, and why shouldn't it?
  2. We define peek() to return the front element. We can put an extension to peek() defining a filter to return elements of the sub-queues specified below. This operation should also execute in constant time.
  3. We further define a set of sub-queues within the general queue, corresponding to different code types, as defined above. These codes, "opcodes", one could say, could be used as input to a seperate component that processes messages on this queue.
  4. We define other sorts of fun information to store, like application specific things (like, say, the symbolic codes we defined before.)

Now, how do we get things into this magical entity? We define the following :

  1. push(), which allocates another entry to the queue and copies the input element onto it. If we have a priority queue, it reorganizes the queue so that the copy of the input element is in it's proper place such that peek() or pop() would return it, if it were to be the front element.
  2. repost(), which requires an element that has already been posted (and is visible in the queue's history), and submits it to the reader for reprocessing. Note, that we do not needlessly have to copy the element ourselves, the queue can copy it for us, or do some manipulation. The point is, we don't care what it does, and that's important. That reduces complexity on our side. This must run in constant time, as well.

And that's it. That's all we need. It might also be useful to keep an archive of messages recieved by the queue.

".... Hey, I've got a great idea! Lets use a SQL server table! It has everything we need! We can pop by using "SELECT TOP 1 ...", we can push by using INSERT! And look, we don't even need to do anything special to implement priority queues! We can just use ORDER BY to do everything for us!"

And, that's not a bad idea, at first. Apart from breaking our promise about constant time constraints (SELECT TOP 1 *could* be constrained by the size of the table if we have a funny WHERE clause), everything just fits, until we come to some interesting application requirements :

  1. External applications must be able to manipulate this queue. Everything that uses our system MUST write to this table otherwise nothing will happen.
    even ones we don't control!!!
  2. This thing is supposed to get written to A LOT. Like, 50 times a second!
  3. We have to be able to "re-post" things back onto this queue so they're recirculated by the consumer program! We thus need to keep an "archive" of requests and ensure they don't become outdated or mislabeled! We should also be able to copy them from one machine to another, ideally.
  4. It'd sure be really nice to use this archive to track failures! Because, you know, so we can "re-post" these things!

So, some of these requirements aren't bad. The archival stuff we've already got by using the database! But the big problems are the first two requirements. How can we force external applications to use this whenever doing anything? How can we expect them to keep up with changes to the format this queue will take?

The most used solution is to force consumers of the data to use library that will mediate all access to the data bank and to the user's output. But, that discards the advantage we have of using an easily-accessible data bank, namely a "SQL server". We can't expect the user to know the intricacies of the data schema, can we?

The answer is that users don't give a damn. They will do whatever is easiest for them to do, including changing the schema for fun and profit (mostly profit.)

So, instead, since we're already using a database for communicating between our input and our processing, and since all users must eventually go to the database, let's put it in there! They'll have to use it! But where?

The answer is, at every operation that could potentially affect the state of these symbolic codes that could change the sum total of what they own, or their "position", we could say. Fortunately, our "SQL server" has a mechanism to automatically run a program at each and every change in the database. What's our requirement for this program?

  1. Compute and push() a message into our queue. Make sure we don't push a duplicate message into our queue, though. We can just "re-post" it later. (Is this a good idea? It kind of breaks our assumption about a queue.... oh well!) Wait! That's another application requirement that just kinda snuck in there.
  2. Run REALLY FAST! We don't want to slow things down just because of a design choice, do we?

This, actually, is okay. So long as our requirements are met, there's no reason we can't run a program for each change.

So, we go off, hack a little bit, and it works! Whenever our symbolic code quantity changes, we fire off this program, insert an element into our queue, and we can process it later! We can even store it and archive it and everything!

But soon, the problem rears it's head once the system is being used by a user. They complain about things like :

  1. "[unrelated operation concerning metadata] is too slow!"
  2. "[program that uses queue data] is too slow! I don't see updates!"
  3. "[operation changing our symbolic data]" is too slow!"

The poor programmer peons toil and toil, and eventually find some inefficency in the database call. "Well, that's normal", they say, "the database is sometimes slow, maybe it's an index or something", but, upon investigating further, something is wrong, and this is what is fundamentally wrong with the initial implementation.

There's too much going on to implement our fundamental operations!

Whenever the program decides it needs to do a simple or complicated operation, each change requires the following operations :

  1. Lock the table to modify.
  2. Write or change the new data into the container, including checking the symbolic code.
  3. Check to see if there's a message queue element already there. If there is, load it.
  4. Lock the message queue table, since we're pushing (or modifying!) into it.
  5. Write or update the message element.
  6. Ensure the message element data references the right symbolic codes.
  7. Unlock the message queue table.
  8. Process the next element for the operation.
  9. Unlock the table that we've modified.

The above steps could be called a "transaction". We have nice algorithms for dealing with that. But here's the problem :

There is no way that the above steps can be *efficent*!!!

We have ONE lock held and unheld just to do something as simple as add to the queue! Something as simple as push()! Furthermore, we're also checking the queue each time! Something that it is not defined to do. And, we have the following overhead of the fact that we're using a database, something that's designed for long-term storage, preferring lots of reads and little writes, for something that is supposed to be used for a lot of writes :

  1. We're checking to make sure our symbolic codes are correct.
  2. Adding a new index entry to whatever indices we have.
  3. The locking contention as mentioned before.
  4. Shuffling around entries to accomidate the new entry.

And all we're doing is adding to the end! Why do we need to shuffle things around? For our priorities? In a different queue, that could be implemented more efficently than what we're forced to use as a consequence of using a database table as our underlying mechanism. Not to mention the extra time held on the first lock as a result of constantly grabbing and releasing the queue lock.

Keeping a lock on the queue isn't a database problem; any implementation could have that problem. But since we use a database table, we're forced to keep that lock, just as we are forced to use a bunch of other things we don't need.

In addition, using this table as a queue has given us a bunch of headaches as a result, namely supporting a more complicated program in an area that is normally reserved for only data (our program that executes for each change in our symbolic entities.)

The end result is that, once two users try to hit two separate tables that hold disjoint pieces of the overall picture, our system grinds to a halt. And that's not even considering how the other piece of the puzzle (our consumer, our processor, forgot about that?) has to access and remove these things!

The main problem in all this mess is that we're forced to use things the database designers thought were good ideas. And they're right, they are good ideas. Just not for everything.

The problem is, we've become so dependent on the nice features that were provided because we drank the Kool-Aid and decided to use a table. We have a nice way to archive our queue elements, we have a nice, standardized way to write them, we have a way to inspect them (when necessary), we have a standardized way to pull them out. But in that standardization, we lose a great deal of performance, which is critical if we need to continuously write to this thing 50 times a second for more than one user.

So what is the solution? We have to decouple the queue from the database, and create it's own mini-database, in a sense, for it. One that can guarantee the specified response times. Otherwise, it'll work, but it'll always be crippled.

On a low-response system, this is okay. We still have our end result. But it can't work fast, it is impossible for it to work fast because we lack the flexibility to do away with the assumptions we no longer need.

Honestly, I'd love to be wrong. I'd like a message queue-as-a-table to work. And you know what, it can, if we use a database system that supports plugins. I may also be overstating the issue by grandstanding this, because I have proven that a lot of the bottleneck is due to the constant checks we do and the lock we hold as a result of allowing modifications. If we remove that, who knows? I wanted to test more, but . . .

I invite all criticism. Please, tell me if I'm wrong and where.

There were a few techniques I never got the time to try before I got shot down, like rotating the set of open queue items into a temp table and consistently SELECT off of that, but that's because I thought it wouldn't amount to much, or it wouldn't amount to that much improvement than moving to a secondary application/server architecture.

--jim briscoe email me