I’ve spent the past week plugging performance holes in a project at work. I wrote a Rails based service that allows us to manage our large amount of messaging to users. It provides stuff like bounce handling and serves as a central drop off point for other applications to push a message in the queue and let the service take care of the rest.Now, I generally pride myself as being knowledgeable about relational databases. I’ve worked with all the major ones out there and I spent a good part of my career being paid to write one of them. So I should know this stuff right? Well, sometimes I guess you need to be put in your place. I made several mistakes in the approach I took with the messaging service. Here’s a list in no particular order. Don’t do what I did.
Underestimate the Load
I’m sure this one happens a lot. I happily developed this thing on my dev box, put 10k messages into it and all was well. I was off by about a factor of 20x. In less than a week, we’ve already put around 200k+ messages into it. As a result, some queries that I just didn’t think would be an issue are now slamming the db server.
Missing Indexes
I’ve been taught to avoid optimizing too early in development. Wait until something is proven to be slow before optimizing it. In database terms, this is often the choice of indexes. So my tables have a primary clustered key index and nothing more. Yeah, that was stupid. It took all of about 5 minutes to point out a handful of indexes that would immediately be useful. In retrospect, the schema pointed out all the columns you would index very clearly.
Putting the BLOB’s Inline
I did think about this one during development. I left three blob columns in the primary table. I honestly don’t know why I did that. I know better. I’ve been burned by this before. I blame this one more on drinking the Rails koolaid. Leaving the BLOB’s in the model makes for nice clean classes but results in poor db performance.
Assume MySQL’s Query Processor Will Take Care of Things
This is the biggest mistake I made. I now have a much greater understanding why people still pay big money to Oracle and Microsoft. Compared to SQL Server, MySQL’s query processor is vastly inferior. In SQL Server, you can get away with some awful queries. The QP does a fantastic job of turning your slop into fast queries. MySQL is not like that. Even after feeding it indexes that should let it figure out results easily, it seems to not use them. Queries (like select count(*) from table) that I never would have thought as being at risk bring the server to its knees.
Overengineer the Admin Tools
I did some pretty cool work on the admin UI for this thing. It would show real-time rollups of work loads, list recent history, show the last time an application submitted a message, and had a multi-sorting UI for browsing the message queues. In the end, almost all of these are slammed by the load. And much of the data can be represented in lower tech ways like performance monitor graphs and log files. The UI was a great tool for debugging and developing but in production, not a good idea.In the end, I think I spent more effort on getting the Rails part of the service right. I was helping to design an entire REST based API for multiple services and this service was a pilot for how to do the others. Almost all of my mistakes would have been no big deal except #1 - underestimating the load. Know your problem set and put your effort into where it will spend the most time.
Comments 1
In my experience, PostgreSQL does a pretty good job of rearranging queries — it’s quite rare (and the query has to be very complex) for a manual rearrangement to have any effect on the query plan.
Posted 24 Apr 2007 at 15:31 ¶Post a Comment