Minimize Your Database Connection Time
One of the techniques that I learned and used way back when I did mostly ASP development was to:
- connect to the database only when I needed to,
- retreive only (and all of) what I needed, and then
- disconnect from the database as soon as possible.
In working with ASP and MS SQL Server, following the three steps was easy. I'd open my connection and then build my query to return one or more recordsets. Then, I'd use getRows to pull all of the data into a multi-dimensional array. After that, I'd close the connection. This meant from opening the connection to closing it, I usually only had a few lines of code. With all my data now in ASP, I could manipulate it and pass it on to my template.
PHP and MySQL can and should be used in a very similar fashion. There's a little extra grunt work in that you have to make multiple queries and loop through each recordset to get it into arrays but the benefits are worth it. Jay Pipes' recent blog post on MySQL Connection Management sheds some light on this.
MVC frameworks should be designed to do this well. The request comes in to the controller and it queries the model for any data it needs. The model should then be responsible for either pulling from the cache or connecting to the database and retreiving the data at that time. Once the controller has all the data it needs, it can close the connection to the database and then proceed to sending the data on to the view.
Minimizing your connection time is an approach you should take for any external resource, be that reading from files or downloading data from a remote server.
This is a good tip with low-volume sites. However, with high-volume sites the overhead of opening and closing connections becomes significant.
Thus, many frameworks (I won't say MVC frameworks, this is a different issue) use connection pooling: they keep a fixed number of connections more-or-less permanently open and reuse them.
One of my favorite tricks for this is in ORMs that let you "select related" -- they'll work out what objects you explicitly asked for, and if you tell them do "select related" they'll also figure out what other objects (through foreign keys or intermediary many-to-many tables) are related to them, and fetch the whole batch in the same connection.
Of course, this is more of a performance tradeoff (because it means more objects in memory) than a pure performance enhancement, but for many common cases it's a worthwhile tradeoff to make.
Thanks for the link to Jay Pipes, it was a good read. While I feel that caching to disk as shown in the article is probably not the most efficient way to go, the example illustrates nicely at a high-level how you can go about implementing DB caching.
Reg: even with connection pooling, you still have the issue of available connections in the pool. The sooner you can release a connection back into the pool, the better. Also, the data provider often handles the connection pooling for you (as is the case with ADO, if I'm not mistaken).
I'm not sure about the older ADO connections, but ADO.NET definitely handles connection pooling for you automatically.
My understanding is that you do close the connections yourself, as soon as you've used them, but .NET will keep the connections alive in the background, so the next connect requires less overhead.
Even though it is not new, I would like to emphasize that these rules, slightly modified, can be applied to client/server instead of web-server/database. The solution is of course XML+XSL and results in performance gain.
1. connect to the server only when I needed to,
2. retreive only (and all of) what I needed, and then
3. send it to the client .
With all my data now in XML, I could render my HTML with my XSL-template, and if the user resorts a list no new connection to the server is needed. I could also re-render with completely another XSL-template and present data in different fashion.
When presentation logic is moved to the clients, the server can serve more clients.