The Case Against Stored Procedures

Introduction

What's this page about? Well, I recently searched the web to try to find information supporting my argument to NOT use stored procedures in a large web project that I'm one of the lead programmers for. Why? Well, I'll get to that in a little bit - for now I just want to say that all I could find on the web were pages about how wonderful stored procedures are and why everyone should use them. And let me say right off the bat that if you're a big fan of stored procedures, and they're working well for you and/or your organization, then more power to you. I'm glad that you're happy with them. Just don't bother emailing me to try to convince me or make me see the light about stored procedures - you'll be wasting your time.

(Update, Feb 2005. I finally found a couple other web pages that argue against using stored procedures. Check out:
http://weblogs.asp.net/fbouma/archive/2003/11/18/38178.aspx and
http://weblogs.asp.net/fbouma/archive/2003/05/14/7008.aspx and
http://www.15seconds.com/issue/040112.htm#StoredProceduresAndCulturalBarriers)

I guess I should start out by giving my credentials. I graduated from Penn State in 1989 with a Bachelor's Degree in computer science (with a math minor). I finished with a 3.53 GPA, and that was with all four years spent at the main campus (I've heard that it's easier to get a good GPA in technical majors if you start at a branch campus your first couple years, because the main campus always has too many students and tries to "weed out" some of them by making the entry-level classes especially difficult). After graduating, I got a job working as a civilian employee of the Navy, initially writing COBOL and ADSO programs on a mainframe. Things change quickly, and over the course of the last 14 and a half years, we've gone from mainframes to client-server to web, and from ADSO to Visual Basic to .NET. During that time, I've risen to the highest position a "techie" can hold in the government (to go any higher, I'd have to become management, which I have no interest in). I've been told regularly by the various supervisors and team leaders that I've worked for that I'm one of the best programmers they've ever worked with (not that I'm bragging - it's not all that difficult when most of the programmers you work with have a computer education that includes just a six-week COBOL class updated every few years with a week-long VB or .NET class). Anyway, the point that I'm trying to make is that I've been programming computers almost half my life (more than half, if you count playing around with a Commodore 64 in high school), so I know what I'm talking about here.

Now, exactly what axe is it I'm trying grind here? Only fair to ask. Despite my desperate pleas to the contrary, management where I work has become convinced (by a couple database designers and an outside contractor) that all database access should be done via stored procedures. I hate using stored procedures. Just absolutely hate it. The reasons are outlined below, but the biggest reason is that it takes all the flexibility and power of a relational database out of the hands of the programmer and replaces it with inflexible, often confusing calls to stored procedures. To make matters worse, our two database people seem to expect to write all the stored procedures themselves. Talk about a bottleneck. I'm willing to bet that, six months from now, we're going to have a couple dozen programmers all sitting around twiddling their thumbs waiting for countless stored procedure changes to be made. And...well, I'm getting ahead of myself. Let's just say that I've given up trying to convince management that they've made a huge mistake, and have to settle for writing this little web rant instead, just to vent a little steam. And maybe save someone else from making the same mistake.

The Case

I guess the best way to do this is to show, point by point, why a lot of the claims about the benefits of stored procedures just don't hold up. Also, I've noticed that the proponents of stored procedures never seem to mention the advantages of dynamic (non-stored-procedure) SQL, so I'll list those too.

Speed

The first claim that is usually made is that stored procedures are a lot faster than dynamic SQL. I've run tests, and it's not true. To be fair, all my tests have been run against Sybase databases, because that's all I have available to me. Maybe other relational databases are faster. But it definitely was not true for the tests I ran.

Several years ago, I wrote a client-server test program that ran a large, very complicated select statement as both dynamic SQL and a stored procedure. The exact same select was used in both cases - it joined together several tables using a complex where clause and returned hundreds of rows of data. And guess what - they ran at about the same speed. The stored procedure would occasionally run a few milliseconds faster, but then the dynamic SQL would also occasionally run a few milliseconds faster.

So when the controversy recently arose again, I wrote a ASP.NET web page that ran both a select statement and an update statement, as dynamic SQL and as a stored procedure. This time I used stored procedures that the database team had written, so one would assume they would run as quickly and efficiently as possible. Then I came up with my own dynamic SQL select and update statements to do the exact same tasks. I wrote the ASP.NET code to try to run both versions of the statements as efficiently as possible (i.e. I defined to the database command objects that the stored procedure was in fact a stored procedure). Then I ran 30 test trials. The results - in 28 trials, the dynamic SQL ran FASTER than the stored procedure. In some cases three to five TIMES faster. In one trial they ran at the same speed, and - I'll give the stored procedure its due - in 1 out of 30 trials, the stored procedure did run faster by a few milliseconds.

But (I can hear some people saying), it's not a fair test. The benefit of stored procedures is that they're compiled and once you run them they stay in memory so subsequent calls will run faster than dynamic SQL. Seems like a weak argument to me, because how often do you need to run the same SQL statement multiple times in a row? But to be fair, I set up loops to run each statement 10 times in a row, and then 100 times in a row. You can guess the results - the dynamic SQL still ran faster the vast majority of the time.

So what's a few milliseconds? Good point. When I say the dynamic SQL sometimes ran three to five times faster than the stored procedures, we're still talking about a fraction of a second here. But turn that around and ask yourself - even if it could be proven that stored procedures run faster, we're still talking about the blink of an eye either way - who's going to notice?

Complexity

Another argument that our database team makes is that you can do more complex database access with a stored procedure vs. dynamic SQL. This claim just proves that they don't know much about programming. The more complex a stored procedure is, the more likely it is to fail, especially if you need to make database changes down the line. I know I have a programmer's mentality, but I really fail to see why complex programming logic should be put on the database and not in the programs where it belongs.

One of our database people tried, as a hypothetical argument, to paint a scenario where a program calls a stored procedure which gets data from one table, then based on those results might hit another table, and based on that might hit another table, etc, then returns the final results. My response to that would be if you need to do all that in a stored procedure, then the database probably wasn't designed correctly in the first place. And if there's decision logic to be made, it should be in the program - there's no reason why the program can't make the first database call, make the necessary decisions, then make subsequent calls. Well, OK, there is a reason - if every database call makes an expensive network trip, then yes, it might save a couple seconds to have that logic in a stored procedure. But cases like that are fairly rare. If management had decided that in such cases a stored procedure would be used and left it at that, I probably wouldn't be writing this little rant. But they want stored procedures for everything, which is just plain overkill. Besides which, there's a good chance our database and web server are going to be sitting on the same physical box, so there won't be any expensive network calls.

Consolidation of Code

One of the selling points of stored procedures is that multiple programs can call them. So if, say, three programs need the same database functionality, all three can call one stored procedure. I see this as a major flaw in the stored procedure argument - what if the needs of one of those three programs changes (and they always do)? You modify the stored procedure for that one program (possibly not even knowing that two other programs are using the same procedure) and suddenly two other programs that you didn't touch are crashing. I've seen it happen, no matter how careful you try to be with keeping track of which programs run which procedures. In a large project such as the one we're undertaking, where there are going to be somewhere in the neighborhood of three to four HUNDRED "programs" (which translates to thousands of web pages) calling hundreds (if not thousands) of stored procedures, it's just insane. No one is going to be able to keep track of all that. It's going to be a maintenance nightmare. Hmmm, could that be why the idea is being championed by a database designer who's approaching retirement and a contractor who may or may not even be around when the system is released? Because they don't have to stick around to feed and care for the monster they're creating? Nah, leave that to the guy who still has 20 years 'til retirement.

Also, consider that if multiple programs need to do the same database call, odds are it's a simple select, so you're not really buying yourself anything with a stored procedure. If it's a complicated update, you should ask yourself why three programs are all making the same update - sounds like the programs weren't designed that well to begin with.

Encapsulation

One web page that I read said that if you put all your data access logic into stored procedures, then you can treat your database as an "object" with "methods" for obtaining and updating information. Now, I've become a big proponent of object oriented programming over the last few years, so at first this sounded like a good argument to me. But after considering it further, I realized that stored procedures don't really make your database "object oriented" - they just give you a disorganized, motley collection of functions that you need to call to retrieve and update data. If you really did have some highly logical person with an understanding of OOP design all your stored procedures in an organized way, then maybe I'd buy it. No, who am I kidding - you'd still lose too much flexibility in your database access. And anyway, it would only be possible with small projects - given the scale of our system and the less-than-stellar programming skills of our database design team, this just isn't going to work. Not that I want to disparage our database people (heck, one of them might find this page some day) - as database designers they do a good job. But they're not programmers, so I can't figure out why they want to take control of a big chunk of the system's logic.

Changing Databases

An argument that makes no sense to me is that if you change from one type of database to another you're in better shape if you put everything in stored procedures. Hunh? Now you have to recreate all those stored procedures in your new database. And hope that the syntax isn't too different. Currently, all of our Sybase stored procedures have arguments that start with the "@" symbol. From what I've heard, Oracle (which we're considering going to) does not allow that symbol in parameter names. Whoops. If you keep your dynamic SQL to ANSI standards, it should be able to run against any database. There should be no reason not to - if you need to do "complex" things with SQL that is specific to the type of database you're using, you should think again about how you're doing those things. Odds are you can change it to an ANSI standard statement and then do the complex processing with program code.

"You Won't Have to Change Your Program"

This one is just funny. One of the claims is that, should the database change, programmers won't have to even open the source code for programs (or web pages) - the database team can just tweak a stored procedure or two, and everything will work just the way the user wants it to. Am I the only one that sees this as nonsense? First of all, if you're changing the database, the odds are very good that it's because the user wants to see more (or different) data on a screen somewhere. So the program is going to have to change anyway, to add or modify screen fields if nothing else. And as long as the program is being updated, it's very simple to modify a dynamic SQL statement - much easier than trying to track down what stored procedure is being called, figure out who wrote it, badger them to find time to update and recompile it, test the new stored procedure and finally work the new procedure into the program. Not to mention hoping that that stored procedure wasn't being used by other programs that are now crashing because of the change.

The Benefits of Dynamic SQL

So, if I haven't soured you on stored procedures yet, consider the benefits of leaving the SQL in the programs:

Flexibility

As pointed out several times above, dynamic SQL is very flexible. If your data access needs change, odds are you can just go in and change a couple lines of SQL. No worrying about affecting other programs or dealing with whoever "owns" the stored procedure that's obtaining or updating your data.

Debugging

I've heard that with some programming environments and databases, programmers in the midst of debugging can step right into the stored procedure code and see how the data access/update is being done. This is not the case with Sybase and .NET (and, I'm betting, dozens of other databases and programming languages). It is immensely frustrating to be in the process of tracking down some bug and suddenly hit a stored procedure in the code. What's it doing? What sort of output does it send back? Who knows? Now you have to stop what you're doing, go find the source code for the stored procedure, hope that you have the RIGHT source code, and someone didn't recompile it with some other source code, maybe run it a couple times to figure out if the output is what you think it is, and finally go back to tracing through your program. Whereas with dynamic SQL, you can see the exact SQL statement that is being generated right there in your program, and know exactly what it's doing and what it's going to return.

Powerful, Fast Development

Dynamic SQL is powerful - with it, programs can instantly tie together whichever database tables they need, pull out any data they want, update wherever necessary. Should you suddenly realize there's one field you need from a table you hadn't previously been reading - about five minutes of programming can take care of that with dynamic SQL. Or you could spend an hour or more writing, compiling and testing a stored procedure to do the same thing. Over the course of my career as a programmer, I would estimate that I have spent MONTHS (at least) struggling with problems caused by stored procedures, where most of those problems would have been a simple fix with dynamic SQL.

Maintenance

Do you want to keep track of thousands of stored procedures and know exactly what each one does, what database tables it uses, which fields on those tables it reads and updates? I know I wouldn't. If the SQL is in the programs, you still need to figure out which programs are affected by a database change, but it's a lot easier to look at a web page or program screen and see what database tables it uses than it is to look at a long list of stored procedure names and know what they're doing.

Releases, etc

Once they're written, stored procedures have to be released to the users. They have to be archived somewhere. They have to be documented and tracked. They're a whole extra layer of hassle that you really just don't need. Do everything with dynamic SQL and think of all the headaches you've saved yourself.

Conclusion

This rant ended up running longer than I had planned, and I still don't think I covered all the arguments against stored procedures. But it felt good to get it off my chest, and hopefully someone out there will read this and actually THINK about it (something management at my place of employment seems sadly unable to do) and maybe I'll have saved some programmer the hell of doing everything with stored procedures.


Copyright © 2004 Bob Eichler
Page created August 27th, 2004
Contact: See my "encrypted" email address towards the bottom of my Home Page (look for the bold word email). I recently got a new email address, and in an effort to reduce the amount of spam in my inbox, I'm taking particular pains not to give direct email links in my web pages any more. Sorry about that.


Update, about three years later in the summer of 2007: Guess what? It turned out that the database team couldn't even get their own work done as we migrated from a huge Sybase database to several huge Oracle databases. And they definitely didn't have time to write stored procedures for all the programmers' needs. And management finally figured out that the project was NEVER going to get done if they stuck with their "all database access must be done with stored procedures" policy. So now not only have most programmers gone back to using dynamic SQL for new code, but as we convert all our programs from Sybase to Oracle, a lot of the "simple select statement" stored procedures in Sybase are not getting converted. Programmers are just changing them to dynamic SQL instead, with management's blessing. I hate to say I told you so, but...nah, who am I kidding? I love saying "I told you so".


Back