Mythbusters: Stored Procedures Edition

72 点作者 F_J_H将近 14 年前

30 条评论

sc68cal将近 14 年前

The article relies on major features of Oracle in order to "bust" the myth of the unmaintainable stored procedure, which leads me to believe that these are big band-aids over what is still a difficult problem: maintaining database changes and getting developers to put database changes into version control. Unlike the author, I am not fortunate enough to have Oracle, and am instead using MSSQL for our database with 500+ stored procedures. Here's my experience:Even the first "myth" that was "busted" had me cringing, putting all the stored procedures into individual files and using a source control system to manage those. The person I replaced had done the same thing with 500+ stored procedures and still had never bothered to actually update the files. He just updated the stored procedures on the development server. Partly because he was a massive fuckup, but also because it was "out of band" management.I ended up moving the code out of SourceAnywhere (an all around terrible version control system and company) and placing the code into Github, and cleaning up the database so that it could actually be dumped to an SQL file and then loaded from that file as an NUnit test, run by the CI server that I set up.Even with these huge strides forward, it has taken me months to clean up and fix these problems, while still adding new features to the existing code and database. The experience has left me feeling that the low quality of code that crap coders can get away with in stored procedures, and the lack of testability still leads me to strongly avoid stored procedures because I ended up having to fix them with sweat, tears, and blood.

评论 #2832765 未加载

评论 #2832506 未加载

评论 #2832500 未加载

评论 #2832842 未加载

评论 #2833224 未加载

评论 #2832812 未加载

评论 #2832483 未加载

msluyter将近 14 年前

I work with PL/SQL daily and I believe several points in the article are... debatable, if not misleading.1. Yes, you can store your packages in SVN or whatnot, but two developers cannot work on the same package on the same database at the same time without stomping on each other's work.2. Managing database changes: yes, you can analyze explicit dependencies for a given object, except that EXECUTE IMMEDIATE statements are not tracked. Nor, of course, are any queries done by your java webapps or whatnot.3. "And PL/SQL has object oriented features too." We've moved away from most of the object oriented features because either we encountered bugs (in 10g) or because performance was abysmal.4. EXECUTE IMMEDIATE -- I love the idea of execute immediate, but in practice, it's has problems. Performance tends to be poor (queries often need to be re-parsed) and as noted above, they break dependency tracking. If over-leveraged, they can also result in highly unreadable code.5. Unit testing -- I've looked at several PL/SQL unit testing packages and they've always seemed to me to be fairly half baked. Unit testing procedures with side effects is also often inherently difficult because they rely on database state. I'm not against it, but it certainly isn't as easy as using jUnit. That said, if anyone has successfully used a PL/SQL unit testing package, I'd be curious to hear it.5. Portability -- Irrelevant if you're using oracle specific features (which, you probably should if you're using Oracle).None of this is to say that I'm against stored procedures. On the contrary, they provide some level of encapsulation, and I've generally found PL/SQL to be fairly transparent to understand. If I'm doing some complicated joins and/or several sequential DML statements with exception handling in a transaction, I'd much rather do it in PL/SQL than from an external language. I just think the article paints a picture that's a little to rosy.

评论 #2833018 未加载

评论 #2832955 未加载

评论 #2835231 未加载

peregrine将近 14 年前

This article reads very much like an article for Oracle, and thats not even my biggest gripe with this article. I have written my fair share of SP's in my very short career and I will say this.Pros: Stored Procedures tend to be fast if you know what you are doing. Stored Procedures are a good way to abstract away the "relational" differences between OO and SQL.Cons: Stored Procedures are a good way to write a bunch of horrible to maintain code. Most developers spend their time writing front-middle end code and don't write SQL that well so they write horrible SPs. Even worse is someone who really understands SQL and writes almost the entire application in a SP and writes awful front-end code. If I had a dollar for every time I opened up a 2k+ line SP with nested queries, queries written as strings and 'eval'd or an awful number of joins I'd probably be arrested for suspicion of dealing drugs.In almost every company there are a small group of people who call themselves "DBA's" and believe in writing unmaintainable code for speed and rally against those using ORM's or things that basically remove them from the equation .Its crazy the amount of pain I've witnessed and felt because all of a sudden I realize I have no clue with a SP is doing or why its written a certain way. And having to sit down next to these DBA's so they can "explain"(mostly fix it because they have no time to explain) it to me.My last job we finally got the OKAY to try out Entity Frame work(.net ORM) but the DBA literally rallied for weeks, showing how much faster SP's are. We finally came to an agreement where we would use entity with autogenerated SP's as a backend and fall back to SP's on slow calls. I'm still not sure why he was given a say but it is what it is and I don't work there anymore.

评论 #2832512 未加载

cameronh90将近 14 年前

I disagree with this article. SQL is an decent (though crusty) language when used specifically for the purposes of interacting with relational databases. The fact that most database vendors have hacked mutually incompatible proprietary extensions on-top of it for doing other tasks doesn't make it a good choice for them. Why would you use SQL over a language that has been specifically designed for code reuse, integration and interacting with multiple systems (rather than just the database)?> Myth #1: Stored procedures can't be version controlledOf course they can, but unlike normal code which runs/builds directly from the file-system, a sproc tends to be written directly on the database server (since you won't get all those fancy refactoring/debugging tools unless you do so). Copying it to the fs before committing your changes is a step that people often forget, or one that requires yet more tools.> Myth #2: Managing the impact of changes in the database is hard > Myth #3: Database tools lack modern IDE featuresHis examples of basic refactoring tools hardly compares to something like ReSharper.> Myth #5: Code in the database can’t be properly encapsulated and reused, you need an object-oriented language for thatThey can be encapsulated, but it's a pain. Even just getting data from one stored procedure to another can involve hackery like opening a connection to the database from inside the calling procedure. See this link: <a href="http://www.sommarskog.se/share_data.html" rel="nofollow">http://www.sommarskog.se/share_data.html</a> - none of these will work in even close to all situations.

评论 #2832888 未加载

评论 #2832720 未加载

Sauce1971将近 14 年前

The fear of SQL, stored procedures, proper database constraints, the love of ORM and the embrace of nosql exists because a large portion of influental programmers just don't bother to understand relational databases and SQL. It's not in fashion, so it becomes a necesseary evil the primadonna rock star will try to avoid at any cost. Just like a real rock star might drop to learn to read notes or sing for that matter. Obviously DB theory should be regarded just as important as understanding the latest greatest language feature and framework.

评论 #2832467 未加载

评论 #2832539 未加载

clutchski将近 14 年前

I work at a company who's entire business was written in stored procedures and it has been a massive source of complexity and pain for us. Here are my issues with it:- Most web languages have a wealth of tools for testing, introspection and instrumentation that don't exist or are hacks in PL/SQL. Unit tests, basic logging, aspects that log function execution time, posting to error tracking & app metrics services, etc.- PL/SQL is not expressive or concise. String processing, in particular, is a nightmare. This means more code, buggier code & thus reduced developer happiness.- At scale, developers need to be parsimonious with joins in the data layer, otherwise partitioning is impossible. This is technically possible in PL/SQL, but runs completely against the grain of SQL, so I can't imagine this actually happening in the real world. This is probably my company's biggest source of technical debt.- SQLPlus doesn't return a non-zero error code when certain errors happen when installing PL/SQL, so release scripts have to install then assert the user_errror table is empty. This is just ugly.

评论 #2833707 未加载

评论 #2833028 未加载

_delirium将近 14 年前

The idea of the database storing the entire business logic, i.e. the relational data and arbitrarily complex computations on it, rather than just storing the data and performing relatively straightforward retrievals, is an interesting indirect victory for the old "deductive database" idea. That was one of Datalog's main conceptual claims over the SQL databases of the day: that databases shouldn't just regurgitate rows and joins/filters of rows, but should perform, within-DB, whatever computational logic needs to be applied to the stored data to produce the information that the application needs. It was also an angle that SQL defenders used to use to attack deductive databases, for putting arbitrary logic in the DB where it allegedly didn't belong. It's interesting to see that the SQL world has adopted the idea after all, though in a bit more manual/procedural way.

pkteison将近 14 年前

It's expensive to scale by adding more DB servers. It's cheap to scale by adding more webservers or 'middle tier' servers. You can't just completely ignore the scaling argument by saying "it's no problem, just spend money."Lots of other things are fairly hand waving. Stored procedures can be version controlled... if you write some code to do it yourself, or buy RedGate's tool for that. Still, I've never worked in a shop that did as good a job at version controlling their SQL as they did their code. It's harder. Stored procedures can be refactored... but I'm not sure what tool is being used, and I don't want to have to evaluate 20 different tools to find the good one. I know I currently have better tools for writing compiled code than I do for writing database sprocs.There are many valid reasons for doing as little as possible in the DB layer, most of which are listed here, and I just don't see this article as successfully debunking them.

kogir将近 14 年前

In my personal experience most of this is true.At Loopt we used MSSQL from the beginning. Things didn't start this way (because I was young and foolish), but in the end:* Each developer had their own local instance of the DB(s) for development.* All schema was kept under source control (now using RedGate SQL Source Control, a fantastic product, and earlier as simple sql scripts in Hg/Svn and database projects in Visual Studio).* Unit tests of the DB were done in code (C#). They set up their own initial state, ran against the local DB instance, and cleaned up after themselves by rolling back transactions.* Commits in source control triggered automatic SQL updates in dev using RedGate SQL Compare (though for a time we used the SQL tooling in Visual Studio, also command line scriptable, for this task).* Debugging performance was easy with performance monitor, set statistics (profile|io|time) on, and built in management views. Right click -> "Show what's expensive" kind of easy.* Debugging functionality was easy with step-in line by line debugging from the C# code calling it.* In many cases 5-10 round trips were avoided by keeping data local. Lots of network bandwidth was saved as well by only returning final results.* Used correctly, stored procedures encourage correct handling of parameters and can increase security -- Web server code didn't have read or write access to the raw data. It was constrained to calling the stored procedures (no way to dump all password hashes, for example). Think of it as another layer of defense.We tried to keep most of the complexity out of the DB because T-SQL is a terrible language. You'd have to be insane to write anything you didn't have to in it. To get optimal performance we occasionally had to do complex optimizations (table hints, indexed views, etc.), but all of that was still far easier than proper cache invalidation. Loopt ran with no cache because the DB worked just fine when tuned correctly.

rch将近 14 年前

Just in the last few weeks, I have started writing PL/SQL for the first time in a long while. Coming straight off a 2 year period of working almost entirely in Python, I expected to begin hating life immediately... but it just hasn't been the case.In a world of light-weight HTTP services, micro-frameworks, and high-quality ORM, the making the 'right' architectural decisions seems downright simple -- or at least, it seems easier to get everyone involved to agree on how things should fit together.

greyfade将近 14 年前

This article unfortunately doesn't address many of the things that concern me the most about stored procedures: Mainly my problem is that writing "business logic" (Can we stop using that phrase? It smacks of managerial buzzwordism.) in SQL is painful at best and downright unmaintainable at worst.I've recently been tasked with rewriting some of our scheduled processing code from what it is now (some very badly-written C#) to a collection of stored procedures and functions. The biggest issue is that the code does a lot of string processing, much of it very unusual. T-SQL seems to lack all of the facilities I require to actually do that job, and my boss won't give me even the leeway to write new .NET code for the database, which would make my job not only easier, but the project more maintainable - I think because a previous programmer poisoned the water with his abysmally bad code. (The company now fears all .NET code. It's that bad.)I think the matter of maintainability outweighs all of the "myths" this article "busts," but the article addresses none of that.

tewolde将近 14 年前

From an architectural standpoint I've always found that separating concerns is "always a good thing". Getting the database right is a difficult enough task without adding in extra complexity of the logic layer.My recent experience has actually pushed into the more extreme conclusion that even the database task itself is too complex for a single tool, so we use different kinds of databases to accomplish different aspects of the same db task.If you add the logic layer to this picture it becomes apparent the most valuable asset on has is flexibility, keep your tools simple and focused.

canadiansaur将近 14 年前

Some of their arguments seem debatable, but my biggest issue with stored procedures with Oracle is purely economic - It is MUCH cheaper to scale the application layer than to scale the database layer. Therefore, if you want to scale cost effectively, and you want to use Oracle, you should put as much logic in the application layer as possible.The article says 'if you have millions of users, you should be able to afford decent hardware.' - that is misleading, because the cost of the hardware is miniscule compared to the cost of the Oracle licenses you would need.

DrJokepu将近 14 年前

I thought this whole stored procedure debate was settled years ago. I had the impression that everyone agrees that stored procedures are useful for queries inherently procedural in nature (so that they are very difficult or impossible to express in the functional style of SQL queries) because they save the roundtrips between the database server and the application server. For CRUD queries, they are just overcomplicating things with minimal, if any, benefits.

darklajid将近 14 年前

I had to do a double take and check the posting date when I saw the 'can be version controlled' part, complete with a _CVS_ screenshot. Wow.My take: Stored procedures are a valid tool that belongs in everyone's toolset. I dislike them for one reason only: I'd rather define as much as possible in _one_ language. Having a codebase that is mixed between languages can be worth the tradeoff, but I'd like to avoid it where possible.

tluyben2将近 14 年前

When and where did Oracle say that this Exadata thing can run Facebook (or what handle Facebook's entire computing load may mean)? Any data about it?

johnwatson11218将近 14 年前

What about the idea that it is still way too easy to mess up 'for' loops in pl/sql? You have to track your loop control variables manually and it is not uncommon to have nested loops that span several printed pages?What about the fact that implementing a hash table is an advanced topic in pl/sql. You get dictionary style data tables that can only be indexed by binary integer?Encapsulation is nice and everything ... but oh yea the entire database is visible to all the code all the time. What does data encapsulation even mean in that environment?What about memory management? What do you do when your code is using too much ram? It seems much less clear than in java.What about the fact that I have never been able to measure a significant runtime difference between pl/sql and raw jdbc code? Usually the cost of inserting rows into a heavily indexed table outweighs all other factors.I'm not opposed to running code in the db but in practice everything feels like it is 10 years behind the state of the art.

ZoFreX将近 14 年前

> If your business logic is not in the database, it is only a recommendation.Absolutely fantastic.This is brilliant timing actually, as I've been wondering lately why stored procedures are so maligned as they do seem to have a lot of advantages. As well as being informative and well-written this post is entertaining, I enjoyed it a lot :)

room606将近 14 年前

If you're lucky enough to be using PostgreSQL with pl/python, you can write your stored procs in Python<a href="https://www.postgresqlconference.org/content/speeding-django-and-other-python-apps-automatic-remoting-database-methods" rel="nofollow">https://www.postgresqlconference.org/content/speeding-django...</a>

romaniv将近 14 年前

I don't agree with a lot of the things said in the article. For one, stored procedures are not just files, they are a part of your DB schema. Your SQL code (whether PL/SQL, or T-SQL, or whatever) is actually executable code that modifies that schema. The schema also includes tables, types, functions and so on.If you want to do meaningful version control for your DB, you need to do version control for all of those things too. It's not impossible, but it is very difficult, because databases also have data. If you want to add a row to your table, for example. you can't just change the code you used to create it. Instead, you need to write new code that modifies the existing schema by adding columns.

thirdstation将近 14 年前

I created a customer service app years ago and decided to put most of the database logic into stored procedures. The front end was Cold Fusion and I figured they would get rid of Cold Fusion way before Oracle (which they had recently adopted company-wide).

o_nate将近 14 年前

Stored procs can actually improve the separation of business and logic layers, because they help to encapsulate the data layer from the logic layer. You don't want the logic layer to know too much about the internals of the data layer. Stored procs give you a clean, abstracted interface.Also, I wouldn't be too much of a purist about the data/logic distinction. Sometimes doing a little processing on the data in a stored proc before returning it makes sense. On the other hand, I would hesitate to put very complex logic in a stored proc. So as usual, there are going to be grey areas. Do what makes sense.

garyrichardson将近 14 年前

so ... much ... venting .. to .. be ... done.Clearly a sales pitch for Oracle tools.SP are just fine if your building Enterprise apps being used by a few hundred people.. you're environment is fixed and probably a handful of developers will ever touch the code.Could you imagine writing a huge open source app using SP? Not every has the same toolset, and pretty much most of them won't have access to Oracle's debuggers/version control/etc.Also, the argument of "If you have a few million users, you can afford to scale your database up" is complete bullshit. If I were the marketing department for Oracle, this would be the sort of seed idea I'd be trying to place in order to sell more licenses.

amaeze将近 14 年前

any tool/technique can be abused/misused. some more easily than others. except you are developing for fun, you need to take ability to misuse (and the ramifications) into consideration.productivity and maintainability IMHO should be the most important decision makers when choosing what to use. what tools allows me to be the most productivity as a developer? who easy is it for others I work with to change what I've developed? optimize later.they are very successful solutions developed in a variety of ways. let's not forget that. arguing over what is best is missing the point.

ck2将近 14 年前

Good article but I wonder how long until a C&D appears for diluting the brand name (along with images).Could argue "fair use" but since lawyers know you cannot afford to go to court...

swanson将近 14 年前

I wish there were some links to some of the tools used in the screenshots, particularly the one showing package dependencies and the autocomplete editor.Anyone happen to know?

评论 #2832473 未加载

评论 #2832776 未加载

评论 #2832497 未加载

评论 #2832567 未加载

br1将近 14 年前

Another take on moving logic to the DB server: thehelsinkideclaration.blogspot.com

babebridou将近 14 年前

As the former lead dev on a rather large Oracle-based architecture, I feel obliged to chime in.Some of this article is extremely true, a couple things are utter bullshit though.I'll go with the bullshit first.1) source controlSource control is mandatory, so you have to do it anyway. But for any Oracle architecture where there is actual PLSQL work beyond funky table functions, meaning triggers, jobs and their ilk, you need to keep around a large schema with enough consistent data, especially with all the issues one can have with the way Oracle optimizes the execution plan of its queries depending on the statistics of each table. You need this information when you develop, so you need amounts of data that approach the order of magnitude of the production system, which is potentially huge.It's simply impossible for every developer to enjoy their own instance, unless each dev has his own server, and even then: there's far too much maintenance to be made on an active Oracle database, and once a certain scale in the schema has been reached it's practically impossible to maintain an acceptable level of performance on a frigging desktop, especially when your devs are doing more than just PL/SQL, but, also, you know, work on the other tiers.So in the end, you have one dev instance, two at most, and you keep versioning of the code just for looks because unless your system is trivial (only reversible transactions), any procedure, job or trigger that is executed on a database will massively change it forever. So in the end if your business logic lies at least partly on PL/SQL code, older versions of sources are only kept as a trace, not for proper versioning.Besides, there are so many encoding issues and your database is so critical that you can't even risk anyone commiting a file rogue-style because if the sql was edited with textpad, notepad, notepad++, created with touch, right click + new, saved as from toad or came from a copy-paste of an e-mail sent from an ipad, the encoding will be different and will mess up tons of stuff. Things like accents in code comments can be interpreted badly and "eat" the following line feed character, which in turns comments the next line, which can lead to ultimate data corruption if it concerns a piece of code that is invoked by a trigger.This is a risk that you don't, ever, want to take. So your PL/SQL SVN server is only for the two/three people that are entitled to actually write the Patch. Don't leave anything to chance here. The rest of the devs follow the chain of command of writing something, testing on the dev server, crashing it and eventually ruining it, send their script by email or dropbox or whatever to the devs in charge of the patch, who then use SVN to version it properly.2) Unit testingUnit testing is cool and all, but if all it does is essentially compile scripts. You can only unit-test functions if they are isolated enough, and isolation is all relative when the code is in the database. Let's unit test a select table function: first you disable triggers on this table, then you insert test data, then you do your test, then you delete the inserted data, then you reactivate the triggers. Happy? No! Why? Because it will take me a week to write all the test case inserts & delete!I'm very caricatural here, but this is what happens in real cases. Cases where the unit test is very hard, much, much harder to design than the code it's supposed to test itself, not because of bad logic or whatever, but because in the end, it's mostly data that's required, and sometimes a simple select unit test requires a metric ton of one-line table inserts and deletes, and there can be a factorial quantity of test cases for tables with more than a couple dozen columns. "Just peer review" is the sensible and agile way to do it. TDD is not.The rest of the article is mostly spot on. I loved the power of PLSQL for this particular system (railway undertaking system). I lack distance still, but I have no idea how we would have handled it had the customer said "no oracle, use [insert NoSQL tech of your choice]". The kind of calculations that were required was: "okay this train here is 5mins late and that train there is 6mins late, we have 6 thousands concurrent wagons circulating in the system that need to be rerouted before any train, anywhere, leaves its station, just warn me asap if this incurs any delay in the deliveries of high priority merchandise within the week, but if possible, fix it automatically yourself".

makay86将近 14 年前

hiiiiiiiiiiiiii

makay86将近 14 年前

gnghhng

30 条评论

sc68cal将近 14 年前

评论 #2832765 未加载

评论 #2832506 未加载

评论 #2832500 未加载

评论 #2832842 未加载

评论 #2833224 未加载

评论 #2832812 未加载

评论 #2832483 未加载

msluyter将近 14 年前

评论 #2833018 未加载

评论 #2832955 未加载

评论 #2835231 未加载

peregrine将近 14 年前

评论 #2832512 未加载

cameronh90将近 14 年前

评论 #2832888 未加载

评论 #2832720 未加载

Sauce1971将近 14 年前

评论 #2832467 未加载

评论 #2832539 未加载

clutchski将近 14 年前

评论 #2833707 未加载

评论 #2833028 未加载

_delirium将近 14 年前

pkteison将近 14 年前

kogir将近 14 年前

rch将近 14 年前

greyfade将近 14 年前

tewolde将近 14 年前

canadiansaur将近 14 年前

DrJokepu将近 14 年前

darklajid将近 14 年前

tluyben2将近 14 年前

When and where did Oracle say that this Exadata thing can run Facebook (or what handle Facebook's entire computing load may mean)? Any data about it?

johnwatson11218将近 14 年前

ZoFreX将近 14 年前

room606将近 14 年前

romaniv将近 14 年前

thirdstation将近 14 年前

o_nate将近 14 年前

garyrichardson将近 14 年前

amaeze将近 14 年前

ck2将近 14 年前

Good article but I wonder how long until a C&D appears for diluting the brand name (along with images).Could argue "fair use" but since lawyers know you cannot afford to go to court...

swanson将近 14 年前

I wish there were some links to some of the tools used in the screenshots, particularly the one showing package dependencies and the autocomplete editor.Anyone happen to know?

评论 #2832473 未加载

评论 #2832776 未加载

评论 #2832497 未加载

评论 #2832567 未加载

br1将近 14 年前

Another take on moving logic to the DB server: thehelsinkideclaration.blogspot.com

babebridou将近 14 年前

makay86将近 14 年前

hiiiiiiiiiiiiii

makay86将近 14 年前

gnghhng