ORMs do much more than "write SQL". This is about 40% of the value they add.<p>As this argument comes up over, and over, and over, and over again, writers of the "bah ORM" club continuously thinking, well I'm not sure, that ORMs are just going to go "poof" one day? I wrote some years back the "SQL is Just As Easy as an ORM Challenge" which demonstrates maybe a few little things that ORMs do for you besides "write SQL", like persisting and loading data between classes and tables that are joined in various very common ways to represent associations between classes:<p><a href="https://gist.github.com/zzzeek/5f58d007698c4a0c372edd95ab8e0267" rel="nofollow noreferrer">https://gist.github.com/zzzeek/5f58d007698c4a0c372edd95ab8e0...</a><p>this is why whenever someone writes one of these "just write SQL" comments, or wow here a whole blog post! wow. I just shake my head. Because this is not at all what the ORM is really getting you. Plenty of ORMs let you write raw SQL or something very close to it. The SQL is not really the point. It's about the rows and objects, moving the data from the objects to the INSERT statement, moving the data from the rows you SELECTed back to the objects. Not to mention abstraction over all the other messy things the database drivers do like dealing with datatypes and stuff like that.<p>It looks like in this blog post, they actually implemented their own nano-ORM that stores one row and queries one table. Well great, now scale that approach up and see how much fun it is to write the same boilerplate XYZRepository / XYZPostgresqlRepository code with the same INSERT / SELECT statement over, and over again. I'd sure want to automate all that tedium. I'd want a one-to-many collection too maybe.<p>You can use SQLAlchemy (which I wrote) and write all the SQL 100% yourself as literal strings, and still use the ORM, and still be using an enormous amount of automation to deal with the database drivers and moving data between your objects and rows. But why would anyone really want to, writing SQL for CRUD is really repetitive and tedious. Computer can do that for you.
If you're going to end up querying all the fields and putting them into a model like this dataclass anyways... Django can do that for you. If you're going to later pick and choose what fields you query on the first load, and defer other data til later.... Django can do that for you. If you're going to have some foreign relations you want to easily query.... Django can do that for you. If you're doing a bunch of joins and are using some custom postgres extensions for certain fields and filtering... Django can help you organize the integration code cleanly.<p>I totally understand people having issues with Django's ORM due to the query laziness making performance tricky to reason about (since an expression might or might not trigger a query depending on the context). In some specialized cases there are some real performance hits from the model creation. But Django is very good at avoiding weird SQL issues, does a lot of things correctly the first time around, and also includes wonderful things like a migration layer.<p>You might have a database that is _really_ unamenable to something like an ORM (like if most of your tables don't have ID rows), but I wonder how much of the wisdom around ORMs is due to people being burned by half-baked ORMs over the years.<p>I am curious as to what a larger codebase with "just SQL queries all over" ends up looking like. I have to imagine they all end up with some (granted, specialized) query builder pattern. But I think my bias is influenced by always working on software where there are just so many columns per table that it would be way too much busywork to not use something.
This is just reimplementing Djangos ORM, but badly.<p>ORM queries <i>compose</i>. That's why [Python] programmers prefers them. You can create a QuerySet in Django, and then later add a filter, and then later another filter, and then later take a slice (pagination). This is hugely important for maintainable and composable code.<p>Another thing that's great about Djangos ORM is that it's THIN. Very thin in fact. The entire implementation is pretty tiny and super easy to read. You can just fork it by copying the entire thing into your DB if you want.
Seems like there's 3 groups of opinions on ORMs:<p>Firstly (1); "I want to use the ORM for everything (table definitions, indexes, and queries)"<p>Then second (2), on the other extreme: "I don't want an ORM, I want to do everything myself, all the SQL and reading the data into objects".<p>Then thirdly (3) the middle ground: "I want the ORM to do the boring reading/writing data between the database and the code's objects".<p>The problem with ORMs is that they are often designed to do number 1, and are used to do number 3. This means there's often 'magic' in the ORM, when really all someone wanted to do was generate the code to read/write data from the database. In my experience this pushes engineers to adopt number 2.<p>I'm a big fan of projects like sqlc[1] which will take SQL that you write, and generate the code for reading/writing that data/objects into and out of the database. It gives you number 3 without any of the magic from number 1.<p>[1] <a href="https://sqlc.dev/" rel="nofollow noreferrer">https://sqlc.dev/</a>
Any serious application beyond the example given in this article will include conditional SQL constructs which go beyond SQL query parameters and will therefore require string formatting to build the SQL.<p>Think a simple UI switch to sort some result either ascending or descending, which will require you format either an `ASC` or a `DESC` in your SQL string.<p>The moment you build SQL with string formatting is the moment you're rewriting the SQL formatter from an ORM, meeting plenty of opportunities to shoot yourself in the foot.
The next logical step after writing the code given in the article is to abstract common boilerplate SQL into a library so you're not spending 50% of your time writing and re-writing basic SQL insert, update, and select statements every time your models need to be updated. At which point all you've done is write your own ORM.<p>If you want to go full-blown SQL you can use something like PostGraphile, which allows you to create all of your business entities as tables or views and then write all your business logic as SQL stored procedures, which get automatically translated into a GraphQL API for your clients to consume, but once you move beyond basic CRUD operations and reporting it becomes incredibly difficult to work with since there aren't really any good IDEs that help you manage and navigate huge pure-SQL code bases.<p>If you're really dead set against using a powerful ORM, it's probably still a good idea to find and use a lightweight one--something that handles the tedious CRUD operations between your tables and objects, but lets you break out and write your own raw queries when you need to do something more complex. I think there's a sweet spot between writing every line of SQL your application executes and having an ORM take care of boilerplate for you that will probably be different in every case but will never be 100% at one end or the other.
The article is missing the code for creating the "users" database table. What about indexes? Migrations? Relations to other tables?<p>I mean you can just write SQL instead of using the ORM if your project consists of a single table with no indexes that will never change, sure.
> ... Python dot not have anything in the standard library that supports database interaction, this has always been a problem for the community to solve.<p>Python has built-in support for SQLite in the standard library: <a href="https://docs.python.org/3/library/sqlite3.html" rel="nofollow noreferrer">https://docs.python.org/3/library/sqlite3.html</a>
It's fine advice... if you don't ever need to build queries programmatically (you will, probably) and don't care about type checks (you should, it's 2023).<p>If you don't know what you're doing on the DB-app interface, you're still better off with an ORM most of the time. If you don't know if you know, you don't know (especially if you <i>think</i> you know but details are fuzzy); please go read sqlalchemy docs, no, skimming doesn't count.<p>If you know what you're doing but are new to Python, use sqlalchemy.core.<p>PS. zzzeek is a low-key god-tier hacker.
> I have spent enough time in tech to see languages and frameworks fall out of grace, libraries and tools coming and going.<p>I feel like Django ORM and SQLAlchemy are the de-facto ORMs for Python and have been around for over a decade. If anything I'd recommend juniors to pick one of these over hand rolling their own solution because it's so ubiquitous in the ecosystem.
Yes, this!<p>I've been burned countless times by Hibernate (and consorts) and now I argue in favour of plain SQL wherever I can. I do not imply that Hibernate is in itself bad, I just have collected many years of observations about projects built upon it, and they all had similar problems regarding tech debt and difficult maintenance, and most of them sooner or later ran into situations where Hibernate had to be worked around in very ugly ways.<p>Yes, I can understand some of the arguments for ORMs, especially when you get a lot of functionality automagically à la Spring Boot repositories.<p>And since nowadays I have more influence, I do advocate for plain SQL or - the middle ground - projects like jOOQ, but without code generation, without magic, just for type safety. We've been quite happy with this approach for a very large rewrite that is now being used productively with success.
If you are just going to "Just Write SQL" then I really don't think you should be coding your own Object and Repository classes.<p>My vision of the "Just Write SQL" paradigm would be a "class" or equivalent that would take a SQL command and return the response from the server. Obviously the response has a few different forms but if you're "just writing SQL" then those responses are database responses and not models or collections of models.<p>(For the record I think simple ORM type functionality is actually quite useful as your use case moves past the scale of small utility scripts.
I'd take it a step further and move all SQL into stored procedures and call those using a functional interface. That's because of PostgreSQL's excellent stored procedure support, it might be harder with, e.g. MySQL.<p>One major benefit of stored procedures, in addition to separation of concerns, is that you can declare them SECURITY DEFINER and give them access to tables the Python process doesn't (in a way reminiscent of setuid), thus improving the security posture dramatically.<p>One example: you could have a procedure authenticate(login, password) that has access to the table of users and (hashed) passwords, but if the Python app server is compromised it doesn't have access to even the hashed passwords or even the ability to enumerate users of the system.
I was sad to find that the author proclaimed "just write SQL" then fell into the trap of modeling his data as objects.<p>If you're going to model your data this way... you might as well use an ORM.<p>A better way is to just just write SQL (or datalog) and model our data, from DB all the way up to the application, as relations rather than objects.<p>Rather than re-hash, this idea has previously been discussed on HN here: <a href="https://news.ycombinator.com/item?id=34948816">https://news.ycombinator.com/item?id=34948816</a>
I'm happy to see someone mention peewee, having used it for numerous startup prototypes since its inception.<p>Coleifer does not get enough credit IMHO:<p><a href="https://github.com/coleifer/">https://github.com/coleifer/</a><p>Peewee has been solid since I began using it a decade ago. Coleifer's stewardship is hard to see at once, but I've interacted with him numerous times back then and the software reflects the mindset of its creator.
You don't need an ORM, but this isn't how you avoid one. If you're thinking of your DB as a mere object store / OOP connector like this article is, you're better off with an ORM or NoSQL than this basically equivalent DIY solution. It's best to instead learn how to use a relational DB like a relational DB, and the rest will follow.<p>Also, I'm not one of those people who dislike easy things (and will often whine about JS or Python existing). I'm all for ease and focusing on the business goals. It's just that ORMs and bad schema design will make things harder.
I've been both ways on this, and ultimately I come down heavily in the camp of using ORMs for as far as it makes sense. Why? Sure, the "just use SQL because it's so simple" crowd use seductively simple examples, and indeed for very static use-cases it can be quite neat and simple.<p>But projects (almost always) grow, and once you need to start conditionally adding filter clauses, conditionally adding joins, things start to get <i>very</i> weird <i>very</i> fast. And no, letting the database connector library do the quoting for you <i>won't</i> save you from SQL injection attacks unless you're just using it to substitute primitive values.<p>And once all your logic is having to spend more space dealing with conditional string formatting, the clarity of what the query is actually trying to do is long gone.<p>I'll refrain from digging up the piece of code where I was having to get the escaping correct for a field that was embedded SQL-in-SQL-in-SQL-in-go. And I could hear the echos of the original author's "YAGNI"s haunting me.
My god, object-relational impedance mismatch seems to be more polarizing than US politics. I've been observing this field for more than 15 years, and it's always "JUST WRITE SQL!!!!!" versus "DRY!!! DRY!!! USE ORMs SO YOU DON'T HAVE TO WRITE SQL!!!! BUSINESS LOGIC!!!" shouting matches. It's like this topic itself takes 30 IQ points away from each participant and the conversation then devolves into complete chaos.<p>Maybe there's some professional trauma at work, as many of us have been traumatized by shitty databases and shitty code working with them alike, ORM or not. But ORMs do come and go, promise bliss, deliver diddly, and I'm reading the same stuff I've been reading in 2008, as if nothing ever changed since.
I've used Rails AR and Django/SQL Alchemy orms, and the more I use it, the more I wish for a fusion of both.<p>Django ORM is amazing for Schema management and migrations, but I dislike their query interfaces (using "filter" instead of "where"). I really like Rails AR way of lightly wrapping SQL, with a almost 1-1, and similar names, but does not have a migration manager - and there is always the chance that your schema and your code will diverge.<p>If I would get a Schema / migration manager, that would allow to do type checks and that would work well with a language server for autocompletes, but use SQL or a very very thin wrapper around SQL, that would be my Goldilock solution.
I’ve been a data engineer for many years and have lots of practice optimizing SQL touching many parts of the language and I still enjoy using SQLAlchemy for its tight, elegant integration with flask/django. Of course some queries make sense to optimize with raw sql but I think here, like with many other things, there’s no black/white conclusion to draw from these situations.
I've been using PugSQL to write SQL in Python [1].<p>With this package, you write the SQL inside SQL files so you can benefit from syntax highlighting, auto formatting, static analysis, etc. At the difference of writing strings of SQL inside Python files. I'm surprised this is not more popular.<p>[1] <a href="https://pugsql.org" rel="nofollow noreferrer">https://pugsql.org</a>
Question to the SQL-only people, how would you handle something dynamic? If I have a database of shoes and want people to be able to find them by brand, size, style, etc., what does that look like?
Alright, let's use the custom approach. And then you need another field. and then you need some slight type checking or (de)serialization, which can change over time. You'll end up writing your own custom, kludgy ORM over time.<p>I have seen people write their own custom crazy version of GraphQL ("I've created a JSONified way of fetching only some fields from an API call) over ego or just ignorance. It's never a good path.<p>Why bother moving away from SQLAlchemy, which will do all of that for you in a simplified, type-checked and portable way? SQLAlchemy is literally one of the best ORMs out there, it's ease of use and maintainability is insane.<p>People that complain about ORMs might have never really used SQLAlchemy. It is <i>that</i> good. I'm a fan and zzzeek is huge force behind why it is so good.<p>And as always, if you need an escape hatch, you can use raw sql for that ONE sql statement that the ORM is giving you grief for.
I've seen many iterations of this type of debate by now, and learned to recognize the patterns. The people arguing for the ostensibly simpler solution are really asking you to trust their ability to architect apps out of simpler building blocks without using an abstraction that they don't like. This can work, but it often leads to situations like someone writing a bespoke system, then leaving the job or otherwise imposing extra complexity on the team. In the immediate term, what often gets overlooked is<p>- The ORM is an externally maintained open-source project with a plurality of contributors; "just write SQL" is not<p>- The ORM is designed to support the full lifecycle of the application including migrations; "just write SQL" is not<p>- The ORM is documented to be <i>legible</i> to newcomers; "just write SQL" is not (for all but the simplest of applications)<p>- The ORM is composable and extensible with opinionated and customizable interfaces for doing so (I've lost track of the number of times I've had my mind blown by how elegant and smart Django and SQLAlchemy's query management tooling is)<p>- The ORM has a security posture that allows you to both reason about your application's security and receive security updates when bugs are found<p>- The ORM is a platform for many other modules responsible for different layers of the application (DRF, OpenAPI, django-admin, testing utilities, etc. etc.) to plug into and allow the application to grow sustainably<p>I now try to guide people to a middle ground. Yes, both Django's and SQLAlchemy's ORMs can be annoying, have performance issues, etc. But for large applications maintained by multiple people over time, their benefits usually outweigh the drawbacks. Both have extensible architectures that allow customization and opinionated restriction of the interface that the ORM presents. If you're unhappy with your organization's ORM, I suggest you try that route first.
If you want to try out something cool, check out<p><a href="https://github.com/sqlc-dev/sqlc">https://github.com/sqlc-dev/sqlc</a><p>It's written in Go and it converts your sql migrations and queries into typesafe code that you use access your database.<p>It currently has a plugin for Python that's in Beta, but what essentially does something similar to what this post is saying.<p><a href="https://github.com/sqlc-dev/sqlc-gen-python">https://github.com/sqlc-dev/sqlc-gen-python</a><p>You write your migrations, and queries and a config file and it does the rest.
A good ORM knows when it needs to fuck off. I just want an easy and boiler plate avoiding way to do crud operations on certain tables and map a custom type against a custom query. The lengths I have to go through to just have a custom query in some ORM's is mind-boggling. I remember fighting Microsoft's Linq to SQL or whatever the incarnation was called so hard. I could do it in the end but it fought me all the way to the end.
It might be worth mentioning LiteralStrings from [PEP 675](<a href="https://peps.python.org/pep-0675/" rel="nofollow noreferrer">https://peps.python.org/pep-0675/</a>) and how you should use them to prevent SQL injections. I'm not sure this blog adds much to the discussion when it comes to when to write SQL and when not to. It does not cover the struggles, the benefits, and the downfalls.
As a data engineer, the pattern the OP shares is very familiar. I find it much preferable to use of ORMs for wide variety of reasons. However, I view implementing with SQL as an antiquated problem rather than a pragmatic feature. The evolution of this pattern would be to integrate database querying into languages more directly and eliminate SQL entirely. While this could be achieved in Python, I find that a language like Clojure, via functional programming (FP) primitives and transducers, is a natural candidate, particularly for JVM implemented databases. Rather than encapsulating SQL via query building or ORM based APIs, an FP core could be integrated into database engines to allow, via transducers, complex native forms to be executed directly across database clusters. Apache Spark is an analog of this. In particular the Clojure project, powderkeg (<a href="https://github.com/HCADatalab/powderkeg">https://github.com/HCADatalab/powderkeg</a>), as an Apache Spark interface, demonstrates the potential of utilizing transducers in a database cluster context.
Recently I was wondering myself whether I should just write SQL as I didn't particularly enjoy working with SQLAlchemy.<p>Then I discovered peewee. I am happy now.
A better title would be "just write your own ORM".<p>I have used several Python ORMs over the years, both for SQL and NoSQL. SQLAlchemy is the most powerful way of interacting with a relational database I have experienced.<p>I also write Go, and when I do, I do not use an ORM. But when it comes to Python I know my solution won't be better than SQLAlchemy, so why bother rolling out my own?
I always ctrl-f to search for the word "composability" when I come across arguments like this. I could take or leave ORMs, but relational <i>query-building</i> libraries are invaluable for composability, compared to proliferating mostly-duplicative raw SQL in format strings all over the place.
So far all my projects have targeted a specific database with no reason to change it.<p>So what I do is write SQL commands, but keep all inside a specific file or module of the project. So that I can decide later to refactor it into an ORM.<p>I think ORMs are great if you write libraries that target more than one database. Or situations, where you have more than one database and need a proper migration part.<p>If you don't need migration, but in the worst case can start with a fresh, empty database, then write SQL.<p>But for production system, the no/manual migration might get old quickly. Writing migration code that just adds fields, indexes or tables is easy. But
writing code that changes fields or table structures? Not do much.<p>Still, you don't need an ORM at the beginning of a project, just don't put SQL everywhere.
I can already see this doesn't have connection pooling which all those ORMs he listed have without you knowing what a connection pool is it just works, and scales, doing that on your own is not easy.
Technologists have a hard time accepting an established standard. Email is a perfect corollary to this conversation. There is a graveyard of companies that have attempted to "Solve email", yet it is still ubiquitous and attempts to 'improve' it continue to fizzle out. I'm not saying that progress, or an attempt at progress, is pointless, but to argue that writing vanilla SQL is somehow antiquated or archaic is false and OP makes several valid points highlighting why it is a perfectly valid approach.
Sentiments on this is that sticking close to native as possible reduces coherency issues between anything. Adding layers of abstraction on top of layers of abstraction often reduces contextual understandings and further dilludes the problem solving technique. If the abstraction is truly needed a thurough way to evaluate executions is needed and a proper way to contextualize which that is not. In-line comments or even very easy to navitage documentation but the former thing or even both is superior to the latter.
One challenge working with SQL from statically typed languages (including Python + Mypy) is that you have to convert the query inputs/outputs to/from types and it's a lot of boilerplate. I started an experiment to generate this from annotated queries. [1] Python support is still incomplete, but I'm using it somewhat successfully for using SQLite from Rust so far.<p>[1]: <a href="https://github.com/ruuda/squiller">https://github.com/ruuda/squiller</a>
For those looking for a rubyish approach to this see: <a href="https://github.com/discourse/mini_sql">https://github.com/discourse/mini_sql</a>
Django isn't just about the ability to programmatically stick together things to make your query or the migrations. It's that, combined with tools that help you debug database issues, and most importantly the patterns that it imposes.<p>As a Django developer it's straightforward to go from one Django project to another, which isn't the case with other stuff as you don't know where everything is going to be.
An ORM makes sense if you need to make very dynamic SQL queries, ie advanced logic at runtime.<p>If your app can work well with static queries then you should not add an ORM.
Yes. Just write SQL. SQL is good.<p>Never use an (active record) ORM. Ever. For any purpose. They are a disastrous idea that should be un-invented.<p>I have this discussion with a lot of people who claim they cannot imagine working without an ORM and that "surely" using SQL is so much more work blah blah blah. Yet they have never tried! And aren't willing to try!<p>You should try it.
If you haven’t yet, check out <a href="https://pugsql.org/" rel="nofollow noreferrer">https://pugsql.org/</a> . all the power of sqla-core, none of the ORM fuss.<p><pre><code> PugSQL is a simple Python interface for using parameterized SQL, in files, with any SQLAlchemy-supported database.</code></pre>
I no longer feel the need for an ORM. Here's what I do instead:<p>- immutable record type for each table in the database (could be a data class)<p>- functions for manipulating the tables that accept or return the immutable record types and directly use SQL<p>- that's it<p>you can generate the functions from the database schema if its too much boilerplate.
Based on my personal experience, I have seen some raw SQL codes about 200 to 1000 lines in some production source codes,<p>not readable at all, not easy to change, which is a terrible development experience.<p>I guess if it is simple CRUD, it does not give too much problem, but it will definitely work in a complex case.
There's a huge module in our python codebase that approaches building queries in roughly raw SQL. Let me tell you, tracking how data moves from one stage to the next in that "ETL pipeline" is an absolute nightmare. Never again.
What happens at big companies is that they will build a custom ORM over time , and it will be way shittier and more vulnerable than if you had just used one in the first place.
The premise is that Go language users only use standard library SQL package.<p>Anecdotally, I haven't seen a place where Go is used without something like gorm or sqlc.