The last time this article came up the consensus was that the author was pretty biased towards Postgres and had little to no experience with actual MS SQL Server use.<p>Also, the lack of author identity was frowned upon.<p>Lastly, the conjecture and attitude towards Microsoft lacks some substance.<p>Conclusion: The author is free to write whatever he likes, but take this resource with a pinch of salt.<p>I use both Postgres and MS SQL Server professionally and whilst philosophically I prefer Postgres, for practical reasons I truly prefer MS SQL Server, if only because of its excellent development tools.
I've been SQL Server user for many years. And for last two years I started also using PostgreSQL. Today I use both of them in my project, and as a developer/dba I see pros and cons:<p>MS SQL:
- The tools included like Management Studio are just great. This is totally next level to the Postgres tools.
- Using multiple CPU cores for a single query is really helpfull in my scenario.
- Easy continous backup to the cloud.
- Included Integration Services, Analysis services are also easy and enugh for my usage.<p>Postgres:
- Is running on linux, to it's cheaper. Even when we use SQL Server as Bizspark (for free now), the Azure Windows VM for it costs us much more, than VM with Linux. And of course when we want to cluster our DB, Postgres is even cheaper.
- Great JSON support. There are parts of our project where it's helpfull.
- Better configurability, like WAL, checkpoints etc. We have much better write performance on postgres than in sql server (probably just our case).<p>The other things really do not much difference. Both DB's can be extended, and extensions may be written in many languages. Both achieve great overall performance, both have strong community and a lot of documentation.
If I was going to pick a relational database system, I'm not sure these would be the criteria I'd use:<p>CSV support - if you do that much CSV extract/transform/load (or indeed, any kind of ETL work), use an ETL tool. SQL Server comes with SQL Server Integration Services for that kind of thing.<p>Ergonomics of dropping and creating tables and stored procedures - the author's example is probably the toughest way I can think of to drop a table. It's easier to check sys.all_objects (which will catch anything - functions, views, procs, etc).<p>Operating system choice - well, in 2015, if you're going to mention that, you should be thinking cloud services anyway. They're both available in the cloud - and at which point, who cares what OS it runs on?<p>Goes on and on. I'm a SQL Server guy, and if I was going to make a list of how to choose a relational database platform, here's what I'd probably list:<p>* Up-front cost (license, maintenance)<p>* Ease of finding staff, and their costs<p>* Ease of finding tooling, and its cost<p>* Ease of finding scalable patterns (proven ways to grow it)<p>I don't think either platform is a loser on those counts - it's hard to go wrong with either one.
<i>The whole thing took about a minute to write and a second to run. It confirmed that some of his folders had a problem and told him which ones they were. How would you do this in Windows?</i><p>I'm so tired of this. Just because you don't already know Powershell and are too lazy to learn doesn't mean it doesn't exist. A know-nothing Windows user might as well say, "If I want to select and move arbitrary files on Windows, I can point and control-click in seconds. How would you do that in the Linux CLI?" In both instances, it comes across as ignorant to anyone who actually knows the ecosystem being derided.
> In MS SQL Server, a CREATE PROCEDURE statement cannot appear halfway through a batch of SQL statements. There's no good reason for this, it's just an arbitrary limitation. It means that extra manual steps are often required to execute a large batch of SQL. Manual steps increase risk and reduce efficiency.<p>It's been a while, but I am pretty sure all you have to do is put GO before/after the CREATE PROCEDURE. I'm absolutely positive there's some way around it, because I've run many, many such scripts on SQL Server without manual intervention.<p>EDIT: Yes, I just fired up a VM and ran this and got the expected results with no errors.<p>CREATE DATABASE HackerNews;<p>CREATE TABLE dbo.Test (id int IDENTITY(1, 1), name varchar(20));<p>GO<p>INSERT INTO dbo.Test (name) VALUES ('Amezarak');<p>GO<p>CREATE PROCEDURE dbo.sp_QueryTest
AS
SELECT * FROM dbo.Test;<p>GO<p>EXEC dbo.sp_QueryTest<p>In my personal opinion, MSSQL (including the tooling around it) is awesome and possibly one of Microsoft's best products. I actually regret not getting to use it anymore since a) my current job doesn't use it and b) I'm not shelling out for a license for my side projects. Postgres is definitely my next pick, though, and both are miles ahead of MySQL. I understand that MySQL is "good enough" for most people, but it's always painful going back to it and inevitably remembering almost all my favorite features don't exist. I'm stuck with it on a side project and it's frustrating.
This website is great. However, this doesn't actually touch on the real issue.<p>I work at a MSSQL shop, and all of us know and are convinced that PG is better. Most of us use PG for our side projects and some of the dev's don't even use windows that much, with some custom MSSQL plugins we've built for linux. However, the problem still exists, of how do you port a ton of Databases over to Postgres? We're a multi-tenancy shop, so close to zero downtime is very important, and it would get really complicated if we ran multiple production versions of our app, one with a PG adapter and one with a MSSQL adapter.<p>A cursory Google search will show that you aren't going to get a ton of help converting them[0][1], not to mention the overhead of switching 20 developers from MSSQL to PG overnight.<p>This website is however excellent at convincing people to use PG over MSSQL. Perhaps, given the direction that Microsoft is going, they'll open source MSSQL overnight and it will become something competitively similar to PG in the long run.<p>[0] <a href="http://www.convert-in.com/mss2pgs.htm" rel="nofollow">http://www.convert-in.com/mss2pgs.htm</a><p>[1] <a href="https://wiki.postgresql.org/wiki/Microsoft_SQL_Server_to_PostgreSQL_Migration_by_Ian_Harding" rel="nofollow">https://wiki.postgresql.org/wiki/Microsoft_SQL_Server_to_Pos...</a>
Can someone explain this one to me?<p>PostgreSQL supports the RETURNING clause, allowing UPDATE, INSERT and DELETE statements to return values from affected rows. This is elegant and useful. MS SQL Server has the OUTPUT clause, which requires a separate table variable definition to function. This is clunky and inconvenient and forces a programmer to create and maintain unnecessary boilerplate code.<p>So I have the equivalent of the following in one of my projects:<p><pre><code> UPDATE sometable SET someField = @parameter
OUTPUT Inserted.field1, Inserted.field2
WHERE ...
</code></pre>
Now, either I don't get the limitation the author describes or SQL Server can do that - returning information from the affected results. Works with DELETE as well. We can argue that 'inserted' is a crappy name here, but..
As somebody working on postgres, I don't find the comparison to be done in a particularly fair, or helpful, way. MSSQL is a pretty good database and deserves to be fairly evaluated.<p>Don't get me wrong, I have 'political' problems with MSSQL, but those shouldn't be disguised as technical ones.
I get the feeling this guy hasn't had to deal with CSV in-the-wild. RFC4180 is a false prophet. I've never had a client give me CSV data that was actual, good CSV, thus I've always had to write a custom parser. Every. Single. Time.<p>Well, not anymore. Now I just refuse to do shit. "We can't change it" turns into "we contacted the original developer and made him fix his broken shit" when I turn into a complete pain in the ass.
Postgres is better than SQL Server because with Postgres I always have the "awesome edition" : <a href="http://blog.codinghorror.com/oh-you-wanted-awesome-edition/" rel="nofollow">http://blog.codinghorror.com/oh-you-wanted-awesome-edition/</a><p>And I don't need a certification to understand the crazy MS licensing (I discovered recently that yes there is a MS certification program for their licenses as it's so byzantine).
>> PostgreSQL supports DROP SCHEMA CASCADE, which drops a schema and all the database objects inside it. This is very, very important for a robust analytics delivery methodology, where tear-down-and-rebuild is the underlying principle of repeatable, auditable, collaborative analytics work.<p>Actually, if you drop the constraints first, you don't have to worry about the table-drop order. I do that on both Postgres and SQL Server, because I use a tool that generates the specific scripts for me.<p>I don't think manually writing DDL scripts is any way to manage an RDBMS in the modern era, especially if you care about "repeatability and auditability" like he claims.<p>Queries, yes, but schema constructs, no. SQL is just a terrible language for it. I'm not saying you have to use a full ORM, but every SQL engine I've ever encountered will let you do a lot of nutty things and won't complain about it, or won't complain until you actually try to query it.
When developing on a Microsoft stack with Visual Studio, I have always found SQL Server to be the most painless DB system. You can't beat the Visual Studio integration and tools. Also, their SQL Server drivers for .NET do a bunch of transparent optimizations like automatic connection pooling behind the scenes so you don't have to worry about this kind of thing.<p>OTOH, when not on a Microsoft stack, it has been a constant source of hard to track down problems and bugs. I remember trying to use Microsoft's JDBC driver a few years ago in a Java web app and running into all kinds of nasty and unbelievable bugs in the actual driver they shipped.
Sigh. It's about as profound as writing an article on whether Linux vs Windows -- which one is a better OS?
I skimmed the article for some reason and it is full of questionable statements. One thing that jumped out at me is the part about using multiple cores to run a query. We are in 2015 ladies and gentlemen. A RDBMS must be able to parallelize queries. Argument about never being CPU bound is laughable. I am, as we speak, running a performance test on a 36 core box and I already know the result: I NEED MORE CORES. Analytics is moving into SSD world, where data processing will be CPU bound once again.
It took a while, but he got there (emphasis mine):<p>>This is an advantage for MS SQL Server whenever you're running a query which is CPU-bound and not IO-bound. In real-life data analytics this happens approximately once every three blue moons. On those very rare, very specific occasions when CPU power is truly the bottleneck, you almost certainly should be using something other than an RDBMS. <i>RDBMSes are not for number crunching.</i><p>As a data analyst, the tools to be comparing shouldn't be RDBMSs.<p>>As I said in the banner and the intro, I am comparing these databases from the point of view of a data analyst, because I'm a data analyst and I use them for data analysis. I know about SSRS, SSAS, in-memory column stores and so on, but I haven't mentioned them because I don't use them (or equivalent features). Yes, this means this is not a comprehensive comparison of the two databases, and I never said it would be. It also means that if you care mostly about OLTP or data warehousing, you might not find this document very helpful.<p>As for this part, data warehousing, OLAP services, and reporting services (lower case on purpose here) are a very large sub-domain within data analytics. I am not saying that these are everything in analytics, but especially from an enterprise standpoint, these make up the bulk of it. From a tooling and full-stack standpoint, Microsoft is quite strong in this segment.
HA and clustering. MSSQL makes this dead easy. Point and click (and it'll display the script for you to learn/reuse) and you're done. Async, sync, HA (with automatic fail over), mirroring, several types of replication - and it just works and is easy. If PG ever ships with an out-of-the-box shared nothing system, yee haw! Maybe they could package up DRDB and heartbeat into one easy script and monitoring system or something.<p>I hate how MSSQL has gone back on their word to let customers benefit from CPU enhancements. They mocked Oracle for charging by type and core... And now they do the same.<p>Also, multiple result sets was a sorely missed feature when porting stuff to PG. But record types made up for it.<p>Of course now, the dominating factor for a lot of people is "Will a hosting provider (Azure, AWS, Google) just run this for me, automatically giving me perfect backups and restore and HA?" SQL Azure, as I understand, not only does backups, but allows you to restore to arbitrary points in time. Sure it's just keeping txlogs, but that sounds hot when sold like that. For many cases, I can see ditching the privacy issues of "cloud" to get those features with zero capex or management overhead.
> Commercial products have support from people who support it because they are paid to. They do the minimum amount necessary to satisfy the terms of the SLA.<p>This is pretty ridiculous and quite a bit insulting to the many people who do work for vendors. I work in a team that has a number of engineers supplied by vendors and they are generally fantastic. Highly qualified, more than happy to assist with tasks that aren't to do with their product and they really care about the overall project outcomes.<p>Open source has forced vendors to make sure that every project that uses their products are a success.<p>> On the other hand, commercial software is often designed by committee, written in cube farms and developed without proper guidance or inspiration<p>Again more nonsense. Not every open source project is some beacon of perfection and neither is every commercial product some poorly designed piece of junk. Anyone that believes otherwise is just being disingenuous.<p>Someone really needs to explain to me why PostgreSQL users in particular seem to always want to bash the competition in order to justify their technology choice. It's been going on for years against MySQL/Oracle first, then MongoDB/NoSQL and now SQL Server. It's odd.
One of the major attractions of MS SQL Server 2014 is memory optimized tables (Hekaton engine) which removes locks completely from the database through snapshot versioning of rows. Does postgresql has anything similar to that?
This is the second time this site has appeared on HN, and both times I've mis-read the title as "PostgreSQL vs MySQL", which I think would be a much more interesting comparison. If you're already working in an MS shop, then MS SQL will work best with what you have. If you're working in an open source environment, then you'll be better off taking advantage of what Postgres or MySQL have to offer. Postgres and MySQL are so different in terms of their features that comparing them would make an interesting article.
> I replied "well there are 1.5 billion Muslims and 1.2 billion Catholics. They can't all be right". Ergo, a billion people most certainly can be wrong. (In this particular case, 2.7 billion people are wrong.)<p>Very smug and condescending statement. Feels like an insecurity on the author's part.
After seeing the title I assumed the verdict would be that MS SQL is way better. If I wrote a comparison that's how the result would end up. I've used MS SQL from 1998 and PostgreSQL from 2001 and have found MS SQL much easier to use. Different strokes for different folks I guess.
I was expecting something more convincing like benchmarks, PostgreSQL has a lot of nice features and that's is a good thing for development, but just for development.
I am inclined to think that you should use an RDBMS for doing what it does best - things like storing and retrieving data efficiently, facilitating the integrity of data while serving multiple users, and enforcing relational constraints. If you're serious about doing analytics and you try to do it in the db you are going to quickly find yourself hampered by the expressive limitations of SQL.<p>I think it's better to skip that and interface with the database using something more powerful - let the database handle getting the data that you want the best way possible, then process it using something better suited to the job.
I expect better from the front page of HN. I like PostgreSQL and MSSQL Server and to a much lesser degree mySql and Oracle. But in my experience I use the tool that exists and makes sense for the project. Since most of my work is enterprise its MSSQL or Oracle. For personal projects I used PostgreSQL and got on quite well. None of the tools prevented me from getting the data I needed. There may be scaling issues or use cases that can favor either platform. If so, pick the tool for the job, move on. If you feel compelled to do so, you should document your use case to help others out.
let me first state i'm a huge postgres fan, i implement whenever i have the option. it's a great database and i'm a firm believer in open source.. however if a company is willing to shell out the coin to run MSSQL, i'm not complaining. MSSQL is very robust, very fast, and very easy to administer. i hate giving MS their props but they did a good job with it..
Huh... a person who compares databases with regards to analytics and doesn't mention OLAP. Doesn't mention ETL. Doesn't mention reporting. What kind of comparison is that? I often read reviews of databases which dismiss MSSQL completely. And that's fine when it is about building applications. HOWEVER, MSSQL comes with very featurefull ETL tools, very featurefull MOLAP and HOLAP and very featurell reporting suite. It also comes with some data mining algorithms for which I am not qualified to talk, but I've heard the functionality is pretty nice for a starting point. It also comes with an IDE for the people who do those things and a DBA tool for people who administer the databases.<p>Out of those things, postgres has a DBA tool out of the box. Yeah... nice comparison.<p>I find it particularly funny that he cites Dunning-Kruger. DK effect is not meant for others. It is meant for self-evaluation.