I've been attempting to preach the PostgreSQL Gospel (<a href="http://www.brightball.com/articles/why-should-you-learn-postgresql" rel="nofollow">http://www.brightball.com/articles/why-should-you-learn-post...</a>) for a few years now about this exact same thing.<p>When you look at your database as a dumb datastore, you're really selling short all of the capabilities that are in your database. PG is basically a stack in a box.<p>Whenever I started getting into Elixir and Phoenix and realized that the entire Elixir/Erlang stack was also basically a full stack on it's own...and that by default Phoenix wants to use PG as it's database...I may have gone a little overboard with excitement.<p>If you build with Elixir and PostgreSQL you've addressed almost every need that most projects can have with minimal complexity.
There was a similar talk[1] at FOSDEM, where the speaker describes how, as an experiment, he replaces a full ELK stack plus other monitoring tools with PostgreSQL. He even goes as far as implementing a minimal logstash equivalent (i.e. log parsing) into the database itself.<p>It wasn't an "we do this at scale" talk, but I'd love to see more experiments like it.<p>For the impatient: Skip to 17 minutes into the video, where he describes the previous architecture and what parts are replaced with Postgres.<p>1. <a href="https://fosdem.org/2017/schedule/event/postgresql_infrastructure_monitoring/" rel="nofollow">https://fosdem.org/2017/schedule/event/postgresql_infrastruc...</a>
I do use PostgreSQL wherever possibly. Add <a href="http://postgrest.com/" rel="nofollow">http://postgrest.com/</a> and nginx as a url-rewriting proxy and you have a performant, highly adaptable REST-Server.
Nice writeup though I would add a few things.<p>Listen/Notify work great for short-term job queues. For longer term ones, you have some serious difficulties on PostgreSQL which require care and attention to detail to solve. In those cases, of course, you can solve them, but they take people who know what they are doing.<p>Also in terms of storing images in the database, this is something that really depends on what you are doing, what your database load is, and what your memory constraints are. At least when working with Perl on the middleware, decoding and presenting the image takes several times the RAM that loading it off the filesystem does. That may not be the end of the world, but it is something to think about.<p>Also TOAST overhead in retrieved columns doesn't show up in EXPLAIN ANALYZE because the items never get untoasted. Again by no means a deal breaker, but something to think about.<p>In general, PostgreSQL can be good enough but having people know know it inside and out is important as you scale. That's probably true with any technology, however.
Is anyone actually utilising a recent version of PostgreSQL for full-text searching beyond a hobby project? How do you find the speed and accuracy versus Elasticsearch?
Very nice article for those of us who have never used PostgreSQL much. I've been starting to use it with Elixir & this gives me a good understanding of why someone would use it, especially when when starting a new app.<p>Out of curiosity, does anyone have a favorite article saved that does a great comparison of when to use certain databases?
Yes, for all projects and small businesses I start, Postgres and Redis is what I use from the beginning. Then if it ever gets to the point where I need a different DB for something, I replace components with the new tool. People get fascinated with these fly-by-night data stores and put their operations at serious risk. Start with the tried and tested technologies, then carefully augment your stack as needed.
Now imagine if we understand that the relational model is no for "just data storage" but also can be use for everything.<p>The closest thing(1) was dbase/foxpro. You can actually build a full app with it. Send email from the database? Yes. Is not that wrong? Is wrong just because RDBMS (2) made it wrong, not because is <i>actually wrong</i>. Why is better to split in separated languages/run times/models a app than one integrated?<p>(1): Taking in consideration that neither Fox or any "modern" rdbms have take the relational model to it full extension.<p>(2): A RDBMS is a full-package with a defined role, and limited capabilities. A relational-alike language will not be a exact replica of that. Not even is demanded to implement a full-storage solution.<p>The biggest mistake the relational guys have commited is to think always in terms of full-databases instead of micro-database. Ironically, kdb+ (or lisp? or rebol?) could be the closest thing to the idea (where data+code are not enemies but friends).
From my experience with both PostgreSQL and RethinkDB (and other NoSQL stores):<p>For SQL, complex queries, and data warehousing: yes. It's an excellent database and I'm not sure why you'd pick another SQL DB unless it were a lot better on point two.<p>For high availability and scaling: no, absolutely not.<p>The problem with the latter is an arcane deployment process and arcane error messages that provide constant worry that you're doing something wrong. It's a many week engineering project to deploy HA Postgres, while HA RethinkDB takes hours -- followed by some testing for prudence... our testing revealed that it does "just work" at least at our scale. We were overjoyed.<p>The docs for Postgres HA and clustering are also horrible. There are like five different ways to do it and they're all in an unknown state of completion or readiness.<p>Of course if/when we <i>do</i> want complex queries and more compact storage, we will probably offload data from the RethinkDB cluster to... drum roll... a PostgreSQL database. Of course that will probably be for analytics done "offline" in the sense that if the DB goes down for a bit we are fine. HA is not needed there.<p>TL;DR: everything has its limitations.
I love Postgres, but the one thing I think sucks is it's COUNT() performance.<p>I've read all sorts of hacks but I would love for someone to solve this for me!
We had to learn this the hard way. We have many of the data/services the article mentions and while we still use them when ever it gets massive we actually will go back to Postgresql.<p>For example for our internal analytics/logs/metrics we use ELK and Druid but believe it or not these tools despite their purported scaling abilities are actually damn expensive. These new cloud "elastic" stuff cheat and use lots and lots of memory. For a bootstrapped solvent self-funded startup like us we do care about memory usage.<p>For customer analytics we use... yes Postgresql.<p>For counters and stream like things we don't use Redis we use Pipelinedb (Postgresql fork). For Cassandra like stuff we use Citus (Postgresql extension).<p>Some of our external search uses SOLR (for small fields) but Postgresql text search is used for big fields.<p>The only part of our platform we don't really leverage on Postgresql is the message queue and this because RabbitMQ so far has done a damn good job (that and the damn JDBC driver isn't asynchronous so LISTEN/NOTIFY isn't really useful).
Postgres' awesome extensible type system means it will continue to increase in functionality much more easily than most comparable DBs.
<a href="https://www.postgresql.org/docs/9.6/static/extend-how.html" rel="nofollow">https://www.postgresql.org/docs/9.6/static/extend-how.html</a>
It's really amazing how far you can with a relational database. If you have very minimal constraints, keeping everything in a single place can make life so much easier. Configuration hell is real. I hadn't considered using PG for storing binary data, but I've hacked together a few toy projects where I used mongo and just shamelessly shoved everything in there.<p>I have a some slightly tangential questions, which I'd love to hear people's thoughts on: How do you decide where to draw the line between what's kept and defined in the application and database? For example, how strict would you make your type definitions and constraints? Do you just accept that you'll end up duplicating some of it in both places? Also, how do you track and manage changes when you have to deal with multiple environments?
Has someone played with threading model within Postgres.<p>I was reading the documents, looks like for every client request Postgres forks a new Process and uses shared memory model.<p>Using multi-processor threads/coroutines might be useful for scaling it further.
Or just MySQL. Popular choice, unpopular opinion. I trust it more because it gives me a tried and tested path when I need replication (which tends to happen rather early). My understanding is that Postgres replication is not nearly as battle-tested.