TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Effectively Using Materialized Views in Ruby on Rails

158 pointsby moritzplassnigover 5 years ago

9 comments

code_biologistover 5 years ago
Overall, I love this approach, but the big pain for me is the lack of incremental view refreshes. I end up needing to recreate giant tables each refresh even though the underlying changes are small. Sure I could implement that myself, but that sacrifices the correctness guarantees that materialized views provide!<p>Two things that would help:<p>1. Getting Incremental View Maintenance (IVM) [1] into Postgres. It looks like work is beginning on this but it&#x27;s been 7+ years coming. If there are any Postgres devs looking at this, I&#x27;m cheering for you!<p>2. There&#x27;s a commercial group doing an implementation of this, Materialize [2], but they don&#x27;t have any publicly released product yet. I mention it because their interview on Data Engineering Podcast is really good [3] and I really like their focus on correctness by working from replication logs.<p>[1] <a href="https:&#x2F;&#x2F;wiki.postgresql.org&#x2F;wiki&#x2F;Incremental_View_Maintenance" rel="nofollow">https:&#x2F;&#x2F;wiki.postgresql.org&#x2F;wiki&#x2F;Incremental_View_Maintenanc...</a><p>[2] <a href="https:&#x2F;&#x2F;materialize.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;materialize.io&#x2F;</a><p>[3] <a href="https:&#x2F;&#x2F;www.dataengineeringpodcast.com&#x2F;materialize-streaming-analytics-episode-112&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.dataengineeringpodcast.com&#x2F;materialize-streaming...</a>
评论 #22069676 未加载
评论 #22071808 未加载
评论 #22070434 未加载
评论 #22069619 未加载
semiquaverover 5 years ago
I used a technique very much like this in the past. Here are two potential gotchas using matviews with rails, both related to the inevitable reality that DDLs evolve over time:<p>- postgres supports concurrently refreshing the contents of existing materialized views but there&#x27;s no built in way to change the structure of the matview concurrently. Which means any `ALTER MATERIALIZED VIEW` will result in all reads to the view blocking for potentially many minutes or longer, for views over nontrivial data. Adding a column to a table is a cheap and non-blocking migration, but allowing the new column to appear in matviews can require an outage or fancy transactional view name swapping that libraries like scenic don&#x27;t support.<p>- a column that is included in a view of any kind cannot change in any way, even ways that are binary-compatible like VARCHAR-&gt;TEXT. This comes up surprisingly often and getting around it is annoying.
评论 #22068879 未加载
toasterlovinover 5 years ago
View Backed Models are a super power. If you’re a working Rails dev and you’ve never used them before, you really owe it to yourself to check them out. IMO, they should be baked into Rails itself. They’re that useful.
j4mieover 5 years ago
Are there any sensible ways to do this with Django? Ideally with migrations to create and update the views.
评论 #22069090 未加载
评论 #22069499 未加载
评论 #22069746 未加载
JangoSteveover 5 years ago
I once worked on a larger Rails project where we had decided to utilize Postgres views and materialized views to optimize the performance of parts of our application. This was about 7 years ago, so before Scenic existed to help with this (or at least before it was known). We had to manually manage the use of our views while keeping Rails apprised of the underlying schema through custom schema files and migration methods.<p>One of the main issues we had run into, in addition to all the complexity that comes with managing a cache and no longer being able to count on your db queries to return the latest information, was that our views had to be deleted and re-created for nearly every schema change (at least every schema change that affected the views, but we had so many views, this ended up being probably 90% of our schema changes).<p>We ended up overriding the standard Rails &quot;up&quot; and &quot;down&quot; migration methods to prepend the deletion of these views, append their re-creation, clean up after failed migrations, etc. I remember us spending a good amount of time across the team dealing with weird edge cases that cropped up from this. I also remember spending a fair amount of time training all the developers on these issues since almost no one on the team had experience using or dealing with database views. I assume managing migrations and schema of your views within Rails is all included in what Scenic does for you for free, since these hassles weren&#x27;t mentioned in the article.<p>It&#x27;s really nice to see something that handles all that complexity for you. However, that still leaves the standard and age-old caching issues with stale data. Another side effect was that they started being a bandaid for inefficient queries and data structures in the database, which is probably how we ended up with so many db views in the first place.<p>In the end, after probably 8 months of using these views, we decided to excise them from the application entirely and go back to optimizing our actual database queries. We got to delete a lot of custom code, our schema changes and migrations became simpler, our data became fresh again, and we got to reinvest that time into improving our underlying data structures and queries.<p>This probably is not an argument against using db views, but rather an anecdote of what can happen when you resort to them prematurely. If you have queries that are 100ms or more, there are probably more traditional optimizations you can find, such as restructuring your relational data, adding&#x2F;removing indexes, etc. If you&#x27;re actually trying to eliminate those last 10s of milliseconds from your queries though, as is shown in the article, I can see them being a good option.
gigatexalover 5 years ago
Another approach could be to create the materialized view and then use some sort of event system say the pg_notify subsystem to capture an event and then increment some aggregation -- this would eliminate the need for scheduled view refreshes and keep the view almost real-time and quick.
评论 #22067793 未加载
评论 #22070252 未加载
评论 #22067908 未加载
pdkl95over 5 years ago
&quot;There are two hard things in computer science: <i>cache invalidation</i>, naming things, and off-by-one errors.&quot; (Phil Karlton, <i>et al</i>)<p>Views (both regular and materialized) are really useful, but remember that storing data in two places really is one of the of the hard problems in CS. Refreshing the materialized view on a timer might seem easy, but 6 months from now when something important reads from the stale view instead of the real table could become a really frustrating bug. A materialized view might still be a worthwhile optimization; just remember that you&#x27;re also adding cache management, which might be more complicated than you suspect.
评论 #22068163 未加载
评论 #22068294 未加载
drusepthover 5 years ago
Is there an upper limit on the number of materialized views that it makes sense to have in a database?<p>I&#x27;ve been trying to figure out a way to deal with my (growing) join of 13 rather large tables when fetching all of a user&#x27;s data. Are materialized views the kind of thing that I could generate these cached results _per user_ and have, like, hundreds of thousands of them sitting around to query? And then be able to query against all of user 123456&#x27;s content directly instead of filtering N tables for their content and joining them all together every time?
georiover 5 years ago
How often does rails refresh the materialized views? And is it automatic or do you have to explicitly refresh them?
评论 #22066424 未加载
评论 #22066443 未加载
评论 #22071258 未加载