TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Using PostgreSQL Arrays the Right Way

42 pointsby afshinabout 10 years ago

4 comments

bonesmosesabout 10 years ago
Introducing the caveats of PostgreSQL arrays is informative. However, I can&#x27;t help but note that this app probably would have been better served by a classic event log table. Why have each user forever carrying tens, or hundreds of thousands of events directly in their user record?<p>If data mining is necessary, it&#x27;s <i>right there</i> in the event log and can be manipulated by much simpler (and faster, possibly by orders of magnitude) SQL. Additionally, it&#x27;s not bloating regular access of the user records.<p>I&#x27;ve been seeing these designs more and more, recently. Is it a natural extension of non-DBAs taking DDL design roles? Because this data manipulation style doesn&#x27;t really mix well with any RDBMS I&#x27;m familiar with.
评论 #9329077 未加载
评论 #9328914 未加载
评论 #9328076 未加载
greggybabout 10 years ago
I find it very interesting to see how seemingly minor rewrites can have huge performance impacts.<p>I do have a question about the SQL style. I work in business intelligence, so I write a pretty good amount of SQL, and get to see a lot of other people&#x27;s efforts, ranging from newbie analysts&#x27; first fumblings to experienced DBAs&#x27; queries and procedures. It seems that nested subqueries in the style of this article are used much more often than CTEs, and I&#x27;m curious why this is.<p>* I find CTEs much more readable than subqueries * For naive uses, they are optimized the same * For complex uses (e.g. recursion to traverse a parent-child hierarchy), CTEs are the only option<p>The only good reason I see for subqueries is if you are using an old version of your RDBMS that does not support CTEs.<p>I&#x27;m just wondering if anyone else can weigh in on the topic.
评论 #9329288 未加载
评论 #9329153 未加载
dbenhurabout 10 years ago
Author could lose a layer of subselect with the filter on rank by using first_value over the window frame instead. <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;dbenhur&#x2F;951b7bae4e07d7998268" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;dbenhur&#x2F;951b7bae4e07d7998268</a>
评论 #9331133 未加载
mritsabout 10 years ago
I had some pretty great success rewriting some of our longer running queries to use arrays. A couple things:<p>I implemented several helper functions around this. Basically I have a row based function that makes sets unique then I have aggregation functions that give me the cardinality of these sets. I use intset when I can, otherwise I use the &quot;key&quot; part of hstore in substitute of the missing set structure<p>I&#x27;ve been pushing hard to use: <a href="https:&#x2F;&#x2F;github.com&#x2F;aggregateknowledge&#x2F;postgresql-hll" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;aggregateknowledge&#x2F;postgresql-hll</a><p>Being able to switch algorithms based on size IMO makes that one of the most badass thing about postgres right now.
评论 #9328118 未加载