TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Using PostgreSQL Arrays the Right Way

42 点作者 afshin大约 10 年前

4 条评论

bonesmoses大约 10 年前
Introducing the caveats of PostgreSQL arrays is informative. However, I can&#x27;t help but note that this app probably would have been better served by a classic event log table. Why have each user forever carrying tens, or hundreds of thousands of events directly in their user record?<p>If data mining is necessary, it&#x27;s <i>right there</i> in the event log and can be manipulated by much simpler (and faster, possibly by orders of magnitude) SQL. Additionally, it&#x27;s not bloating regular access of the user records.<p>I&#x27;ve been seeing these designs more and more, recently. Is it a natural extension of non-DBAs taking DDL design roles? Because this data manipulation style doesn&#x27;t really mix well with any RDBMS I&#x27;m familiar with.
评论 #9329077 未加载
评论 #9328914 未加载
评论 #9328076 未加载
greggyb大约 10 年前
I find it very interesting to see how seemingly minor rewrites can have huge performance impacts.<p>I do have a question about the SQL style. I work in business intelligence, so I write a pretty good amount of SQL, and get to see a lot of other people&#x27;s efforts, ranging from newbie analysts&#x27; first fumblings to experienced DBAs&#x27; queries and procedures. It seems that nested subqueries in the style of this article are used much more often than CTEs, and I&#x27;m curious why this is.<p>* I find CTEs much more readable than subqueries * For naive uses, they are optimized the same * For complex uses (e.g. recursion to traverse a parent-child hierarchy), CTEs are the only option<p>The only good reason I see for subqueries is if you are using an old version of your RDBMS that does not support CTEs.<p>I&#x27;m just wondering if anyone else can weigh in on the topic.
评论 #9329288 未加载
评论 #9329153 未加载
dbenhur大约 10 年前
Author could lose a layer of subselect with the filter on rank by using first_value over the window frame instead. <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;dbenhur&#x2F;951b7bae4e07d7998268" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;dbenhur&#x2F;951b7bae4e07d7998268</a>
评论 #9331133 未加载
mrits大约 10 年前
I had some pretty great success rewriting some of our longer running queries to use arrays. A couple things:<p>I implemented several helper functions around this. Basically I have a row based function that makes sets unique then I have aggregation functions that give me the cardinality of these sets. I use intset when I can, otherwise I use the &quot;key&quot; part of hstore in substitute of the missing set structure<p>I&#x27;ve been pushing hard to use: <a href="https:&#x2F;&#x2F;github.com&#x2F;aggregateknowledge&#x2F;postgresql-hll" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;aggregateknowledge&#x2F;postgresql-hll</a><p>Being able to switch algorithms based on size IMO makes that one of the most badass thing about postgres right now.
评论 #9328118 未加载