TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

DuckDB Isn't Just Fast

111 点作者 calpaterson11 个月前

7 条评论

LunaSea11 个月前
I know I&#x27;m repeating my self (it must be my third comment on HN about this topic), but this does not match my experience at all.<p>DuckDB will error-out with an out-of-memory exception in very simple DISTINCT ON &#x2F; GROUP BY queries.<p>Even with a temporary file, an on-disk database and not keeping the initial order.<p>On any version of DuckDB.
评论 #40644415 未加载
评论 #40645036 未加载
评论 #40644231 未加载
评论 #40644655 未加载
评论 #40644206 未加载
评论 #40647233 未加载
评论 #40644180 未加载
评论 #40644671 未加载
felipemesquita11 个月前
DuckDB has great ergonomics for moving data between different databases and making copies for local analysis. The one thing that differed in my experience with it from the author’s is how much of the Postgres sql dialect (and extensions) it supports. Attempting to run my Postgres analytics sql code in duckdb errors out on most json operations - to be fair, the DuckDB json functions have cleaner names than jsonb_path_query - also, DuckDB has no support for handling xml, so all xpath calls fail as well.
评论 #40644716 未加载
dgan11 个月前
Really, is this what&#x27;s getting praised? I mean specifically the first point: the whole &quot;just paste the url into the DB&quot; - thing, + inferring the column names. That looks like the laziest and shakiest basis, and if I ever saw that in production i d be both stunned and scared
评论 #40644038 未加载
评论 #40644586 未加载
评论 #40647914 未加载
评论 #40644498 未加载
uptime11 个月前
observablehq.com has built in support for duckdb, and I have found it to be very easy to use. Getting windowing and cte and derived columns is great and being able to just refer to sql query cells as an array of rows makes things much easier for me than breaking out into js right away.<p>Someone wrote an export function, so I can make a select into a table and grab that as csv to use elsewhere.<p>I wish for Simon Willison to adopt duckdb as he has with sqlite to see what he would create!
koromak11 个月前
I very, very nearly migrated to a full Duckdb solution for customer-facing historical stock data. It would have been magical, and ridiculously, absurdly, ungodly fast. But the cloud costs ended up being close to a managed analytics solution, with significantly more moving parts (on our end). But I think thats just our use case, going forward I&#x27;d look at duckdb as an option for any large-scale datasets.<p>Using ECS&#x2F;EKS containers reading from a segmented dataset in EFS is a really solid solution, you can get sub second performance over 6 billion rows &#x2F; 10000 columns with proper management and reasonably restrictive queries.<p>Another option is to just deploy a couple huge EC2 instances that can fully fit the dataset. Costs here were about the same, but with a little more pain in server management. But the speed man, its just unbelievable.
评论 #40652625 未加载
评论 #40646078 未加载
评论 #40647844 未加载
评论 #40661512 未加载
aargh_aargh11 个月前
Great, I didn&#x27;t know about fsspec!
Woshiwuja11 个月前
uhm why would you ever use this instead of sqlite
评论 #40644373 未加载