TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

DuckDB Isn't Just Fast

111 pointsby calpaterson11 months ago

7 comments

LunaSea11 months ago
I know I&#x27;m repeating my self (it must be my third comment on HN about this topic), but this does not match my experience at all.<p>DuckDB will error-out with an out-of-memory exception in very simple DISTINCT ON &#x2F; GROUP BY queries.<p>Even with a temporary file, an on-disk database and not keeping the initial order.<p>On any version of DuckDB.
评论 #40644415 未加载
评论 #40645036 未加载
评论 #40644231 未加载
评论 #40644655 未加载
评论 #40644206 未加载
评论 #40647233 未加载
评论 #40644180 未加载
评论 #40644671 未加载
felipemesquita11 months ago
DuckDB has great ergonomics for moving data between different databases and making copies for local analysis. The one thing that differed in my experience with it from the author’s is how much of the Postgres sql dialect (and extensions) it supports. Attempting to run my Postgres analytics sql code in duckdb errors out on most json operations - to be fair, the DuckDB json functions have cleaner names than jsonb_path_query - also, DuckDB has no support for handling xml, so all xpath calls fail as well.
评论 #40644716 未加载
dgan11 months ago
Really, is this what&#x27;s getting praised? I mean specifically the first point: the whole &quot;just paste the url into the DB&quot; - thing, + inferring the column names. That looks like the laziest and shakiest basis, and if I ever saw that in production i d be both stunned and scared
评论 #40644038 未加载
评论 #40644586 未加载
评论 #40647914 未加载
评论 #40644498 未加载
uptime11 months ago
observablehq.com has built in support for duckdb, and I have found it to be very easy to use. Getting windowing and cte and derived columns is great and being able to just refer to sql query cells as an array of rows makes things much easier for me than breaking out into js right away.<p>Someone wrote an export function, so I can make a select into a table and grab that as csv to use elsewhere.<p>I wish for Simon Willison to adopt duckdb as he has with sqlite to see what he would create!
koromak11 months ago
I very, very nearly migrated to a full Duckdb solution for customer-facing historical stock data. It would have been magical, and ridiculously, absurdly, ungodly fast. But the cloud costs ended up being close to a managed analytics solution, with significantly more moving parts (on our end). But I think thats just our use case, going forward I&#x27;d look at duckdb as an option for any large-scale datasets.<p>Using ECS&#x2F;EKS containers reading from a segmented dataset in EFS is a really solid solution, you can get sub second performance over 6 billion rows &#x2F; 10000 columns with proper management and reasonably restrictive queries.<p>Another option is to just deploy a couple huge EC2 instances that can fully fit the dataset. Costs here were about the same, but with a little more pain in server management. But the speed man, its just unbelievable.
评论 #40652625 未加载
评论 #40646078 未加载
评论 #40647844 未加载
评论 #40661512 未加载
aargh_aargh11 months ago
Great, I didn&#x27;t know about fsspec!
Woshiwuja11 months ago
uhm why would you ever use this instead of sqlite
评论 #40644373 未加载