TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Pandas extension arrays

95 点作者 ptype超过 6 年前

3 条评论

lmeyerov超过 6 年前
Innnnteresting! We&#x27;ve been using Pandas as our slow CPU fallback in the GPU Arrow world b&#x2F;c of issues like this.<p>Today, we default to using pydata gpu arrow tools like BlazingSQL or Nvidia RAPIDS directly. They ~guarantee perf, and subtle yet critical for maintaining our &lt; 100ms SLA, the Arrow format stays clean. (Ex: don&#x27;t want a column schema to get coerced to something less structured.) We&#x27;ll use Pandas as a fallback for when they lack features or are hard to use.<p>The ideal would be to use Pandas directly. Today it is a bit of a crapshoot on whether schemas will break across calls, and the above libraries are really replacements, rather than integrated accelerator extensions. So thinking like this project get us closer to predictable (and GPU-level) performance within pandas, vs fully replacing it. So cool!
评论 #19051251 未加载
mactrey超过 6 年前
I can&#x27;t say this is going to make a big difference in how I use pandas but I&#x27;ve ran into the bizarre &quot;can&#x27;t have nans in an int Series&quot; annoyance in almost every pandas project I&#x27;ve worked on, so good on them for fixing that.
评论 #19052675 未加载
评论 #19053842 未加载
bpchaps超过 6 年前
Has anyone done any perf analysis between this and previous versions?
评论 #19053495 未加载