TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

π0.5: A VLA with open-world generalization

177 点作者 lachyg22 天前

11 条评论

bytesandbits22 天前
Most of it is open source. Their VLAs are based upon Gemma models + vision encoders, plus their own action experts. You can download and play around or fine tune their Pi0 VLAs from their servers directly (JAX format) or from Huggingface LeRobot safetensors port. They also have notebooks and code in their repo to get started with fine-tuning. Inference runs in a single 4090 RTX streamed over WiFi to the robot.
评论 #43766778 未加载
beklein22 天前
This is amazing! As someone working with industrial robots, normally under strict environmental constraints and control, witnessing such real-world robotics progress truly excites me about the future!<p>By the way, they’ve open-sourced their π0 model (code and model weights). More information can be found here: <a href="https:&#x2F;&#x2F;github.com&#x2F;Physical-Intelligence&#x2F;openpi">https:&#x2F;&#x2F;github.com&#x2F;Physical-Intelligence&#x2F;openpi</a>
评论 #43764851 未加载
djoldman22 天前
I&#x27;m genuinely asking (not trying to be snarky)... Why are these robots so slow?<p>Is it a throughput constraint given too much data from the environment sensors?<p>Is it processing the data?<p>I&#x27;m curious about where the bottleneck is.
评论 #43767816 未加载
评论 #43765485 未加载
评论 #43765791 未加载
评论 #43765099 未加载
评论 #43766110 未加载
评论 #43765098 未加载
huydotnet22 天前
Amazing! On a fun note, I believe if a human kid were cleaning up the spill and threw the sponge into the sink like that, the kid would be in trouble. XD
评论 #43780659 未加载
meisel22 天前
These variable-length arrays are getting quite advanced
评论 #43766216 未加载
评论 #43765082 未加载
gs1722 天前
Is the robot platform they&#x27;re using something they&#x27;ve developed themselves? The paper doesn&#x27;t seem to mention any details outside of sensors and actuators.
评论 #43764792 未加载
评论 #43766522 未加载
th0ma522 天前
Does the general laws of demos apply here? Than any automation shown is the extent of capabilities not the start?
评论 #43766116 未加载
评论 #43767042 未加载
airstrike22 天前
I&#x27;m just a layman, but I can&#x27;t see this design scaling. It&#x27;s way too slow and &quot;hard&quot; for fine motor tasks like cleaning up a kitchen or being anywhere around humans, really.<p>I think the future is in &quot;softer&quot; type of robots that can sense whether their robot fingers are pushing a cabinet door (or if it&#x27;s facing resistance) and adjust accordingly. A quick google search shows this example (animated render) which is closer to what I imagine the ultimate solution will be: <a href="https:&#x2F;&#x2F;compliance-robotics.com&#x2F;compliance-industry&#x2F;" rel="nofollow">https:&#x2F;&#x2F;compliance-robotics.com&#x2F;compliance-industry&#x2F;</a><p>Human flesh is way too squishy for us to allow hard tools to interface with it, unless the human is in control. The difference between a blunt weapon and the robot from TFA is that the latter is very slow and on wheels.
评论 #43766010 未加载
yencabulator22 天前
VLA = vision-language-action, a kind of a machine learning model
desertmonad22 天前
Finally, machines doing the work we <i>dont</i> want to do
评论 #43767829 未加载
zx808021 天前
&gt; Investors &gt; We are grateful for the support of Bond, Jeff Bezos, Khosla Ventures, Lux Capital, OpenAI, Redpoint Ventures, Sequoia Capital, and Thrive Capital.