TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robots

48 pointsby jasondavies8 months ago

2 comments

aithrowawaycomm8 months ago
I didn&#x27;t think this GitHub Pages write-up was very clear, but the linked paper on arXiv is interesting (haven&#x27;t finished reading yet!) and this is a cool project.<p>Ultimately the weaknesses seem to come from &quot;outsourcing&quot; true spatio-mechanical reasoning to a language model which designs the according constraints, but does so with the same kind of brittle reasoning and odd limitations we&#x27;ve come to expect. It&#x27;s not really &quot;artificial&quot; spatial reasoning so much as &quot;virtual&quot;: sometimes quite good, but paper-thin and largely based on memorizing patterns. I think the authors overstated a few conclusions, e.g. the clothes folding don&#x27;t appear to be following any strategy at all, let alone a &quot;novel&quot; strategy - whatever apparent hints of strategy the authors are seeing is probably better explained by the symmetry of human clothing, which the vision model picks up on.<p>And note they didn&#x27;t ask the robot to fold <i>messy</i> clothing like a human does when it&#x27;s fresh out of the dryer; I suspect the robot needs shirts and pants to be laid out neatly, otherwise the vision model will misidentify it.<p>More generally, the authors did not do enough to stress-test the robot in situations that don&#x27;t line up with the training data. It&#x27;s cool to pour tea from a pot into a mug, but the vision model presumably has thousands of photos of this for the robot to imitate. What if you ask the robot to pour a mug into an open teapot? Presumably the vision model itself is less adept with this prompt; maybe the robot will still work, it&#x27;s a simple task.<p>But experience with ANNs suggests it&#x27;s likely to falter in these off-the-golden-path cases, and that it&#x27;ll falter in ways that are bizarre and unpredictable. I would have liked to see more comprehensive stress testing before using fancy terms like &quot;spatio-temporal reasoning.&quot; AI does not need more fancy tech demos driving unrealistic hype.<p>Regardless the results are very cool, and the underlying machinery is sophisticated without being too mysterious (once you accept the mysterious AI models it&#x27;s based on...). I think the edge case issue might mitigate <i>industrial</i> deployment in e.g. a factory, but I think robotics tinkerers and hobbyists would have a blast with these ideas, and people much cleverer than me could even make a real product.
评论 #41688864 未加载
CodeGroyper8 months ago
I love that Twitter is linked as tl;dr.