TechEcho

2 comments

I didn't think this GitHub Pages write-up was very clear, but the linked paper on arXiv is interesting (haven't finished reading yet!) and this is a cool project.Ultimately the weaknesses seem to come from "outsourcing" true spatio-mechanical reasoning to a language model which designs the according constraints, but does so with the same kind of brittle reasoning and odd limitations we've come to expect. It's not really "artificial" spatial reasoning so much as "virtual": sometimes quite good, but paper-thin and largely based on memorizing patterns. I think the authors overstated a few conclusions, e.g. the clothes folding don't appear to be following any strategy at all, let alone a "novel" strategy - whatever apparent hints of strategy the authors are seeing is probably better explained by the symmetry of human clothing, which the vision model picks up on.And note they didn't ask the robot to fold messy clothing like a human does when it's fresh out of the dryer; I suspect the robot needs shirts and pants to be laid out neatly, otherwise the vision model will misidentify it.More generally, the authors did not do enough to stress-test the robot in situations that don't line up with the training data. It's cool to pour tea from a pot into a mug, but the vision model presumably has thousands of photos of this for the robot to imitate. What if you ask the robot to pour a mug into an open teapot? Presumably the vision model itself is less adept with this prompt; maybe the robot will still work, it's a simple task.But experience with ANNs suggests it's likely to falter in these off-the-golden-path cases, and that it'll falter in ways that are bizarre and unpredictable. I would have liked to see more comprehensive stress testing before using fancy terms like "spatio-temporal reasoning." AI does not need more fancy tech demos driving unrealistic hype.Regardless the results are very cool, and the underlying machinery is sophisticated without being too mysterious (once you accept the mysterious AI models it's based on...). I think the edge case issue might mitigate industrial deployment in e.g. a factory, but I think robotics tinkerers and hobbyists would have a blast with these ideas, and people much cleverer than me could even make a real product.

评论 #41688864 未加载

CodeGroyper8 months ago

I love that Twitter is linked as tl;dr.

2 comments

aithrowawaycomm8 months ago

评论 #41688864 未加载

CodeGroyper8 months ago

I love that Twitter is linked as tl;dr.

ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robots

2 comments

ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robots

2 comments