TechEcho

9 comments

canjobearover 7 years ago

It seems to me there's been an interesting turn in AI recently, toward focusing on adaptability as a goal in itself. Deep learning has shown that there is incredible power in stochastic gradient descent over a space of functions, but so far that has mostly been applied to rigid tasks. Now work like this is about turning that power towards adaptability itself as a goal, and it seems to me that this brings us towards "real" intelligence.The logical extreme of this thinking would be agents that actually maximize entropy of future actions as the only objective function, like in [1][1] <a href="http://paulispace.com/intelligence/2017/07/06/maxent.html" rel="nofollow">http://paulispace.com/intelligence/2017/07/06/maxent.html</a>

评论 #15564484 未加载

评论 #15563287 未加载

评论 #15563341 未加载

评论 #15561964 未加载

anon404123over 7 years ago

super cool that this was done by a high schooler

评论 #15561128 未加载

评论 #15560989 未加载

评论 #15561108 未加载

hacker_9over 7 years ago

Does this optimise the hierarchy as the environment changes? For example when cooking, I unpackage food as needed, but when it starts to clutter the workspace I make a decision to fit in a 'clean up cycle' while waiting on some other food to cook.

评论 #15562946 未加载

zardoover 7 years ago

I was mulling over this idea yesterday in the context of RTS games... There's no reason to consider changing your overall strategy every frame. Nice to see it works!It will be interesting to see how it performs with more tiers in the hierarchy, and with more structured tasks.Controlling a virtual arm to play a board game for example.

sharemywinover 7 years ago

Found the paper from the wired article below<a href="https://s3-us-west-2.amazonaws.com/openai-assets/MLSH/mlsh_paper.pdf" rel="nofollow">https://s3-us-west-2.amazonaws.com/openai-assets/MLSH/mlsh_p...</a>

评论 #15562685 未加载

indescions_2017over 7 years ago

Next step: transfer learning and sharing amongst sub-policies in the graph hierarchy. If an Ant Agent learns to "move up" to avoid obstacle or reach goal. Why can't it infer the same for any cardinal or diagonal direction, after observing the world around it. It's just a rotation or translation after all.Also, for small numbers of sub-policies, would Monte Carlo playouts be faster. Where we are searching over the next step the Any may encounter. Which presumably is a finite set of possible "wall-floor" configurations ;)In any case, great work! Always love watching OpenAI vids...

评论 #15562654 未加载

评论 #15563128 未加载

sputknickover 7 years ago

I don't understand where the 'hierarchy' comes into play? This reads to me as a standard computer program where you execute code, and some of those lines execute other segments of code which might be much more complex than what I see. If I execute the line 'printline('Hello World')' I only excuted one line, but many other things happened that I did not directly execute. I'm sure I'm missing something, and this is somehow different and novel, but I'm just missing it from this blog post.

评论 #15562010 未加载

setrover 7 years ago

Is it just me or is there something revolting about the character model?Good work nonetheless but for god's sake give it six legs and make it black

gthinkinover 7 years ago

Great work, Kevin!

评论 #15563560 未加载

9 comments

canjobearover 7 years ago

评论 #15564484 未加载

评论 #15563287 未加载

评论 #15563341 未加载

评论 #15561964 未加载

anon404123over 7 years ago

super cool that this was done by a high schooler

评论 #15561128 未加载

评论 #15560989 未加载

评论 #15561108 未加载

hacker_9over 7 years ago

评论 #15562946 未加载

zardoover 7 years ago

sharemywinover 7 years ago

评论 #15562685 未加载

indescions_2017over 7 years ago

评论 #15562654 未加载

评论 #15563128 未加载

sputknickover 7 years ago

评论 #15562010 未加载

setrover 7 years ago

Is it just me or is there something revolting about the character model?Good work nonetheless but for god's sake give it six legs and make it black

gthinkinover 7 years ago

Great work, Kevin!

评论 #15563560 未加载

Learning a hierarchy

9 comments

Learning a hierarchy

9 comments