TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Learning a hierarchy

267 pointsby gdbover 7 years ago

9 comments

canjobearover 7 years ago
It seems to me there&#x27;s been an interesting turn in AI recently, toward focusing on adaptability as a goal in itself. Deep learning has shown that there is incredible power in stochastic gradient descent over a space of functions, but so far that has mostly been applied to rigid tasks. Now work like this is about turning that power towards adaptability itself as a goal, and it seems to me that this brings us towards &quot;real&quot; intelligence.<p>The logical extreme of this thinking would be agents that actually maximize entropy of future actions as the only objective function, like in [1]<p>[1] <a href="http:&#x2F;&#x2F;paulispace.com&#x2F;intelligence&#x2F;2017&#x2F;07&#x2F;06&#x2F;maxent.html" rel="nofollow">http:&#x2F;&#x2F;paulispace.com&#x2F;intelligence&#x2F;2017&#x2F;07&#x2F;06&#x2F;maxent.html</a>
评论 #15564484 未加载
评论 #15563287 未加载
评论 #15563341 未加载
评论 #15561964 未加载
anon404123over 7 years ago
super cool that this was done by a high schooler
评论 #15561128 未加载
评论 #15560989 未加载
评论 #15561108 未加载
hacker_9over 7 years ago
Does this optimise the hierarchy as the environment changes? For example when cooking, I unpackage food as needed, but when it starts to clutter the workspace I make a decision to fit in a &#x27;clean up cycle&#x27; while waiting on some other food to cook.
评论 #15562946 未加载
zardoover 7 years ago
I was mulling over this idea yesterday in the context of RTS games... There&#x27;s no reason to consider changing your overall strategy every frame. Nice to see it works!<p>It will be interesting to see how it performs with more tiers in the hierarchy, and with more structured tasks.<p>Controlling a virtual arm to play a board game for example.
sharemywinover 7 years ago
Found the paper from the wired article below<p><a href="https:&#x2F;&#x2F;s3-us-west-2.amazonaws.com&#x2F;openai-assets&#x2F;MLSH&#x2F;mlsh_paper.pdf" rel="nofollow">https:&#x2F;&#x2F;s3-us-west-2.amazonaws.com&#x2F;openai-assets&#x2F;MLSH&#x2F;mlsh_p...</a>
评论 #15562685 未加载
indescions_2017over 7 years ago
Next step: transfer learning and sharing amongst sub-policies in the graph hierarchy. If an Ant Agent learns to &quot;move up&quot; to avoid obstacle or reach goal. Why can&#x27;t it infer the same for any cardinal or diagonal direction, after observing the world around it. It&#x27;s just a rotation or translation after all.<p>Also, for small numbers of sub-policies, would Monte Carlo playouts be faster. Where we are searching over the next step the Any may encounter. Which presumably is a finite set of possible &quot;wall-floor&quot; configurations ;)<p>In any case, great work! Always love watching OpenAI vids...
评论 #15562654 未加载
评论 #15563128 未加载
sputknickover 7 years ago
I don&#x27;t understand where the &#x27;hierarchy&#x27; comes into play? This reads to me as a standard computer program where you execute code, and some of those lines execute other segments of code which might be much more complex than what I see. If I execute the line &#x27;printline(&#x27;Hello World&#x27;)&#x27; I only excuted one line, but many other things happened that I did not directly execute. I&#x27;m sure I&#x27;m missing something, and this is somehow different and novel, but I&#x27;m just missing it from this blog post.
评论 #15562010 未加载
setrover 7 years ago
Is it just me or is there something revolting about the character model?<p>Good work nonetheless but for god&#x27;s sake give it six legs and make it black
gthinkinover 7 years ago
Great work, Kevin!
评论 #15563560 未加载