TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Evolved Policy Gradients

48 pointsby gdbabout 7 years ago

3 comments

edhu2017about 7 years ago
"EPG takes a step toward agents that are not blank slates but instead know what it means to make progress on a new task, by having experienced making progress on similar tasks in the past." Can someone explain to me how they take a step? It seems like they just use random search define a loss function for the sub-policy to optimize against. Is it because the loss function is "learned" over the sequence of actions, making it adaptive?
twtwabout 7 years ago
TL;DR:<p>Parametrize your loss function and wrap a normal policy optimization with a random search to find a better loss function. Don&#x27;t call it &quot;random search,&quot; call it &quot;evolution strategies&quot; to make it sound sophisticated.<p>Neat idea.
yohann305about 7 years ago
Would someone here know how to go about recreating a physics sandbox using a virtual robot arm with cubes in a game engine editor like Unity&#x2F;UE4 where we&#x27;d be able to apply ML?<p>Any suggestion is welcome
评论 #16870165 未加载