Evolving Stable Strategies

32 点作者 wei_jok超过 7 年前

1 comment

candiodari超过 7 年前

I wonder what happens when you simply backprop using experience replay in either a CNN or fully connected net. Just run a random neural net, and take "samples" (inputs + outputs) every 1s or so. After 30s get an error, optionally "discount" it over time, and run backprop.