科技回声

8 条评论

Okay, TL;DR:"Causal Entropic Forcing" is something like an AI's utility function, where the agent attempts to maximize future possibilities. Since this is meaningless (all possible futures are possible), what you actually want to do is make it as easy as possible to get to those futures - aka, their entropic adjacency, hence the name, causal entropic forcing.However, CEF requires that the agent can actually predict possible future states of the system, which comes with some serious issues. In the original paper, this is covered by access to perfect simulators, but those aren't available in real-world situations.This post discusses how to (possibly) use recurrent neural networks to make such predictions; how to do so effectively, and with consideration of the NN's confidence in it's predictions.It's pretty cool!

评论 #14714935 未加载

评论 #14713442 未加载

评论 #14714138 未加载

RangerScience将近 8 年前

This (causal entropic forcing) is one of the coolest ideas I've ever come across; it's one of my go-to stories about machine learning and philosophy.It combines well with Jeremy England's theory that life is entropically inevitable: <a href="https://www.scientificamerican.com/article/a-new-physics-theory-of-life/" rel="nofollow">https://www.scientificamerican.com/article/a-new-physics-the...</a>..and I wonder sometimes if you could make a religion out of all this; morality and existence based on entropic math. Consider that the goal of CEF is maximized possibilities, smoke a bowl, and think about fractals and holograms.One of the things I find really interesting about CEF is that it doesn't specifically help with understanding or predicting the world around you; it just gives a very effective way to determine what possible actions you should actually do. Given that (AFAIK) the human brain/mind is itself a combination of many systems, it seems to me to be very elegant that a CEF agent is also a combination of systems, each of which have limitations and issues.

评论 #14713642 未加载

评论 #14715658 未加载

highd将近 8 年前

I've been trying to parse this body of work - there doesn't seem to be a writeup on the exact implementation, just that it's using "Causal Entropic Forces". They have this writeup on the optimization implementation here: <a href="https://arxiv.org/pdf/1705.08691.pdf" rel="nofollow">https://arxiv.org/pdf/1705.08691.pdf</a>One red flag for me is that they're simultaneously claiming that there's no training during the OpenAI Gym while also claiming that the optimization approach is relevant. In that case, what is being optimized? It seems like they might be optimizing over previous simulations - there's frequent reference to having access to a "simulator". In that case, that should effectively count as training, right? I was under the impression that the OpenAI Gym was supposed to benchmark untrained approaches so they could be compared by learning time. Hence the gradually increasing training curves in the other approaches.

评论 #14713927 未加载

评论 #14713373 未加载

pizza将近 8 年前

I remember this -- entropica -- from, well must have been like 5 or 6 years ago now<a href="http://www.entropica.com/" rel="nofollow">http://www.entropica.com/</a>

pzone将近 8 年前

This blog post seems to be a comment or response aimed at people who already understand the paper, not an exposition for someone encountering it for the first time. I think I'm moderately well versed in probability and information theory and couldn't make heads or tails of it.

评论 #14713421 未加载

mehwoot将近 8 年前

Maximizing your number of future options is not always a good idea. Sometimes fewer options are better provided that these are more useful optionsI guess I'm missing something, because this seems to negate the entire point... isn't the point that number of future options is a good measure of "more useful options"?

评论 #14716890 未加载

mrdrozdov将近 8 年前

At quick glance, this work seems related to Information Maximization like is done in the papers for InfoGAN, VIME, and Intrinsic Motivation (for automatic goal-setting in RL).

canjobear将近 8 年前

How does this relate to concepts like AIXI and Solomonoff induction?

评论 #14718795 未加载

8 条评论

RangerScience将近 8 年前

评论 #14714935 未加载

评论 #14713442 未加载

评论 #14714138 未加载

RangerScience将近 8 年前

评论 #14713642 未加载

评论 #14715658 未加载

highd将近 8 年前

评论 #14713927 未加载

评论 #14713373 未加载

pizza将近 8 年前

I remember this -- entropica -- from, well must have been like 5 or 6 years ago now<a href="http://www.entropica.com/" rel="nofollow">http://www.entropica.com/</a>

pzone将近 8 年前

评论 #14713421 未加载

mehwoot将近 8 年前

评论 #14716890 未加载

mrdrozdov将近 8 年前

At quick glance, this work seems related to Information Maximization like is done in the papers for InfoGAN, VIME, and Intrinsic Motivation (for automatic goal-setting in RL).

canjobear将近 8 年前

How does this relate to concepts like AIXI and Solomonoff induction?

评论 #14718795 未加载

Entropy Maximization and intelligent behaviour

8 条评论

Entropy Maximization and intelligent behaviour

8 条评论