There is some work that explores this from an AI perspective with relative success [1]. It turns out that you can create quite intelligent agents if they look to maximize future entropy, with interesting results [2]. It's still quite computationally expensive, but your normal search reduction tricks apply and you can get something computationally feasible.<p>[1] <a href="https://arxiv.org/abs/1310.1863" rel="nofollow">https://arxiv.org/abs/1310.1863</a><p>[2] <a href="https://www.mdpi.com/1099-4300/16/5/2789" rel="nofollow">https://www.mdpi.com/1099-4300/16/5/2789</a>
This topic is extremely interesting, and good to see that experiments support it.<p>In AI lens:<p>In a way, you can compare this to novelty seeking and intelligent exploration which is quite an active field in Artificial Life and game AI[1].
If you find this interesting: Jeff Clune, Kenneth Stanley and Joel Lehman conducted interesting related research.<p>Also, isn't this somehow related to the Free Energy principle by Karl Friston? If you look at entropy maximization as a way to minimize surprises.<p>[1] : <a href="https://arxiv.org/abs/1901.10995" rel="nofollow">https://arxiv.org/abs/1901.10995</a>
Are we sure the availability of options equals entropy? It doesn't appear as though we all act to simply increase our options. Preferring options over reward may also constitute delayed gratification and sacrifice, which is another interesting can of worms, but can it be predicated in terms of just preference of options over reward when those are your only two artificial options?<p>Human behavior appears to point towards the maximization of current order as an investment in power/potential to drive future entropy, as opposed to simply maximizing entropy. This is the difference between building a nuclear bomb and keeping it, as opposed to building the bomb to use it. When one was used, it was meant to end a war, not start one. And success in life may as well be defined by hoarding order, be it technologically, financially, socially, or just objects. The pyramids were a feat in lowering entropy, not increasing it. And we love our diamonds.<p>This is also an extrapolation from the evidence in biology that energy entering a system increases order and contributes to the orderly structuring of matter and hence life [1].<p>[1] <a href="https://www.quantamagazine.org/a-new-thermodynamics-theory-of-the-origin-of-life-20140122/" rel="nofollow">https://www.quantamagazine.org/a-new-thermodynamics-theory-o...</a>
When my son has asked me what he should do about courses or employment or even vacation choices, I answer that he ought to choose the path that gives him the most choices.<p>He has thanked me many time for that advice, which has resulted in a high-value path for him.
On an intuitive level, this makes perfect sense. Assuming that entropy is, roughly speaking, novelty, that makes the calculation one about exploring new options for utility gains.<p>I recall seeing a study (although not where) suggesting novelty seeking was a key hallmark of intelligence. Maybe this means the entropy-utility calibration drives their intelligence? (Alongside their actual material circumstances)
Also see Casual Entropic Forces [1] by A. D. Wissner-Gross [2] and C. E. Freer<p>[1] <a href="https://www.alexwg.org/publications/PhysRevLett_110-168702.pdf" rel="nofollow">https://www.alexwg.org/publications/PhysRevLett_110-168702.p...</a><p>[2] <a href="https://www.alexwg.org/" rel="nofollow">https://www.alexwg.org/</a>
aka Mobility heuristics<p>A very strong heuristics that works well in many games (exceptions are usually very interesting games) and is the root to other heuristic concepts such as piece value, central positioning, "protected king", ... in Chess and similar concepts in, e.g., Starcraft.<p>Also very easy to implement, for discrete turn-based games it's just the number of moves in a given state.<p><a href="http://ggp.stanford.edu/lectures/heuristics.pdf" rel="nofollow">http://ggp.stanford.edu/lectures/heuristics.pdf</a>
Isn't it pretty much the optimal behavior as evidenced e.g. by multi-armed bandit algorithms and explore-exploit balance in reinforcement learning?
"several studies have shown that individuals demonstrate a preference for choice, or the availability of multiple options, over and above utilitarian value." -> yes, it is called the need for orientation/control and "utilitarian value" has nothing to do with it -> Index Funds vs. Actively-Managed Funds -> people prefer the latter even if the returns are consistently lower. [1]<p>"Yet we lack a decision-making framework that integrates preference for choice with traditional utility maximisation in free choice behaviour." -> utility maximisation "has charm for economists, but it rests on the shaky foundation of an implausible and untestable assumption" - Daniel Kahneman [2] -> TL;DR the author of "Thinking Fast and Slow" proves it false<p>"We found that participants were biased towards states that kept their options open, even when both states were balanced in the total number of goal locations. This bias was evident not only when both contexts were equally valuable but throughout all value conditions..." AND "Participants were not informed of the precise values ..." -> seeing the utilitarian variable being forced upon conclusions is disheartening<p>[1] <a href="https://www.thebalance.com/index-funds-vs-actively-managed-funds-2466445" rel="nofollow">https://www.thebalance.com/index-funds-vs-actively-managed-f...</a>
[2] <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=870494" rel="nofollow">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=870494</a>