Unless an application is pretty poorly written, anything that is safe to self-tune isn't a configuration option in the first place. The crux of the issue is that the system/application doesn't have enough perspective to understand if it should self-tune one direction or the other.<p>For example, if an app runs out of file descriptors, is that happening because of normal conditions or is something wrong? Increasing the max blindly until the issue goes away is rarely the right answer.<p>Each self-tuning app would have to have logic more complex than the business logic itself to understand how it will interact in the environment it's running in (other apps, hardware, expected traffic bursts, etc).<p>This entire article is pretty shallow on the things it attacks. Take the following: "Why does the JVM still need messing around GC settings to get acceptable performance/memory usage?"<p>The answer to this should be obvious to anyone that has developed high-performance Java apps. There is no possible way for the garbage collector to understand what your application is doing to guess the optimal times to interrupt and collect. Unless your app is a tiny state machine, the garbage collector trying to self-tune based on runtime behavior is going to make your performance worthlessly unpredictable.
A robotic analogue is tuning PID loops. There exists loads of literature on the subject, but it turns out that it's quite hard, despite the relatively simplicity of the control law [1]. It's the underlying dynamics that results in no golden rule of PID tuning if you start the system from a random state.<p>The whole computer stack is similarly complex, but higher dimension and it's not clear what metric you would tune against (it would be application specific). Not that I think it's impossible, lots of regions of computer science are self-adaptive (TCP, splay tree). Its just the ensemble is a mega chaotic space. Maybe someone will hook up a DBN and skynet it soon.<p>[1] <a href="http://en.wikipedia.org/wiki/PID_controller#Loop_tuning" rel="nofollow">http://en.wikipedia.org/wiki/PID_controller#Loop_tuning</a>
There are a few reasons why this is hard:<p>* Lack of information -- in order to pick the right parameters, the system needs to know what you're trying to achieve (e.g., do you want to minimize latency or maximize throughput?). Communicating that more effectively than just specifying parameters is challenging.<p>* Instead of telling the system what you want, it can try to figure it out -- to <i>some</i> extent -- by observing program behavior and dynamically fiddling with value. The problem with this approach is that it creates feedback loops that muddle things beyond repair.<p>* Optimization over a high dimensional space is hard. As an example (even with just one dimension), see this bit from Doug Lea's talk about why dynamic tuning of spinning vs. blocking on concurrency constructs doesn't work: <a href="https://youtu.be/sq0MX3fHkro?t=39m48s" rel="nofollow">https://youtu.be/sq0MX3fHkro?t=39m48s</a><p>* Writing systems that never reboot[1] <i>and</i> are efficient is very hard. Dynamic variables usually require inserting checks at runtime that can't be optimized away by an AOT compiler. This is where a speculatively-optimizing JIT comes in handy, but JITs aren't appropriate for all uses, and even where they are, good optimizing JITs are notoriously hard to write.<p>--<p>> Why does the JVM still need messing around GC settings to get acceptable performance/memory usage?<p>OpenJDK's HotSpot does fairly reasonable auto-tuning now (known as "GC ergonomics"). You can pick either the throughput collector and <i>maybe</i> set your young-gen ratio, or G1 and set a maximum-pause goal. Along with the maximum heap size, those are just three values, one of which is binary, another is often unnecessary and the third can be a very rough estimation. This should be more than enough for the vast majority of systems, certainly with Java 8. Much of the GC tuning parameters you see in the wild are old remnants from before GC ergonomics, that people are afraid to pull out.<p>[1]: <a href="http://steve-yegge.blogspot.co.il/2007/01/pinocchio-problem.html" rel="nofollow">http://steve-yegge.blogspot.co.il/2007/01/pinocchio-problem....</a>
Wasn't there a Linux kernel contributor with a medical background who used to push unsuccessfully for it to have more self-tuning/homeostatic behaviour? (It wasn't Greg Kroah-Hartmann was it?)
There are lots of self-tuning systems in industrial control now. Some of the theory overlaps with machine learning. The theory is really hard, so hard that control theory PhDs are struggling deciding what math to learn.<p>I get IEEE Control Systems Technology magazine, but I don't understand most of it any more.
You'd need some way to express preferences, eg utility functions.<p>If it's obvious ("avoiding people on the road better than getting there in time") it will be in the system.<p>If it's not obvious ("want throughput and latency") you're back at square one.
I think the author does a great job at venting frustration at the state of autotuning systems research, though I would disagree that the research interest has dried up. On the contrary, autotuning research is alive and kicking, the problem is that there has been few attempts to unify all the competing systems that exist (with some exceptions [1]). As such, the state of autotuning is fragmented, with no one approach able to achieve the critical mass needed to hit the mainstream.<p>Disclaimer: I'm doing a PhD in autotuning ;-)<p>[1] <a href="http://ctuning.org/" rel="nofollow">http://ctuning.org/</a>
"self-tuning" is alive in activity metering<p><a href="http://www.autoletics.com/posts/managing-performance-analysis-complexity-adaptive-hotspot-measurement" rel="nofollow">http://www.autoletics.com/posts/managing-performance-analysi...</a><p>and execution control of systems using adaptive control valves & QoS<p><a href="https://vimeo.com/groups/sentris" rel="nofollow">https://vimeo.com/groups/sentris</a><p><a href="http://www.autoletics.com/wp-content/uploads/2014/07/UsingSystemDynamicsforEffectiveConcurrencyConsumptionControlofCode.pdf" rel="nofollow">http://www.autoletics.com/wp-content/uploads/2014/07/UsingSy...</a><p><a href="http://www.autoletics.com/wp-content/uploads/2014/07/AdaptivelyControllingApacheCassandraClientRequestProcessing.pdf" rel="nofollow">http://www.autoletics.com/wp-content/uploads/2014/07/Adaptiv...</a><p>self-adaptation (self-tuning being a sub category) needs to be both online & offline in reality<p><a href="http://www.autoletics.com/posts/iterative-application-performance-benchmark-analysis" rel="nofollow">http://www.autoletics.com/posts/iterative-application-perfor...</a>
Self-tuning systems are inherently complex and opaque. Execution tends to be non-deterministic and irreproducible, and so the burden of testing is far greater.<p>It takes some discipline to write elegant, loosely coupled code that is self-tuning. I work on scientific software, where reproducibility and reusability is (…or at least <i>should be</i>) paramount, and where black boxes are evil. It's hard not to build black boxes when you're writing self-tuning software.
> <i>Why does the JVM still need messing around GC settings to get acceptable performance/memory usage?</i><p>But the JIT <i>is a</i> self-tuning system! Here you are.
A tangentially related optimization:<p>When you read a value from a collection, move it towards the front.<p>Requires that order not be important, of course. Ditto, it's more trouble than it's generally worth if you're doing any sort of multithreading, or if you're using something where writes are much slower than reads.<p>It's also handy for most hash tables. (Though generally you should be using a Cuckoo hashtable anyways)
<a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107.8842" rel="nofollow">http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107....</a><p>Self-tuning databases from MS Research
There are optimizes for jvm flags. One of them is called Groningen and it will optimize your GC parameters for throughout or pause duration of other custom goals that you provide. I believe it is a genetic algorithm.