科技回声

I was reading about Dotnet Core's Tiered Compilation model and the two requirements to go from Tier 0 to Tier 1 caught my eye.> The method needs to be called at least 30 times, as measured by the call counter, and this gives us a rough notion that the method is 'hot'. The number 30 was derived with a small amount of early empirical testing but there hasn't been a large amount of effort applied in checking if the number is optimal. We assumed that both the policy and the sample benchmarks we were measuring would be in a state of flux for a while to come so there wasn't much reason to spend a lot of time finding the exact maximum of a shifting curve. As best we can tell there is also not a steep response between changes in this value and changes in the performance of many scenarios. An order of magnitude should produce a notable difference but +-5 can vanish into the noise.> At startup a timer is initiated with a 100ms timeout. If any Tier0 jitting occurs while the timer is running then it is reset. If the timer completes without any Tier0 jitting then, and only then, is call counting allowed to commence. This means a method could be called 1000 times in the first 100ms, but the timer will still need to expire and have the method called 30 more times before it is eligible for Tier1. The reason for the timer is to measure whether or not Tier0 jitting is still occurring, which is a heuristic to measure whether or not the application is still in its startup phase. Before adding the timer we observed that both the call counter and background threads compiling Tier1 code versions were slowing down the foreground threads trying to complete startup, and this could result in losing all the startup performance wins from Tier0 jitting. By delaying until after 'startup' the Tier0 code is left running longer, but that was nearly always a better performing outcome than trying to replace it with Tier1 code too eagerly.I understand that a JIT can optimize a running program to produce even more performant code, but in this case the two parameters weren't optimized much once a good set of values were found. This makes sense from the perspective of the programmer, but it seems to me that this could be dynamically tuned on a per CPU basis depending on factors like, how good a users cooling solution is, and whether or not they won the silicon lottery.So in this case it does not appear like any machine learning was used to tune these parameters, but a "optimizing ML entity" that had access to these knobs, could potentially use these, and other parameters to tune a given program for your computer in a much more intimate way than currently possible. Is that something that is done with JITS in general? Or is the performance tuning more rudimentary than that?

2 条评论

PaulHoule超过 2 年前

Machine learning for compilers is a big research area, this query turns up documents on that<a href="https://arxiv.org/search/?query=machine+learning+compiler+optimization&searchtype=all&source=header" rel="nofollow">https://arxiv.org/search/?query=machine+learning+compiler+op...</a>and also documents on compilers for machine language programs.An advanced optimizing compiler can work very hard to optimize the code, particularly with "profile guided optimization" or having the compiler generate a large number of candidate binaries and choosing the best. You'd better believe people treat this as a mathematical "optimization" problem and try out the most powerful tools.The JIT, however, runs every time the program starts up for every user and if the JIT is not fast the delay becomes untolerable. Now the JIT does have the advantage that it can do profile-guided optimization based on tier0 profiling but whatever it does has to be fast so in that sense it must use simple algorithms. As that article points out the developers of a commercial runtime like .NET are very concerned about optimizing the optimization process.

评论 #34735794 未加载

auganov超过 2 年前

These two parameters you're mentioning were chosen to optimize for startup performance. By definition, they have to be preselected. I may be wrong, but it sounds to me like you're under the false impression that this quoted stuff is somehow subject to jitting. You could be more sophisticated about picking these parameters for sure, but that would be orthogonal to jitting.

2 条评论

PaulHoule超过 2 年前

评论 #34735794 未加载

auganov超过 2 年前

Ask HN: Does a JIT typically use machine learning to boost performance?

2 条评论

Ask HN: Does a JIT typically use machine learning to boost performance?

2 条评论