The graal behavior is a lot more sane:<p><pre><code> graal:
[info] SoFlow.square_i_two 10000 avgt 10 5338.492 ± 36.624 ns/op // 2 *\sum i * i
[info] SoFlow.two_i_ 10000 avgt 10 6421.343 ± 34.836 ns/op // \sum 2 * i * i
[info] SoFlow.two_square_i 10000 avgt 10 6367.139 ± 34.575 ns/op // \sum 2 * (i * i)
regular 1.8:
[info] SoFlow.square_i_two 10000 avgt 10 6393.422 ± 27.679 ns/op
[info] SoFlow.two_i_ 10000 avgt 10 8870.908 ± 35.715 ns/op
[info] SoFlow.two_square_i 10000 avgt 10 6221.205 ± 42.408 ns/op
</code></pre>
The graal-generated assembly for the first two cases is nearly identical, featuring unrolled repetitions of sequences like<p><pre><code> [info] 0x000000011433ec03: mov %r8d,%ecx
[info] 0x000000011433ec06: shl %ecx ;*imul {reexecute=0 rethrow=0 return_oop=0}
[info] ; - add.SoFlow::test_two_i_@15 (line 41)
[info] 0x000000011433ec08: imul %r8d,%ecx ;*imul {reexecute=0 rethrow=0 return_oop=0}
[info] ; - add.SoFlow::test_two_i_@17 (line 41)
[info] 0x000000011433ec0c: add %ecx,%r9d ;*iadd {reexecute=0 rethrow=0 return_oop=0}
[info] ; - add.SoFlow::test_two_i_@18 (line 41)
[info] 0x000000011433ec0f: lea 0x5(%r11),%r8d ;*iinc {reexecute=0 rethrow=0 return_oop=0}
[info] ; - add.SoFlow::test_two_i_@20 (line 40)
</code></pre>
while the third case does a single shl at the end.<p><pre><code> [info] 0x000000010e2918bb: imul %r8d,%r8d ;*imul {reexecute=0 rethrow=0 return_oop=0}
[info] ; - add.SoFlow::test_square_i_two@15 (line 32)
[info] 0x000000010e2918bf: add %r8d,%ecx ;*iadd {reexecute=0 rethrow=0 return_oop=0}
[info] ; - add.SoFlow::test_square_i_two@16 (line 32)
[info] 0x000000010e2918c2: lea 0x3(%r11),%r8d ;*iinc {reexecute=0 rethrow=0 return_oop=0}
[info] ; - add.SoFlow::test_square_i_two@18 (line 31)
</code></pre>
Both graal and C2 inline, but as usual the graal output is a lot more comprehensible.