The majority of the performance difference between strings concat and builder in your example is explained by memory allocation. Every loop of concat will result in a new allocation, while the builder - which uses []bytes internally - will only allocate when length equals capacity, and the newly allocated slice will be approx. twice the capacity of the old slice (see: <a href="https://golang.org/src/strings/builder.go?#L62" rel="nofollow">https://golang.org/src/strings/builder.go?#L62</a>).<p>Therefore, 500,000 rounds of concat is about 500,000 allocations, while 200,000,000 rounds of builder is ~ 27.5 allocations (=log2(200000000)).<p>I would suggest a different benchmark to approximate real world usage:<p><pre><code> func BenchmarkConcatString(b *testing.B) {
for n := 0; n < b.N; n++ {
var str string
str += "x"
str += "y"
str += "z"
}
}
func BenchmarkConcatBuilder(b *testing.B) {
for n := 0; n < b.N; n++ {
var builder strings.Builder
builder.WriteString("x")
builder.WriteString("y")
builder.WriteString("z")
builder.String()
}
}
</code></pre>
Which still shows a significant performance advantage for using builder (-40% ns/op):<p><pre><code> BenchmarkConcatString-4 20000000 93.5 ns/op
BenchmarkConcatBuilder-4 30000000 54.6 ns/op</code></pre>
I would mention that, gc (the official Go compiler) makes special optimization for string concatenation operation (+). If the number of strings to be concatenated is known at compile time, using + to concatenate strings is the most efficient.<p><pre><code> package a
import "testing"
import "strings"
var strA, strB string
var x, y, z = "x", "y", "z"
func BenchmarkConcatString(b *testing.B) {
for n := 0; n < b.N; n++ {
strA = x + y + z
}
}
func BenchmarkConcatBuilder(b *testing.B) {
for n := 0; n < b.N; n++ {
var builder strings.Builder
builder.WriteString(x)
builder.WriteString(y)
builder.WriteString(z)
strB = builder.String()
}
}
</code></pre>
Result:<p><pre><code> goos: linux
goarch: amd64
BenchmarkConcatString-2 20000000 83.7 ns/op
BenchmarkConcatBuilder-2 20000000 102 ns/op</code></pre>
String benchmarks are so broken.<p>They way he uses b.N is wrong. b.N is different for different loops so he's e.g. timing 100 iterations of string '+' with a 1000 iterations of builder.WriteString()<p>Also the compiler can completely null out no-op functions (without side effects) so in benchmarks it's a good idea to assign the value being calculated into e.g. a global variable.<p>The corrected code is: <a href="https://gist.github.com/kjk/6a7d7135ae1e5fa6cd1f0db23d2eaf4d" rel="nofollow">https://gist.github.com/kjk/6a7d7135ae1e5fa6cd1f0db23d2eaf4d</a><p>An example of correctly benchmarking:<p><pre><code> func BenchmarkConcatString(b *testing.B) {
for n := 0; n < b.N; n++ {
var str string
for i := 0; i < 100; i++ {
str += "x"
}
gStr = str
}
}
</code></pre>
After fixes it paints significantly different picture:<p><pre><code> go test -bench=. -benchmem
goos: darwin
goarch: amd64
BenchmarkConcatString-8 300000 5148 ns/op 5728 B/op 99 allocs/op
BenchmarkConcatBuffer-8 1000000 1046 ns/op 368 B/op 3 allocs/op
BenchmarkConcatBuilder-8 1000000 1177 ns/op 248 B/op 5 allocs/op</code></pre>
While I don't doubt that strings.Builder does is quicker than += concat for many iterations, to make it a fair comparison you probably need to pull out the string at the end rather than just writing to the buffer. It's also not obvious for example what the difference is with just 2 strings to join if I need to join two strings together 40 trillion times or whatnot.<p>Nice collection of microbenchmarks though. Interesting to see magnitude differences from e.g. regexp compile
Fun fact: the crypto rand "number" benchmark depends on the number you pass into it:<p><pre><code> BenchmarkCryptoRand27-8 5000000 388 ns/op
BenchmarkCryptoRand28-8 3000000 356 ns/op
BenchmarkCryptoRand29-8 5000000 335 ns/op
BenchmarkCryptoRand30-8 5000000 327 ns/op
BenchmarkCryptoRand31-8 5000000 331 ns/op
BenchmarkCryptoRand32-8 5000000 322 ns/op
BenchmarkCryptoRand33-8 3000000 480 ns/op
BenchmarkCryptoRand34-8 3000000 474 ns/op
</code></pre>
for benchmarks like<p><pre><code> func BenchmarkCryptoRand32(b *testing.B) {
for n := 0; n < b.N; n++ {
_, err := crand.Int(crand.Reader, big.NewInt(32))
if err != nil {
panic(err)
}
}
}
</code></pre>
This is because the crypto/rand library is very very careful to give you unbiased random numbers.
The string benchmark has the issue that the amount of work done varies with each pass through the loop since the string just keeps getting appended to. A proper benchmark like the ones in the comments here do the same amount of work for every loop.
Note that you can also get the number of bytes processed per second by calling the SetBytes method. This is very useful on some bench (hashing, base64, ...):<p><pre><code> func benchmarkHash(b *testing.B, h hash.Hash) {
data := make([]byte, 1024)
rand.Read(data)
b.ResetTimer()
b.SetBytes(len(data))
for n := 0; n < b.N; n++ {
h.Write(data)
h.Sum(nil)
}
}</code></pre>
> The following benchmarks evaluate various functionality with the focus on real-world usage patterns.<p>I can't say I write much code that does one thing many times in a really tight loop. It would be a lot more interesting if the code combined multiple functions into the loop body in a better attempt to simulate "real-world usage patterns."
I always wanted to ask this. I'm a full stack developer with good knowledge on Java and JavaScript. I'm currently reading Golang especially for its concurrency idioms. It is good and easy to write concurrent code but people always come and say about actors which are very good when compared with channels. I have never used actors before.. Whats your thoughts on this?
Even though this is clearly a benchmarking game, I don't like that it does not explain how the things benchmarked against each other sometimes have drastically different usecases.<p>I can assure you that someone is going to use these numbers to argue that crypto.Rand needs to be replaced by math.Rand BECAUSE SPEED, or that MD5 should be preferred over SHA2/3.
It's worth noting that the first number in a benchmark result is how many loops (for n := 0; n < b.N) that Go used to find the results.<p>The nanoseconds, bytes, and allocs per operation are the important part.