TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

String.intern()

145 pointsby dmitabout 8 years ago

16 comments

Terr_about 8 years ago
Every time I see String.intern() my mind leaps to the problem of new Java programmers who are misled into this:<p><pre><code> String a = &quot;hello&quot;; String b = &quot;world&quot;; assert a != b; b = &quot;hello&quot;; assert a == b; &#x2F;&#x2F; OH NICE I&#x27;LL USE == FOR STRING COMPARISONS NOW </code></pre> It works cause source-code literals are intern&#x27;ed down into identical objects by the compiler, but that&#x27;s a special case that won&#x27;t apply to strings created at runtime.
评论 #14346837 未加载
评论 #14348066 未加载
评论 #14347152 未加载
评论 #14347367 未加载
derrizabout 8 years ago
This is really an unscientific claim but I ran a hand crafted&#x2F;hacked benchmark just to get a feeling for the numbers. For 5 to 35 character Strings, == is 20 to 40 times faster than String.equals().<p>Given that s1.equals(s2) if and only if s1.intern() == s2.intern() (assuming you haven&#x27;t filled the string table), then this looks like an opportunity for a significant optimization.<p>Before doing this, I had hoped that String.equals might check if both were &quot;interned&quot; and shortcut the character by character comparison if this was the case by just comparing references. But interpreting the results of my rough benchmark would suggest this isn&#x27;t what is happening which would agree with the source provided for the String.equals method.<p>Java String comparison is absolutely ubiquitous so I would have expected that an optimization like this might have been considered?<p>Having said that, the supplied rt.jar source also suggests that the String.hashCode() computation isn&#x27;t cached&#x2F;memoized. This strikes me as odd given that Strings are immutatable and Strings are one of the most common key type for Maps.
评论 #14348909 未加载
评论 #14348713 未加载
filereaperabout 8 years ago
Total aside from main topic, I love shipilev&#x27;s posts.<p>If there are other core JVM developers that have similar blogs, I&#x27;d love to hear about them here.
评论 #14348936 未加载
lorenzosnapabout 8 years ago
We built an inmemory map and we were using String.intern for both keys and values. We could see that we were saving lots of memory but we had the problems described in the article. We then built our own &#x27;String.intern&#x27; by using yet another static HashMap. It worked. It was the simplest alternative and it just did the job. Thanks alekskey for the nice article.
emmelaichabout 8 years ago
I&#x27;d never seen the @Benchmark annotation before so I looked it up.<p>The blog author is also one of JMH&#x27;s authors.<p><a href="http:&#x2F;&#x2F;openjdk.java.net&#x2F;projects&#x2F;code-tools&#x2F;jmh&#x2F;" rel="nofollow">http:&#x2F;&#x2F;openjdk.java.net&#x2F;projects&#x2F;code-tools&#x2F;jmh&#x2F;</a>
deepsunabout 8 years ago
Have been doing Java for 14 years so far, never ever needed the .intern(). I can imagine it&#x27;s use-case, but anyway does seem pretty rare case.
评论 #14347974 未加载
Robotbeatabout 8 years ago
Is &quot;Anatomy Park&quot; a Rick and Morty reference? <a href="http:&#x2F;&#x2F;rickandmorty.wikia.com&#x2F;wiki&#x2F;Anatomy_Park_(episode)" rel="nofollow">http:&#x2F;&#x2F;rickandmorty.wikia.com&#x2F;wiki&#x2F;Anatomy_Park_(episode)</a>
jwilkabout 8 years ago
Please consider adding &quot;JVM Anatomy Park&quot; to the title.
TheGuyWhoCodesabout 8 years ago
The code creates unique strings to &quot;interns&quot; which most likely isn&#x27;t what would happen in a real world application (unless you know... code without thought), you&#x27;d inter strings with low variance usually. Not saying that it won&#x27;t be slower but the memory usage might be lower.
评论 #14346939 未加载
gravypodabout 8 years ago
This, and the few other articles up, are a great series. Having done Java development now for 30% of my life these are some amazing pointers.<p>I&#x27;d love to buy a hard copy of these if they ever get up to a few dozen articles. Would be good to give to middle-experience devs (like myself) in the future.
评论 #14346604 未加载
relics443about 8 years ago
&quot;The performance is at the mercy of the native HashTable implementation, which may lag behind what is available in high-performance Java world, especially under concurrent access.&quot;<p>What native HashTable is used? Shouldn&#x27;t the JVM be using an optimized one?
评论 #14346626 未加载
评论 #14346872 未加载
评论 #14348689 未加载
zdeabout 8 years ago
String.intern() would suck much less if strings had an &quot;IS_INTERNED&quot; flag which would prevent hashtable lookups for already interned strings. Really sad given the insane overhead Java strings have.
评论 #14349895 未加载
pimlottcabout 8 years ago
&gt; in OpenJDK, String.intern() is native, and it actually calls into JVM, to intern the String in the native JVM String pool.<p>How much of this also applies when using the standard Oracle JDK?
kristianpabout 8 years ago
It would be interesting to know where it&#x27;s used, was it used in the JDK for example?
kazinatorabout 8 years ago
Also see:<p>XInternAtom (XWindow function)<p>RegisterClass (Windows)
zarothabout 8 years ago
The instrumentation here is impressive. The amount of data inspection done with just a few simple commands is a bit overwhelming. Frankly, I rarely hope to find myself looking at this level of metrics.<p>There&#x27;s a lot down there I like to take for granted. But more likely I try to use methods like string.Intern() exactly never.<p>Use code you know and understand. Frankly, use code you can trust. And wtf would trust a method string.Inter() to do... exactly, what?<p>If you are writing a function to <i>do something</i> the name of the function must be the thing being done. What the heck is a &#x27;static internalize&#x27;? The explicit HashMap was a few lines of code, and it&#x27;s the most basic and obvious, and surprisingly performant approach. So definitely I agree you must use your own HashMap and not a static internalizer.