So, aside from the Clojure, Mathematica, Python, Ruby, Bourne Shell, Haskell, and Scala solutions posted in the other comments, all of which are simpler than the C++, C#, and JS solutions, presented here with some minor cleanups:<p><pre><code> (take 10 (reverse (sort-by (comp first rest) (frequencies (string/split ... #"\+s")))) ; llambda Clojure
// haakon Scala
s.split(' ').groupBy(identity).mapValues(_.size).toList.sortBy(-_._2).take(10).map(_._1)
(->> (string/split s #"\s+") frequencies (sort-by val) reverse (take 10)) ; aphyr Clojure
var top = (from w in text.Split(' ') // louthy C# LINQ
group w by w into g
orderby g.Count() descending
select g.Key).Take(10);
collections.Counter(s1.split()).most_common(10) # shill Python
d = {} # shill Python without collections library
for word in s1.split(): d[word] = d.get(word, 0) + 1
print [(x, d[x]) for x in sorted(d, key=d.get, reverse=True)][:10]
words = s.split() # spenuke and abecedarius probably O(N²) Python
sorted(set(words), key=words.count, reverse=True)[:10]
d3.entries((s.split(" ").reduce(function(p, v){ // 1wheel JS with d3
v in p ? p[v]++ : p[v] = 1;
return p;}, {})))
.sort(function(a, b){ return a.value > b.value; })
.map(function(d){ return d.key;})
.slice(-10)
# kenuke O(N²) Ruby:
str.split.sort_by{|word| str.split.count(word)}.uniq.reverse.take(10)
counts = Hash.new { 0 } # my Ruby
str.split.each { |w| counts[w] += 1; }
counts.keys.sort_by { |w| -counts[w] }.take 10
# aaronbrethorst ruby
str.split(/\W+/).inject(Hash.new(0)) {|acc, w| acc[w] += 1; acc}.sort {|a,b| b.last <=> a.last }[0,10]
Commonest[StringSplit[string], 10] # carlob Mathematica
Reverse[SortBy[Tally[StringSplit[#]], #[[2]] &]][[;; 10, 1]] & # superfx old Mathematica
$a = array_count_values(preg_split('/\b\s+/', $s)); arsort($a); array_slice($a, 0, 10) // Myrth PHP
tr -cs a-zA-Z '\n' | sort | uniq -c | sort -nr | head # mzs and me sh
-- lelf in Haskell
take 10 . map head . reverse . sortBy (comparing length) . group . sort . words
# prakashk Perl6
.say for (bag($text.words) ==> sort {-*.value})[^10]
# navinp1912 C++
string s,f;
map<string,int> M;
set<pair<int,string> > S;
while(cin >> s) {
M[s]++;
int x=M[s];
if(x>1) S.erase(make_pair(x-1,s));
S.insert(make_pair(x,s));
}
set<pair<int,string> >::reverse_iterator it=S.rbegin();
int topK=10;
while(topK-- && (it!=S.rend())) {
cout << it->second<<" "<<it->first<<endl;
it++;
}
</code></pre>
I thought I'd maybe take a look at Afterquery: <a href="http://afterquery.appspot.com/help" rel="nofollow">http://afterquery.appspot.com/help</a><p>Although I haven't tested it, I think the Afterquery program to solve this, assuming you first had something to tokenize your text into one word per row, would be something like<p><pre><code> &group=word;count(*)
&order=-count(*)
&limit=10
</code></pre>
which, though perhaps less readable, is simpler still, except for Mathematica. More details at <a href="http://apenwarr.ca/log/?m=201212" rel="nofollow">http://apenwarr.ca/log/?m=201212</a>.<p>Perl 5, perhaps surprisingly, is not simpler:<p><pre><code> perl -wle 'local $/; $_ = <>; $, = " "; $w{$_}++ for split; print @{[sort {$w{$b} <=> $w{$a}} keys %w]}[0..9]'
</code></pre>
And neither is this, although it uses less code and less RAM:<p><pre><code> perl -wlne '$w{$_}++ for split; END { $, = " "; print @{[sort {$w{$b} <=> $w{$a}} keys %w]}[0..9]}'
</code></pre>
I was surprised, attempting to solve this in Common Lisp, that there's no equivalent of string/split in ANSI Common Lisp, and although SPLIT-SEQUENCE is standardized, it's not included in SBCL's default install, at least on Debian; and counting the duplicate words involves an explicit loop. So basically in unvarnished CL you end up doing more or less what you'd do in C, but without writing your own hash table. Lua and Scheme too, I think, except that in Scheme you don't even have hash tables.