LINQ Ruined My Favorite Interview Question

133 pointsby scottchaalmost 12 years ago

34 comments

llambdaalmost 12 years ago

I hate to be the bearer of bad news, but I think there may be even simpler solutions to this problem:<pre><code> (take 10 (reverse (sort-by (comp first rest) (frequencies (string/split ... #"\+s")))) </code></pre> The above is a Clojure one-liner example that I believe satisfies the original problem. So while LINQ may have simplified from the C-language family solutions he had seen, it's clearly possible to take it one step further with the expressivity of modern languages like Clojure...Edit: remember to sort! (Forgot my coffee this morning...)Edit again: and aphyr's solution is even more concise and idiomatic, where `s` is the first paragraph of the blog post:<pre><code> => (->> (string/split s #"\s+") frequencies (sort-by val) reverse (take 10)) (["I" 7] ["to" 6] ["the" 5] ["a" 5] ["of" 4] ["candidates" 3] ["is" 3] ["question" 3] ["in" 3] ["their" 2])</code></pre>

评论 #6027768 未加载

评论 #6026534 未加载

评论 #6026346 未加载

评论 #6026513 未加载

评论 #6028770 未加载

评论 #6028209 未加载

评论 #6028900 未加载

评论 #6026322 未加载

评论 #6026360 未加载

评论 #6027702 未加载

评论 #6028383 未加载

overgardalmost 12 years ago

LINQ tends to get thought of as "database syntax sugar", but it's way more than that. It's C#'s version of the lazy collection operations you find in most functional languages, just given friendlier sqlish names. (IE, "Select" instead of "map" and "Where" instead of "filter")I almost feel a bit gross when I have to write a "foreach" loop at this point, because there's almost always an equivalent way to do it in LINQ (although it's a tradeoff, as the one downside of LINQ is that given it's deferred nature, it's harder to debug).

评论 #6029204 未加载

评论 #6026465 未加载

评论 #6026250 未加载

评论 #6026324 未加载

评论 #6029410 未加载

joshuaellingeralmost 12 years ago

I am a big fan of LINQ but it gives and takes away complexity at the same time.It makes a 'whole class of things that you would have to do with loops' go away. Once you are comfortable with the syntax, it makes code a lot more readable.But you lose track of when and where things are getting executed.For example, it is easy to make something that you intend to execute inside of SQL server run inside of C# code. And then, all of a sudden, string comparison is case-sensitive.You wind up having to context switch between procedural and set mentality without the same kind of visual cues you used to get.

评论 #6026280 未加载

评论 #6026339 未加载

评论 #6026220 未加载

评论 #6026257 未加载

评论 #6027997 未加载

moominalmost 12 years ago

I hate to be the arrogant know it all on Hacker News, but seriously, if you're writing c# and not using LINQ all the time, you need to catch up.I'm tired of seeing people answer interview questions with anything _other_ than LINQ.

评论 #6026487 未加载

评论 #6026432 未加载

评论 #6026349 未加载

评论 #6027926 未加载

评论 #6028924 未加载

评论 #6026629 未加载

评论 #6031262 未加载

kephraalmost 12 years ago

/me wondersare candidates allowed to chose their favorite language?man bash | tr '[:upper:] ' '[:lower:]\n' | sed '/^$/d' | sort | uniq -c | sort -rn | head | awk '{ print $2 }' | fmtor do you only hire windows coders?

评论 #6028242 未加载

评论 #6028350 未加载

评论 #6026375 未加载

评论 #6028008 未加载

评论 #6028358 未加载

评论 #6027710 未加载

azurelogicalmost 12 years ago

I had a class where we built a DB engine from scratch, and I ported my code to C# just for LINQ and proper list handling. While it's not the most efficient, it reduced many of the complex sections down to a few lines, like reordering the values being inserted to match the order necessary for a record by comparing the schema to the parameter list in the insert statement.If you like LINQ and wish it were available in JS, look at underscore or lo-dash.

mythzalmost 12 years ago

LINQ does make C# more readable but it doesn't add much value over normal functional collections which usually end up more concise, simpler and easier to reason about - visible in my Dart port of C#'s 101 LINQ samples (which are also lazy): <a href="https://github.com/dartist/101LinqSamples" rel="nofollow">https://github.com/dartist/101LinqSamples</a>performance of Linq 2 Objects is not that great either and it doesn't add much value readability-wise over rewriting the same task in other dynamic languages: <a href="https://github.com/dartist/sudoku_solver" rel="nofollow">https://github.com/dartist/sudoku_solver</a>

tolmaskyalmost 12 years ago

My main problem with LINQ is that it seems to perform terribly on mobile. I was looking for map/reduce/etc type functions for C# in Unity, and thought I found it with LINQ. To my dismay, LINQ creates so many crazy intermediate objects to pull off its "laziness" that our GC high water mark was being crossed all the time. I went and just reimplemented everything from underscore.js in C# and got way better performance. I imagine this is probably not an issue on desktops/servers.

评论 #6028133 未加载

评论 #6028337 未加载

T-zexalmost 12 years ago

The interviewer doesn't see the difference between the LINQ and extension methods and he hasn't provided a single line of LINQ in his post.

评论 #6027719 未加载

Shish2kalmost 12 years ago

In the vein of <a href="http://xkcd.com/353/" rel="nofollow">http://xkcd.com/353/</a> :<pre><code> >>> from collections import Counter >>> Counter("here are some words here are".split()).most_common(3) [('are', 2), ('here', 2), ('words', 1)]</code></pre>

gboudriasalmost 12 years ago

What is it with people's obsession over lines of code?As someone who doesn't do C# or LINQ, that second solution seems to me like someone really wanted to have as few lines of code as possible.I don't claim to have an impressive programming pedigree, but while I take simplicity and performance into account, I never take "conciseness" into account. Conciseness usually means "this is opaque as shit but at least it's short". And who really cares about short? What's the purpose of "short"? None that I can find, other than impressing interviewers. Anyone who believes otherwise should probably be writing in Clojure or Haskell (and probably is), but I personally just don't see the point.But that's just my opinion. My favorite language is Python.

评论 #6031256 未加载

jastralmost 12 years ago

The O(n) solutions sound way more interesting!1. Iterate through all key,value pairs once, keeping track of the 10 most common 2. Run quickselect 10 times 3. Coolest (and an interview question in it's own right) - modify quickselect to return the top 10!

ExpiredLinkalmost 12 years ago

> “Return the top 10 most frequently occurring words in a string.”...> var words = s1.Split(' ');Wrong. Yet another example where an interviewer cannot correctly solve his own questions.

评论 #6026601 未加载

A1kmmalmost 12 years ago

I'm not sure that language features making code more succinct really ruin the question.The C# isn't even that succinct compared to doing a similar thing in other popular languages. For example, in Haskell:<pre><code> topTenWords :: String -> [String] topTenWords = take 10 . map fst . sortBy (flip (comparing snd)) . map (\l -> (head l, length l)) . group . sort . words</code></pre>

评论 #6031934 未加载

lubomiralmost 12 years ago

The LINQ really is much more succint, but the original code did not set the bar very high. Why should one write a 12 line comparing function when using 'b.value - a.value' would work pretty much the same (unless C# really requires comparison to return -1/1 instead of any negative/positive integer, which would be fixed by a sign function).

评论 #6028921 未加载

dbauppalmost 12 years ago

I think the analysis may've been skewed by the outlying points. They are almost certainly "high leverage points"[1] and so possibly exert an undue influence on the final trend lines.[1]: <a href="http://en.wikipedia.org/wiki/Partial_leverage" rel="nofollow">http://en.wikipedia.org/wiki/Partial_leverage</a>

telalmost 12 years ago

Here's the Haskell golf<pre><code> import qualified Data.Map as M import Data.List import Data.Ord countWords :: String -> [String] countWords = map fst . take 10 . sortBy (comparing snd) . M.toList . M.fromListWith (+) . map (\w -> (w, 1)) . words</code></pre>

joshkaalmost 12 years ago

The biggest benefit I find from using Linq is readability. It allows the code to express what it is doing rather than how it is doing it. Compare:<pre><code> foreach (var item in list) if (SomeCondition(item)) return item; return null; </code></pre> vs.<pre><code> list.FirstOrDefault(item);</code></pre>

seivanalmost 12 years ago

I would have used a weighted SET.But it's funny, the first time I read the paragraph, I'd assume the words were not separated by space, and you had to find occurrences of combinations than.I instinctively made the test much harder than it would be. I'm damaged.

zwiebackalmost 12 years ago

Good read. It also shows that interviewing can be a great learning tool. I've experienced that myself, while interviewing takes a lot of time investment on the part of the interviewer it's also often a source of new insights.

lelfalmost 12 years ago

Well, welcome to high-level programming<pre><code> count = take 10 . map head . reverse . sortBy (comparing length) . group . sort . words </code></pre> That's ignoring Unicode rules for word splitting of course

mrcozzalmost 12 years ago

Everyone knows "Hadoop is a distributed system for counting words." ;-)<a href="https://github.com/twitter/scalding" rel="nofollow">https://github.com/twitter/scalding</a>

aaronbrethorstalmost 12 years ago

That was a fun exercise :)<pre><code> input_string.split(/\W+/).inject(Hash.new(0)) {|acc, w| acc[w] += 1; acc}.sort {|a,b| b.last <=> a.last }[0,10]</code></pre>

superfxalmost 12 years ago

Since we're comparing notes, here's the Mathematica version:Reverse[SortBy[Tally[StringSplit[#]], #[[2]] &]][[;; 10, 1]] &

ajanuaryalmost 12 years ago

The first projection isn't needed, which would eliminate the creation of 10 objects.

coderguy123almost 12 years ago

text.Split(' ').Where(x=>!string.IsNullOrWhiteSpace(x)).GroupBy(x => x).Select(x => new { word = x.Key, count = x.Count() }).OrderByDescending(x => x.count).Select(x=>x.word).Take(10).ToArray();

prakashkalmost 12 years ago

Perl 6:<pre><code> .say for (bag($text.words) ==> sort {-*.value})[^10]</code></pre>

seogurualmost 12 years ago

here's a verbose ruby version:<pre><code> def topx(str,x) c = Hash.new(0) str.split(/\s+/).each { |s| c[s] += 1 } c.sort_by {|k,v| -v}.take(x) end</code></pre>

coderguy123almost 12 years ago

isn't .ToList() part of linq too. technically the first solution is also using Linq. would be interesting to make them implement sort algorithm too.

pawrvxalmost 12 years ago

LINQ is my favorite API of all times. It rocks.

评论 #6030047 未加载

telalmost 12 years ago

Why does this "ruin" a question?

评论 #6031317 未加载

kragenalmost 12 years ago

So, aside from the Clojure, Mathematica, Python, Ruby, Bourne Shell, Haskell, and Scala solutions posted in the other comments, all of which are simpler than the C++, C#, and JS solutions, presented here with some minor cleanups:<pre><code> (take 10 (reverse (sort-by (comp first rest) (frequencies (string/split ... #"\+s")))) ; llambda Clojure // haakon Scala s.split(' ').groupBy(identity).mapValues(_.size).toList.sortBy(-_._2).take(10).map(_._1) (->> (string/split s #"\s+") frequencies (sort-by val) reverse (take 10)) ; aphyr Clojure var top = (from w in text.Split(' ') // louthy C# LINQ group w by w into g orderby g.Count() descending select g.Key).Take(10); collections.Counter(s1.split()).most_common(10) # shill Python d = {} # shill Python without collections library for word in s1.split(): d[word] = d.get(word, 0) + 1 print [(x, d[x]) for x in sorted(d, key=d.get, reverse=True)][:10] words = s.split() # spenuke and abecedarius probably O(N²) Python sorted(set(words), key=words.count, reverse=True)[:10] d3.entries((s.split(" ").reduce(function(p, v){ // 1wheel JS with d3 v in p ? p[v]++ : p[v] = 1; return p;}, {}))) .sort(function(a, b){ return a.value > b.value; }) .map(function(d){ return d.key;}) .slice(-10) # kenuke O(N²) Ruby: str.split.sort_by{|word| str.split.count(word)}.uniq.reverse.take(10) counts = Hash.new { 0 } # my Ruby str.split.each { |w| counts[w] += 1; } counts.keys.sort_by { |w| -counts[w] }.take 10 # aaronbrethorst ruby str.split(/\W+/).inject(Hash.new(0)) {|acc, w| acc[w] += 1; acc}.sort {|a,b| b.last <=> a.last }[0,10] Commonest[StringSplit[string], 10] # carlob Mathematica Reverse[SortBy[Tally[StringSplit[#]], #[[2]] &]][[;; 10, 1]] & # superfx old Mathematica $a = array_count_values(preg_split('/\b\s+/', $s)); arsort($a); array_slice($a, 0, 10) // Myrth PHP tr -cs a-zA-Z '\n' | sort | uniq -c | sort -nr | head # mzs and me sh -- lelf in Haskell take 10 . map head . reverse . sortBy (comparing length) . group . sort . words # prakashk Perl6 .say for (bag($text.words) ==> sort {-*.value})[^10] # navinp1912 C++ string s,f; map<string,int> M; set<pair<int,string> > S; while(cin >> s) { M[s]++; int x=M[s]; if(x>1) S.erase(make_pair(x-1,s)); S.insert(make_pair(x,s)); } set<pair<int,string> >::reverse_iterator it=S.rbegin(); int topK=10; while(topK-- && (it!=S.rend())) { cout << it->second<<" "<<it->first<<endl; it++; } </code></pre> I thought I'd maybe take a look at Afterquery: <a href="http://afterquery.appspot.com/help" rel="nofollow">http://afterquery.appspot.com/help</a>Although I haven't tested it, I think the Afterquery program to solve this, assuming you first had something to tokenize your text into one word per row, would be something like<pre><code> &group=word;count(*) &order=-count(*) &limit=10 </code></pre> which, though perhaps less readable, is simpler still, except for Mathematica. More details at <a href="http://apenwarr.ca/log/?m=201212" rel="nofollow">http://apenwarr.ca/log/?m=201212</a>.Perl 5, perhaps surprisingly, is not simpler:<pre><code> perl -wle 'local $/; $_ = <>; $, = " "; $w{$_}++ for split; print @{[sort {$w{$b} <=> $w{$a}} keys %w]}[0..9]' </code></pre> And neither is this, although it uses less code and less RAM:<pre><code> perl -wlne '$w{$_}++ for split; END { $, = " "; print @{[sort {$w{$b} <=> $w{$a}} keys %w]}[0..9]}' </code></pre> I was surprised, attempting to solve this in Common Lisp, that there's no equivalent of string/split in ANSI Common Lisp, and although SPLIT-SEQUENCE is standardized, it's not included in SBCL's default install, at least on Debian; and counting the duplicate words involves an explicit loop. So basically in unvarnished CL you end up doing more or less what you'd do in C, but without writing your own hash table. Lua and Scheme too, I think, except that in Scheme you don't even have hash tables.

评论 #6031652 未加载

评论 #6032668 未加载

评论 #6032673 未加载

评论 #6032671 未加载

jahabreweralmost 12 years ago

(his name is Jon Skeet)(sorry)

评论 #6030874 未加载

评论 #6030873 未加载

inzaxalmost 12 years ago

He should try adding AsParrallel() to the expression. I bet there will be a drastic speed up as long as he has a multi core.

34 comments

llambdaalmost 12 years ago

评论 #6027768 未加载

评论 #6026534 未加载

评论 #6026346 未加载

评论 #6026513 未加载

评论 #6028770 未加载

评论 #6028209 未加载

评论 #6028900 未加载

评论 #6026322 未加载

评论 #6026360 未加载

评论 #6027702 未加载

评论 #6028383 未加载

overgardalmost 12 years ago

评论 #6029204 未加载

评论 #6026465 未加载

评论 #6026250 未加载

评论 #6026324 未加载

评论 #6029410 未加载

joshuaellingeralmost 12 years ago

评论 #6026280 未加载

评论 #6026339 未加载

评论 #6026220 未加载

评论 #6026257 未加载

评论 #6027997 未加载

moominalmost 12 years ago

评论 #6026487 未加载

评论 #6026432 未加载

评论 #6026349 未加载

评论 #6027926 未加载

评论 #6028924 未加载

评论 #6026629 未加载

评论 #6031262 未加载

kephraalmost 12 years ago

评论 #6028242 未加载

评论 #6028350 未加载

评论 #6026375 未加载

评论 #6028008 未加载

评论 #6028358 未加载

评论 #6027710 未加载

azurelogicalmost 12 years ago

mythzalmost 12 years ago

tolmaskyalmost 12 years ago

评论 #6028133 未加载

评论 #6028337 未加载

T-zexalmost 12 years ago

The interviewer doesn't see the difference between the LINQ and extension methods and he hasn't provided a single line of LINQ in his post.

评论 #6027719 未加载

Shish2kalmost 12 years ago

gboudriasalmost 12 years ago

评论 #6031256 未加载

jastralmost 12 years ago

ExpiredLinkalmost 12 years ago

评论 #6026601 未加载

A1kmmalmost 12 years ago

评论 #6031934 未加载

lubomiralmost 12 years ago

评论 #6028921 未加载

dbauppalmost 12 years ago

telalmost 12 years ago

joshkaalmost 12 years ago

seivanalmost 12 years ago

zwiebackalmost 12 years ago

lelfalmost 12 years ago

mrcozzalmost 12 years ago

Everyone knows "Hadoop is a distributed system for counting words." ;-)<a href="https://github.com/twitter/scalding" rel="nofollow">https://github.com/twitter/scalding</a>

aaronbrethorstalmost 12 years ago

That was a fun exercise :)<pre><code> input_string.split(/\W+/).inject(Hash.new(0)) {|acc, w| acc[w] += 1; acc}.sort {|a,b| b.last <=> a.last }[0,10]</code></pre>

superfxalmost 12 years ago

Since we're comparing notes, here's the Mathematica version:Reverse[SortBy[Tally[StringSplit[#]], #[[2]] &]][[;; 10, 1]] &

ajanuaryalmost 12 years ago

The first projection isn't needed, which would eliminate the creation of 10 objects.

coderguy123almost 12 years ago

text.Split(' ').Where(x=>!string.IsNullOrWhiteSpace(x)).GroupBy(x => x).Select(x => new { word = x.Key, count = x.Count() }).OrderByDescending(x => x.count).Select(x=>x.word).Take(10).ToArray();

prakashkalmost 12 years ago

Perl 6:<pre><code> .say for (bag($text.words) ==> sort {-*.value})[^10]</code></pre>

seogurualmost 12 years ago

here's a verbose ruby version:<pre><code> def topx(str,x) c = Hash.new(0) str.split(/\s+/).each { |s| c[s] += 1 } c.sort_by {|k,v| -v}.take(x) end</code></pre>

coderguy123almost 12 years ago

isn't .ToList() part of linq too. technically the first solution is also using Linq. would be interesting to make them implement sort algorithm too.

pawrvxalmost 12 years ago

LINQ is my favorite API of all times. It rocks.

评论 #6030047 未加载

telalmost 12 years ago

Why does this "ruin" a question?

评论 #6031317 未加载

kragenalmost 12 years ago

评论 #6031652 未加载

评论 #6032668 未加载

评论 #6032673 未加载

评论 #6032671 未加载

jahabreweralmost 12 years ago

(his name is Jon Skeet)(sorry)

评论 #6030874 未加载

评论 #6030873 未加载

inzaxalmost 12 years ago

He should try adding AsParrallel() to the expression. I bet there will be a drastic speed up as long as he has a multi core.