I'm not sure that the kinds of employees that this article describes will ever be a large number. There could be more of them in the future, but someone who is top-notch at all of statistics, programming, and data-presentation has long been less common than someone who's good at one or two of those. Companies might consider looking at better ways to build teams that combine talent that exists, instead of pining for more superstars.<p>I'm reminded indirectly of an acquaintance of mine who works on repairing industrial machinery, where companies complain of a big skills shortage. They either fail to realize or are in denial about what that means in the 21st century, though. It might've been a one-person job in the 1950s, a skilled-labor type of repairman job. But today they want to find one person who can do the physical work (welding, etc.), EE type work, embedded-systems programming (and possibly reverse engineering), application-level programming to hook things up to their network, etc. Some of these people exist, but it's more common to find boutique consulting firms with 3-person teams of EE/CE/machinist or some such permutation. But companies balk at paying consulting fees equivalent to three professional salaries for something they think "should" be doable by one person with a magical combination of skills, who will work for maybe $80k. So they complain that there is a shortage of people who can repair truck scales (for example).
"claims of severe talent shortage in Big Data <a href="http://online.wsj.com/article/SB10001424052702304723304577365700368073674.html" rel="nofollow">http://online.wsj.com/article/SB1000142405270230472330457736...</a> Ok... where are the high salaries (500k$ a year)? No? No real shortage."<p><a href="https://twitter.com/#!/lemire/status/196245665951649793" rel="nofollow">https://twitter.com/#!/lemire/status/196245665951649793</a><p>Business has a shortage of "big data" folks in much the same way I have a "huge sailboat" shortage. Neither of us want to pay for it. We want it, but not for the going rate. Only one of us has a media platform, though.
As an engineer who's investing in developing "deep expertise in statistics and machine learning" I can only stand to benefit from it, but something about the current wave of Big Data hype makes me instinctively a bit wary.<p>Does this skills shortage really exist to the extent claimed? are there really enough people out there who would know what to do with a 'data scientist' if they were able to hire one? I see more talk than action, I see vendors circling around looking to flog freshly-buzzword-compliant BI tools, prognosticators trying to push nervous businesses into engaging in an arms race over data.<p>Of course there's real value there too, for some at least. I hope my concerns prove unfounded, but worth retaining a healthy skepticism I feel :-)
Managers frequently wail about skill shortages, but very often it's pure hypocrisy. The real problem is the reluctance to do any training (and I don't mean formal training) combined with the desire to get <i>proven experts</i> in whatever field. Proven experts must have years of experience in applying their expertise. If <i>no one</i> lets people with less experience to work in that field, where the hell would those experts appear from? Another dimension?<p>Can I be a 80% developer and 20% "data scientist" in your company to try the new role out? The bigger your company is, the less likely the answer to be a "yes". Since Big Data implies a big company, the resulting "shortage" is not surprising. It's self-made.
How media sees Big Data:<p>BIG database => BIG machine learning algo => BIG MODELS => PREDICTIONs, Insights => $$$<p>How it is actually done:<p>awk -F"\|" '{print $1}' SCRAPED_file_pipe.txt | sort | uniq -c | head -n 10 => $$$
This article is nonsense.<p>Talented developers are talented developers. At Quantcast we used Hadoop in production before it was even called Hadoop and now we process 10PB a day. We forbid our sourcers from using Hadoop as a resume search term because it meant absolutely nothing.<p>Statisticians who can code are scarce, but companies that know how to use them are scarcer.
Actually that is silly -- McKensey should now that there is and will never be a talent shortage. There will only be shortage of talent at a particular wage rate.<p>If the companies paid newly graduated 'data-scientists' (what other kind of scientists are there? The tea-leaf reading kind?) 200k/year then they would have a lot more. It is pretty simple economics.
Until the pay is comparable to finance, good luck?<p>I'd love to work on (arguably) cooler problems, but the combination of lower pay and the constant need to use the "hot new thing" to solve problems doesn't make transitioning look remotely attractive.<p>Really, the second is the HUGE obstacle:
- You don't know anything about aNNs? Sorry, no job.
- Nobody uses aNNs anymore, SVMs are all that matters. Sorry, come back after you catch up.
- SVMs? Man, we need someone who's got expertise in optimizing RFs and Bayesian Trees. We don't want "black box" machine learning. We need to "understand" the results. Sorry no job.
- Decision trees? GTFO man. We're doing rNNs now.
- I'm pretty impressed with your data mining knowledge, but we're looking for someone with a background in DLMs and GPs. Sorry, no job.
- repeat until vomit/suicide<p>I kind of wonder about the need for "badass" math skills; I'm not terribly convinced that math wizards are extraordinarily high value relative to people with other types of data analysis skills.
It seems the problem is that some companies are looking for person who is expert in setting up scalable systems (Hadoop cluster, storage, high availability, etc.) and that she/he also knows statistics and efficient ways of processing and understanding the data. Good luck with that.<p>My observation is that requirements like this come from people who did mainly web programing (and actually that was making a lot of money so with money they become influential): assuming that this equivalent of writing both ruby code and javascript code.<p>Building team is hard and in order to solve "big data" problem you need to build a balanced team.
Basically a solution looking for a problem.<p>They are right, the complexity that big data caters for requires expertise at both technical and business level that would be costly (though may not be at infrastructure level). In the current economy, it looks even more difficult where businesses want to squeeze the maximum out of dollar investment.<p>IMO, its too early stage for big data solution adoption. However stage could be set for startups who can come up innovative solution that brings the cost level down together with simple and useful easy to grasp solutions.
I've done this kind of thing most of my career, including doing it for NASA and Unilever Research. You can't really train an average graduate to do this. You need someone with a pretty highly developed integration between 1:intuitive/creative abilities, 2:mathematical/analytical skills, and 3:engineering/ability to make things happen. Add to that 4:work experience in the real world, and 5:ability to easily understand how things work in a field you delve into for the first time... And there's very few people in the world who can do this.
At my previous work place we tried for a whole year to hire someone who would at have at least some of these skills and seems promising to develop the rest on the job. We couldn't find anyone although we interviewed about 30 different people (from about 500 resumes most of them with a PhD in ML from a good university). And this was in central London, UK.
So I was wondering if any fellow HNer is on a quest to be at least comfortable around these problems. Can you share your plans? Currently I am starting with some linear algebra and I have plans to move to statistics then pick up a book on machine learning. I would really use some advice.
Not entirely sure this is true, to be honest. Most of "data science" lies in the work of collecting and cleaning the data to get it into a usable state. A recent story on the"fallacy of the data scientist shortage"[1] goes into more detail, but in reality what we want in this quantity are better data <i></i>analysts<i></i>. I love the idea of data science as, essentially, viewing statistical analysis from a computer science perspective, but the breathless predictions of a huge shortage seem a little overblown.<p>[1]: <a href="http://smartdatacollective.com/nraden/48952/fallacy-data-scientist-shortage" rel="nofollow">http://smartdatacollective.com/nraden/48952/fallacy-data-sci...</a>
Can anyone explain what talent refers to in this context? Is it someone who has learned this stuff, someone who is capable of learning it, or someone who was born with an innate understanding?
I just listened to a lecture on this at BU. Emerging Internet Technologies at IBM or something of that nature. He was basically trying to sell us his product that crawled the internet (mainly a firehose at Twitter) and gathered statistics for advertisers and presented it in pretty graphics.<p>The main issue they had was developing language recognition. Deciding if a user 'liked', 'loved', 'hated' or was 'neutral' about a product. Another issue that stood out to me had to do with their reliance on the internet. Just because 200 users tweeted that 'this movie is going to suck' does not really represent the overall opinion.<p>To reiterate, the whole buzz of the lecture was the biggest turn off. He wasn't explaining about how to expand on his product or where to go from here. Just that they had developed a product and we could use it instead of attempting to develop one ourselves.
Surprised Ben Rooney did not mention IBM acquisition of Vivisimo the day before (Apri. 25) this article (Apr.26) <a href="http://www-03.ibm.com/press/us/en/pressrelease/37491.wss" rel="nofollow">http://www-03.ibm.com/press/us/en/pressrelease/37491.wss</a> "IBM Advances Big Data Analytics with Acquisition of Vivisimo"
Why don't the people who hire doctors, dentists, and lawyers suffer from the same talent shortage that the people who hire 'big data' computer scientists feel?<p>Because it's better across the board to start your own startup than work your ass off for a 4% raise at a place which recognizes you as top talent. I'm on the verge of starting a startup myself, removing myself from the people in this list. There is a shortage of talent in computer science, but never in the other disciplines, it may take another 30 years for the suits to have the ability to understand why.