TechEcho

4 comments

brudgersover 8 years ago

Caveat: This is random advice from the internet.1. If it were me, I'd start by installing Hadoop on a laptop since Googling indicates it's doable....for some definition of 'doable.' Even if I could not get it to work, reading the documentation and researching whatever problems I encountered would deepen my practical knowledge. Getting Hadoop up and running is also a facet in a practical working definition of 'knowing Hadoop.'2. Linux Wireless driver BLOB's have been a source of pain for me. The work arounds for me have been:a. Purchase well supported hardware, e.g. used Thinkpad and cards without obscure Broadcom chips.b. Use an external wireless router and an ethernet cable. That's how I connect desktops and laptops around the office.3. My gut is that the important knowledge for many positions requiring or preferring Hadoop will be more related to data science than technical expertise. On the other hand, looping back to my earlier advice, positions that are Hadoop first rather than data-science first will benefit from an operational understanding.Lastly, what I've been hearing about the industry, is that 'embarrassingly parallel workloads that can take full advantage of Hadoop are not as common as was thought a few years ago. The big useful innovation of Hadoop is looking like the underlying Hadoop Distributed File System (HDFS) and other big data search/query tools are being built over it.That's not to say Hadoop is dead or not worth exploring, particularly at the technical level of HDFS and in terms of applying data-science concepts. Learning Pig or Hive makes sense in service of learning how to apply data science concepts. Because Hive is based on SQL it is probably the more generalizable skill...and learning SQL is probably more useful than learning either in terms of employment.Good luck.

评论 #12610165 未加载

praneshpover 8 years ago

I learned Hadoop in Grad school in 2013. If you can spend a little bit of cash, get some VMs on AWS, and follow one of the many guides out there (for example, Cloudera) to install Hadoop. Should be enough to build something like: <a href="http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop/" rel="nofollow">http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data...</a>.I started out trying my VMs on virtualbox, then a couple of different laptops at home, etc, but AWS was the easiest setup in the end.

评论 #12610170 未加载

mtmailover 8 years ago

There's also <a href="http://www.cloudera.com/training.html" rel="nofollow">http://www.cloudera.com/training.html</a>You can run Linux in a virtual machine (VirtualBox, VMware etc) where you wouldn't have to deal with wifi drivers because it uses the existing network connection from the host operating system.

评论 #12597327 未加载

mtmailover 8 years ago

There's a couple of hints to books in <a href="https://news.ycombinator.com/item?id=12389595" rel="nofollow">https://news.ycombinator.com/item?id=12389595</a>

评论 #12610172 未加载

4 comments

brudgersover 8 years ago

评论 #12610165 未加载

praneshpover 8 years ago

评论 #12610170 未加载

mtmailover 8 years ago

评论 #12597327 未加载

mtmailover 8 years ago

There's a couple of hints to books in <a href="https://news.ycombinator.com/item?id=12389595" rel="nofollow">https://news.ycombinator.com/item?id=12389595</a>

评论 #12610172 未加载

Ask HN: Learning Hadoop

4 comments

Ask HN: Learning Hadoop

4 comments