TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The HDFS Juggernaut

144 pointsby josephscottalmost 8 years ago

5 comments

pweissbrodalmost 8 years ago
Hadoop (and hence HDFS) is a stack of services designed to work together to serve a file system and manage jobs. The hadoop stack has a pluggable authentication&#x2F;authorization by design. And yes, the default is &quot;no security&quot;.<p>Given the distributed nature, HDFS runs on multiple machines. In linux distributed service security fits well with kerberos. Normally if you want a &quot;secure&quot; HDFS you must &quot;kerberize&quot; the services such that any hadoop operation requires a valid&#x2F;authorized TGT.<p>To most people kerberizing a hadoop cluster is a major barrier to getting hadoop running. I dont see this changing but certain vendor hadoop distros break down some of the barriers.<p>Sometimes it is OK if you run a cluster insecure. Please dont do it if youre handling my financial or medical records though. As Mr.T once said &#x27;dont write checks that yo ass cant cash&#x27;
评论 #15021323 未加载
评论 #15024254 未加载
评论 #15021885 未加载
sbarrealmost 8 years ago
Why do all these products have &quot;insecure by default&quot; configurations, anyways?<p>Didn&#x27;t we learn anything from register_globals?
评论 #15020335 未加载
评论 #15020776 未加载
评论 #15020258 未加载
评论 #15020344 未加载
评论 #15021225 未加载
评论 #15020739 未加载
评论 #15023451 未加载
评论 #15025147 未加载
评论 #15020334 未加载
iamjochemalmost 8 years ago
even if node-to-node communication in a cluster (hadoop or otherwise) itself is not secured, is it not reasonable to secure external access to the cluster itself (i.e. with a firewall)?<p>from an outsider perspective (I&#x27;ve never used&#x2F;run hadoop) I cannot see much reason for exposing the cluster to the outside world - either a web-app acts as an intermediary or access can be provided via VPN&#x2F;ssh-tunnel&#x2F;etc<p>... just curious why a fully&#x2F;publically exposed cluster would be a &quot;requirement&quot;? or does it come down to the fact that firewalling an AWS environment is as painful (if not more) than &quot;kerberizing&quot; a [hadoop] cluster? (I kind of assumed AWS has firewalling functionality that is fairly plug&#x27;n&#x27;play ... a quick search does really back that up though)
评论 #15022230 未加载
评论 #15021520 未加载
jarymalmost 8 years ago
I knew it was a bad idea to post &#x27;getting started&#x27; tutorials that skipped all the security steps and replace them with a &#x27;probably don&#x27;t wanna do it this way in production&#x27; (and usually no documentation on how one should do it)...<p>Not levelling this comment at HDFS solely but it&#x27;s about time people stopped with the &#x27;hello world&#x27; style examples.
评论 #15026131 未加载
Danihanalmost 8 years ago
This was back in May, I wonder how it has changed &#x2F; if anyone parsed some of this data..
评论 #15021208 未加载
评论 #15020974 未加载