TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Why not Java?

52 pointsby OmegaHNalmost 13 years ago
I am building a web crawler to access data to be processed. All the code is fairly high level, so I am drawn to Python, but there are certain bits of it that require data manipulation that is much easier in a C-like language (arrays are a big part of it).<p>Java seems to fit this role very well. It is statically typed, object-oriented, and doesn't delve into memory. However, it seems to get a lot of hate (or, at least, dismissal) from many programming communities, so I am asking, why not Java? Why is it so horrible as a systems language above C? Is there any other language that fits this role in a better way?<p>I am in particular asking this because I have been banging my head against the Python syntax for awhile, but I am trying to expand what languages I can program in.

25 comments

strlenalmost 13 years ago
It's perfectly fine to use Java for this kind of software.<p>The hate against Java comes from using Java for application development: this is largely due to the kinds of applications that are typically written in Java (line of business software) and (this is the most important reason) accidental complexity and low quality of APIs like Spring or J2EE.<p>Recipe for programming happyness is to use the right tool for the job:<p>* Python (or Ruby) for web application development, development tools, and "devops" scripting.<p>* C (or C++) for pieces that need deterministic performance[1], provide a "native" feeling user interface, or require control over memory layout.<p>Note: performance and efficiency are relative to what your throughput and latency requirements are. Google's crawlers and indexers will remain in C++ for the foreseeable future, but (for example) crawlers for an intranet can get away with being in Java (or Python for that matter).<p>* Java (or Scala, Haskell, OCaml, Go, Erlang, or one of the many Lisps) for "userland" systems programming. If the majority of the system fits under the last bullet point, use C++.<p>* Avoid JNI or Swig if you can. Use JSON + REST for cross-language RPC. If you need performance guarantees of a tight binary protocol use Thrift or Protocol Buffers. If you have to use JNI, consider using JNA first.<p>* No matter what language you use, stick to high quality libraries and tools. For Java, you'll absolutely want to use guava, Guice, and either Netty (or NIO.2 if you are using Java 7) or Jetty + Jersey + Jackson (for REST APIs).<p>Pick up either emacs and cscope, netbeans, Eclipse, or IntelliJ for navigating a large Java codebase.<p>All Java build tools suck. Maven sucks less and is the de-facto standard in the open source community. Twitter's "pants" is also worth looking at.<p>* Don't touch Spring with a 60-foot pole: in the mildest terms it's unequivocal and absolute garbage. Ditto for any other buzzword you may see in a job listing for an "enterprise" Java development job (with 20 years of experience required, naturally).<p>[1] Java performance can be quite high, but a JIT-ted and garbage collected runtime implies a lack of determinism.
评论 #4406456 未加载
评论 #4406345 未加载
评论 #4406310 未加载
评论 #4406387 未加载
评论 #4406330 未加载
评论 #4406982 未加载
评论 #4406362 未加载
gojomoalmost 13 years ago
Nothing's wrong with Java. Commercial and research-quality crawlers of tens of billions of web resources have been written in Java for over a decade. Its threading/concurrency support and extensive well-optimized libraries make it easier for you to make your code fast over large datasets... if you're good at Java. (If you're not, there are plenty of ways to sabotage yourself.)<p>But, Java's a bit verbose, has gaps in concise support for higher-level constructs, and sometimes the static typing gets in the way. So if you don't find those parts helpful -- some do -- and think your performance targets can be met with other later optimizations/design-choices/selective-reimplementations, stick with whatever more concise language you're good at.<p>Or, use any of the more concise languages available on the JVM allowing intermixing of the occasional Java facility, like Jython, JRuby, Groovy, Javascript, Scala, Clojure, and others.<p>(If efficiently handling massive numbers of concurrent net/IO streams is a priority, the recent JVM-based project vert.x may be of interest. I haven't used it for anything but toy tests, but it seems to combine some of the best-practices for maximum JVM IO throughput with a somewhat higher-level-language-agnostic top layer well-suited for servers/proxies/crawlers.)
评论 #4406728 未加载
Derbastialmost 13 years ago
In my experience many Java programmers don't really "program" Java. They are more like "expert Eclipse users" and Eclipse happens to output Java. This style of development makes heavy use of wizards and those Eclipse refactoring tools.<p>This probably is a consequence of the verbosity of Java-the-language, which made heavy tooling support a necessity. And then Eclipse, which provides one of the tightest language integration with Java of any IDE ever.<p>The sad thing is that this is not really the fault of Java-the-language or Eclipse. It did spawn a whole caste of very mediocre programmers and libraries though, which can make for a very unpleasant culture.<p>Used correctly, Java can be a great tool, though.
评论 #4407428 未加载
slurgfestalmost 13 years ago
I am puzzled at how arrays are hard to use in Python? I cannot understand how you could be 'banging your head' against Python's array syntax unless you are just new to Python.<p>If you want to use Java (e.g.: you know it already and don't like learning other things), who cares? Why is this an issue where you have to challenge other people's opinions of Java? Use it if you want to.
rbanffyalmost 13 years ago
You can always do the speed-critical parts in C and link that from your Python code. Or, if your analysis is something already done, use a library already written in C (such as NumPy).<p>Another approach could be Jython (or any other JVM language closer to the desired level of abstraction) and Java.<p>I don't have much love for Java the language. It's not much easier to program than with C, isn't faster and is very verbose. Still, what you are doing looks like a good match for it. And all the respect I don't have for the language, I have for the JVM.<p>I wouldn't use if for web app development as there are much more productive options around.
评论 #4406427 未加载
rockyjalmost 13 years ago
Java is good, make no mistakes about it. It offers you find grained control in almost every aspect of programming (e.g. concurrency). However it is the same freedom that allows developers to make mistakes. For example -<p>One can write concurrent systems in Java without understanding concurrency. Languages like Scala and Clojure will give you some freedom but will also enforce certain design principles which will save you.<p>Similarly for web development, there are scores of frameworks in the Java world, and you can mess it up easily. Rails / Django on the other hand will provide one good, solid way to do web programming.<p>Finally, Java is showing it's age. The need to write large files of XML to configure things and the lack of ability to treat functions as objects put developers off. Some things are being addressed by Oracle but will take time.
btillyalmost 13 years ago
I am curious about what, specifically, you find easier to do in Java syntax than Python syntax.<p>Seriously, there is a fairly direct translation from any Java you might want to write to completely equivalent Python. Sure, Python offers more complex techniques such as list comprehensions and iterators. But you don't need to use them. You can just write Java-like Python.
评论 #4406433 未加载
评论 #4406442 未加载
pacalaalmost 13 years ago
&#62; Why is it so horrible as a systems language above C?<p>* First class functions (interfaces with one method) plus garbage collector eventually encourage a functional programming style, with lots of little objects created on the heap. Alas, the per-object memory overhead of popular Java implementations is horrendous.<p>* Strong emphasis on using threads for concurrency. Alas, in practice, threads are incredibly large memory hogs.<p>* Verbosity. While it is possible to write clean composable code in Java, it is also remarkably verbose. After a while, this gets old and people take all shortcuts they can to limit verbosity. Which is a very bad idea. To quote an esteemed colleague, "I never took a shortcut I didn't regretted it later". Can we have our lambdas yet, pretty please?
评论 #4415810 未加载
评论 #4426127 未加载
phaoalmost 13 years ago
There is a hate against Java, also against C#, C++, PHP (which I hate), C, and pretty much any other mainstream language.<p>Notice, though, that competent people have done great jobs using these languages. So you have some choices. Two of them are: wonder why people bash Java or go do something useful with it. I suggest you do the second.<p>The key to using programming languages is in trying to use the one which will help you the most, or get in your way the least. Sort of "the right tool for the job". Idk what jobs java is good at. If you found out that it's good for your project, then use it.<p>Take a look a this article: <a href="http://prog21.dadgum.com/143.html" rel="nofollow">http://prog21.dadgum.com/143.html</a>
评论 #4406741 未加载
orangecatalmost 13 years ago
<i>Why is it so horrible as a systems language above C?</i><p>It's not "horrible", it just has many slight-to-moderate deficiencies and annoyances that make development more work than it should be.<p><i>Is there any other language that fits this role in a better way?</i><p>Scala is strictly superior when used as a "better Java". (If you go deep into its functional capabilities you get a different set of tradeoffs). C# is better as a language, but then you're tied to .NET.<p>Really we'd need to know more details of what you're doing and why you believe Python may not work. Are you concerned about performance, or do you need to do things that Python doesn't have convenient APIs for?
nostromoalmost 13 years ago
I've written two crawlers in Java and found it quite well-suited.<p>I think most people on HN who hate Java are talking about creating websites, and for good reason. Back in the bad ol' days, people would use Java frameworks like Struts for web apps, and it was quite painful.<p>For my latest project I'm using Play Framework for front-end Java, and it's quite delightful.
评论 #4406496 未加载
评论 #4406735 未加载
freeslavealmost 13 years ago
Nothing wrong with using Java and there is something to be said for using the language you are most productive in. But if you are thinking of building a web crawler in Java, I would recommend taking a look at the Heritrix project: <a href="https://webarchive.jira.com/wiki/display/Heritrix/Heritrix" rel="nofollow">https://webarchive.jira.com/wiki/display/Heritrix/Heritrix</a> It's robust, open source and easily extensible. Might be easier to write a custom module for it than to roll your own web crawler.
samspotalmost 13 years ago
The best reason to use Java is the enormous ecosystem of libraries and resources.<p>The best reason to AVOID using Java is the huge demand for Java programmers and the low supply. At my job we can barely find applicants with Java so we end up hiring .NET people and converting them.
评论 #4413261 未加载
评论 #4407585 未加载
bbayeralmost 13 years ago
I personnally dont like Java because of API complexity and this is why ended up with python. I have implemented many crawlers by using Scrapy framework and I believe it speed up development. We have crawled millions of pages without any problem.<p>Python is very powerful in terms of string manipulation because it has very good language constructs (like slice syntax) which makes development easy. At the beginning it might be a little bit confusing but once you mastered it you really feel power.<p>Twisted like frameworks also makes good job at this point. It is well-designed, asynchronus and it suits well for multi-tier network applications.
jfbalmost 13 years ago
It's a lousy language (IMO) with some excellent libraries and very fast compilers. If you're comfortable with the limitations of the language, and you're having trouble with Python, it's worth a shot, I guess.
Mikeraalmost 13 years ago
Java is a great platform to build on - and the sweet spot is definitely for server side applications like this.<p>You can safely ignore the people who bash Java - they are generally clueless. The Java language is perfectly fine: high performance, statically typed, OOP, relatively simple and maintainable. It may not offer the most concise code and it may not have all the "trendy" language syntax features but guess what - that actually doesn't matter much in the real world (i.e. outside the realm of language designers and fanboys). If saving a few characters of typing is your major concern when choosing a language, you have much bigger problems.<p>But the real strength in Java is not the language but rather the overall platform - the combination of the JVM (which is an amazing high performance feat of engineering), the library ecosystem (which is the best overall for any language), the tools (great IDEs, Maven, a host of other developer-focused tools), the fact that the OpenJDK itself and most of the libraries are open source and the portability (compiled JVM code is extremely portable, and importantly doesn't need a recompile unlike some other so-called "cross-platform" languages)<p>So overall you can't really go wrong with choosing Java for server side applications. Although I would also give Clojure or Scala a look - if you are after "powerful" languages then these two are pretty amazing and you still get all the benefits of being on the Java platform.
评论 #4406713 未加载
jvvlimmealmost 13 years ago
Java is suited and even powers some powerful crawlers like Heritrix (archive.org) and Nutch (Apache foundation).<p>That being said, it doesn't really matter what language you write your crawler in: its performance will much sooner be influenced by other aspects (network latency, storage, etc) than the language you choose.<p>So pick the language you're most comfortable with for crawling and offload the data processing to a lower level language that is better sooted for that task.
Tichyalmost 13 years ago
It's just that Java is very verbose, and actually I found it particularly horrible for data driven applications (by this I mean apps whose behavior is determined by data/config files, not "Big Data" - I have no experience with the latter). For complex data types you always need to create complex class hierarchies. In other languages you could just write<p>webInfo = {url: "bla.bla", title: "bla die blub", links: ["link1", "link2"]}<p>Notice that webInfo contains two different types, Strings and Arrays. In Java arrays or hashes you can not easily mix types - you'll end up just putting objects everywhere, then be forced to litter the code with type casts. Or you create the unwieldly class hierarchy. That is my prediction, anyway - I am too lazy to come up with a good example :-(<p>You can also not simply write something like the hash above. The nearest you can get is if you have created that class hierarchy with suitable constructors, you could instantiate that in one go. At least that is my memory - I have now avoided it for so long that I am not even sure how to instantiate an Array or a Hash with data on the fly anymore.<p>I think instantiating an array with data goes something like<p>links = new String[]{"bla", "blub"}, and there is nothing like that for Hashes - you are stuck with<p>info = new HashMap()&#60;String, Object&#62;;//generics are particularly ugly and annoying<p>info.put("links", new String[]{"bla", "blub"});<p>info.put("title", "some stupid web site");<p>info.put("url", "undisclosed");<p>And so on - a far cry from the example above. (Note the Java syntax is probably wrong, created from memory - but it is something like that).<p>Even if you went through the mind numbing work of creating appropriate classes, you'd be stuck with<p>info = new WebInfo(title, url, new String[]{link1, link2,...});<p>And that is just for two different types, and notice that there is no way to see what the name of the parameters of the WebInfo constructor actually are from that snippet of code.<p>title: someTitle<p>is actually much more readable because you can instantly see that someTitle is supposed to be a title.<p>Also if you want to use NoSQL, I suspect converting java classes to JSON could be a pita, too.
评论 #4407532 未加载
评论 #4426137 未加载
ljw1001almost 13 years ago
There are some good, reasoned comments here. Java suffered through some unfortunate 'best practices' that tarnished it's reputation. Building is uh, suboptimal, as some have pointed out, but if you're working alone just keep it simple and it shouldn't be a problem.<p>Unless you're building something that needs to be (1) highly dynamic (like a web-based spreadsheet where you don't know the column types til run-time, or (2) true real-time software, you're probably better off using java. Some libraries do suck as others wrote, but it's the volume of good libraries you care about. In any case, I'd argue that in many alternate languages, the code you're writing so quickly doesn't need to be written at all in java, because there's a library for it.<p>Verboseness is a fact in Java, but a decent IDE shields you from that as well. With Java it takes a little longer to get things done, but (in my experience) you spend less time trying on performance, fixing problems in the underlying tools or language, or just dealing with your own bugs and keeping things running. Since most development is maintenance, you want to optimize for that.
mseepgoodalmost 13 years ago
&#62; Is there any other language that fits this role<p>Go?
评论 #4406446 未加载
NTHalmost 13 years ago
My main problem is that it has awful support for functional programming, which I find to be a really helpful way of doing something like a web crawler, where you're essentially describing a computation to parse some input. I would use F#, because it offers powerful functional programming tools, is on .NET / VS 2012 (not sure if that's a pro or con for you), and has type inference (so you get the benefits of static typing without the cost of writing out the types of everything).<p>You should probably check existing web crawler solutions to see if you can adapt them before rolling your own.
lelelealmost 13 years ago
Java reputation suffers because of the association of such language with corporate drones.<p>We may say that with the current crop of languages running on the JVM, Java is a low-level language. It is to the JVM what C is to hardware. You avoid coding in both when you have higher-level languages available which will make you more productive.<p>But when you want to optimize performance on the JVM for specific chunks of your application - without resorting to JVM bytecode of course - Java is the right choice.
exelibalmost 13 years ago
Java is good enough for all types of projects. People hate Java because... they can't Java right. Personally, I very like Python and JavaScript and features like multiple inheritance or prototyping, higher-order functions and so on. But Java is better sufficient for projects which more complex as "Hello world", because type safe (robust, compile-time feedback), excelent IDE support and incredibly fast and allow fast development (if you can Java, TDD and so on).
评论 #4409802 未加载
dotborg2almost 13 years ago
You always can use JS/Python/whatever in your java application as a scripting language.<p>In such case like web crawler, the main issue with Java is the scalability or rather lack of it. You need to code it yourself, but that's not any different than other languages and platforms.
spullaraalmost 13 years ago
DropWizard is a great way to start Java project.