How fast can a BufferedReader read lines in Java?

91 pointsby CCsalmost 6 years ago

15 comments

elFartoalmost 6 years ago

The first issue I can see with that code is it's not doing what he expects. He does this to read the file into a StringBuffer:<pre><code> bf.lines().forEach(s -> sb.append(s)); </code></pre> However, this ends up reading all the lines into one giant line, since the String's that lines() produces have the newline character stripped. This leads to the second lines() call to read a 23MB line (the file produced by gen.py). This is less than optimal.The fastest version I managed to write was:<pre><code> public void readString5(String data) throws IOException { int lastIdx = 0; for (int idx = data.indexOf('\n'); idx > -1; idx = data.indexOf('\n', lastIdx)) { parseLine(data.substring(lastIdx, idx)); lastIdx = idx+1; } parseLine(data.substring(lastIdx)); } </code></pre> Not the prettiest thing, but it went from 0.594 GB/s to 1.047 GB/s. Also, it doesn't quite do the same as the lines() method, but that's easily changed.

评论 #20543544 未加载

评论 #20543259 未加载

评论 #20543464 未加载

评论 #20542793 未加载

评论 #20542788 未加载

derefralmost 6 years ago

I don't know what Java's BufferedReader is doing, but it's probably not the optimal thing in terms of IO throughput. I would blame the algorithm long before blaming anything inherent about the JVM.Erlang is another language where "naive" IO is kind of slow. <a href="https://github.com/bbense/beatwc/" rel="nofollow">https://github.com/bbense/beatwc/</a> is a project someone did to test various methods of doing IO in Erlang/Elixir, and their performance for a line-counting task, relative to the Unix wc(1) command.It's interesting to see which approaches are faster. Yes, parallelism gains you a bit, but a much larger win comes from avoiding the stutter-stop effect of cutting the read buffer off whenever you hit a newline. Instead, the read buffer should be the same size as your IO source's optimal read-chunk size (a disk block; a TCP huge packet), and you should grab a whole buffer-ful of lines at a time, do a pattern-matching binary scan to collect all the indices of the newlines, and then use those indices to part the buffer out as slice references.This achieves quite a dramatic speedup, since most of the time you don't need movable copies of the lines, and can copy the line (or more likely just part of it) yourself when you need to hold onto it.This approach is probably also already built in to Java's "better" IO libraries, like NIO.

评论 #20542596 未加载

评论 #20542300 未加载

评论 #20542587 未加载

Sindisilalmost 6 years ago

Honestly, it seems that nearly everyone here is missing his point.Some of the blame for that probably lies with his headline choice, but he clearly states at the end of this post:""" This is not the best that Java can do: Java can ingest data much faster. However, my results suggest that on modern systems, Java file parsing might be frequently processor-bound, as opposed to system bound. That is, you can buy much better disks and network cards, and your system won’t go any faster. Unless, of course, you have really good Java engineers.Many firms probably just throw more hardware at the problem. """It's not about this piece of code. It's not even about Java In the previous post he mentions at the start of this one, he pointed out:""" These results suggest that reading a text file in C++ could be CPU bound in that sense that buying an even faster disk would not speed up your single-threaded throughput. """So, I take his point to be that one shouldn't make assumptions about performance. Rough performance scales -- such as have been posted here many times (e.g. [1]) -- make great rules of thumb for implementation choices or as a guide for where to look first for bottlenecks. To optimize in the real world, though, you're best served using real measurements.[1] <a href="https://www.prowesscorp.com/computer-latency-at-a-human-scale/" rel="nofollow">https://www.prowesscorp.com/computer-latency-at-a-human-scal...</a>

评论 #20543918 未加载

znpyalmost 6 years ago

The post does nothing to explain how and why, it just throws a couple of outputs from a non specified machine and does no comparison.It has no baseline and no specs. For all I know, he could have got his 0.5 GB/sec on ab old Pentium II processor.There is no analysis.I am perplexed.

评论 #20542306 未加载

tawy12345almost 6 years ago

I'm amazed at how upset some commenters are about a blog post that did a toy experiment and didn't actually make any strong claims.I'm actually a stickler about good benchmarks - it riles me when people draw sweeping conclusions from poorly-designed experiments. Lemire is actually one of the good ones. If you want something more fully developed than a blog post, read one of his papers.I personally really enjoy his blog because of this - he's good at picking interesting exploratory experiments that provide some insight, without trying to over-generalize from the results. If you read his conclusion, the point is that there is a good probability that even relatively simple programs are CPU-bound. His experiment supports that point. My experience also matches that - I've seen a lot of data processing code that could be I/O bound in theory (i.e. a perfect implementation could max out CPU or network) but is CPU bound in practice. Usually because of string manipulation, regexes, or any number of other things.> This is not the best that Java can do: Java can ingest data much faster. However, my results suggest that on modern systems, Java file parsing might be frequently processor-bound, as opposed to system bound. That is, you can buy much better disks and network cards, and your system won’t go any faster. Unless, of course, you have really good Java engineers.

评论 #20542752 未加载

ubu7737almost 6 years ago

This is absurd, the original platform libraries do not account for the fastest use-cases in any specialized IO case.Java NIO channel should have been used for this. It was demonstrated back in the early 2000s with the "Grand Canyon" demo achieving very good throughput for its time, and it's still the gold standard.

adrianNalmost 6 years ago

So what's the reason for this? Is it maybe because of some unicode shenanigans? Java characters are 16bit iirc, and strings have some forty bytes of constant overhead.

评论 #20542253 未加载

评论 #20542475 未加载

评论 #20542268 未加载

vbezhenaralmost 6 years ago

Java has many inefficient parts. For example there's no immutable array concept (or owning concept, like in Rust), so there's a lot of unnecessary array copies happens in JDK. String is not well designed. There was an attempt to abstract String concept into CharSequence, but a lot of code still uses Strings.I made a similar benchmark. The idea is as follows: we have 2 GB byte array (because arrays in Java have 32 bit limit, LoL) filled with 32..126 values, imitating ASCII text and 13 values imitating newlines.The first test is simply does XOR the whole array. It's the ideal result which should correspond to memory bandwidth.The second test wraps this array into ByteArrayInputSteram, converts it into Reader using InputStreamReader with UTF-8 encoding, reads lines using BufferedReader and in the end also XORs every char value.For 2 GB I have 516 ms as an ideal time (3,8 GB/s which is still almost order of magnitude less than theoretical 19.2 GB/s DDR4 speed) and 3566 ms as a BufferedReader, so you can have almost 7x speed improvement with better implementation.Benchmark: <a href="https://pastebin.com/xMD4W8mn" rel="nofollow">https://pastebin.com/xMD4W8mn</a>

nullwasamistakealmost 6 years ago

Eh, he didn't use NIO. BufferedReader is an ancient Java relic. Like reading from STDIN in c, it's not made to be fast, it's there for convenience and backwards compatibility.Read a file using something like Vert.X, which is optimized for speed. I'm 100% confident it will be faster than the naive c approach

评论 #20543203 未加载

评论 #20542377 未加载

tantaloralmost 6 years ago

One thing that jumps out at me is the test code writes to "volume" variable in the read loop, I assume for counting the number of bytes in the file, but never reads it back. A clever compiler will optimize away those writes, the string length check, the loop over the lines, and actually reading the file.I'm not saying that's happening here, but it's a basic fact when writing benchmarks that you have to actually test something real and not a transient property of the program after the compiler has had its chance to be really smart.

barbarbaralmost 6 years ago

I am a bit confused. The scanFile reads from a file. But the readLines inside the for loop is reading from a StringReader - and not from a file?

chvidalmost 6 years ago

At least two problems with the java code. Concatenation of strings using the plus operator creates a new string and copies the content of the old, that pushes the complexity of the code from o(n) to o(n2) where n is the number of lines. Secondly order is not guaranteed with the for each operation on streams.The correct way to do it is using collect(Collectors.joining(“\n”)) or straight forward imperative style (without streams).I don’t think the general statement holds (that java or buffered reader is cpu bound in particular).

评论 #20543645 未加载

IloveHN84almost 6 years ago

But all the stream API isso terribile for performance..you write a one line code and you are already at O(n^5)

pjmlpalmost 6 years ago

Completely wrong.It is like asserting something about C based on GCC specific behaviour.Java is not a single language implementation.

评论 #20543781 未加载

nottorpalmost 6 years ago

Java is... java.I was once working on an Android app on a cheap custom board with 128 M ram (don't ask why Android on a single function custom board, wasn't my decision).Among other things, I had to parse a 80000 line csv file. Splitting and the rest of the processing created so many temporary strings the system ran out of ram. We eventually gave up.

评论 #20542309 未加载

评论 #20542414 未加载

评论 #20548034 未加载

评论 #20542800 未加载

评论 #20543315 未加载

评论 #20542267 未加载

评论 #20542315 未加载