I wrote a fast C json parser a few years ago (+600megs/s) And it was an interesting experience. Validating the json roughly halved the performance. The most interesting performance gain was going from using aligned structs to packed byte offsets being accessed using memcpy. It added 20-30%. The overhead of aligning was nothing compared to fewer cache misses. In the end i found that making a truly fast json parser mostly depend on what you parse it to. Like, is the structure read only, and how fast is it to access?