Next, try the taggers on a more realistic setting than the standard corpuses -- e.g. a product review that compares several products, and you'll instantly see how incredibly poor the current state of the art NER is.<p>Technology is really going to advance once we have anything that comes close to human level on NER and relation extraction. Kind of like self driving cars, the basic ideas have been around for decades, but performance in realistic adverse conditions remains awful for almost everywhere that it could theoretically be used.
It's always nice to know that your masters programme requires more of you in just an exam (building a Relation Extraction pipeline including POS tagging and a NER system).<p>Having said that, it's been shown pretty well that CRF's outperform the Stanford Parser with simple features (it can get even better with better features - particularly for organisations), which also beat out HMM's but it could be interesting to see how neural networks would do.