We've been using spaCy a lot for the past few months.<p>Mostly for non-production use cases, however, I can say that it is the most robust framework for NLP at the moment.<p>V3 added support for transformers: that's a killer feature as many models from <a href="https://huggingface.co/docs/transformers/index" rel="nofollow">https://huggingface.co/docs/transformers/index</a> work great out of the box.<p>At the same time, I found NER models provided by spaCy to have a low accuracy while working with real data: we deal with news articles <a href="https://demo.newscatcherapi.com/" rel="nofollow">https://demo.newscatcherapi.com/</a><p>Also, while I see how much attention ML models get from the crowd, I think that many problems can be solved with rule-based approach: and spaCy is just amazing for these.<p>Btw, we recently wrote a blog post comparing spaCy to NLTK for text normalization task: <a href="https://newscatcherapi.com/blog/spacy-vs-nltk-text-normalization-comparison-with-code-examples" rel="nofollow">https://newscatcherapi.com/blog/spacy-vs-nltk-text-normaliza...</a>
A relatively underdiscussed quirk of the rise of superlarge language models like GPT-3 for certain NLP tasks is that since those models have incorporated so much real world grammar, there's no need to do advanced preprocessing and can just YOLO and work with generated embeddings instead without going into spaCy's (excellent) parsing/NER features.<p>OpenAI recently released an Embeddings API for GPT-3 with good demos and explanations: <a href="https://beta.openai.com/docs/guides/embeddings" rel="nofollow">https://beta.openai.com/docs/guides/embeddings</a><p>Hugging Face Transformers makes this easier (and for free) as most models can be configured to return a "last_hidden_state" which will return the aggregated embedding. Just use DistilBERT uncased/cased (which is fast enough to run on consumer CPUs) and you're probably good to go.