TechEcho

3 comments

We've been using spaCy a lot for the past few months.Mostly for non-production use cases, however, I can say that it is the most robust framework for NLP at the moment.V3 added support for transformers: that's a killer feature as many models from <a href="https://huggingface.co/docs/transformers/index" rel="nofollow">https://huggingface.co/docs/transformers/index</a> work great out of the box.At the same time, I found NER models provided by spaCy to have a low accuracy while working with real data: we deal with news articles <a href="https://demo.newscatcherapi.com/" rel="nofollow">https://demo.newscatcherapi.com/</a>Also, while I see how much attention ML models get from the crowd, I think that many problems can be solved with rule-based approach: and spaCy is just amazing for these.Btw, we recently wrote a blog post comparing spaCy to NLTK for text normalization task: <a href="https://newscatcherapi.com/blog/spacy-vs-nltk-text-normalization-comparison-with-code-examples" rel="nofollow">https://newscatcherapi.com/blog/spacy-vs-nltk-text-normaliza...</a>

评论 #29511921 未加载

评论 #29512079 未加载

评论 #29514603 未加载

评论 #29519236 未加载

评论 #29514681 未加载

评论 #29513680 未加载

评论 #29515417 未加载

minimaxirover 3 years ago

A relatively underdiscussed quirk of the rise of superlarge language models like GPT-3 for certain NLP tasks is that since those models have incorporated so much real world grammar, there's no need to do advanced preprocessing and can just YOLO and work with generated embeddings instead without going into spaCy's (excellent) parsing/NER features.OpenAI recently released an Embeddings API for GPT-3 with good demos and explanations: <a href="https://beta.openai.com/docs/guides/embeddings" rel="nofollow">https://beta.openai.com/docs/guides/embeddings</a>Hugging Face Transformers makes this easier (and for free) as most models can be configured to return a "last_hidden_state" which will return the aggregated embedding. Just use DistilBERT uncased/cased (which is fast enough to run on consumer CPUs) and you're probably good to go.

评论 #29514872 未加载

评论 #29512654 未加载

评论 #29514057 未加载

41209over 3 years ago

I really love spaCy, it's trivial to throw up a server which handles basic NLP. No complaints here, very happy to see it still being updated

3 comments

artembugaraover 3 years ago

评论 #29511921 未加载

评论 #29512079 未加载

评论 #29514603 未加载

评论 #29519236 未加载

评论 #29514681 未加载

评论 #29513680 未加载

评论 #29515417 未加载

minimaxirover 3 years ago

评论 #29514872 未加载

评论 #29512654 未加载

评论 #29514057 未加载

41209over 3 years ago

I really love spaCy, it's trivial to throw up a server which handles basic NLP. No complaints here, very happy to see it still being updated

Advanced NLP with spaCy v3

3 comments

Advanced NLP with spaCy v3

3 comments