TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: What approach would you suggest for Text classification?

1 点作者 gerenuk将近 7 年前
Hey everyone!<p>We are trying to solve a problem where we need to classify the articles into the right categories.<p>Currently, using a FastText to train a model with 100,000 articles categorized into 600 categories. The loss seems to be converging but the precision is not going up, another thing that requires clarification is that can we use pre-trained Wikipedia English embeddings to categorize text.<p>What would you recommend using FastText or some other algorithm&#x2F;approach towards this problem?<p>Any suggestion&#x2F;ideas would be appreciated.<p>Thanks.

1 comment

smithmayowa将近 7 年前
FastText is state of the art when it comes to word embedding due to its ability to generate embedding for even words it has not seen, so perhaps your problem lies in your model&#x27;s architecture, are you using convolution neural nets or just basic feed forward networks I have had great success using CNN for text classification, and in your words pre-processing are you filtering out stopwords(very common words in English that throw confusion to a models ability to correctly classify text&#x27;s).