TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The IIT Bombay English-Hindi Parallel Corpus [pdf]

103 pointsby aq3cnover 7 years ago

2 comments

mongodudeover 7 years ago
Since so long, I have been waiting for Indian universities especially IITs to invest and publish in building such corpora. Being a founder of AI/ML startup, I am surprised at the appalling lack of datasets available to work on Indian problems. Contrast this with Chinese universities where they have built some world class datasets to build NLP solutions in Mandarin. Our sentiment analysis works in 8 different languages but none of it is in Indian languages despite we being in India!
评论 #15451608 未加载
评论 #15451080 未加载
gumbyover 7 years ago
Being it&#x27;s IIT Bombay I hope for Marathi some day soon.<p>I know different people have different priorities :-)
评论 #15454090 未加载