TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Unsupervised word embeddings capture latent knowledge from scientific literature

135 pointsby KasianFranksalmost 6 years ago

12 comments

gnomewascoolalmost 6 years ago
For the lazy, the doi is 10.1038/s41586-019-1335-8 if you want to add it to your bibliographies. Obviously, don't use the doi for any illegal purposes, such as getting around the paywall.
评论 #20361817 未加载
solaristalmost 6 years ago
An article about the paper by the first author <a href="https:&#x2F;&#x2F;towardsdatascience.com&#x2F;using-unsupervised-machine-learning-to-uncover-hidden-scientific-knowledge-6a3689e1c78d" rel="nofollow">https:&#x2F;&#x2F;towardsdatascience.com&#x2F;using-unsupervised-machine-le...</a>
iandanforthalmost 6 years ago
Summary: Given abstracts of materials science papers they were able to predict that certain materials would have desirable&#x2F;interesting properties before these materials were actually examined for those properties. This was confirmed by &quot;holding out&quot; recent years of data and then seeing if predictions from say 2009 would have held up today. They also have made predictions which have yet to be confirmed &#x2F; refuted.<p>Interesting points on future work:<p>- This was only using abstracts. Using full papers could yield significant improvements.<p>- Uses word2vec and not Bert &#x2F; Elmo, so there&#x27;s likely to be another jump in performance there.
moconnoralmost 6 years ago
You can get an idea of the content from their GitHub <a href="https:&#x2F;&#x2F;github.com&#x2F;materialsintelligence&#x2F;mat2vec&#x2F;blob&#x2F;master&#x2F;README.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;materialsintelligence&#x2F;mat2vec&#x2F;blob&#x2F;master...</a><p>The author emails are at the end of README.md if you still want to ask for a preprint.
PeterStueralmost 6 years ago
If you do not have access to the Nature paper, this paper reports on the same study. <a href="https:&#x2F;&#x2F;chemrxiv.org&#x2F;articles&#x2F;Named_Entity_Recognition_and_Normalization_Applied_to_Large-Scale_Information_Extraction_from_the_Materials_Science_Literature&#x2F;8226068&#x2F;1" rel="nofollow">https:&#x2F;&#x2F;chemrxiv.org&#x2F;articles&#x2F;Named_Entity_Recognition_and_N...</a>
haddralmost 6 years ago
5-years old discovery, nothing spectacular (as of 2019). On the other hand, a good example of publishing: code, corpora and materials are available for everyone to reproduce it.
msamwaldalmost 6 years ago
This is certainly a nice paper, but it is also a bit puzzling that this was noteworthy enough to be published in Nature.
评论 #20361543 未加载
评论 #20361626 未加载
评论 #20363096 未加载
评论 #20363041 未加载
delton137almost 6 years ago
We published something similar in spirit recently (although it ended up as a conference paper and not in Nature)... Notably, we did our study with much fewer data - instead of millions of patents we had the text of a few thousand patents and the text of a few hundred conference papers. We had a specific focus and we wanted to focus on texts about energetic materials (explosives and propellants).<p>We showed how chemical-application &amp; chemical-property relations are captured by word2vec and GloVe. For instance we found rocket fuels where the chemicals appearing closest to “rocket” while materials used in air bags appeared closest to “air bag”. We were able to filter to chemical names using ChemDataExtractor and further to likely energetic chemicals by obtaining SMILES strings from PubChem and using a classifier to classify them as likely energetics or not.<p>You can find our work here : <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;1903.00415.pdf" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;pdf&#x2F;1903.00415.pdf</a> .
tastroderalmost 6 years ago
Is the novel part the application to materials science? I can&#x27;t get to the nature paper on mobile but the analysis in the other resources linked here looks pretty thorough.<p>Is there anything new methodology wise in the nature version?
smaddoxalmost 6 years ago
Do the authors have a draft pdf available?
tshitoyanalmost 6 years ago
Hi All, glad to see our paper caught your attention. Here is a link to read the paper: <a href="https:&#x2F;&#x2F;rdcu.be&#x2F;bItqk" rel="nofollow">https:&#x2F;&#x2F;rdcu.be&#x2F;bItqk</a>
Der_Einzigealmost 6 years ago
Between this and the UMAP paper on cancer publishing in Nature, I&#x27;m convinced that my next publication will be in the sample place that Isaac Newton published in