TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Zero-shot transfer across 93 languages

283 点作者 moneil971超过 6 年前

14 条评论

minimaxir超过 6 年前
&gt; The encoder is five-layer bidirectional LSTM (long short-term memory) network. In contrast with neural machine translation, we do not use an attention mechanism but instead have a 1,024-dimension fixed-size vector to represent the input sentence.<p>5 layers of 1024-cell <i>bidirectional</i> LSTMs (edit: actually 512-cells x2?)? Can consumer GPUs even fit that (+ the decoder) into RAM?
评论 #18972100 未加载
评论 #18971795 未加载
nograpes超过 6 年前
I think the English translations of the Hindi and the Bulgarian are mixed up. The Hindi should be &quot;Their destination was secret&quot;, and the Bulgarian should be &quot;Nobody knew where they went&quot;. Also, the Devnagari script is not rendered properly; the diacritical marks should be directly over (or under) the related character, and conjunct characters are not &quot;squished&quot; together.
评论 #18976531 未加载
评论 #18974579 未加载
评论 #18973944 未加载
raldi超过 6 年前
Can someone explain what &quot;zero shot&quot; means? The link doesn&#x27;t explain, and some basic googling doesn&#x27;t either.
评论 #18974232 未加载
XaspR8d超过 6 年前
Tangent: I&#x27;ve always wondered if there would be utility in humans gaining expertise in writing &quot;for&quot; translation, i.e. knowing what kinds of semantic and syntactic constructs are the least lossy when localized, or perhaps even learning to write in some intermediate, non-native-human language whose reduced feature set guarantees a certain level of translatability.<p>I suppose the answer might be that machine translation will improve fast enough that such a field wouldn&#x27;t have time to emerge. But I always think that using humans to intelligently fill gaps in machine competency is a <i>neat</i> solution!
评论 #18971362 未加载
评论 #18973108 未加载
评论 #18971234 未加载
评论 #18973154 未加载
评论 #18971875 未加载
评论 #18971443 未加载
评论 #18972113 未加载
评论 #18976872 未加载
评论 #18974006 未加载
评论 #18974699 未加载
评论 #18971747 未加载
评论 #18971317 未加载
sideral超过 6 年前
The license is CC non-commercial. Does anyone here know if this means that it cannot be used to train models that will be used commercially?
评论 #18972003 未加载
etiam超过 6 年前
Wish they&#x27;d stayed away from LASER. That acronym&#x27;s already taken...
评论 #18973272 未加载
评论 #18972362 未加载
评论 #18971991 未加载
nahh超过 6 年前
If we get babelfish from Facebook, was it worth it?
评论 #18977119 未加载
vladislav超过 6 年前
&quot;LASER achieves these results by embedding all languages jointly in a single shared space (rather than having a separate model for each)&quot;. There could be a good reason for why the mutual embedding of several languages works better than individual, beyond the extra data. If human languages share some minimal representation (universality so to say), training on multiple languages may be required to extract it with today&#x27;s techniques, since training on just one language is bound to overfit to its particulars.
stephanimal超过 6 年前
That graphic of the language families seems to misspell Estonian (as Estinain) and Finnish (as Finish) ? Seems like an odd oversight for such a project.
评论 #18974686 未加载
MikusR超过 6 年前
It even works on Latavian language (as shown on the top animation)
ahurmazda超过 6 年前
I have been pleasantly surprised by FB&#x27;s suggested translation even when the messages are written in seemingly (to me) complicated transliteration of Bengali.
vectorEQ超过 6 年前
pretty nice release. just note that i find it a bit silly to refer to the berber language as if it&#x27;s 1 language. it&#x27;s a group of languages, and moreover they are phonetic, so the text you train on can vary greatly between the languages and even the writers on how you would write it.
bregma超过 6 年前
My hovercraft is full of eels.
评论 #18973712 未加载
oska超过 6 年前
I find it interesting that this page has already been &#x27;snapshotted&#x27; on the Internet Archive 24 times [1], less than a day after it appeared. Is this because, like me, people are wary of visiting any facebook domain? Or is it because people consider it an important research result? (Obviously it can also be both).<p>[1] <a href="https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;*&#x2F;https:&#x2F;&#x2F;code.fb.com&#x2F;ai-research&#x2F;laser-multilingual-sentence-embeddings&#x2F;" rel="nofollow">https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;*&#x2F;https:&#x2F;&#x2F;code.fb.com&#x2F;ai-researc...</a>
评论 #18974507 未加载