For technical HN readers, I think the article[1] that the author linked is better.<p>After reading Google's explanation, I don't think his comment is accurate:<p><i>>Google Translate invented its own language to help it translate more effectively.<p>>What’s more, nobody told it to. It didn’t develop a language (or interlingua, as Google call it) because it was coded to. It developed a new language because the software determined over time that this was the most efficient way to solve the problem of translation.</i><p>That makes it sound like the middle GNMT box (alternating in blue and orange) was automatically fabricated by the algorithm. Instead, what seems to have happened is that the <i>existence</i> of an "intermediate" representation was a deliberate architecture choice by human Google programmers. What got "learned by machine" was the build up of internal data (filling up the vectors with numbers to find mappings of "meaning").<p>Google programmers can chime in on this but as an outsider, I'm guessing the previous incarnations of translate was more "point-to-point" instead of "hub-&-spoke".<p>With the 103 languages, the point-to-point when computed as "n choose k"[2] means 5253[3] possible direct mappings. (Although one example pair such as <i>African Swahili</i> to <i>Australia Aborigine</i> would probably not be filled with translation data.)<p>With the new GNMT (the intermediate hub), you don't need a 5253 mappings. Instead of (n!/k!(n-k)!) combinations, it's just n. (However, I'm not saying that reducing the mathematical combinations was the main motivator for the re-architecture.)<p>An analogy would be the LLVM IR intermediate representation. One can target an "intermediate hub" language like LLVM-IR. This reduces the combinatorial complexity of all frontend programming language compilers to understand all backend machine languages. Instead of languages like Rust & Julia writing point-to-point backends to specific machine languages like x86 & ARM & Sun. The difference with Google's GNMT is that the keywords of "intermediate language" was not pre-specified by humans.<p>[1] <a href="https://research.googleblog.com/2016/11/zero-shot-translation-with-googles.html" rel="nofollow">https://research.googleblog.com/2016/11/zero-shot-translatio...</a><p>[2] <a href="https://en.wikipedia.org/wiki/Combination#Number_of_k-combinations" rel="nofollow">https://en.wikipedia.org/wiki/Combination#Number_of_k-combin...</a><p>[3] <a href="https://www.google.com/search?q=(103%5E2-103)%2F2" rel="nofollow">https://www.google.com/search?q=(103%5E2-103)%2F2</a>