So glad someone took the time to put up some data about it. Since day one, the subpar results for Asian languages has stuck out to me. It's especially true for LLama-derived models, where the output is just abysmal. It's my own pet theory, that bad tokenization is an important reason as to why they suck so much in the first place.<p>It's not just broken grammar, it's a surprising lack of creativity, that English doesn't suffer from. ChatGPT English -> DeepL and fixing the auto-translation gives vastly improved results, than prompting ChatGPT to respond in an asian language.