TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Llasa: Llama-Based Speech Synthesis

168 点作者 CalmStorm16 天前

5 条评论

ks204816 天前
Odd that the page doesn&#x27;t seem to link to either,<p>paper: <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2502.04128" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2502.04128</a><p>github: <a href="https:&#x2F;&#x2F;github.com&#x2F;zhenye234&#x2F;LLaSA_training">https:&#x2F;&#x2F;github.com&#x2F;zhenye234&#x2F;LLaSA_training</a>
评论 #43864286 未加载
CalmStorm16 天前
LLaSA is a simple framework for speech synthesis that employs a single-layer vector quantizer (VQ) codec and a single Transformer architecture to fully align with standard LLMs such as LLaMA.
评论 #43861099 未加载
dheera16 天前
&gt; employs a single-layer vector quantizer (VQ) codec and a single Transformer architecture to fully align<p>I really wish when new models were released that they would draw a diagram of all the layers and the tensor input and output sizes at each layer, with zoom in&#x2F;out capabilities if needed using D3.js or whatever visualization framework if needed. Every single layer should be on there with its input and output sizes.<p>These one-sentence descriptions, and approximate block diagrams with arrows pointing at each other are never enough to understand how something is actually implemented.
评论 #43867280 未加载
评论 #43865632 未加载
评论 #43862995 未加载
StevenNunez16 天前
I can&#x27;t wait see this integrated into Open WebUI! These sound amazing.
评论 #43865791 未加载
mring3362116 天前
the long &#x27;uuuuhhhhhhh&#x27; from some of the lesser models is killing me.
评论 #43865764 未加载
评论 #43862103 未加载
评论 #43868733 未加载