TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

The Quite OK Audio Format for Fast, Lossy Compression

140 点作者 smlckz将近 2 年前

13 条评论

kragen将近 2 年前
has anyone benchmarked qoa to see roughly how many instructions per sample it needs? all i see here is that it&#x27;s more than adpcm and less than mp3, but those differ by orders of magnitude<p>like, can you reasonably qoa-compress real-time 16ksps audio on a 16 megahertz atmega328?<p>hmm, <a href="https:&#x2F;&#x2F;phoboslab.org&#x2F;log&#x2F;2023&#x2F;04&#x2F;qoa-specification" rel="nofollow noreferrer">https:&#x2F;&#x2F;phoboslab.org&#x2F;log&#x2F;2023&#x2F;04&#x2F;qoa-specification</a> has some benchmark results, let&#x27;s see... seems like he encoded 9807 seconds of 44.1ksps stereo in 25.8 seconds and decoded it in 3.00 seconds on an i7-6700k running singlethreaded. what does that imply for other machines?<p>it seems to be integer code (because reproducibility between the predictor in encoding and decoding is important, and a significant part of it is 16-bit. <a href="https:&#x2F;&#x2F;ark.intel.com&#x2F;content&#x2F;www&#x2F;xl&#x2F;es&#x2F;ark&#x2F;products&#x2F;88195&#x2F;intel-core-i76700k-processor-8m-cache-up-to-4-20-ghz.html?ui=BIG" rel="nofollow noreferrer">https:&#x2F;&#x2F;ark.intel.com&#x2F;content&#x2F;www&#x2F;xl&#x2F;es&#x2F;ark&#x2F;products&#x2F;88195&#x2F;i...</a> says it&#x27;s a 4.2 gigahertz skylake. agner says skylake can do 4–6 ipc (well, μops&#x2F;cycle) <a href="https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;blog&#x2F;read.php?i=628" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.agner.org&#x2F;optimize&#x2F;blog&#x2F;read.php?i=628</a>, coincidentally testing on an i7-6700k himself, but let&#x27;s assume it&#x27;s 3 ipc, because it&#x27;s usually hard to reach even that level of ilp in useful code<p>so that&#x27;s about 380 μops per sample if i&#x27;m doing my math right; that might be on the order of 400 32-bit <i>integer</i> instructions per sample on an in-order processor. if (handwaving wildly now!) that&#x27;s 600 8-bit instructions, the atmega328 should be able to encode somewhere in the range of 16–32 kilosamples per second<p>so, quite plausibly<p>for decoding the same math gives 43 μops per sample rather than 380<p>i&#x27;m very interested to hear anyone else&#x27;s benchmarks or calculations
g0xA52A2A将近 2 年前
Some previous discussions.<p>3 months ago - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=35738817">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=35738817</a><p>6 months ago - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34625573">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=34625573</a>
评论 #37071548 未加载
mips_r4300i将近 2 年前
Comparing against 4bit ADPCM, which is already able to give quite good performance as long as your sample rates are relatively modern, this only improves it to 3.2 bits. It is fast, but ADPCM is also fast.<p>Would be nice to see joint stereo support. If you were to take ADPCM or this OK format and try to encode any stereo music with it, you will need 2 channels. However, there is an extremely advantageous optimization that can be made here - most music is largely center panned, so both channels are almost the same. With joint stereo you record one channel (either by picking one or mixing to an average) and then you can store the difference for the other channel which will occupy a lot fewer bits, assuming you are able to quantize away the increased entropy.<p>For example, instead of using two 4bit ADPCM channels for stereo, which would only be a 50% savings over uncompressed, you could probably use an average of 5 bits per sample.
评论 #37074097 未加载
gaazoh将近 2 年前
I like the philosophy of QOA (and other similar projects, including QOI and TinyVG), but unlike others, it seems like it&#x27;s not ready to use yet, see <a href="https:&#x2F;&#x2F;github.com&#x2F;phoboslab&#x2F;qoa&#x2F;issues&#x2F;25">https:&#x2F;&#x2F;github.com&#x2F;phoboslab&#x2F;qoa&#x2F;issues&#x2F;25</a><p>&gt; I have just pushed a workaround to master. [...]<p>&gt; This still introduces audible artifacts when the weights reset. It prevents the LMS from exploding, but is far from perfect :&#x2F;<p>This, combined with the fact that that issue is still open mean that a breaking change is still to be expected.
codeflo将近 2 年前
It&#x27;s interesting that this works in the time domain (instead of frequency domain), and I wonder what the resulting quality limitations are, if any. The sound samples on the demo page, at the least the dozen I clicked on, didn&#x27;t seem all that challenging. Few, mostly synthesized instruments, low dynamic range. My ears aren&#x27;t good enough to evaluate audio codecs anyway, however.
Pet_Ant将近 2 年前
What is the LFE channel?<p>It should be spelled out explicitly, but I figured out the rest<p>L-Left,R-Right,C-Center,FL-Front Left,FR-FrontRight,SL-SideLeft,SR-SideRight,BL-BackLeft,BR-BackRight<p>---<p>Edit: LFE-LowFrequencyEffects... so subwoofer?<p><a href="https:&#x2F;&#x2F;www.dolby.com&#x2F;uploadedFiles&#x2F;Assets&#x2F;US&#x2F;Doc&#x2F;Professional&#x2F;38_LFE.pdf" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.dolby.com&#x2F;uploadedFiles&#x2F;Assets&#x2F;US&#x2F;Doc&#x2F;Profession...</a>
评论 #37070985 未加载
评论 #37068069 未加载
评论 #37067437 未加载
评论 #37066854 未加载
MobiusHorizons将近 2 年前
Seems to have similar design criteria as opus but I don’t see any comparison.
Turing_Machine将近 2 年前
I looked around, but didn&#x27;t see any mention of potential patent issues. I assume that this has been considered? The Ogg Vorbis people spent a lot of time on that back when they were developing their format.<p>Other than that, looks great!
评论 #37067447 未加载
marcoc将近 2 年前
How can one create a professional looking pdf like the QOAF specification one?
评论 #37068146 未加载
评论 #37068437 未加载
评论 #37068191 未加载
评论 #37067821 未加载
rockstarflo将近 2 年前
What is the tradeoff there?
评论 #37047773 未加载
评论 #37067683 未加载
ape4将近 2 年前
What&#x27;s going to be the next Quite OK thing?
评论 #37067435 未加载
评论 #37073042 未加载
评论 #37072120 未加载
评论 #37072745 未加载
评论 #37072361 未加载
评论 #37066984 未加载
ericls将近 2 年前
The smaple page preloads all the files before playing... Which wastes lots of bandwidth.
Aldipower将近 2 年前
An _audio_ format which is _quite_ ok? Not sure, if I need that.