TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Does this algorithm for increasing audio quality exist?

3 pointsby E-Reveranceover 4 years ago
I&#x27;ve been having this weird thought lately about how to increase the &#x27;definition&#x27; of audio using convolutions. The idea is as follows<p>1. Convolve a rectangular function of a certain width against the audio<p>2. The local maxima&#x27;s of the convolved function show where there is the highest average amplitude, so increase the amplitude of those regions a bit<p>3. Repeat with smaller rectangular functions to capture finer details<p>Does this already exist, and if it doesn&#x27;t, does this idea even work

7 comments

lock-freeover 4 years ago
You&#x27;re going to have to define &quot;definition&quot; a bit because it&#x27;s unclear what your goals are.<p>In general what it sounds like you&#x27;re talking about is a class of nonlinear processes called &quot;dynamics processing&quot; (common examples are automatic gain correction (AGC), compression, expansion, the compressor-expander (compander)). All have been in production use since the least the 1940s. It&#x27;s built into your cellphone and also those default sound effects they put in crappy TVs.<p>Your algorithm as described (convolution with a rect in time == multiplication by sinc in frequency) would be a pretty terrible sounding filter, and would cause some gnarly phasing sound effects to the signal. A linear filter will not solve this problem directly.<p>What you want to do is extract the <i>envelope</i> of the signal, which can be done using the Hilbert Transform (1), which is an example of a class of algorithms called envelope followers (2).<p>After extracting the envelope from the signal you can use it to compute a gain to apply to the signal (this is how dynamics processing works). It&#x27;s not a magic bullet, and dynamics processing is <i>undesirable</i> in high fidelity reproduction. Its used in telephones to compensate for the godawful dynamic range, as an effect in recording or production to add balance within a mix, in conferencing applications to make up for poor mic&#x27;ing conditions, and in protection circuitry. You do not want to add more than necesssary, as a general rule.<p>(1) <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Hilbert_transform" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Hilbert_transform</a><p>(2) <a href="https:&#x2F;&#x2F;www.dsprelated.com&#x2F;showarticle&#x2F;938.php" rel="nofollow">https:&#x2F;&#x2F;www.dsprelated.com&#x2F;showarticle&#x2F;938.php</a>
评论 #25448568 未加载
pwgover 4 years ago
You may be looking for &quot;dynamic range compression&quot;:<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Dynamic_range_compression" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Dynamic_range_compression</a>
评论 #25446967 未加载
ksajover 4 years ago
I don&#x27;t believe you&#x27;ll capture finer details. What you&#x27;ll end up with is a softer sound. It will make overly-digital sounding clips sound better, but it won&#x27;t make previously unheard stuff come out.<p>Where I get this from, is many moons ago I experimented with turning audio into a very wide image. Visually it looks rather like a frequency analyzer. It was easy to shift pitches around, but the digitization left artifacts at the wrong (unexpected) pitches which changed the sound of vowels quite considerably.<p>Before I figured out why that is (and that alone was an amazingly interesting lightbulb over my head), I tried smoothing them out so those artifacts wouldn&#x27;t be so odd. What I ended up with instead was something akin to talking through cotton, and guitar lines totally didn&#x27;t sound like guitar anymore - more like a synth, which I thought sounded cool, but was effectively a failed experiment.<p>I don&#x27;t want to discourage you from doing this using math, and I&#x27;d love to see your progress in it, but my prediction is that you&#x27;ll find something similar occurs.<p>Having said that, I later discovered this device that I think contains the secret to making my experiments actually succeed: <a href="https:&#x2F;&#x2F;www.behringer.com&#x2F;product.html?modelCode=P0CD0" rel="nofollow">https:&#x2F;&#x2F;www.behringer.com&#x2F;product.html?modelCode=P0CD0</a> because it literally compensates for the problems inherent to what my smoothing experiments were doing.<p>PS: I actually got the idea from an old television set I used to have that had a function that supposedly made it look higher definition. But even that showed some interesting artifacts (blur) if you looked at it up really close. But in motion, the videos indeed looked higher def, which is why I tried it on sound.
评论 #25452379 未加载
PaulHouleover 4 years ago
The ideal response function for a visual field is a point, so deconvolution works to &quot;sharpen&quot; an image.<p>The experience of an audio recording is both the experience of the sound source plus the experience of the space that the sound source is in. This is particularly important for multichannel sound in movies and video games but it is important for music.<p>Good sound recordings (say a David Bowie album from the 1970s) carefully record the instruments with a &quot;dry&quot; recording with limited echo and reverb. Then they put in the echo and reverb they want with a convolutional filter or physical realization thereof.<p>If you see that as the artistic vision and want to reproduce it accurately you don&#x27;t want to undo that convolution.<p>Undoing convolution is an iffy thing to do anyway because it involves a lot of subtracting two big numbers to get two little numbers and is apt to amplify high frequency noise.<p>It is different for voice applications: a speech recognition system needs some kind of deconvolution to not be confused by the audio environment.
jschveibinzover 4 years ago
Convolution is multiplication in the frequency domain. By convolving with a rectangle, you are multiplying by the Fourier transform of a rectangle in the frequency domain. In EE terminology, this is a sinc (sin(x)&#x2F;x) low pass filter: it emphasizes low frequencies and attenuated high frequencies.
评论 #25447438 未加载
panda88888over 4 years ago
Anyway, here are my thoughts.<p>1) This is simply a low pass filter. It essentially computes the average values (plus a multiplicative constant). So this value would be high for audio signal segments with lots of bass.<p>2) This is incorrect. The value in 1) is the amplitude of low frequency signals. Increasing the amplitude of this would <i>probably</i> result in emphasizing the bass.<p>3) Smaller rectangular functions are low pass filters with different cutoffs, so effect would be similar as 1).<p>My guess is that the audio will end up having its bass region boosted, depending on the size of the rectangular windowing functions.
probinsoover 4 years ago
this is a white noise machine.