TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

How Google Meet's noise cancellation works

390 点作者 theanirudh将近 5 年前

36 条评论

crazygringo将近 5 年前
Most people have no idea of the amount of incredibly advanced signal processing that goes into echo cancellation and noise cancellation in videoconferencing.<p>This post is on noise cancellation specifically, and it actually has the potential to be a <i>huge</i> step forward.<p>One of the big audio problems with group meetings is that the background noise from each participant adds up, to a point where it quickly becomes unbearable. For that reason, videoconferencing generally only plays audio from one or two participants at most, using a fairly simple estimation of whichever audio signal is currently loudest. The problem is that this can make it really hard to interrupt (people will literally not hear you), or tell the difference between two people going &quot;mm-hmm&quot; versus the whole group. If you&#x27;ve ever been in a group meeting where everybody applauds something, this is why you <i>see</i> everyone applauding but only hear a smattering.<p>But if this noise cancellation really succeeds, it could be a huge leap forward because audio cues and overlap will actually work for the first time -- hearing the &quot;mm-hmms&quot;, hearing everyone pipe up, and so on. Videoconferencing will feel more like an actual single shared audio environment, rather than the kind of &quot;walkie-talkie&quot; effect it so often feels like now.<p>I&#x27;m really looking forward to this.
评论 #23469438 未加载
评论 #23470159 未加载
评论 #23471527 未加载
评论 #23470274 未加载
评论 #23469064 未加载
评论 #23472153 未加载
评论 #23471102 未加载
评论 #23470608 未加载
评论 #23470480 未加载
评论 #23470762 未加载
评论 #23469062 未加载
skybrian将近 5 年前
&gt; A musical instrument will probably also get filtered out. “To a pretty large degree, it does,” Lachapelle said. “Especially percussion instruments. Sometimes a guitar can sound very much like a voice — you’re starting to touch the limits there. But if you have music playing in the background, usually it’ll cut it all out.”<p>This is a big issue with hearing aids. The whole industry is focused on optimizing for voice intelligibility and as a musician you end up doing trial-and-error with the audiologist to turn all that stuff off.<p>We need more open source hearing aids - I&#x27;ve read of a few but they&#x27;re not mainstream.
评论 #23470625 未加载
评论 #23469721 未加载
xeno42将近 5 年前
I&#x27;ve been using <a href="https:&#x2F;&#x2F;krisp.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;krisp.ai&#x2F;</a> to great effect with Zoom while sitting outside on the laptop with road traffic, birds, etc nearby - My team really had a &quot;wow&quot; moment when i turned it on the first time
评论 #23469201 未加载
评论 #23469074 未加载
pierrebai将近 5 年前
As far as my experience goes, the single best way to deal with background noise is... the mute button.<p>In every video conf I&#x27;ve been, you can instantly tell when &quot;one of them&quot; who can&#x27;t be bothered to mute themselves joins. The audio quality immediately goes down the drain. It&#x27;s always the same subset of people who do it, too. As soon as they&#x27;re enjoined to please mute, the audio quality is restored.<p>No amount of magic signal processing will ever match it.<p>While perhaps misguided to use it that way, the mute button thus act as a social-clueness meter.
评论 #23479826 未加载
评论 #23470611 未加载
评论 #23473542 未加载
评论 #23471552 未加载
jdm2212将近 5 年前
I might be unusual, but my experience with videoconferencing has been that ambient noise is rarely a major problem. The big issue is audio cutting out due to a shaky network. When ambient noise is a problem, it&#x27;s not so much someone typing as their spouse talking in the background or a fire engine going by -- and at that point the solution is for them to hit mute.
评论 #23469137 未加载
评论 #23469555 未加载
评论 #23469985 未加载
bigtones将近 5 年前
I was not so impressed with this demo - especially when he was scrunching his potato chip packet, the degradation in his voice quality made it almost impossible to understand what he was saying and his voice sounded very synthesized and processed, and that&#x27;s through a $200 Yeti professional microphone. Seems like some of the other noise cancellation technology options from Nvidia RTX and others are more effective.
评论 #23468956 未加载
评论 #23469593 未加载
评论 #23468917 未加载
评论 #23468930 未加载
ben7799将近 5 年前
I&#x27;ve been doing online guitar lessons since Covid-19 started and all these algorithms just suck hard for that. Even in a 1:1 call.<p>Two repeated notes and the noise cancellation just immediately shuts you down... we&#x27;ve been using Zoom and luckily you can turn all the audio processing off if you go in &quot;Advanced&quot; and enable &quot;Turn on original audio&quot;.
评论 #23472679 未加载
graton将近 5 年前
This would have been useful for my co-worker. They went on a trip to Europe years ago and had conference call scheduled when they got there.<p>Unfortunately for them they decided to lie on the bed in the hotel and the jet lag hit them pretty hard. Next thing you know they are asleep and they started snoring and I guess fairly loudly and everyone on the call could hear. So then the people on the call spend some time trying to figure out who is the person snoring, going through all the attendees. Eventually they figure out who it was and they started yelling trying to wake them up, which they did after awhile. Needless to say my coworker was very embarrassed about the incident at the time, but it did make a good story to tell people :)
mleonhard将近 5 年前
Simple non-AI solution: Require all meeting participants to use push-to-talk. Support foot-pedals, mouse buttons, phone volume up button, and bluetooth play button.<p>For large meetings, organizers can enable a single-talker mode. Holding the talk button puts you in a queue. Your screen indicates when it&#x27;s your turn to talk. This prevents folks from talking over each other. This eliminates echo by muting the talker&#x27;s speakers while recording their voice. Also, attendees see the current talker, not the person whose dog just barked.
评论 #23482712 未加载
miki123211将近 5 年前
To all those here who complain about algorithms messing with their audio when they don&#x27;t want them to. Use an app called TeamTalk. It lets you disable all that processing, so it works great for high-quality music transmission etc. I have no affiliation with them, I have been using it for a few years and I&#x27;m very happy.
mwexler将近 5 年前
All this work to elim background is great. But we also will need business oriented group meets which emulate real life: Allow breakouts, 2-4 people in group &quot;sidebar&quot; to chat, the burble of other convos drift in, providing a gentle low background hum. The sidebar, unless marked private, would also contribute to hum of other users in a main group or in sidebars of their own. Acoustic effects can even allow directional it of the sounds.<p>Yes, we can do VR worlds that still look like Second Life, but while we are working on fixing that, we might solve for near term things to improve interactivity.<p>Well, and solve the &quot;one person speaks, all must listen, lag for response, loop&quot; which I find is similar to how morse code discussions work.
JoeAltmaier将近 5 年前
Everybody makes a stab at this, and very little of it works consistently. I applaud Google for attacking this head-on! It is a big issue and deserves attention.<p>My biggest issue (when I worked in videoconferencing) was echoing, and locking onto the delay window where echoes could occur. Depending on the distance from a conference room speaker to all the walls, echoes could occur at one or more offsets (appear at microphone input with some delay after presenting at the speaker). And ambient noises could masquerade as echoes. The filters tend to be IIR filters, and get wound up easily. It was awful.
PascLeRasc将近 5 年前
Microsoft Teams’ noise cancellation has been driving me absolutely crazy the past few weeks. I live on a busy road, and whenever a car drives past Teams will reduce the volume of anyone else talking in a meeting. Even if I’m muted - Teams gives me the “your microphone is muted” notification. And I use closed-back headphones for meetings so I don’t even hear the outside noise. So this results in my having to constantly have the Windows volume slider open, one earpiece off, and listen for a car coming so I can raise the volume in advance. Is anyone else dealing with this?
评论 #23472877 未加载
arielserafini将近 5 年前
I&#x27;d say this is more like a demo. From the &quot;how it works&quot; in the title I was expecting to see some implementation details.<p>Edit: I had only watched the video. The article does indeed contain a lot more detail.
评论 #23468915 未加载
The_Amp_Walrus将近 5 年前
Anyone know what kind of system they&#x27;re using to do this? Any papers?<p>I messed around for a few months with speech enhancement last year and didn&#x27;t really get anywhere beyond sort-of-reproducing a few existing models: <a href="https:&#x2F;&#x2F;github.com&#x2F;MattSegal&#x2F;speech-enhancement" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;MattSegal&#x2F;speech-enhancement</a><p>All the published &quot;state of the art&quot; examples I could find were pretty crap, whereas Krisp AI were doing much better than what I&#x27;ve seen released publicly.
评论 #23474522 未加载
ruffrey将近 5 年前
Serious question - what&#x27;s the risk that someone with a high pitched, outside-the-norm voice will get denoised? If it filters out kids in the background, will kids no longer be able to use google meet?
评论 #23469155 未加载
评论 #23468919 未加载
zoom6628将近 5 年前
I would really like to use a Google Noise for meeting cancellation. Excessive use of meetings now &quot;because we can&quot; instead of thinking longer and harder and asking a well structured question.<p>Meetings should be for review&#x2F;discussion and decision making not vocal exercise and grandstanding.<p>Would also be great if meeting providers would have a dial to show current latency for all participants to make easier to interject.<p>Lastly I do recommend using meeting tools that have features like letting you vote, raise a hand, chat all in a sync to main voice. Will make life easier for meeting moderators..... And if you don&#x27;t use moderators then start the practice of doing it - quality of meetings will improve hugely.
dekhn将近 5 年前
What I&#x27;d really like to see is effective source seperation and nulling. For example, if you could mute the screaming baby in the background of a VC speaker (this has been fairly common occurrence now that we are WFH and it&#x27;s hard to get day care).
评论 #23470019 未加载
pier25将近 5 年前
As someone with 2 dogs this is going to be a good reason to switch to Meet whenever possible.
newfeatureok将近 5 年前
None of this fancy technology is necessary IMO.<p>Just implement push to talk with mute-by-default. 90% of the audio issues would be resolved. Another 5% could be solved by buying everyone a decent headset which hopefully has a push-to-talk button on it as well.
评论 #23470831 未加载
GhostVII将近 5 年前
I wonder how much benefit you would get from targeting specific microphone&#x2F;speaker setups for noise cancelling rather than treating everything the same. I would imagine that the noise cancellation requirements are far different for someone video conferencing over a laptop mic and speaker versus a good pair of Bose headphones. If you could specify what type of device you are using it could tune the noise cancellation accordingly - if I am using a good pair of headphones, I don&#x27;t need echo cancellation, but I still need to filter out some amount of background noise.
monkey26将近 5 年前
Funny timing. Just got off my first Google Meet call an hour ago and was thinking they need to add noise cancellation. It was awful.
评论 #23469479 未加载
评论 #23469894 未加载
jpalomaki将近 5 年前
Might be interesting to train the model using user’s own voice. Maybe this would help filtering out co-workers in open office or family members.<p>Maybe you could also use this personal model to hide very short network interruptions. Other party could use this model to constantly predict my next piece of audio and switch to prediction in case packet is lost.
imroot将近 5 年前
One of the things I picked up from HN a few months ago was bettering my remote setup -- I picked up a HDMI capture card for my mirrorless camera, and bought a few lights to brighten up my office, and then I purchased a cardioid microphone and a pop filter.<p>The difference is night and day based on some of the recordings I&#x27;ve heard.
The_Amp_Walrus将近 5 年前
Anyone know what kind of system they&#x27;re using to do this? Any papers?<p>I messed around a little with speech enhancement last year and didn&#x27;t really get anywhere: <a href="https:&#x2F;&#x2F;github.com&#x2F;MattSegal&#x2F;speech-enhancement" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;MattSegal&#x2F;speech-enhancement</a>
kemayo将近 5 年前
&gt; Google also made a conscious decision to put the machine learning model in the cloud, which wasn’t the immediately obvious choice.<p>Oh good. Meet is already a <i>huge</i> battery-hog on my laptop, so adding fancy signal processing client-side was worrying me.
pkaye将近 5 年前
I need to get hearing aids soon and heard about all the advances and limitations. Particularly the problems with noise cancellation. I hope this kind of technology trickles into hearing aids also.
xchaotic将近 5 年前
I am still waiting until AI reaches the ultimate in noise cancelling- that meeting could have been an email. AI will automatically send meeting cancellation and most likely meeting notes.
taeric将近 5 年前
Wouldn&#x27;t this be a bit more trivial with multiple microphones?
ComplexSpidey将近 5 年前
Journey of a datascience feature - Start to End (still Work in Progress)<p>Key Takeaway - Its fine to be not 100% accurate, roll it out and learn.<p>tl;dr<p>- Approval from Execs<p>- Data -&gt; Learning -&gt; Training -&gt; Variability -&gt; Training -&gt; Tuning<p>- Privacymatters (for all digital educated , uneducated)<p>- What &amp; Whys of UX -- ultimately what user says<p>- Definitely Cloud -- Its 21st Century<p>- Optimised for Speed , Cost (a bit irrelevant if I am Google ;) )<p>- Release (with presentation) -- Timing matters<p>- Feedbacks (On permission)<p>A summary on the &quot;Denoiser&quot; and not &quot;Noise cancellation&quot; [Don&#x27;t want to get ranted out by Data Science folks] feature of googlemeet by PM. Applies to any such feature.
neximo64将近 5 年前
Any battery life tests of this tech on phones?
评论 #23469823 未加载
Vaslo将近 5 年前
Or you could just mute your damn phones
jokoon将近 5 年前
Weirdly, it seems the simplest phones already solved those problem a long time ago.<p>Seems like over-engineering. The issue is either with the microphone, with the hi-def stuff or something else.<p>Every normal phone never had an inch of a problem, so I&#x27;m really confused why computers have this issue.
评论 #23474598 未加载
m0zg将近 5 年前
&gt; How Google Meet&#x27;s noise cancellation works<p>Very poorly. Of all the available alternatives (Zoom, Skype, FaceTime), Google Meet seems to have the worst audio _and_ video quality. This is inexplicable for a company very easily capable of technological and product leadership in both of those things.
评论 #23470062 未加载
trboyden将近 5 年前
Not very well. Watched a Google Meet meeting for the neighbor&#x27;s Honor Society induction and the quality was horrible. Video kept freezing and audio cut in and out. Was probably only about a dozen attendees in the meeting room. Wasn&#x27;t the neighbor&#x27;s connection either, they have a solid Fios 200&#x2F;200 service.
评论 #23470053 未加载
david_draco将近 5 年前
It would be fun if it canceled out screaming.<p>We somehow have this sexist social expectation that women who show their feelings (crying, screaming) are &quot;hysterical&quot; (really a nasty word) and not taken seriously. If so, men screaming should be equally considered a sign of immaturity and lack of self-control.<p>Also could help with customers (&quot;Sorry, I can&#x27;t hear you!&quot;).