科技回声

10 条评论

The idea that "shore" and "sure" are pronounced "almost identically" would depend pretty heavily on your accent. The vowel is pretty different to me.Also, the matches for "sorI" and "sorY" would seem to me to misinterpret the words as having a vowel at the end, rather than a silent vowel. If you're using data meant for foreign surnames, the rules of which may differ from English and which might have silent vowels be very rare depending on the original language, of course you may mispronounce English words like this, saying both shore and sure as "sore-ee".I'm sure there are much better ways to transcribe orthography to phonetics, probably people have published libraries that do it. From some googling, it seems like some people call this type of library a phonemic transcriber or IPA transcriber.

评论 #42179389 未加载

评论 #42172850 未加载

评论 #42173496 未加载

评论 #42177414 未加载

评论 #42180312 未加载

WarOnPrivacy6 个月前

This short epilogue struck me.<pre><code> This past Yom Kippur, my wife and I drove two hours to spend the afternoon at my aunt’s house, with my cousins. As the night drew on, conversation roamed from television shows and books to politics and philosophy. The circle grew as we touched on increasingly sensitive and challenging topics, drawing us in. We didn’t agree, per se. We were engaging in debate as often as we were engaging in conversation. But we all love each other deeply, and the amount of care and restraint that went into how each person expressed their disagreement was palpable.</code></pre>

cess116 个月前

It's about someone using Levenshtein distance for phonetic fitting against text learning about soundex.One way to start playing around with it is to put some stuff in a database: <a href="https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_soundex" rel="nofollow">https://dev.mysql.com/doc/refman/8.4/en/string-functions.htm...</a>(or this module, <a href="https://www.postgresql.org/docs/current/fuzzystrmatch.html" rel="nofollow">https://www.postgresql.org/docs/current/fuzzystrmatch.html</a> if you're stuck with PG)

ajuc6 个月前

This is one of these cases where inheriting hacked-together piece of crap (English spelling) makes a lot of additional work higher up.Another example is poetry. A regex can find rhymes in Polish. Same postfix == it rhymes.In English it's a feat of engineering.

评论 #42177123 未加载

评论 #42173123 未加载

smoores6 个月前

Oh, hello, I didn't realize this was shared here! I guess let me know if anyone has any questions. I mostly wrote this piece as part of processing some hard feelings I've been having and seeing shared among Jewish folks around me, but I also ended up learning quite a bit about phonetic encoding algorithms, and I've spent several years at this point steeped in forced alignment via Storyteller.

评论 #42180634 未加载

arunc6 个月前

I created this sheet[0] to tech my kid to learn Tamil using Roman letters and in the process figured it could be useful for kids learning other Indian languages as well.With the history of reading and speaking (Indian) phonetic languages, I think, English would've been much nicer and uniform if the vowels sounded right, esp the long forms.Extending the long forms using orthogonal vowels probably made it complex, especially with the lack of ii and uu.Say for instance, to extend the long form of "o", "a" was used. Eg: boat, goat. The correct spelling could've been boot, with the original boot spelled as buut.With that notion, door is probably the only word that's written and pronounced phonetically correct, with two oo.Curious to know how would such correct phonetic translation aid in the encoding, matching and compression.[0] <a href="https://docs.google.com/spreadsheets/d/15hdVh-oBUngTyigqDdjgugEAgn4Odaw-FxK2xp_4EjM/edit?gid=589132794#gid=589132794" rel="nofollow">https://docs.google.com/spreadsheets/d/15hdVh-oBUngTyigqDdjg...</a>

msgerbush6 个月前

I'm using a library, stable-ts, for a similar issue with short audio clips and it works well: <a href="https://github.com/jianfch/stable-ts/tree/main">https://github.com/jianfch/stable-ts/tree/main</a>Not sure how it will perform on something long like an audiobook.

Der_Einzige6 个月前

Highly related to my paper on why tokenization in LLMs is the devil: <a href="https://paperswithcode.com/paper/most-language-models-can-be-poets-too-an-ai-1" rel="nofollow">https://paperswithcode.com/paper/most-language-models-can-be...</a>

qrian6 个月前

I also had to do this in my previous work and I took the phonetic embeddings of reference and transcribed text and ran a dynamic time warping with them.

willwade6 个月前

Im intrigued.. Is this not done just with a phonemizer?<pre><code> from phonemizer.phonemize import phonemize text = "hello world" variations = [ phonemize(text, backend="espeak", language="en-us", strip=True), phonemize(text, backend="espeak", language="en-gb", strip=True), phonemize(text, backend="espeak", language="en-au", strip=True), ] </code></pre> I mean, espeak isnt the best but a lot of folks in the ASR/Speech world still are using this right?(NB: If you are on iOS check out the inbuilt one - Settings -> Accessibility -> Spoken Content -> Pronounciations. Adding one it has the ability to phonemize to IPA your spoken message. If someone can tell me where that SDK/API is they use in that I'd love to know) for i, variation in enumerate(variations, 1): print(f"Variation {i}: {variation}")

评论 #42176646 未加载

10 条评论

asveikau6 个月前

评论 #42179389 未加载

评论 #42172850 未加载

评论 #42173496 未加载

评论 #42177414 未加载

评论 #42180312 未加载

WarOnPrivacy6 个月前

cess116 个月前

ajuc6 个月前

评论 #42177123 未加载

评论 #42173123 未加载

smoores6 个月前

评论 #42180634 未加载

arunc6 个月前

msgerbush6 个月前

Der_Einzige6 个月前

qrian6 个月前

I also had to do this in my previous work and I took the phonetic embeddings of reference and transcribed text and ran a dynamic time warping with them.

willwade6 个月前

评论 #42176646 未加载

Phonetic Matching

10 条评论

Phonetic Matching

10 条评论