科技回声

11 条评论

tbran8 个月前

To run text-to-speech on my laptop, I've been using Justine Tunney's downloadable single executable Whisper file.I use it transcribe audio then copy into an LLM to get notes on whatever it is. Helps me decide to watch or listen to something and saves a bunch of time.Her tweet: <a href="https://x.com/JustineTunney/status/1825551821857010143" rel="nofollow">https://x.com/JustineTunney/status/1825551821857010143</a>Instructions from Simon Willison: <a href="https://simonwillison.net/2024/Aug/19/whisperfile/" rel="nofollow">https://simonwillison.net/2024/Aug/19/whisperfile/</a>Command line options: <a href="https://github.com/Mozilla-Ocho/llamafile/issues/544#issuecomment-2297368432">https://github.com/Mozilla-Ocho/llamafile/issues/544#issueco...</a>

jwr8 个月前

Amazing work.I am also impressed by the advances in technology. 20 years ago, I had severe RSI problems and worked on "vx-mode", a package for interfacing XEmacs to Dragon NaturallySpeaking, the best speech-recognition solution available at the time. My goals were similar, although the result was nowhere near what the OP has done. Also, speech recognition tech was nowhere near what we have now: I still remember buying good microphones, worrying about microphone placement relative to mouth, endless training and re-training…This kind of software can make a huge difference for many people.

评论 #41555410 未加载

submeta8 个月前

Year 2080: AGIs help you trinscribe, structure, layout your code/text/thoughts. At the same time: HN posts: „New package for Emacs doing xyz“.

评论 #41553597 未加载

lepisma8 个月前

Hey, author here. Didn't notice this came up on HN.I wrote a small follow up trying to write and speak at the same time here <a href="https://lepisma.xyz/journal/2024/09/13/can-i-output-two-streams-of-text/index.html" rel="nofollow">https://lepisma.xyz/journal/2024/09/13/can-i-output-two-stre...</a>

评论 #41555263 未加载

voltaireodactyl8 个月前

This looks very useful, and beautifully presented — looking forward to being able to use with local model.

Jeff_Brown8 个月前

I would use this for edits that are hard to do otherwise. Like, instead of typing `M-x align-regexp` and then figuring out what regular expression to type, I would just highlight a passage and say to the LLM "Can you align all the library names in this import statement?"

BeetleB8 个月前

I did something similar here:<a href="https://blog.nawaz.org/posts/2023/Dec/cleaning-up-speech-recognition-with-gpt/" rel="nofollow">https://blog.nawaz.org/posts/2023/Dec/cleaning-up-speech-rec...</a>I now use Whisper with a much expanded prompt and have the flow integrated both in Emacs and my WM.Prior HN discussion:<a href="https://news.ycombinator.com/item?id=40174921">https://news.ycombinator.com/item?id=40174921</a>I've since done hours of transcription with it - often transcribing whole emails. The challenge is that my brain thinks very differently while talking compared to while typing. As a result, my output is very verbose, and is very different from what I would have typed. I haven't figured out how to speak as if I'm typing.

ggm8 个月前

"Emacs: Upgrade to MELPA"ELPA installed s/w suite: "I'm sorry Dave, I can't do that"

评论 #41554945 未加载

ants_everywhere8 个月前

nerd-dictation is a decent offline speech dictation tool for Linux that I've used with Emacs <a href="https://github.com/ideasman42/nerd-dictation">https://github.com/ideasman42/nerd-dictation</a>

namidark8 个月前

Has anyone gotten whisper.el/.cpp to work on OSX with the microphone permissions and Emacs?

zvmaz8 个月前

Does the author mind if he shared his Emacs configuration? So beautiful!

评论 #41563381 未加载

11 条评论

tbran8 个月前

jwr8 个月前

评论 #41555410 未加载

submeta8 个月前

Year 2080: AGIs help you trinscribe, structure, layout your code/text/thoughts. At the same time: HN posts: „New package for Emacs doing xyz“.

评论 #41553597 未加载

lepisma8 个月前

评论 #41555263 未加载

voltaireodactyl8 个月前

This looks very useful, and beautifully presented — looking forward to being able to use with local model.

Jeff_Brown8 个月前

BeetleB8 个月前

ggm8 个月前

"Emacs: Upgrade to MELPA"ELPA installed s/w suite: "I'm sorry Dave, I can't do that"

评论 #41554945 未加载

ants_everywhere8 个月前

nerd-dictation is a decent offline speech dictation tool for Linux that I've used with Emacs <a href="https://github.com/ideasman42/nerd-dictation">https://github.com/ideasman42/nerd-dictation</a>

namidark8 个月前

Has anyone gotten whisper.el/.cpp to work on OSX with the microphone permissions and Emacs?

zvmaz8 个月前

Does the author mind if he shared his Emacs configuration? So beautiful!

评论 #41563381 未加载

Speech Dictation Mode for Emacs

11 条评论

Speech Dictation Mode for Emacs

11 条评论