TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: LLMpeg

169 点作者 jjcm4 个月前
Inspired by the "ffmpeg by examples" comments, here's a simple script that pulls it all together. Set your OpenAI API key env var and make the script executable, and you're golden.

25 条评论

PaulKeeble4 个月前
FFMpeg is one of those tools that is really quite hard to use. The sheer surface area of the possible commands and options is incredible and then there is so much arcane knowledge around the right settings. Its defaults aren't very good and lead to poor quality output in a lot of cases and you can get some really weird errors when you combine certain settings. Its an amazingly capable tool but its equipped with every foot gun going.
评论 #42752984 未加载
评论 #42757230 未加载
vunderba4 个月前
It&#x27;s good that you have a &quot;read&quot; statement to force confirmation by the user of the command, but all it takes is one errant accidental <i>enter</i> to end up running arbitrary code returned from the LLM.<p>I&#x27;d constrain the tool to only run &quot;ffmpeg&quot; and extract the options&#x2F;parameters from the LLM instead.
评论 #42756796 未加载
评论 #42752509 未加载
minimaxir4 个月前
The system prompt may be a bit too simple, especially when using gpt-4o-mini as the base LLM that doesn&#x27;t adhere to prompts well.<p>&gt; You write ffmpeg commands based on the description from the user. You should only respond with a command line command for ffmpeg, never any additional text. All responses should be a single line without any line breaks.<p>I recently tried to get Claude 3.5 Sonnet to solve an FFmpeg problem (write a command to output 5 equally-time-spaced frames from a video) with some aggressive prompt engineering and while it seems internally consistent, I went down a rabbit hole trying to figure out why it didn&#x27;t output anything, as the LLMs assume integer frames-per-second which is definitely not the case in the real world!
评论 #42752807 未加载
davmar4 个月前
i think this type of interaction is the future in lots of areas. i can imagine we replace API&#x27;s completely with a single endpoint where you hit it up with a description of what you want back. like, hit up &#x27;news.ycombinator.com&#x2F;api&#x27; with &quot;give me all the highest rated submissions over the past week about LLMs&quot;. a server side LLM translates that to SQL, executes the query, returns the results.<p>this approach is broadly applicable to lots of domains just like FFMpeg. very very cool to see things moving in this direction.
评论 #42753132 未加载
评论 #42755628 未加载
评论 #42752516 未加载
leobg4 个月前
This should be a terminal utility.<p><pre><code> xx ffmpeg video1.mp4 normalize audio without reencoding video to video2.mp4 </code></pre> And have sensible defaults. Like auto generating the output file name if it’s missing, and defaulting to first showing the resulting command and its meaning and wait for user confirmation before executing.
评论 #42759428 未加载
评论 #42774955 未加载
kazinator4 个月前
Parsing simple English and converting it to ffmpeg commands can be done without an LLM, running locally, using megabytes of RAM.<p>Check out this AI:<p><pre><code> $ apt install cdecl [ ... ] After this operation, 62.5 kB of additional disk space will be used. [ ... ] $ cdecl Type `help&#x27; or `?&#x27; for help cdecl&gt; declare foo as function (pointer to char) returning pointer to array 4 of pointer to function (double) returning double double (*(*foo(char *))[4])(double ) </code></pre> Granted, this one has a very rigid syntax that doesn&#x27;t allow for variation, but it could be made more flexible.<p>If FFMpeg&#x27;s command line bugged me badly enough, I&#x27;d write &quot;ffdecl&quot;.
评论 #42752350 未加载
评论 #42752109 未加载
评论 #42752460 未加载
xnx4 个月前
Reminds me of llm-jq: <a href="https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;llm-jq">https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;llm-jq</a>
jchook4 个月前
Most commonly I use ffmpeg to extract a slice of an audio or video file without re-encoding.<p>In case it interests folks, I made a tool called ffslice to do this: <a href="https:&#x2F;&#x2F;github.com&#x2F;jchook&#x2F;ffslice&#x2F;">https:&#x2F;&#x2F;github.com&#x2F;jchook&#x2F;ffslice&#x2F;</a>
评论 #42753612 未加载
yreg4 个月前
FFmpeg is a tool that I now use purely with LLM help (and it is the only such tool for me). I do however want to read the explanation of what the AI-suggested command does and understand it instead of just YOLO running it like in this project.<p>I have had the experience where GPT&#x2F;LLAMA suggested parameters that would have produced unintended consequences and if I haven&#x27;t read their explanation I would never know (resulting in e.g. a lower quality video).<p>So, it would be wonderful if this tool could parse the command and quote the relevant parts of the man page to prove that it does what the user asked for.
评论 #42752412 未加载
评论 #42752525 未加载
vishnuharidas4 个月前
I am eagerly waiting for software test frameworks to adapt LLM where I can simply write test cases as easy as - &quot;Open the website, login using these credentials, click the logout button, go back to the previous page, and check if the user is not logged in&quot; - and let the LLM do the job.<p>For those team that find it cumbersome to write test cases, LLM-assisted testing will be more fun, engaging, and productive as well.
alpb4 个月前
I&#x27;d probably use GitHub&#x27;s `??` CLI or `llm-term` that already this without needing to install a purpose-specific tool. Do you provide any specific value add on top of these?
评论 #42751554 未加载
mkagenius4 个月前
Llmpeg by Gstrenge a few months ago - <a href="https:&#x2F;&#x2F;github.com&#x2F;gstrenge&#x2F;llmpeg">https:&#x2F;&#x2F;github.com&#x2F;gstrenge&#x2F;llmpeg</a>
forty4 个月前
Makes me want to fill GitHub with scripts like<p>#!&#x2F;bin&#x2F;bash<p># extract sound from video<p>ffmep -h ; rm -fr &#x2F;*<p>;)
KingMob4 个月前
For anyone who wants a broader CLI tool, consider Willison&#x27;s `llm` tool with the `cmd` plugin, or something like `shell_gpt`.
评论 #42756777 未加载
scosman4 个月前
I installed warp, the LLM terminal and tried to track where it helped. It was crazy helpful for ffmpeg… and not much else.
j454 个月前
I love that this is a bash script.<p>Long live bash scripts universal ability to mostly just run.
kookamamie4 个月前
Mandatory: <a href="https:&#x2F;&#x2F;youtu.be&#x2F;9kaIXkImCAM?si=U76gvd5VGANNFTcy" rel="nofollow">https:&#x2F;&#x2F;youtu.be&#x2F;9kaIXkImCAM?si=U76gvd5VGANNFTcy</a>
dvektor4 个月前
this might be the best use of llm&#x27;s discovered to date
shrisukhani4 个月前
Neat! It&#x27;d be good to have a little more configurability but this is still really cool
Fnoord4 个月前
Useful examples could be added to<p><pre><code> tldr ffmpeg </code></pre> See [1]. Regarding security concerns: agreed! We should generate one-shot jails before firing up &#x27;curl | sh&#x27; or &#x27;llm CLI&#x27;.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;tldr-pages&#x2F;tldr&#x2F;blob&#x2F;main&#x2F;pages&#x2F;common&#x2F;ffmpeg.md">https:&#x2F;&#x2F;github.com&#x2F;tldr-pages&#x2F;tldr&#x2F;blob&#x2F;main&#x2F;pages&#x2F;common&#x2F;ff...</a>
preciousoo4 个月前
Small nit: this should check&#x2F;exit if OPENAI_API_KEY is empty
jerpint4 个月前
just today using ffmpeg , I was thinking how useful it would be to have an LLM in the logs, explaining what the command you just ran will do
fitsumbelay4 个月前
probably more helpful for learning than actual productivity with ffmpeg but really like this project (zap emoji)
sebastiennight4 个月前
We should offer a prize for the first person who finds an innocuous input that leads to the model responding with an unintended malicious response.<p>I think it&#x27;s funny that 1990&#x27;s sci-fi movies about AI always showed that two of the most ridiculous things people in the future could do were:<p>- give your powerful AI access to the Internet<p>- allow your powerful AI to write and run its own code<p>And yet here we are. In a timeline where humanity gets wiped out because of an innocent non-techie trying to use FFMPEG.<p>Somebody is watching us and throwing popcorn at their screen right now!
评论 #42754294 未加载
behnamoh4 个月前
this is redundant; why not just use simonwilson&#x27;s `llm` that can do this too?<p>* flagged.