TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Is anyone doing anything cool with tiny language models?

684 点作者 prettyblocks4 个月前
I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?

65 条评论

kaspermarstal4 个月前
I built an Excel Add-In that allows my girlfriend to quickly filter 7000 paper titles and abstracts for a review paper that she is writing [1]. It uses Gemma 2 2b which is a wonderful little model that can run on her laptop CPU. It works surprisingly well for this kind of binary classification task.<p>The nice thing is that she can copy&#x2F;paste the titles and abstracts in to two columns and write e.g. &quot;=PROMPT(A1:B1, &quot;If the paper studies diabetic neuropathy and stroke, return &#x27;Include&#x27;, otherwise return &#x27;Exclude&#x27;&quot;)&quot; and then drag down the formula across 7000 rows to bulk process the data on her own because it&#x27;s just Excel. There is a gif on the readme on the Github repo that shows it.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;getcellm&#x2F;cellm">https:&#x2F;&#x2F;github.com&#x2F;getcellm&#x2F;cellm</a>
评论 #42791646 未加载
评论 #42793924 未加载
评论 #42790494 未加载
评论 #42796501 未加载
评论 #42805657 未加载
评论 #42791645 未加载
评论 #42790265 未加载
评论 #42813125 未加载
评论 #42790901 未加载
评论 #42812155 未加载
评论 #42790359 未加载
评论 #42795545 未加载
antonok4 个月前
I&#x27;ve been using Llama models to identify cookie notices on websites, for the purpose of adding filter rules to block them in EasyList Cookie. Otherwise, this is normally done by, essentially, manual volunteer reporting.<p>Most cookie notices turn out to be pretty similar, HTML&#x2F;CSS-wise, and then you can grab their `innerText` and filter out false positives with a small LLM. I&#x27;ve found the 3B models have decent performance on this task, given enough prompt engineering. They do fall apart slightly around edge cases like less common languages or combined cookie notice + age restriction banners. 7B has a negligible false-positive rate without much extra cost. Either way these things are really fast and it&#x27;s amazing to see reports streaming in during a crawl with no human effort required.<p>Code is at <a href="https:&#x2F;&#x2F;github.com&#x2F;brave&#x2F;cookiemonster">https:&#x2F;&#x2F;github.com&#x2F;brave&#x2F;cookiemonster</a>. You can see the prompt at <a href="https:&#x2F;&#x2F;github.com&#x2F;brave&#x2F;cookiemonster&#x2F;blob&#x2F;main&#x2F;src&#x2F;text-classification.mjs#L12">https:&#x2F;&#x2F;github.com&#x2F;brave&#x2F;cookiemonster&#x2F;blob&#x2F;main&#x2F;src&#x2F;text-cl...</a>.
评论 #42793119 未加载
评论 #42786896 未加载
评论 #42793157 未加载
评论 #42786891 未加载
Evidlo4 个月前
I have ollama responding to SMS spam texts. I told it to feign interest in whatever the spammer is selling&#x2F;buying. Each number gets its own persona, like a millennial gymbro or 19th century British gentleman.<p><a href="http:&#x2F;&#x2F;files.widloski.com&#x2F;image10%20(1).png" rel="nofollow">http:&#x2F;&#x2F;files.widloski.com&#x2F;image10%20(1).png</a><p><a href="http:&#x2F;&#x2F;files.widloski.com&#x2F;image11.png" rel="nofollow">http:&#x2F;&#x2F;files.widloski.com&#x2F;image11.png</a>
评论 #42787151 未加载
评论 #42787781 未加载
评论 #42789860 未加载
评论 #42795730 未加载
评论 #42786904 未加载
评论 #42786974 未加载
评论 #42796084 未加载
评论 #42794672 未加载
评论 #42795824 未加载
评论 #42787231 未加载
评论 #42789419 未加载
behohippy4 个月前
I have a mini PC with an n100 CPU connected to a small 7&quot; monitor sitting on my desk, under the regular PC. I have llama 3b (q4) generating endless stories in different genres and styles. It&#x27;s fun to glance over at it and read whatever it&#x27;s in the middle of making. I gave llama.cpp one CPU core and it generates slow enough to just read at a normal pace, and the CPU fans don&#x27;t go nuts. Totally not productive or really useful but I like it.
评论 #42786114 未加载
评论 #42785325 未加载
评论 #42786081 未加载
评论 #42785192 未加载
评论 #42785253 未加载
评论 #42787856 未加载
nozzlegear4 个月前
I have a small fish script I use to prompt a model to generate three commit messages based off of my current git diff. I&#x27;m still playing around with which model comes up with the best messages, but usually I only use it to give me some ideas when my brain isn&#x27;t working. All the models accomplish that task pretty well.<p>Here&#x27;s the script: <a href="https:&#x2F;&#x2F;github.com&#x2F;nozzlegear&#x2F;dotfiles&#x2F;blob&#x2F;master&#x2F;fish-functions&#x2F;gen_commit_msg.fish">https:&#x2F;&#x2F;github.com&#x2F;nozzlegear&#x2F;dotfiles&#x2F;blob&#x2F;master&#x2F;fish-func...</a><p>And for this change [1] it generated these messages:<p><pre><code> 1. `fix: change from printf to echo for handling git diff input` 2. `refactor: update codeblock syntax in commit message generator` 3. `style: improve readability by adjusting prompt formatting` </code></pre> [1] <a href="https:&#x2F;&#x2F;github.com&#x2F;nozzlegear&#x2F;dotfiles&#x2F;commit&#x2F;0db65054524d0d2e706cbcf57e8067b878b3358b">https:&#x2F;&#x2F;github.com&#x2F;nozzlegear&#x2F;dotfiles&#x2F;commit&#x2F;0db65054524d0d...</a>
评论 #42790370 未加载
评论 #42795733 未加载
评论 #42788793 未加载
评论 #42790595 未加载
sidravi14 个月前
We fine-tuned a Gemma 2B to identify urgent messages sent by new and expecting mothers on a government-run maternal health helpline.<p><a href="https:&#x2F;&#x2F;idinsight.github.io&#x2F;tech-blog&#x2F;blog&#x2F;enhancing_maternal_healthcare&#x2F;" rel="nofollow">https:&#x2F;&#x2F;idinsight.github.io&#x2F;tech-blog&#x2F;blog&#x2F;enhancing_materna...</a>
评论 #42788954 未加载
评论 #42793587 未加载
评论 #42790308 未加载
评论 #42801392 未加载
flippyhead4 个月前
I have a tiny device that listens to conversations between two people or more and constantly tries to declare a &quot;winner&quot;
评论 #42787108 未加载
评论 #42786672 未加载
评论 #42785979 未加载
评论 #42785791 未加载
评论 #42890452 未加载
评论 #42785781 未加载
评论 #42785949 未加载
评论 #42786455 未加载
评论 #42785970 未加载
评论 #42807514 未加载
评论 #42788937 未加载
评论 #42788174 未加载
评论 #42789840 未加载
评论 #42791711 未加载
simonjgreen4 个月前
Micro Wake Word is a library and set of on device models for ESPs to wake on a spoken wake word. <a href="https:&#x2F;&#x2F;github.com&#x2F;kahrendt&#x2F;microWakeWord">https:&#x2F;&#x2F;github.com&#x2F;kahrendt&#x2F;microWakeWord</a><p>Recently deployed in Home Assistants fully local capable Alexa replacement. <a href="https:&#x2F;&#x2F;www.home-assistant.io&#x2F;voice_control&#x2F;about_wake_word&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.home-assistant.io&#x2F;voice_control&#x2F;about_wake_word&#x2F;</a>
评论 #42789733 未加载
RhysU4 个月前
&quot;Comedy Writing With Small Generative Models&quot; by Jamie Brew (Strange Loop 2023)<p><a href="https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=M2o4f_2L0No" rel="nofollow">https:&#x2F;&#x2F;m.youtube.com&#x2F;watch?v=M2o4f_2L0No</a><p>Spend the 45 minutes watching this talk. It is a delight. If you are unsure, wait until the speaker picks up the guitar.
评论 #42784951 未加载
评论 #42798484 未加载
azhenley4 个月前
Microsoft published a paper on their FLAME model (60M parameters) for Excel formula repair&#x2F;completion which outperformed much larger models (&gt;100B parameters).<p><a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2301.13779" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2301.13779</a>
评论 #42785415 未加载
评论 #42788633 未加载
评论 #42785673 未加载
评论 #42785270 未加载
computers33334 个月前
<a href="https:&#x2F;&#x2F;gophersignal.com" rel="nofollow">https:&#x2F;&#x2F;gophersignal.com</a> – I built GopherSignal!<p>It&#x27;s a lightweight tool that summarizes Hacker News articles. For example, here’s what it outputs for this very post, &quot;Ask HN: Is anyone doing anything cool with tiny language models?&quot;:<p>&quot;A user inquires about the use of tiny language models for interesting applications, such as spam filtering and cookie notice detection. A developer shares their experience with using Ollama to respond to SMS spam with unique personas, like a millennial gymbro or a 19th-century British gentleman. Another user highlights the effectiveness of 3B and 7B language models for cookie notice detection, with decent performance achieved through prompt engineering.&quot;<p>I originally used LLaMA 3:Instruct for the backend, which performs much better, but recently started experimenting with the smaller LLaMA 3.2:1B model.<p>It’s been cool seeing other people’s ideas too. Curious—does anyone have suggestions for small models that are good for summaries?<p>Feel free to check it out or make changes: <a href="https:&#x2F;&#x2F;github.com&#x2F;k-zehnder&#x2F;gophersignal">https:&#x2F;&#x2F;github.com&#x2F;k-zehnder&#x2F;gophersignal</a>
评论 #42791453 未加载
评论 #42809880 未加载
评论 #42819063 未加载
deet4 个月前
We (avy.ai) are using models in that range to analyze computer activity on-device, in a privacy sensitive way, to help knowledge workers as they go about their day.<p>The local models do things ranging from cleaning up OCR, to summarizing meetings, to estimating the user&#x27;s current goals and activity, to predicting search terms, to predicting queries and actions that, if run, would help the user accomplish their current task.<p>The capabilities of these tiny models have really surged recently. Even small vision models are becoming useful, especially if fine tuned.
评论 #42791416 未加载
mettamage4 个月前
I simply use it to de-anonymize code that I typed in via Claude<p>Maybe should write a plugin for it (open source):<p>1. Put in all your work related questions in the plugin, an LLM will make it as an abstract question for you to preview and send it<p>2. And then get the answer with all the data back<p>E.g. df[“cookie_company_name”] becomes df[“a”] and back
评论 #42785696 未加载
评论 #42788777 未加载
评论 #42784789 未加载
评论 #42785808 未加载
jwitthuhn4 个月前
I&#x27;ve made a tiny ~1m parameter model that can generate random Magic the Gathering cards that is largely based on Karpathy&#x27;s nanogpt with a few more features added on top.<p>I don&#x27;t have a pre-trained model to share but you can make one yourself from the git repo, assuming you have an apple silicon mac.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;jlwitthuhn&#x2F;TCGGPT">https:&#x2F;&#x2F;github.com&#x2F;jlwitthuhn&#x2F;TCGGPT</a>
deivid4 个月前
Not sure it qualifies, but I&#x27;ve started building an Android app that wraps bergamot[0] (the firefox translation models) to have on-device translation without reliance on google.<p>Bergamot is already used inside firefox, but I wanted translation also outside the browser.<p>[0]: bergamot <a href="https:&#x2F;&#x2F;github.com&#x2F;browsermt&#x2F;bergamot-translator">https:&#x2F;&#x2F;github.com&#x2F;browsermt&#x2F;bergamot-translator</a>
评论 #42789246 未加载
评论 #42786996 未加载
ata_aman4 个月前
I have it running on a Raspberry Pi 5 for offline chat and RAG. I wrote this open-source code for it: <a href="https:&#x2F;&#x2F;github.com&#x2F;persys-ai&#x2F;persys">https:&#x2F;&#x2F;github.com&#x2F;persys-ai&#x2F;persys</a><p>It also does RAG on apps there, like the music player, contacts app and to-do app. I can ask it to recommend similar artists to listen to based on my music library for example or ask it to quiz me on my PDF papers.
评论 #42788228 未加载
bashbjorn4 个月前
I&#x27;m working on a plugin[1] that runs local LLMs from the Godot game engine. The optimal model sizes seem to be 2B-7B ish, since those will run fast enough on most computers. We recommend that people try it out with Gemma 2 2B (but it will work with any model that works with llama.cpp)<p>At those sizes, it&#x27;s great for generating non-repetitive flavortext for NPCs. No more &quot;I took an arrow to the knee&quot;.<p>Models at around the 2B size aren&#x27;t really capable enough to act a competent adversary - but they are great for something like bargaining with a shopkeeper, or some other role where natural language can let players do a bit more immersive roleplay.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;nobodywho-ooo&#x2F;nobodywho">https:&#x2F;&#x2F;github.com&#x2F;nobodywho-ooo&#x2F;nobodywho</a>
评论 #42794698 未加载
mritchie7124 个月前
I used local LLMs via Ollama for generating H1&#x27;s &#x2F; marketing copy.<p>1. Create several different personas<p>2. Generate a ton of variation using a high temperature<p>3. Compare the variagtions head-to-head using the LLM to get a win &#x2F; loss ratio<p>The best ones can be quite good.<p>0 - <a href="https:&#x2F;&#x2F;www.definite.app&#x2F;blog&#x2F;overkillm" rel="nofollow">https:&#x2F;&#x2F;www.definite.app&#x2F;blog&#x2F;overkillm</a>
评论 #42790343 未加载
评论 #42788292 未加载
psyklic4 个月前
JetBrains&#x27; local single-line autocomplete model is 0.1B (w&#x2F; 1536-token context, ~170 lines of code): <a href="https:&#x2F;&#x2F;blog.jetbrains.com&#x2F;blog&#x2F;2024&#x2F;04&#x2F;04&#x2F;full-line-code-completion-in-jetbrains-ides-all-you-need-to-know&#x2F;#under-the-hood" rel="nofollow">https:&#x2F;&#x2F;blog.jetbrains.com&#x2F;blog&#x2F;2024&#x2F;04&#x2F;04&#x2F;full-line-code-co...</a><p>For context, GPT-2-small is 0.124B params (w&#x2F; 1024-token context).
评论 #42785838 未加载
评论 #42785728 未加载
评论 #42785009 未加载
评论 #42786326 未加载
jbentley14 个月前
Tiny language models can do a lot if they are fine tuned for a specific task, but IMO a few things are holding them back:<p>1. Getting the speed gains is hard unless you are able to pay for dedicated GPUs. Some services offer LoRA as serverless but you don&#x27;t get the same performance for various technical reasons.<p>2. Lack of talent to actually do the finetuning. Regular engineers can do a lot of LLM implementation, but when it comes to actually performing training it is a scarcer skillset. Most small to medium orgs don&#x27;t have people who can do it well.<p>3. Distribution. Sharing finetunes is hard. HuggingFace exists, but discoverability is an issue. It is flooded with random models with no documentation and it isn&#x27;t easy to find a good oen for your task. Plus, with a good finetune you also need the prompt and possibly parsing code to make it work the way it is intended and the bundling hasn&#x27;t been worked out well.
评论 #42793782 未加载
gpm4 个月前
I made a shell alias to translate things from French to English, does that count?<p><pre><code> function trans llm &quot;Translate \&quot;$argv\&quot; from French to English please&quot; end </code></pre> Llama 3.2:3b is a fine French-English dictionary IMHO.
评论 #42790203 未加载
ignoramous4 个月前
We&#x27;re prototyping a text firewall (for Android) with Gemma2 2B (which limits us to English), though DeepSeek&#x27;s R1 variants now look pretty promising [0]: Depending on the content, we rewrite the text or quarantine it from your view. Of course this is easy (for English) in the sense that the core logic is all LLMs [1], but the integration points (on Android) are not so straight forward for anything other than SMS. [2]<p>A more difficult problem we forsee is to turn it into a real-time (online) firewall (for calls, for example).<p>[1] <a href="https:&#x2F;&#x2F;chat.deepseek.com&#x2F;a&#x2F;chat&#x2F;s&#x2F;d5aeeda1-fefe-4fc6-8c90-20effc1dd7a4" rel="nofollow">https:&#x2F;&#x2F;chat.deepseek.com&#x2F;a&#x2F;chat&#x2F;s&#x2F;d5aeeda1-fefe-4fc6-8c90-2...</a><p>[1] MediaPipe in particular makes it simple to prototype around Gemma2 on Android: <a href="https:&#x2F;&#x2F;ai.google.dev&#x2F;edge&#x2F;mediapipe&#x2F;solutions&#x2F;genai&#x2F;llm_inference&#x2F;android" rel="nofollow">https:&#x2F;&#x2F;ai.google.dev&#x2F;edge&#x2F;mediapipe&#x2F;solutions&#x2F;genai&#x2F;llm_inf...</a><p>[2] Intend to open source it once we get it working for anything other than SMSes
评论 #42895850 未加载
JLCarveth4 个月前
I used a small (3b, I think) model plus tesseract.js to perform OCR on an image of a nutritional facts table and output structured JSON.
评论 #42789249 未加载
评论 #42790829 未加载
评论 #42789735 未加载
eb0la4 个月前
We&#x27;re using small language models to detect prompt injection. Not too cool, but at least we can publish some AI-related stuff on the internet without a huge bill.
评论 #42785713 未加载
juancroldan4 个月前
I&#x27;m making an agent that takes decompiled code and tries to understand the methods and replace variables and function names one at a time.
评论 #42791477 未加载
cwmoore4 个月前
I&#x27;m playing with the idea of identifying logical fallacies stated by live broadcasters.
评论 #42787653 未加载
评论 #42798458 未加载
评论 #42791080 未加载
评论 #42788090 未加载
评论 #42788889 未加载
评论 #42787010 未加载
评论 #42793882 未加载
评论 #42798043 未加载
spiritplumber4 个月前
My husband and me made a stock market analysis thing that gets it right about 55% of the time, so better than a coin toss. The problem is that it keeps making unethical suggestions, so we&#x27;re not using it to trade stock. Does anyone have any idea what we can do with that?
评论 #42787375 未加载
评论 #42787659 未加载
评论 #42787185 未加载
评论 #42787033 未加载
评论 #42787294 未加载
评论 #42789438 未加载
iamnotagenius4 个月前
No, but I use llama 3.2 1b and qwen2.5 1.5 as bash oneliner generator, always runnimg in console.
评论 #42785424 未加载
评论 #42786003 未加载
mrmage4 个月前
I am building GitHub-Copilot style AI autocomplete in any text field on your Mac. The point is to have the AI fill in all the redundant words required by human language, while you provide the entropy (i.e. the words that are unique to what you are trying to express). It is kind of a &quot;dance&quot; between accepting the AI&#x27;s suggested words and typing yourself to keep it going in the right direction.<p>Using it, I find myself often writing only the first half of most words, because the second part can usually already be guessed by the AI. In fact, it has a dedicated shortcut for accepting only the first word of the suggestion — that way, it can save you some typing even when later words deviate from your original intent.<p>Completions are generated in real-time locally on your Mac using a variety of models (primarily Qwen 2.5 1.5B).<p>It is currently in open beta: <a href="https:&#x2F;&#x2F;cotypist.app" rel="nofollow">https:&#x2F;&#x2F;cotypist.app</a>
评论 #42839984 未加载
jmward014 个月前
I think I am. At least I think I&#x27;m building things that will enable much smaller models: <a href="https:&#x2F;&#x2F;github.com&#x2F;jmward01&#x2F;lmplay&#x2F;wiki&#x2F;Sacrificial-Training">https:&#x2F;&#x2F;github.com&#x2F;jmward01&#x2F;lmplay&#x2F;wiki&#x2F;Sacrificial-Training</a>
jothflee4 个月前
when i feel like casually listening to something, instead of netflix&#x2F;hulu&#x2F;whatever, i&#x27;ll run a ~3b model (qwen 2.5 or llama 3.2) and generate and audio stream of water cooler office gossip. (when it is up, it runs here: <a href="https:&#x2F;&#x2F;water-cooler.jothflee.com" rel="nofollow">https:&#x2F;&#x2F;water-cooler.jothflee.com</a>).<p>some of the situations get pretty wild, for the office :)
评论 #42796474 未加载
评论 #42788384 未加载
lightning194 个月前
not sure if it is cool but, purely out of spite, I&#x27;m building a LLM summarizer app to compete with a AI startup that I interviewed with. The founders were super egotistical and initially thought I was not worthy of an interview.
ceritium4 个月前
I am doing nothing, but I was wondering if it would make sense to combine a small LLM and SQLITE to parse date time human expressions. For example, given a human input like &quot;last day of this month&quot;, the LLM will generate the following query `SELECT date(&#x27;now&#x27;,&#x27;start of month&#x27;,&#x27;+1 month&#x27;,&#x27;-1 day&#x27;);`<p>It is probably super overengineering, considering that pretty good libraries are already doing that on different languages, but it would be funny. I did some tests with chatGPT, and it worked sometimes. It would probably work with some fine-tuning, but I don&#x27;t have the experience or the time right now.
评论 #42790583 未加载
评论 #42791143 未加载
评论 #42812619 未加载
lormayna4 个月前
I am using smollm2 to extract some useful information (like remote, language, role, location, etc.) from &quot;Who is hiring&quot; monthly thread and create an RSS feed with specific filter. Still not ready for Show HN, but working.
arionhardison4 个月前
I am, in a way by using EHR&#x2F;EMR data for fine tuning so agents can query each other for medical records in a HIPPA compliant manner.
sauravpanda4 个月前
We are building a framework to run this tiny language model in the web so anyone can access private LLMs in their browser: <a href="https:&#x2F;&#x2F;github.com&#x2F;sauravpanda&#x2F;BrowserAI">https:&#x2F;&#x2F;github.com&#x2F;sauravpanda&#x2F;BrowserAI</a>.<p>With just three lines of code, you can run Small LLM models inside the browser. We feel this unlocks a ton of potential for businesses so that they can introduce AI without fear of cost and can personalize the experience using AI.<p>Would love your thoughts and what we can do more or better!
评论 #42792474 未加载
thetrash4 个月前
I programmed my own version of Tic Tac Toe in Godot, using a Llama 3B as the AI opponent. Not for work flow, but figuring out how to beat it is entertaining during moments of boredom.
评论 #42787006 未加载
danbmil994 个月前
Using llama 3.2 as an interface to a robot. If you can get the latency down, it works wonderfully
评论 #42788820 未加载
linsomniac4 个月前
I have this idea that a tiny LM would be good at canonicalizing entered real estate addresses. We currently buy a data set and software from Experian, but it feels like something an LM might be very good at. There are lots of weirdnesses in address entry that regexes have a hard time with. We know the bulk of addresses a user might be entering, unless it&#x27;s a totally new property, so we should be able to train it on that.
评论 #42798829 未加载
kianN4 个月前
I don’t know if this counts as tiny but I use llama 3B in prod for summarization (kinda).<p>Its effective context window is pretty small but I have a much more robust statistical model that handles thematic extraction. The llm is essentially just rewriting ~5-10 sentences into a single paragraph.<p>I’ve found the less you need the language model to actually do, the less the size&#x2F;quality of the model actually matters.
A4ET8a8uTh0_v24 个月前
Kinda? All local so very much personal, non-business use. I made Ollama talk in a specific persona styles with the idea of speaking like Spider Jerusalem, when I feel like retaining some level of privacy by avoiding phrases I would normally use. Uncensored llama just rewrites my post with a specific persona&#x27;s &#x27;voice&#x27;. Works amusingly well for that purpose.
addandsubtract4 个月前
I use a small model to rename my Linux ISOs. I gave it a custom prompt with examples of how I want the output filenames to be structured and then just feed it files to rename. The output only works 90ish percent of the time, so I wrote a little CLI to iterate through the files and accept &#x2F; retry &#x2F; edit the changes the LLM outputs.
dh10114 个月前
I copied all the text from this post and used an LLM to generate a list of all the ideas. I do the same for other similar HN post .
评论 #42788487 未加载
评论 #42793163 未加载
merwijas4 个月前
I put llama 3 on a RBPi 5 and have it running a small droid. I added a TTS engine so it can hear spoken prompts which it replies to in droid speak. It also has a small screen that translates the response to English. I gave it a backstory about being a astromech droid so it usually just talks about the hyperdrive but it&#x27;s fun.
reeeeee4 个月前
I built a platform to monitor LLMs that are given complete freedom in the form of a Docker container bash REPL. Currently the models have been offline for some time because I&#x27;m upgrading from a single DELL to a TinyMiniMicro Proxmox cluster to run multiple small LLMs locally.<p>The bots don&#x27;t do a lot of interesting stuff though, I plan to add the following functionalities:<p>- Instead of just resetting every 100 messages, I&#x27;m going to provide them with a rolling window of context.<p>- Instead of only allowing BASH commands, they will be able to also respond with reasoning messages, hopefully to make them a bit smarter.<p>- Give them a better docker container with more CLI tools such as curl and a working package manager.<p>If you&#x27;re interested in seeing the developments, you can subscribe on the platform!<p><a href="https:&#x2F;&#x2F;lama.garden" rel="nofollow">https:&#x2F;&#x2F;lama.garden</a>
krystofee4 个月前
Has anyone ever tried to do some automatic email workflow autoresponder agents?<p>Lets say, I want some outcome and it will autonomousl handle the process prompt me and the other side for additional requirements if necessary and then based on that handle the process and reach the outcome?
ahrjay4 个月前
I built <a href="https:&#x2F;&#x2F;ffprompt.ryanseddon.com" rel="nofollow">https:&#x2F;&#x2F;ffprompt.ryanseddon.com</a> using the chrome ai (Gemini nano). Allows you to do ffmpeg operations on videos using natural language all client side.
评论 #42793833 未加载
guywithahat4 个月前
I&#x27;ve been working on a self-hosted, low-latency service for small LLM&#x27;s. It&#x27;s basically exactly what I would have wanted when I started my previous startup. The goal is for real time applications, where even the network time to access a fast LLM like groq is an issue.<p>I haven&#x27;t benchmarked it yet but I&#x27;d be happy to hear opinions on it. It&#x27;s written in C++ (specifically not python), and is designed to be a self-contained microservice based around llama.cpp.<p><a href="https:&#x2F;&#x2F;github.com&#x2F;thansen0&#x2F;fastllm.cpp">https:&#x2F;&#x2F;github.com&#x2F;thansen0&#x2F;fastllm.cpp</a>
herol3oy4 个月前
I&#x27;ve created Austen [0] to generate relationships between book characters using Mermaid.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;herol3oy&#x2F;austen">https:&#x2F;&#x2F;github.com&#x2F;herol3oy&#x2F;austen</a>
Thews4 个月前
Before ollama and the others could do structured JSON output, I hacked together my own loop to correct the output. I used it that for dummy API endpoints to pretend to be online services but available locally, to pair with UI mockups. For my first test I made a recipe generator and then tried to see what it would take to &quot;jailbreak&quot; it. I also used uncensored models to allow it to generate all kinds of funny content.<p>I think the content you can get from the SLMs for fake data is a lot more engaging than say the ruby ffaker library.
codazoda4 个月前
I had an LLM create a playlist for me.<p>I’m tired of the bad playlists I get from algorithms, so I made a specific playlist with an Llama2 based on several songs I like. I started with 50, removed any I didn’t like, and added more to fill in the spaces. The small models were pretty good at this. Now I have a decent fixed playlist. It does get “tired” after a few weeks and I need to add more to it. I’ve never been able to do this myself with more than a dozen songs.
评论 #42794243 未加载
评论 #42792207 未加载
评论 #42788087 未加载
评论 #42790316 未加载
sharnabeel4 个月前
I have tired but chinese to english but it isn&#x27;t good(none of them are), because for Chinese words meaning different depending on context so i am just stuck with large model but sometimes even they leave chinese text in translation(like google gemina 2),<p>I really hope there would be some amazing models this year for translation.
sebazzz4 个月前
I built auto-summarization and grouping in an experimental branch of my hobby-retrospective tool: <a href="https:&#x2F;&#x2F;github.com&#x2F;Sebazzz&#x2F;Return&#x2F;tree&#x2F;experiment&#x2F;ai-integration">https:&#x2F;&#x2F;github.com&#x2F;Sebazzz&#x2F;Return&#x2F;tree&#x2F;experiment&#x2F;ai-integra...</a><p>I’m now just wondering if there is any way to build tests on the input+output of the LLM :D
mogaal4 个月前
I bought a tiny business in Brazil, the database (Excel) I inherited with previous customer data *do not include gender*. I need gender to start my marketing campaigns and learn more about my future customer. I used Gemma-2B and Python to determine gender based on the data and it worked perfect
评论 #42791935 未加载
kolinko4 个月前
Apple’s on device models are around 3B if I’m nit mistaken, and they developed some nice tech around them that they published, if I’m not mistaken - where they have just one model, but have switchable finetunings of that model so that it can perform different functionalities depending on context.
itskarad4 个月前
I&#x27;m using ollama for parsing and categorizing scraped jobs for a local job board dashboard I check everyday.
accrual4 个月前
Although there are better ways to test, I used a 3B model to speed up replies from my local AI server when testing out an application I was developing. Yes I could have mocked up HTTP replies etc., but in this case the small model let me just plug in and go.
HexDecOctBin4 个月前
Is there any experiments in a small models that does paraphrasing? I tried hsing some off-the-shelf models, but it didn&#x27;t go well.<p>I was thinking of hooking them in RPGs with text-based dialogue, so that a character will say something slightly different every time you speak to them.
评论 #42791485 未加载
jftuga4 个月前
I&#x27;m using ollama, llama3.2 3b, and python to shorten news article titles to 10 words or less. I have a 3 column web site with a list of news articles in the middle column. Some of the titles are too long for this format, but the shorter titles appear OK.
ittaboba4 个月前
I am building a private text editor that runs LLMs locally <a href="https:&#x2F;&#x2F;manzoni.app&#x2F;" rel="nofollow">https:&#x2F;&#x2F;manzoni.app&#x2F;</a>
evacchi4 个月前
I&#x27;m interested in finding tiny models to create workflows stringing together several function&#x2F;tools and running them on device using mcp.run servlets on Android (disclaimer: I work on that)
panchicore34 个月前
I am moderating a playlists manager to restrict them to a range of genders so it classifies song requests as accepted&#x2F;rejected.
numba8884 个月前
Many interesting projects, cool. I&#x27;m waiting to LLMs in games. That would make them much more fun. Any time now...
评论 #42796550 未加载
kristopolous4 个月前
I&#x27;m working on using them for agentic voice commands of a limited scope.<p>My needs are narrow and limited but I want a bit of flexibility.
Havoc4 个月前
Pretty sure they are mostly used as fine tuning targets, rather than as-is.
评论 #42785562 未加载