TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Phi 4 available on Ollama

291 pointsby eadz4 months ago

18 comments

sgk2844 months ago
Over the holidays, we published a post[1] on using high-precision few-shot examples to get `gpt-4o-mini` to perform similar to `gpt-4o`. I just re-ran that same experiment, but swapped out `gpt-4o-mini` with `phi-4`.<p>`phi-4` really blew me away in terms of learning from few-shots. It measured as being 97% consistent with `gpt-4o` when using high-precision few-shots! Without the few-shots, it was only 37%. That&#x27;s a huge improvement!<p>By contrast, with few-shots it performs as well as `gpt-4o-mini` (though `gpt-4o-mini`&#x27;s baseline without few-shots was 59% – quite a bit higher than `phi-4`&#x27;s).<p>[1] <a href="https:&#x2F;&#x2F;bits.logic.inc&#x2F;p&#x2F;getting-gpt-4o-mini-to-perform-like" rel="nofollow">https:&#x2F;&#x2F;bits.logic.inc&#x2F;p&#x2F;getting-gpt-4o-mini-to-perform-like</a>
评论 #42671837 未加载
评论 #42671690 未加载
评论 #42686272 未加载
评论 #42671692 未加载
评论 #42673138 未加载
t0lo4 months ago
Is anyone blown away by how fast we got to running something this powerful locally? I know it&#x27;s easy to get burnt out on llms but this is pretty incredible.<p>I genuinely think we&#x27;re only 2 years away from full custom local voice to voice llm assistants that grow with you like JOI in BR2049 and it&#x27;s going to change how we think about being human and being social, and how we grow up.
评论 #42672632 未加载
评论 #42674851 未加载
评论 #42674539 未加载
评论 #42674157 未加载
评论 #42675080 未加载
评论 #42673623 未加载
crorella4 months ago
It’s odd that MS is releasing models they are competitors to OA. This reinforce the idea that there is no real strategic advantage in owning a model. I think the strategy is now offer cheap and performant infra to run the models.
评论 #42671532 未加载
评论 #42671115 未加载
评论 #42673242 未加载
评论 #42671089 未加载
评论 #42673751 未加载
mythz4 months ago
Was disappointed in all the Phi models before this, whose benchmark results scored way better than it worked in practice, but I&#x27;ve been really impressed with how good Phi-4 is at just 14B. We&#x27;ve run it against the top 1000 most popular StackOverflow questions and it came up 3rd beating out GPT-4 and Sonnet 3.5 in our benchmarks, only behind DeepSeek v3 and WizardLM 8x22B [1]. We&#x27;re using Mixtral 8x7B to grade the quality of the answers which could explain how WizardLM (based on Mixtral 8x22B) took 2nd Place.<p>Unfortunately I&#x27;m only getting 6 tok&#x2F;s on NVidia A4000 so it&#x27;s still not great for real-time queries, but luckily now that it&#x27;s MIT licensed it&#x27;s available on OpenRouter [2] for a great price of $0.07&#x2F;$0.14M at a fast 78 tok&#x2F;s.<p>Because it yields better results and we&#x27;re able to self-host Phi-4 for free, we&#x27;ve replaced Mistral NeMo with it in our default models for answering new questions [3].<p>[1] <a href="https:&#x2F;&#x2F;pvq.app&#x2F;leaderboard" rel="nofollow">https:&#x2F;&#x2F;pvq.app&#x2F;leaderboard</a><p>[2] <a href="https:&#x2F;&#x2F;openrouter.ai&#x2F;microsoft&#x2F;phi-4" rel="nofollow">https:&#x2F;&#x2F;openrouter.ai&#x2F;microsoft&#x2F;phi-4</a><p>[3] <a href="https:&#x2F;&#x2F;pvq.app&#x2F;questions&#x2F;ask" rel="nofollow">https:&#x2F;&#x2F;pvq.app&#x2F;questions&#x2F;ask</a>
评论 #42671040 未加载
评论 #42671227 未加载
评论 #42674186 未加载
评论 #42675873 未加载
hbcondo7144 months ago
FWIW, Phi-4 was converted to Ollama by the community last month:<p><a href="https:&#x2F;&#x2F;ollama.com&#x2F;vanilj&#x2F;Phi-4">https:&#x2F;&#x2F;ollama.com&#x2F;vanilj&#x2F;Phi-4</a>
评论 #42671168 未加载
评论 #42671518 未加载
raybb4 months ago
I was going to ask if this or other Ollama models support structured output (like JSON).<p>Then a quick search revealed you can as of a free weeks ago<p><a href="https:&#x2F;&#x2F;ollama.com&#x2F;blog&#x2F;structured-outputs">https:&#x2F;&#x2F;ollama.com&#x2F;blog&#x2F;structured-outputs</a>
评论 #42672040 未加载
评论 #42674984 未加载
andhuman4 months ago
I’ve seen on the localllama subreddit that some GGUFs have bugs in them. The one recommended was by unsloth. However, I don’t know how the Ollama GGUF holds up.
评论 #42670964 未加载
评论 #42671237 未加载
评论 #42670988 未加载
gnabgib4 months ago
Related <i>Phi-4: Microsoft&#x27;s Newest Small Language Model Specializing in Complex Reasoning</i> (439 points, 24 days ago, 144 comments) <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=42405323">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=42405323</a><p>Also on hugging face <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;microsoft&#x2F;phi-4" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;microsoft&#x2F;phi-4</a>
评论 #42671031 未加载
summarity4 months ago
Does it include the unsloth fixes yet?
mettamage4 months ago
How come models can be so small now? I don&#x27;t know a lot about AI, but is there an ELI5 for a software engineer that knows a <i>bit</i> about AI?<p>For context: I&#x27;ve made some simple neural nets with backprop. I read [1].<p>[1] <a href="http:&#x2F;&#x2F;neuralnetworksanddeeplearning.com&#x2F;" rel="nofollow">http:&#x2F;&#x2F;neuralnetworksanddeeplearning.com&#x2F;</a>
评论 #42681343 未加载
k__4 months ago
<i>&quot;built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&amp;A datasets&quot;</i><p>Does this mean the model was trained without copyright infringements?
评论 #42674686 未加载
dartos4 months ago
Does this include some of the config fixes that the sloth guys pointed out?
kuatroka4 months ago
I’ve pulled and ran it. It launches fine, but when I actually ask it anything I constantly get just a blank line. Does anyone else experience this?
评论 #42676160 未加载
XCSme4 months ago
Does this have an &quot;instruct&quot; version? Or it&#x27;s already sort-of like that, as it was trained more on Q&amp;A scenarios?
ionwake4 months ago
Can this run on a macbook m1? What is the performance like? Or would I need an m3? Thanks
评论 #42674918 未加载
buyucu4 months ago
I have unfortunately been disappointed with the llama.cpp&#x2F;ollama ecosystem of late, and thinking about moving my things to vllm instead.<p>llama.cpp basically dropped support for multimodal visual models. ollama still does support them, but only a handful. Also ollama still does not support vulkan eventhough llama.cpp had vulkan support for a long long time now.<p>This has been very sad to watch. I&#x27;m more and more convinced that vllm is the way to go, not ollama.
评论 #42674176 未加载
评论 #42673307 未加载
sega_sai4 months ago
I&#x27;ve just tried to make it run something, and I just could not force to include the python code inside ``` ``` quotation marks. It always wants to put word python after three quotes, like this: ```python .. code.. ``` I wonder if that&#x27;s the result of training. (I use the LLM output to then run the resulting code)
v3ss0n4 months ago
Translation, Phi-4 available on llmacpp