科技回声

11 条评论

simonw大约 2 个月前

I've been having some very impressive results from Gemini 2.5 Pro for complex coding tasks in the few hours I've been experimenting with it so far.I added a section about that to my review last night describing two of the larger examples: <a href="https://simonwillison.net/2025/Mar/25/gemini/#update-it-s-very-good-at-code" rel="nofollow">https://simonwillison.net/2025/Mar/25/gemini/#update-it-s-ve...</a>(It's always risky saying anything like this on a forum like Hacker News because it's inevitable someone will find a way to argue that the examples are trivial/unrealistic/show I don't know what I'm doing/clearly just regurgitated from StackOverflow/etc, but I'll take the risk anyway.)

评论 #43481726 未加载

评论 #43480957 未加载

评论 #43481188 未加载

评论 #43489722 未加载

评论 #43483794 未加载

评论 #43480976 未加载

评论 #43485959 未加载

tkgally大约 2 个月前

Very cool. Inspired by the post, I gave the following prompt to Gemini 2.5 Pro with the temperature set to 0.35:“Create a music synthesizer that I can run in my browser. Give it original functionality, with buttons and other controls that create pleasing but unexpected sounds, melodies, rhythmic patterns, etc. as well as various sliders and dials that can be used to modify the sounds and patterns.”Here is the result:<a href="https://www.gally.net/temp/202540326geminisynthesizer/index.html" rel="nofollow">https://www.gally.net/temp/202540326geminisynthesizer/index....</a>Here’s an excerpt from the reasoning trace:> Brainstorm "Original" and "Unexpected" Functionality:> Standard Synths: Oscillators (sine, square, saw, triangle), filters (lowpass, highpass), envelopes (ADSR), LFOs. These are common. How to make them unexpected?> Generative Aspects: Instead of just playing notes the user clicks, what if the synth generates patterns?> Melody Generation: Use algorithms like random walks, cellular automata, or simple probability rules.> Rhythm Generation: Euclidean rhythms, probabilistic triggers, evolving patterns. ...> Interface Concept: How can the controls themselves be part of the "unexpected" experience? Buttons that trigger complex events, sliders with non-linear responses, dials that control abstract concepts.

评论 #43489775 未加载

评论 #43481546 未加载

评论 #43481419 未加载

评论 #43481763 未加载

jrvarela56大约 2 个月前

Would be cool if the LLM can break up the request into sub-requests processable by LLMs. Current talk about agents mention some sort of router/orchestrator that delegates to other agents. But these can be another llm, another agent, another router itself or a simple tool call, etc - all function calls that wrap other llm-enabled sub components.My feeling is that we have the pieces to build AGI. Like humans, we don't need a 400IQ person to solve all problems ('AGI'). What we have is coordination problems and in LLM land it's 'the glue' that's missing. Hopeful it's a matter of patterns/best-practices emerging.

评论 #43481009 未加载

评论 #43480739 未加载

评论 #43480786 未加载

评论 #43480728 未加载

评论 #43480905 未加载

评论 #43481244 未加载

评论 #43480983 未加载

wiz21c大约 2 个月前

FTA:> I have never seen an LLM do thisInterestingly, many of the program we use provide a finite set of functionality that we can discover over time. But LLM's are different: you can't explore them because the input space is too big. Therefore, they can surprise us for a long time. That's cool!

评论 #43480871 未加载

评论 #43480863 未加载

coolgoose大约 2 个月前

Next step into LLM evolution is teaching them to procrastinate

评论 #43481092 未加载

评论 #43480932 未加载

ofirtwo大约 2 个月前

I'm curious on how the model's going to face intellectual tasks he can't resolve by referring back to the user. Today most LLM's will give multiple answers to "what's the meaning of life?" and immediately move the wand back to the user. It could be interesting if they'll hang with the question more, dive deeper into contradictions and tell, eventually, they don't know.

retrofuturism大约 2 个月前

That's interesting, but I wonder if it's _just_ the system prompt dictating that a request that would likely consume too many resources and likely fail should be rejected with such an answer.

menzoic大约 2 个月前

"During its thinking session it reached the conclusion that this task is not feasible in one shot. It then stopped and explained that to me."I've seen this happen with GPT-4 with zero shot prompts. Similar to the author "negotiating" allowed it to continue with an iterative approach.

cadamsdotcom大约 2 个月前

It’s a new type of refusal.The model is unlikely to know its own limits. Hopefully these refusals are amenable to prompt engineering: “even if the task seems infeasible, try anyway.”And hopefully next-gen models are trained to have more faith in themselves :)

vladmdgolam大约 2 个月前

I’ve encountered a similar when prompging o1-pro to make palindromes with some words and it actually answered that it’s impossible with some of them because they are gibberish when reversed and then made an example

trash_cat大约 2 个月前

Would be interesting to see the input prompts.

评论 #43480986 未加载

11 条评论

simonw大约 2 个月前

评论 #43481726 未加载

评论 #43480957 未加载

评论 #43481188 未加载

评论 #43489722 未加载

评论 #43483794 未加载

评论 #43480976 未加载

评论 #43485959 未加载

tkgally大约 2 个月前

评论 #43489775 未加载

评论 #43481546 未加载

评论 #43481419 未加载

评论 #43481763 未加载

jrvarela56大约 2 个月前

评论 #43481009 未加载

评论 #43480739 未加载

评论 #43480786 未加载

评论 #43480728 未加载

评论 #43480905 未加载

评论 #43481244 未加载

评论 #43480983 未加载

wiz21c大约 2 个月前

评论 #43480871 未加载

评论 #43480863 未加载

coolgoose大约 2 个月前

Next step into LLM evolution is teaching them to procrastinate

评论 #43481092 未加载

评论 #43480932 未加载

ofirtwo大约 2 个月前

retrofuturism大约 2 个月前

That's interesting, but I wonder if it's _just_ the system prompt dictating that a request that would likely consume too many resources and likely fail should be rejected with such an answer.

menzoic大约 2 个月前

cadamsdotcom大约 2 个月前

vladmdgolam大约 2 个月前

trash_cat大约 2 个月前

Would be interesting to see the input prompts.

评论 #43480986 未加载

Gemini 2.5 Pro reasons about task feasibility

11 条评论

Gemini 2.5 Pro reasons about task feasibility

11 条评论