科技回声

Is there a good reference for which models should be preferred for which tasks? As I understand it, there's a decent amount of overlap in the capability of the top models right now (Claude 3, Claude 3.5, GPT4o, OpenAI o1, Gemini 1.5 Pro, Llama 3.1, etc.) but each have their own strengths and weaknesses as well as differences in the API pricing, context windows, usage privacy policies, etc. It'd be nice to know how each stack up against each other for specific use cases. Eg. writing -> use Claude 3, coding -> use Claude 3.5, STEM / logic questions -> OpenAI o1, huge context -> Gemini 1.5 Pro, self-hosting / running locally -> Llama 3.1<p>I do see a lot of articles on Google from people just personally testing and comparing models based on their own criteria. I haven't found anything yet though that seems to be both thorough and well maintained (given how often new models are released or updated).

1 comment

deichrenner8 个月前

I have been wondering the same and found <a href="https://www.leewayhertz.com/comparison-of-llms/" rel="nofollow">https://www.leewayhertz.com/comparison-of-llms/</a> and <a href="https://artificialanalysis.ai/" rel="nofollow">https://artificialanalysis.ai/</a>

Ask HN: Good up-to-date resource for finding best current LLM for a given task?

1 comment

Ask HN: Good up-to-date resource for finding best current LLM for a given task?

1 comment