Is there a good reference for which models should be preferred for which tasks? As I understand it, there's a decent amount of overlap in the capability of the top models right now (Claude 3, Claude 3.5, GPT4o, OpenAI o1, Gemini 1.5 Pro, Llama 3.1, etc.) but each have their own strengths and weaknesses as well as differences in the API pricing, context windows, usage privacy policies, etc. It'd be nice to know how each stack up against each other for specific use cases. Eg. writing -> use Claude 3, coding -> use Claude 3.5, STEM / logic questions -> OpenAI o1, huge context -> Gemini 1.5 Pro, self-hosting / running locally -> Llama 3.1<p>I do see a lot of articles on Google from people just personally testing and comparing models based on their own criteria. I haven't found anything yet though that seems to be both thorough and well maintained (given how often new models are released or updated).