TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

InternLM2

136 pointsby milliondreamsabout 1 year ago

9 comments

zone411about 1 year ago
We really need better long context benchmarks than needle-in-a-haystack. There is LV-Eval (<a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2402.05136" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;2402.05136</a>) with multi-hop QA that&#x27;s better but still pretty basic.
评论 #40025708 未加载
评论 #39893312 未加载
评论 #39893717 未加载
esha_manideepabout 1 year ago
Pretty amazing to see training data being discussed more openly
评论 #39891347 未加载
milliondreamsabout 1 year ago
TLDR; 1. InternLM2 is an open-source Large Language Model that has shown improvements over previous models, particularly in long-context modeling. 2. The model uses a unique approach, combining traditional training with Supervised Fine-Tuning and Conditional Online Reinforcement Learning from Human Feedback. 3. It offers a variety of model sizes and training stages to the community, demonstrating significant advancements in AI research and application.
评论 #39890314 未加载
barsonmeabout 1 year ago
Is it normal for papers to have that many authors?
评论 #39891403 未加载
评论 #39891021 未加载
评论 #39891958 未加载
评论 #39896665 未加载
ilakshabout 1 year ago
Does anyone know how the free commercial license works? Do they usually grant it? <a href="https:&#x2F;&#x2F;wj.qq.com&#x2F;s2&#x2F;12727483&#x2F;5dba&#x2F;" rel="nofollow">https:&#x2F;&#x2F;wj.qq.com&#x2F;s2&#x2F;12727483&#x2F;5dba&#x2F;</a> looks like a form there.<p>Apache 2 code, free commercial license with application form for weights.
Kwpolskaabout 1 year ago
The name suggests this is interns posing as a chatbot, especially considering today’s date.
pilotnekoabout 1 year ago
I experimented with this model and vLLM around a month ago. The long context length is attractive, but it was incredibly slow on a g5.12xlarge (4 NVIDIA A10G GPUs). I actually could not get it to respond for single examples longer than 50K tokens.
viraptorabout 1 year ago
The repo is here: <a href="https:&#x2F;&#x2F;github.com&#x2F;InternLM&#x2F;InternLM">https:&#x2F;&#x2F;github.com&#x2F;InternLM&#x2F;InternLM</a>
dannywabout 1 year ago
How good is the base (non-instruction-tuned) model? Everyone is trying to make chat bots, but for my use cases, I find base models more suitable.
评论 #39891173 未加载