TechEcho

9 comments

zone411about 1 year ago

We really need better long context benchmarks than needle-in-a-haystack. There is LV-Eval (<a href="https://arxiv.org/abs/2402.05136" rel="nofollow">https://arxiv.org/abs/2402.05136</a>) with multi-hop QA that's better but still pretty basic.

评论 #40025708 未加载

评论 #39893312 未加载

评论 #39893717 未加载

esha_manideepabout 1 year ago

Pretty amazing to see training data being discussed more openly

评论 #39891347 未加载

milliondreamsabout 1 year ago

TLDR; 1. InternLM2 is an open-source Large Language Model that has shown improvements over previous models, particularly in long-context modeling. 2. The model uses a unique approach, combining traditional training with Supervised Fine-Tuning and Conditional Online Reinforcement Learning from Human Feedback. 3. It offers a variety of model sizes and training stages to the community, demonstrating significant advancements in AI research and application.

评论 #39890314 未加载

barsonmeabout 1 year ago

Is it normal for papers to have that many authors?

评论 #39891403 未加载

评论 #39891021 未加载

评论 #39891958 未加载

评论 #39896665 未加载

ilakshabout 1 year ago

Does anyone know how the free commercial license works? Do they usually grant it? <a href="https://wj.qq.com/s2/12727483/5dba/" rel="nofollow">https://wj.qq.com/s2/12727483/5dba/</a> looks like a form there.<p>Apache 2 code, free commercial license with application form for weights.

Kwpolskaabout 1 year ago

The name suggests this is interns posing as a chatbot, especially considering today’s date.

pilotnekoabout 1 year ago

I experimented with this model and vLLM around a month ago. The long context length is attractive, but it was incredibly slow on a g5.12xlarge (4 NVIDIA A10G GPUs). I actually could not get it to respond for single examples longer than 50K tokens.

viraptorabout 1 year ago

The repo is here: <a href="https://github.com/InternLM/InternLM">https://github.com/InternLM/InternLM</a>

dannywabout 1 year ago

How good is the base (non-instruction-tuned) model? Everyone is trying to make chat bots, but for my use cases, I find base models more suitable.

评论 #39891173 未加载