TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Zoology 1: Measuring and Improving Recall in Efficient Language Models

2 pointsby convexstrictlyover 1 year ago

1 comment

convexstrictlyover 1 year ago
Research suggesting that much of the power of the transformer architecture comes from associative recall over long sequences that does not require scaling model dimensions. They design state space models that narrow the gap.<p>Overview <a href="https:&#x2F;&#x2F;hazyresearch.stanford.edu&#x2F;blog&#x2F;2023-12-11-zoology0-intro" rel="nofollow noreferrer">https:&#x2F;&#x2F;hazyresearch.stanford.edu&#x2F;blog&#x2F;2023-12-11-zoology0-i...</a><p>Zoology 2 <a href="https:&#x2F;&#x2F;hazyresearch.stanford.edu&#x2F;blog&#x2F;2023-12-11-zoology2-based" rel="nofollow noreferrer">https:&#x2F;&#x2F;hazyresearch.stanford.edu&#x2F;blog&#x2F;2023-12-11-zoology2-b...</a><p>Monarchs and Butterflies: Towards Sub-Quadratic Scaling in Model Dimension. Scaling in model dimension as opposed to sequence dimension scaling in the previous posts. <a href="https:&#x2F;&#x2F;hazyresearch.stanford.edu&#x2F;blog&#x2F;2023-12-11-truly-subquadratic" rel="nofollow noreferrer">https:&#x2F;&#x2F;hazyresearch.stanford.edu&#x2F;blog&#x2F;2023-12-11-truly-subq...</a>