TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Writing an LLM from scratch, part 8 – trainable self-attention

380 pointsby gpjt2 months ago

4 comments

andsoitis2 months ago
&gt; For me, it took eight re-reads of Raschka&#x27;s (emininently clear and readable) explanation to get to a level where I felt I understood it.<p>It’s interesting to observe in oneself how repetition can result in internalizing new concepts. It is less about rote memorization but more about becoming aware of nuance and letting our minds “see” things from different angles, integrating it with existing world models either through augmentation, replacement, or adjustment. Similar for practicing activities that require some form of motor ability.<p>Some concepts are internalized less explicitly, like when we “learn” through role-modeling behaviors or feedback loops through interaction with people, objects, and ideas (like how to fit into a society).
评论 #43264942 未加载
评论 #43264087 未加载
评论 #43266672 未加载
评论 #43264249 未加载
评论 #43264513 未加载
评论 #43264976 未加载
penguin_booze2 months ago
One sees from scratch, and then one also sees<p><pre><code> from fancy_module import magic_functions </code></pre> I&#x27;m semi-serious here, of course. To me, for something to be called &#x27;from scratch&#x27;, requisite knowledge should be built ground up. To wit, I&#x27;d want to write the tokenizer myself but don&#x27;t want to derive laws of quantum physics that makes the computation happen.
评论 #43274172 未加载
评论 #43268230 未加载
评论 #43267469 未加载
评论 #43267085 未加载
评论 #43266731 未加载
kureikain2 months ago
In case if anyone want to read this book and live in bay areas, you can also access Oreilly media through your local library online and will be granted access to orelly media. this book is available there.
ForOldHack2 months ago
Part 8? Wait... Is this a story that wrote itself?<p>1) I am kidding. 2) At what point does it become self replicating? 3) skynet. 4) kidding - not kidding.