TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

LLM Visualization

1592 pointsby plibither8over 1 year ago

48 comments

warkanlockover 1 year ago
This is an excellent tool to realize how an LLM actually works from the ground up!<p>For those reading it and going through each step, if by chance you get stuck on why 48 elements are in the first array, please refer to the model.py on minGPT [1]<p>It&#x27;s an architectural decision that it will be great to mention in the article since people without too much context might lose it<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;minGPT&#x2F;blob&#x2F;master&#x2F;mingpt&#x2F;model.py">https:&#x2F;&#x2F;github.com&#x2F;karpathy&#x2F;minGPT&#x2F;blob&#x2F;master&#x2F;mingpt&#x2F;model....</a>
评论 #38513983 未加载
评论 #38512086 未加载
评论 #38508965 未加载
评论 #38514094 未加载
holtkam2over 1 year ago
The visualization I&#x27;ve been looking for for months. I would have happily paid serious money for this... the fact that it&#x27;s free is such a gift and I don&#x27;t take it for granted.
评论 #38513539 未加载
wills_forwardover 1 year ago
My jaw drop to see algorhythmic complexity laid out so clearly in a 3d space like that. I wish I was smart enough to know if it&#x27;s accurate or not.
评论 #38509458 未加载
gryfftover 1 year ago
Damn, this looks <i>phenomenal.</i> I&#x27;ve been wanting to do a deep dive like this for a while-- the 3D model is a spectacular pedagogic device.
评论 #38510595 未加载
baqover 1 year ago
Could as well be titled &#x27;dissecting magic into matmuls and dot products for dummies&#x27;. Great stuff. Went away even more amazed that LLMs work as well as they do.
mark_l_watsonover 1 year ago
I am looking at Brenden’s GitHub repo <a href="https:&#x2F;&#x2F;github.com&#x2F;bbycroft&#x2F;llm-viz">https:&#x2F;&#x2F;github.com&#x2F;bbycroft&#x2F;llm-viz</a><p>Really nice stuff.
flockonusover 1 year ago
Twitter thread by the author sharing some extra context on this work: <a href="https:&#x2F;&#x2F;twitter.com&#x2F;BrendanBycroft&#x2F;status&#x2F;1731042957149827140" rel="nofollow noreferrer">https:&#x2F;&#x2F;twitter.com&#x2F;BrendanBycroft&#x2F;status&#x2F;173104295714982714...</a>
评论 #38519242 未加载
tysam_andover 1 year ago
Another visualization I would really love would be a clickable circular set of possible prediction branches, projected onto a Poincare disk (to handle the exponential branching component of it all). Would take forever to calculate except on smaller models, but being able to visualize branch probabilities angularly for the top n values or whatever, and to go forwards and backwards up and down different branches would likely yield some important insights into how they work.<p>Good visualization precludes good discoveries in many branches of science, I think.<p>(see my profile for a longer, potentially more silly description ;) )
29athrowawayover 1 year ago
I big kudos to the author of this.<p>Not only has the visualization, but it&#x27;s interactive, has explanations for each item, has excellent performance and is open source: <a href="https:&#x2F;&#x2F;github.com&#x2F;bbycroft&#x2F;llm-viz&#x2F;blob&#x2F;main&#x2F;src&#x2F;llm">https:&#x2F;&#x2F;github.com&#x2F;bbycroft&#x2F;llm-viz&#x2F;blob&#x2F;main&#x2F;src&#x2F;llm</a><p>Another interesting visualization related thing: <a href="https:&#x2F;&#x2F;github.com&#x2F;shap&#x2F;shap">https:&#x2F;&#x2F;github.com&#x2F;shap&#x2F;shap</a>
8f2ab37a-ed6cover 1 year ago
Expecting someone to implement an LLM in Factorio any day now, we&#x27;re half-way there already with this blueprint.
Exumaover 1 year ago
This is really awesome but I at least wish there were a few added sentences around how I&#x27;m supposed to intuitively think about the purpose of why it&#x27;s like that. For example, I see a T x C matrix of 6 x 48... but at this step, before it&#x27;s fed into the net, what is this supposed to represent?
评论 #38520545 未加载
atgctgover 1 year ago
A lot of transformer explanations fail to mention what makes self attention so powerful.<p>Unlike traditional neural networks with fixed weights, self-attention layers adaptively weight connections between inputs based on context. This allows transformers to accomplish in a single layer what would take traditional networks multiple layers.
评论 #38509992 未加载
评论 #38509888 未加载
评论 #38511533 未加载
skadamatover 1 year ago
If folks want a lower dimensional version of this for their own models, I&#x27;m a big fan of the Netron library for model architecture visualization.<p>Wrote about it here: <a href="https:&#x2F;&#x2F;about.xethub.com&#x2F;blog&#x2F;visualizing-ml-models-github-netron" rel="nofollow noreferrer">https:&#x2F;&#x2F;about.xethub.com&#x2F;blog&#x2F;visualizing-ml-models-github-n...</a>
评论 #38510427 未加载
airesQover 1 year ago
Incredible work.<p>So much depth; initially I thought it&#x27;s &quot;just&quot; a 3d model. The animations are amazing.
shaburnover 1 year ago
Visualization never seems to get the credit due in software development. This is amazing.
SiempreViernesover 1 year ago
Anyone know if there is a name for this 3D control schema? This feels like one of the most intuitive setups I&#x27;ve ever used.
评论 #38523973 未加载
评论 #38518184 未加载
arikrakover 1 year ago
This looks pretty cool! Anyone know of visualizations for simpler neural networks? I&#x27;m aware of tensorflow playground but that&#x27;s just for a toy example, is there anything for visualizing a real example (e.g handwriting recognition)?
评论 #38510083 未加载
评论 #38510417 未加载
评论 #38519178 未加载
rvzover 1 year ago
Rather than looking at the visuals of this network, it is more better to focus on the actual problem with these LLMs which the author already has shown:<p>With in the transformer section:<p>&gt; As is common in deep learning, it&#x27;s hard to say exactly what each of these layers is doing, but we have some general ideas: the earlier layers tend to focus on learning lower-level features and patterns, while the later layers learn to recognize and understand higher-level abstractions and relationships.<p>That is the problem and yet these black boxes are just as explainable as a magic scroll.
评论 #38511046 未加载
codedokodeover 1 year ago
This is a great visualization because original paper on transformers is not very clear and understandable; I tried to read it first and didn&#x27;t understand so I had to look for other explanations (for example it was unclear for me how multiple tokens are handled).<p>Also, speaking about transformers: they usually append their output tokens to input and process them again. Can we optimize it, so that we don&#x27;t need to do the same calculations with same input tokens?
评论 #38517025 未加载
hmate9over 1 year ago
This is a phenomenal visualisation. I wish I saw this when I was trying to wrap my head around transformers a while ago. This would have made it so much easier.
johnklosover 1 year ago
Am I the only one getting &quot;Application error: a client-side exception has occurred (see the browser console for more information).&quot; messages?
评论 #38524012 未加载
评论 #38513231 未加载
评论 #38514019 未加载
tsunamifuryover 1 year ago
This shows how the individual weights and vectors work but unless I’m missing something doesn’t quite illustrate yet how higher order vectors are created at the sentence and paragraph level. This might be an emergent property within this system though so it’s hard to “illustrate”. how all of this ends up with a world simulation needs to be understood better and I hope this advances further.
评论 #38508667 未加载
russellbeattieover 1 year ago
I&#x27;ve wondered for a while if as LLM usage matures, there will be an effort to optimize hotspots like what happened with VMs, or auto indexed like in relational DBs. I&#x27;m sure there are common data paths which get more usage, which could somehow be prioritized, either through pre-processing or dynamically, helping speed up inference.
nbzsoover 1 year ago
Beautiful. This should be the new educational standard for visualization of complex topics and systemic thinking.
abrookewoodover 1 year ago
This does an amazing job of showing the difference in complexity between the different models. Click on GPT-3 and you should be able to see all 4 models side-by-side. GPT-3 is a monster compared to nano-gpt.
tikkunover 1 year ago
What happened to this thread? When I saw it before it had 700+ upvotes.
评论 #38511905 未加载
drdgover 1 year ago
Very cool. The explanations of what each part is doing is really insightful. And I especially like how the scale jumps when you move from e.g. Nano all the way to GPT-3 ....
thistoowontpassover 1 year ago
Thank you. I&#x27;d just completed doing this manually (much uglier and less accurate) and so can really appreciate the effort behind this.
thefourthchimeover 1 year ago
First off, this is fabulous work. I went through it for the Nano, but is there a way to do the step-by-step for the other LLMs?
评论 #38508497 未加载
cod1rover 1 year ago
Really cool stuff. Looks like an entire computer but with software. Definitely need to dig into more AI&#x2F;ML things.
nikhil896over 1 year ago
This is by far the best resource I&#x27;ve seen to understand LLMs. Incredibly well done! Thanks for this awesome tool
Simon_ORourkeover 1 year ago
This is brilliant work, thanks for sharing.
评论 #38510263 未加载
crotchfireover 1 year ago
Application error: a client-side exception has occurred (see the browser console for more information).
评论 #38514635 未加载
sva_over 1 year ago
The score on this post just went down by a factor of 10 and the time went to &quot;1 hour&quot; ago?!
评论 #38512067 未加载
gbertasiusover 1 year ago
This is AMAZING! I&#x27;m about to go into Uni and this will be useful for my ML classes.
meebyover 1 year ago
This is easily the best visualization I&#x27;ve seen for a long time. Fantastic work!
RecycledEleover 1 year ago
This is excellent!<p>This is why I love Hacker News!
Arctic_flyover 1 year ago
Curse you for being interesting enough to make me get on my desktop.
valdectover 1 year ago
Super cool! It&#x27;s always nice to look something concrete
athulsuresh123over 1 year ago
This should be in college textbooks
physPopover 1 year ago
Honestly reading the pytorch implementation of minGTP is a lot more informative than an inscrutable 3d rendering. It&#x27;s a well commended and pedagogical implementation. I applaud the intention, and it looks slick, but I&#x27;m not sure it really conveys information in an efficient way.
smy20011over 1 year ago
Really cool!
Solvencyover 1 year ago
Wish it were mobile friendly.
Workaccount2over 1 year ago
Such an amazing tool
haltistover 1 year ago
Very cool.
BSTRhinoover 1 year ago
bbycroft is the GOAT!
wdiamondover 1 year ago
amazing router
reqoover 1 year ago
I feel like visualizations like this are what is missing from univeristy curricula. Now imagine a professor going trough each animation describing exactly what is happening, I am pretty sure students would get a much more in-depth understanding!
评论 #38508517 未加载
评论 #38509386 未加载
评论 #38509044 未加载