A non-technical explanation of deep learning

304 pointsby tworatsabout 2 years ago

22 comments

jstx1about 2 years ago

Does stuff like this help anyone?I still haven’t forgiven CGP Grey for changing the title to his 2017 ML video to “How AIs, like ChatGPT, learn”. The video is about genetic algorithms and has nothing to do with ChatGPT. (or with anything else in modern AI)

评论 #35707045 未加载

评论 #35704518 未加载

评论 #35704402 未加载

评论 #35707622 未加载

zvmazabout 2 years ago

I have met people who think they understand a particular topic I am versed in, but actually don't. Similarly, I am often wary that I get superficial knowledge about a topic I don't know much about through "laymen" resources, and I doubt one can have an appropriate level of understanding mainly through analogies and metaphors. It's a kind of "epistemic anxiety". Of course, there are "laymen" books I stumbled upon which I think go to appropriate levels of depth and do not "dumb down" to shallow levels the topics, yet remain accessible, like Gödel's Proof, by Ernest Nagel. I'd be glad to read about similar books on all topics, including the one discussed in this thread.Knowledge is hard to attain...

评论 #35704750 未加载

评论 #35705292 未加载

评论 #35705722 未加载

评论 #35707068 未加载

评论 #35707991 未加载

评论 #35707663 未加载

time_to_smileabout 2 years ago

> This is how neural networks work: they see many examples and get rewarded or punished based on whether their guesses are correct.This description more closely describes reinforcement learning, rather than gradient based optimization.In fact, the entire metaphor of a confused individual being slapped or rewarded without understanding what's going on doesn't really make sense when considering gradient optimization because the gradient wrt the to loss function tells the network exactly how to change it's behavior to improve it's performance.This last point is incredibly important to understand correctly since it contains one of the biggest assumptions about network behavior: that the optimal solution, or at least good enough for our concerns solution, can be found by slowing taking small steps in the right direction.Neural networks are great at refining their beliefs but have a difficult time radically changing them. A better analogy might be trying to very slowly convince your uncle that climate change is real, and not a liberal conspiracy.edit: it also does a poor job of explaining layers, which reads much more similar to how ensemble methods work (lots of little classifiers voting) than how deep networks work.

评论 #35709337 未加载

sainezabout 2 years ago

If anyone is looking for a quick overview of how LLMs are built, I highly recommend this video by Steve Seitz: <a href="https://www.youtube.com/watch?v=lnA9DMvHtfI">https://www.youtube.com/watch?v=lnA9DMvHtfI</a>.It does an excellent job of taking you from 0 to a decent understanding without dumbing down the content or abusing analogies.

评论 #35710653 未加载

onikolas7about 2 years ago

Funny. In the game black&white you would slap or pet your avatar to train it. The lead AI programmer on that was Demis Hassabis of deepmind fame.

评论 #35707023 未加载

评论 #35705049 未加载

lhnzabout 2 years ago

I have a few funny analogies that I think kind of work.1. "gradient descent" is like tuning a guitar by ear and listening to the beat frequencies ("loss") and then decreasing these by tuning a string up or down.2. the best I can come up with for "backpropagation" is to imagine a clever device that can tirelessly optimize a Rube Goldberg machine for you but as a science, not an art.

pkdpicabout 2 years ago

I love this, but Im always confused in these kinds of analogies what the reward / punishment system really equates to...Also reminds me of Ted Chiang warning us that we will torture innumerable AI entities long before we start having real conversations about treating them with compassion.

评论 #35703833 未加载

评论 #35707095 未加载

clarleabout 2 years ago

Totally aware that this isn't a fully formal definition of deep learning, but one interesting takeaway for me is realizing that in a way, corporations with their formal and informal reporting structures are structured in a way similar to neural networks too.It seems like these sort of structures just regularly arise to help regulate the flow of information through a system.

评论 #35704867 未加载

评论 #35705179 未加载

ameliusabout 2 years ago

The problem with deep learning is opposite. You can understand most of it with just high school math. Advanced math is mostly useless because of the dimensionality of neural nets.

评论 #35707133 未加载

评论 #35704547 未加载

评论 #35703668 未加载

评论 #35703666 未加载

hgsgmabout 2 years ago

Non-technical, non-accurate. "Truthy", buzzfeed/huffpo quality.

dukeofdoomabout 2 years ago

I asked ChatGPT:Sure, I can explain deep learning using the analogy of mixing paint colors.When you mix different colors of paint, you can create new colors that are a combination of the original colors. Similarly, in deep learning, you can combine different types of data, such as images, text, and audio, to create a more comprehensive understanding of the information.In a neural network, which is the basic building block of deep learning, each layer is like a different color of paint. The input layer represents the initial data, which is like a blank canvas. As the data passes through each layer, the network extracts increasingly complex features, similar to how mixing different colors creates more complex hues.Each layer in a neural network can be thought of as a different color that is mixed with the previous layer's color. As the colors mix, the resulting color becomes more complex and nuanced, just as the data becomes more sophisticated and detailed as it passes through the layers of the neural network.At the end of the process, the output layer represents the final result, which is like a finished painting. The deep learning algorithm can then use this final output to make predictions or classifications based on the input data.So, in essence, deep learning is like mixing paint colors to create new and more complex hues, but instead of colors, the mixtures are made up of data and the neural network layers are the different colors being mixed.

评论 #35708295 未加载

great_wubwubabout 2 years ago

As someone who knows barely enough to be dangerous, I like this. I'm sure it leaves enough out to make most experts angry, but it makes a lot of sense to me.

评论 #35702418 未加载

Myrmornisabout 2 years ago

> they see 3 spreadsheets of numbers representing the RGB values of the picture.This needs expanding: it's the sort of thing that's easy for a programmer to say, but few non-{programmer,mathematically trained person} are going to see that an RGB value has 3 parts and so a collection of RGB values could be sliced into 3 sheets.

评论 #35707690 未加载

charcircuitabout 2 years ago

Why is violence and praise being used to illustrate gradient descent? Why does each person get to see the entire input data?

wrsabout 2 years ago

This is the funniest refutation of the Chinese Room argument that I’ve seen. Note that at the end, it’s still the case that none of these people can recognize a cat.

评论 #35703163 未加载

kagawabout 2 years ago

I see, so neural network works like Naruto's Kage Bunshin technique where the learning of the clones will pass to its origin

hyperdimensionabout 2 years ago

> They respond that this sounds very convoluted and they'll only agree to do it if you call them "colonel".Cute.

tartakovskyabout 2 years ago

Made me laugh because it's true, funny. Well done!

lxeabout 2 years ago

This has Three-Body Problem vibes :)

nailerabout 2 years ago

Does this article imply there are circumstances where a spreadsheet is a cat? What a poor example of technical writing.

评论 #35703754 未加载

评论 #35702851 未加载

giardiniabout 2 years ago

Nothing about LLMs?!

评论 #35704311 未加载

_gmax0about 2 years ago

The most concise and intuitive line of explanation I've been given goes along the lines of this:1 - We want to model data, representative of some system, through functions.2 - Virtually any function can be expressed by a n-th order polynomial.3 - We wish to learn the parameters, the coefficients, of such polynomials.4 - Neural networks allow us to brute-force test candidate values of such parameters (finding optimal candidate parameters such that error between expected and actual values of our dataset are minimized)Whereas prior, methods (e.g. PCA) could only model linear relationships, neural networks allowed us to begin modeling non-linear ones.

评论 #35707145 未加载

评论 #35707349 未加载

评论 #35706716 未加载

评论 #35708665 未加载

评论 #35706521 未加载

评论 #35705369 未加载