TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Any Deep ReLU Network Is Shallow

18 pointsby fofozalmost 2 years ago

3 comments

PaulHoulealmost 2 years ago
It doesn’t surprise me. It’s been known a long time that you can model arbitrary functions with a 3-layer network with logistic activation.
评论 #36435799 未加载
alexfromapexalmost 2 years ago
It seems intuitive since ReLU is just a type of implicit regularization. Why would subsequent gradient descent help once you've achieved the benefit of throwing away the "outliers" or data beyond the threshold you want?
cheekyfibonaccialmost 2 years ago
don't confuse this with universal approximation - yes shallow ReLU networks are dense in functional space, so at the limit you should be able to get any function you want - but they are talking about exact representation with finitely many neurons here.