TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Neural Network Architecture Beyond Width and Depth

70 点作者 StrauXX将近 2 年前

4 条评论

fwlr将近 2 年前
They introduce “height” as another architectural dimension, alongside the usual width and depth. If you imagine the usual diagram of a neural network, the difference when a neural net is of height 2 is that in the middle layers, each individual node contains another network inside it, and that inner network has the same structure as the top-level network. For height 3, each node has an inner network, and each of those inner networks is composed of nodes that have their own inner networks as well. And so on, recursively, for greater heights. There’s a diagram on page 3.
评论 #36024672 未加载
评论 #36026623 未加载
neodypsis将近 2 年前
&gt; We propose the nested network architecture since it shares the parameters via repetitions of sub-network activation functions. In other words, a NestNet can provide a special parameter-sharing scheme. This is the key reason why the NestNet has much better approximation power than the standard network.<p>It would be interesting to see an experiment that compares their CNN2 model with other parameter-sharing schemes such as networks using hyper-convolutions [0][1][2].<p>[0] Ma, T., Wang, A. Q., Dalca, A. V., &amp; Sabuncu, M. R. (2022). Hyper-Convolutions via Implicit Kernels for Medical Imaging. arXiv preprint arXiv:2202.02701.<p>[1] Chang, O., Flokas, L., &amp; Lipson, H. (2019, September). Principled weight initialization for hypernetworks. In International Conference on Learning Representations.<p>[2] Ukai, K., Matsubara, T., &amp; Uehara, K. (2018, November). Hypernetwork-based implicit posterior estimation and model averaging of cnn. In Asian Conference on Machine Learning (pp. 176-191). PMLR.
moosedev将近 2 年前
Paper was submitted almost exactly 1 year ago, and last revised in Jan 2023.<p>Not sure if title needs a (2022), just pointing out the above in case anyone else like me read “19 May” and mistakenly thought it was a 2 day old paper :)
评论 #36024876 未加载
revskill将近 2 年前
So we could have n-dimensional Neuron Network to process train data. In theory it should work.