TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Scalable Bayesian Optimization Using Deep Neural Networks

63 pointsby groarover 9 years ago

3 comments

cs702over 9 years ago
In short, these guys are using deep neural nets to find good hyperparameters for training other deep neural nets, and this works as well as a Gaussian Process[1] but is more scalable and can be parallelized, allowing for faster optimization of hyperparameters.<p>--<p>[1] For example, like Spearmint: <a href="https:&#x2F;&#x2F;github.com&#x2F;JasperSnoek&#x2F;spearmint" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;JasperSnoek&#x2F;spearmint</a>
abeppuover 9 years ago
I haven&#x27;t read the paper, just skimmed through it, but isn&#x27;t this a sort of unreasonable comparison? Full GPs are O(N^3) because they do inverse on a covariance matrix that basically allows every datapoint to be related to every other datapoint. There exists a bunch of literature on sparse approximate Gaussian processes which iirc are basically O(N*M) where N is the data and M is basically the size of some set of active points (which is tunable). That seems like the natural point of comparison to their neural net approach. Broadly, in their deep net, they&#x27;ve chosen some architecture which determines the number of parameters. It seems like either the neural net or the sparse GP approach can claim to have runtime which is linear with the data size, and is doing less work than the full GP for basically the same reasons.
cschmidtover 9 years ago
As a completely trival point, one of the co-authors only has one name, Prabhat. Google, to its credit, has his work page as the first result.