TechEcho

You probably want to compress your figures - I think your line plots are stored in some vector format. The paper is 30MB and rendering chokes on those ultra dense figures (and the data resolution is not buying you any information). If the figures are vector format you should convert to png/jpeg etc.

Hello HackerNews! Author here :)TL;DR: We devise a linear SDE/ODE model to imitate per-class feature (thinking logits) dynamics of neural nets training based on local elasticity (LE) [1]. We found the emergence of LE implies linear separation of features from different classes as training progresses.The drift matrix of our model has a relatively simple structure; with that estimated, we can simulate the SDE using the forward Euler method, whose results align reasonably well with genuine dynamics.Local elasticity models the phenomenon observed in DNN training: the effect due to training on a sample is greater for samples from the same class, and smaller for samples from different classes. For example, training an image of cats facilitates the model better learns images of other cats while not so for images of, say, dogs.Any comments/thoughts/questions are most welcome![0] <a href="https://arxiv.org/abs/2110.05960" rel="nofollow">https://arxiv.org/abs/2110.05960</a> [1] <a href="https://arxiv.org/abs/1910.06943" rel="nofollow">https://arxiv.org/abs/1910.06943</a>

Imitating Deep Learning Dynamics via Stochastic Differential Equations

2 comments

Imitating Deep Learning Dynamics via Stochastic Differential Equations

2 comments