Was prepared for an introduction to some novel never-before-seen method near the end, but this was a nice and well-structured summary regardless! Wanted to tack on a couple of questions/thoughts:<p>1. Is the author familiar with all of the recent work around neural differential equations? Latent (Neural) ODEs[1] and Neural Controlled Differential Equations[2] are already able to match or exceed the performance of GRU-D when working with certain sparse and irregularly-sampled time series.<p>2. More of a nitpick, but most of the models covered as working with EHR data are using clinical data instead of raw physiological signals. For example, measurements from a high-frequency physiological waveform such as an ECG are usually synthesized into something like a per-minute heart rate in a medical record. Most research working with physiological signals directly is using either traditional signal processing approaches or some form of CNN (this includes 1-D resnets and wavenet-like architectures). RNNs do pop up occasionally when dealing with dramatically downsampled signals, but seem to suffer pretty badly from catastrophic forgetting and other issues when run on longer, higher-frequency data.<p>[1] <a href="http://papers.nips.cc/paper/8773-latent-ordinary-differential-equations-for-irregularly-sampled-time-series" rel="nofollow">http://papers.nips.cc/paper/8773-latent-ordinary-differentia...</a>
[2] <a href="https://arxiv.org/abs/2005.08926" rel="nofollow">https://arxiv.org/abs/2005.08926</a>