-- vs kalman filters:<p>"In both models, there's an unobserved state that changes over time according to relatively simple rules, and you get indirect information about that state every so often.
In Kalman filters, you assume the unobserved state is Gaussian-ish and it moves continuously according to linear-ish dynamics (depending on which flavor of Kalman filter is being used).
In HMMs, you assume the hidden state is one of a few classes, and the movement among these states uses a discrete Markov chain.
In my experience, the algorithms are often pretty different for these two cases, but the underlying idea is very similar." - THISISDAVE<p>-- vs LSTM/RNN:<p>"Some state-of-the-art industrial speech recognition [0] is transitioning from HMM-DNN systems to "CTC" (connectionist temporal classification), i.e., basically LSTMs. Kaldi is working on "nnet3" which moves to CTC, as well.
Speech was one of the places where HMMs were _huge_, so that's kind of a big deal." -PRACCU<p>"HMMs are only a small subset of generative models that offers quite little expressiveness in exchange for efficient learning and inference." - NEXTOS<p>"IMO, anything that be done with an HMM can now be done with an RNN. The only advantage that an HMM might have is that training it might be faster using cheaper computational resources. But if you have the $$$ to get yourself a GPU or two, this computational advantage disappears for HMMs." - SHERJILOZAIR
A coworker of mine used to ask job candidates (usually folks with PhDs) with HMMs on their CV "what's hidden in a hidden markov model". Lots of people couldn't answer that question.
Are there an open tools for solving HMMs for large datasets? i.e. if I have millions of observations from millions of users and want to learn a HMM from the data, what are my options?
> A Markov chain is a sequence of random variables X1, X2, X3, . . . , Xt ,
. . . , such that the probability distribution of Xt+1 depends only on t and
xt (Markov property), in other words:<p>No. In a Markov process, the future does depend on the past, even all of the past. But what is special is that the past and the future are conditionally independent given the present. If we are not given the present, then all of the past can be relevant in predicting the future.
link to MIT course:<p><a href="http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-410-principles-of-autonomy-and-decision-making-fall-2010/" rel="nofollow">http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-4...</a>