Ask HN: Best place to start learning about Markov Chains?

237 点作者 chrisherd大约 6 年前

A progressive reading list or process to follow would be awesome

37 条评论

dcwca大约 6 年前

Just pick a random place to start, read some stuff, and then take a guess as to which direction to go in next, based on what's probably a good next thing to read. Then keep repeating the process over and over again.

评论 #19634131 未加载

评论 #19634459 未加载

评论 #19635592 未加载

评论 #19634161 未加载

评论 #19636826 未加载

评论 #19635111 未加载

评论 #19641365 未加载

gtycomb大约 6 年前

So many there are. Starting with basic Probability, this lecture series is a good first intro.<a href="https://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter11.pdf" rel="nofollow">https://www.dartmouth.edu/~chance/teaching_aids/books_articl...</a>Or starting from the basics, and learning how to actually do the number crunching, this is unusually good (Stewart, Introduction to numerical solution of Markov Chains):<a href="https://press.princeton.edu/titles/5640.html" rel="nofollow">https://press.princeton.edu/titles/5640.html</a>Robert Gallager's MIT lecture series, very well presented, titled Principles of Digital Communications, takes you on another train based on Markov Chains (Kalman filters, etc).<a href="https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-450-principles-of-digital-communications-i-fall-2006/" rel="nofollow">https://ocw.mit.edu/courses/electrical-engineering-and-compu...</a>

activatedgeek大约 6 年前

Markov chains in essence are simple. Instead of diverging and reading all the theory, I'd recommend do it on a need basis. Learn as you go. So pick up a problem and move ahead. I don't think it is fruitful to just learn everything about Markov Chains just for the sake of it.Markov Chain Monte Carlo to sample from probability distributions is a good start - <a href="https://arxiv.org/abs/1206.1901" rel="nofollow">https://arxiv.org/abs/1206.1901</a> if you are into sampling.

评论 #19636739 未加载

thedevindevops大约 6 年前

Tough one, I'd have to say:45% <a href="http://setosa.io/ev/markov-chains/" rel="nofollow">http://setosa.io/ev/markov-chains/</a>30% <a href="https://en.wikipedia.org/wiki/Markov_chain" rel="nofollow">https://en.wikipedia.org/wiki/Markov_chain</a>25% Youtube

评论 #19635394 未加载

usgroup大约 6 年前

1. Elementary probability theory.2. Poisson processes.3. The Markov property.4. Stochastic processes.5. Realise that you’re missing a background in analysis, therefore you don’t know sh?t about measure theory but you actually need it to know anything deeper . Wonder to yourself if you really want to spend the next 3 years getting a maths background you don’t have.6. Convince yourself that it’s all just engineering and middle through by picking a project involving non trivial markov chain.7. Go back and spend 3 years doing foundational maths then repeat point 1-5.

评论 #19634700 未加载

评论 #19635641 未加载

评论 #19634746 未加载

评论 #19636886 未加载

评论 #19636910 未加载

YorkshireSeason大约 6 年前

If you are not already intimately familiar with them learn about FSA (= finite state automata), aka FSM (finite state machines).Most interesting facts about Markov chains (e.g. the Stationary Distribution Theorem) really are probabilistic generalisations of simpler facts about FSAs (e.g. FSAs cannot be used to "count"). In my experience, understanding them first for FSAs and then seeing how they generalise for the probabilitic case is a good way of approaching this subject.

Vaslo大约 6 年前

Here is an excellent place to start:<a href="http://setosa.io/ev/markov-chains/" rel="nofollow">http://setosa.io/ev/markov-chains/</a>

评论 #19634906 未加载

notinventedhear大约 6 年前

For a broad introduction to Bayesian analysis, MCMC and PyMC I'd suggest Bayesian Methods for Hackers[1][1] <a href="http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/" rel="nofollow">http://camdavidsonpilon.github.io/Probabilistic-Programming-...</a>

localhostdotdev大约 6 年前

markov chains are very simple at their core (e.g. simple version could be: take the probability of the next word given the known probabilities of words that follow the previous word)it can be implemented in a few lines of code, that's the beauty of it: <a href="https://github.com/justindomingue/markov_chains/blob/master/lib/markov_chains/dictionary.rb" rel="nofollow">https://github.com/justindomingue/markov_chains/blob/master/...</a>obviously then you could take the previous n words into account, tweak the starting word, add randomness, etc.now replace "word" with "state" and "probability(next state | previous state)" to edges of a graph: <a href="https://static1.squarespace.com/static/54e50c15e4b058fc6806d068/t/5650d16ee4b033f56d20ae6b/1459882428797/markov+chain+graph+all.png?format=1500w" rel="nofollow">https://static1.squarespace.com/static/54e50c15e4b058fc6806d...</a>and you got a generic markov chain :)footnotes: p(A | B) is probability of A given B, e.g. p(rain | clouds) > p(rain | sun) :)

crshults大约 6 年前

I thought this recent post: 'Generating More of My Favorite Aphex Twin Track'[1] had a good beginner-level write up on Markov Chains. [1]<a href="https://news.ycombinator.com/item?id=19490832" rel="nofollow">https://news.ycombinator.com/item?id=19490832</a>

nrjames大约 6 年前

What I would do is use the Markovify python library and feed it with several texts from Project Gutenberg... try to generate some Lovecraftian prose or something...<a href="https://github.com/jsvine/markovify" rel="nofollow">https://github.com/jsvine/markovify</a>

YeGoblynQueenne大约 6 年前

Personally, I started with Eugene Charniak's Statistical Language Learning [1] then continued with Manning and Schütze's Foundations of Statistical Natural Language Processing [2] and Speech and Language Processing by Jurafsky and Martin [3].The Charniak book is primarily about HMMs and quite short, so it's the best introduction to the subject. Manning and Schütze and Jurafsky and Martin are much more extensive and cover pretty much all of statistical NLP up to their publication date (so no LSTMs if I remember correctly) but they are required reading for an in-depth approach.You will definitely want to go beyond HMMs at some point, so you will probably want the other two books. But, if you really just want to know about HMMs, then start with the Charniak.______________[1] <a href="https://mitpress.mit.edu/books/statistical-language-learning" rel="nofollow">https://mitpress.mit.edu/books/statistical-language-learning</a>[2] <a href="https://nlp.stanford.edu/fsnlp/" rel="nofollow">https://nlp.stanford.edu/fsnlp/</a>[3] <a href="https://web.stanford.edu/~jurafsky/slp3/" rel="nofollow">https://web.stanford.edu/~jurafsky/slp3/</a>

evmar大约 6 年前

For hidden Markov models (which only look into after you get the basics), I recall that this widely-cited paper (perhaps the original?) is pretty readable. From the title it looks like it's about speech but ignore the speech parts and read the math:<a href="https://www.robots.ox.ac.uk/~vgg/rg/papers/hmm.pdf" rel="nofollow">https://www.robots.ox.ac.uk/~vgg/rg/papers/hmm.pdf</a>

danaugrs大约 6 年前

I really like this short, relaxed video: "Information Theory part 10: What is a Markov chain?" by Art of the Problem <a href="https://www.youtube.com/watch?v=o-jdJxXL_W4" rel="nofollow">https://www.youtube.com/watch?v=o-jdJxXL_W4</a>If you like it I recommend watching the whole series.

jotaf大约 6 年前

These are my favorite lecture notes, they assume almost no a-priori knowledge (with an awesome review of basic probabilities) and yet they don't shy away from explaining all the rigorous math.If you have time to read step-by-step derivations and want to understand the fundamentals, I think this is an excellent self-contained resource.<a href="https://ermongroup.github.io/cs228-notes/" rel="nofollow">https://ermongroup.github.io/cs228-notes/</a>

评论 #19634727 未加载

twiecki大约 6 年前

If you are looking for an explanation of MCMC that focuses on intuitive understanding to complement more mathematical introductions, I wrote a blog post trying to explain things in simple terms here: <a href="https://twiecki.io/blog/2015/11/10/mcmc-sampling/" rel="nofollow">https://twiecki.io/blog/2015/11/10/mcmc-sampling/</a>

ivan_ah大约 6 年前

If you're interested in a basic math intro (starting from linear algebra concepts), check out Section 8.2 in this excerpt from the book "No Bullshit guide to Linear Algebra": <a href="https://minireference.com/static/excerpts/probability_chapter.pdf#page=12" rel="nofollow">https://minireference.com/static/excerpts/probability_chapte...</a> This excerpt contains some exercises (with answers in the back) as well an examples application (PageRank).Technically Linear Algebra is not "required" to understand Markov Chains, but it's a very neat way to think about them: each "step" in the chain is equivalent to multiplication of the state vector by the transition matrix.

maurits大约 6 年前

My personal favorite introduction to MC(MC) is lecture 1 of statistical mechanics and computations [1][1]: <a href="https://www.coursera.org/learn/statistical-mechanics" rel="nofollow">https://www.coursera.org/learn/statistical-mechanics</a>

melling大约 6 年前

I’ve got a couple of links here:<a href="https://github.com/melling/MathAndScienceNotes/tree/master/statistics" rel="nofollow">https://github.com/melling/MathAndScienceNotes/tree/master/s...</a>

jerednel大约 6 年前

I learned quite a bit by exploring attribution modeling with them. There is an R package where you can just faceroll a model without really understanding anything so I tried recreating it in Python <a href="https://github.com/jerednel/markov-chain-attribution" rel="nofollow">https://github.com/jerednel/markov-chain-attribution</a> - its messy for sure but it is a learning exercise and it helped me understand the concept quite a bit. That currently only supports the simplest use case of a first order markov chain.

jamesb93大约 6 年前

Make one with a direct application. I did one to model melody from Bach in a stupid way. It was made in Max, so I can't provide the size of the code in any meaningful way, but its basically just a text file with an index and a number of possibilities related to that index.<a href="https://soundcloud.com/jamesbradbury/9th-order-markov-chain-of-bach" rel="nofollow">https://soundcloud.com/jamesbradbury/9th-order-markov-chain-...</a>

sublimino大约 6 年前

Markov Chains can be quite amusing when applied to a corpus of similar texts, and often stunningly human-like. I maintain a list of humourous applications: <a href="https://github.com/sublimino/awesome-funny-markov" rel="nofollow">https://github.com/sublimino/awesome-funny-markov</a>Some favourites:- Erowid trip reports and tech recruiter emails - <a href="https://twitter.com/erowidrecruiter" rel="nofollow">https://twitter.com/erowidrecruiter</a>- Calvin and Markov - Calvin and Hobbes strips reimagined <a href="http://joshmillard.com/markov/calvin/" rel="nofollow">http://joshmillard.com/markov/calvin/</a>- Generate your future tweets based on the DNA of your existing messages - <a href="http://yes.thatcan.be/my/next/tweet/" rel="nofollow">http://yes.thatcan.be/my/next/tweet/</a>- Fake headlines created by smashing up real headlines - <a href="https://www.headlinesmasher.com/best/all" rel="nofollow">https://www.headlinesmasher.com/best/all</a>- The most confusing subreddit (often on the front page) - <a href="https://www.reddit.com/r/subredditsimulator" rel="nofollow">https://www.reddit.com/r/subredditsimulator</a>The original Markov-generated content prank: "I Spent an Interesting Evening Recently with a Grain of Salt" <a href="https://web.archive.org/web/20011101013348/http://www.sincity.com/penn-n-teller/pcc/shaney.html" rel="nofollow">https://web.archive.org/web/20011101013348/http://www.sincit...</a>And of course (un-amusingly!) - Google's PageRank algorithm is built on Markov Chains <a href="https://en.wikipedia.org/wiki/PageRank#Damping_factor" rel="nofollow">https://en.wikipedia.org/wiki/PageRank#Damping_factor</a>n.b. there used to be parodies of Hacker News, but both are down: <a href="https://news.ycombniator.com/" rel="nofollow">https://news.ycombniator.com/</a> and <a href="https://lou.wtf/phaker-news" rel="nofollow">https://lou.wtf/phaker-news</a>

maxmouchet大约 6 年前

For an introduction to discrete and continuous-time Markov chains, as well as an application to queuing theory, you can check the MOOC "Queuing Theory: from Markov Chains to Multi-Server Systems" on edX [1].[1] <a href="https://www.classcentral.com/course/edx-queuing-theory-from-markov-chains-to-multi-server-systems-10079" rel="nofollow">https://www.classcentral.com/course/edx-queuing-theory-from-...</a>

thepill大约 6 年前

<a href="http://setosa.io/ev/markov-chains/" rel="nofollow">http://setosa.io/ev/markov-chains/</a>

DanBC大约 6 年前

Not sure it's introductory, but A Mathematical Theory of Communication, page 5 onwards, is useful: <a href="http://www.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf" rel="nofollow">http://www.math.harvard.edu/~ctm/home/text/others/shannon/en...</a>

segmondy大约 6 年前

The wikipedia page is actually good and how I learned about it. <a href="https://en.wikipedia.org/wiki/Markov_chain" rel="nofollow">https://en.wikipedia.org/wiki/Markov_chain</a> follow through with some random googling, read then implement it. It's really simple for something that sounds so fancy. :)

mindcrime大约 6 年前

David Silver's course on Reinforcement Learning contains some good information on Markov processes. See Lecture #2 in particular.<a href="https://www.youtube.com/playlist?list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT" rel="nofollow">https://www.youtube.com/playlist?list=PL7-jPKtc4r78-wCZcQn5I...</a>

platz大约 6 年前

-- Markov Decision Processesthere is a lot of info out there about markov chains, but very little about markov decision processes (MDP).How popular are MDP? What are their strengths? weaknesses?-- Kalman Filters vs HMM (Hidden Markov Model):"In both models, there's an unobserved state that changes over time according to relatively simple rules, and you get indirect information about that state every so often. In Kalman filters, you assume the unobserved state is Gaussian-ish and it moves continuously according to linear-ish dynamics (depending on which flavor of Kalman filter is being used). In HMMs, you assume the hidden state is one of a few classes, and the movement among these states uses a discrete Markov chain. In my experience, the algorithms are often pretty different for these two cases, but the underlying idea is very similar." - THISISDAVE-- HMM vs LSTM/RNN:"Some state-of-the-art industrial speech recognition [0] is transitioning from HMM-DNN systems to "CTC" (connectionist temporal classification), i.e., basically LSTMs. Kaldi is working on "nnet3" which moves to CTC, as well. Speech was one of the places where HMMs were _huge_, so that's kind of a big deal." -PRACCU"HMMs are only a small subset of generative models that offers quite little expressiveness in exchange for efficient learning and inference." - NEXTOS"IMO, anything that be done with an HMM can now be done with an RNN. The only advantage that an HMM might have is that training it might be faster using cheaper computational resources. But if you have the $$$ to get yourself a GPU or two, this computational advantage disappears for HMMs." - SHERJILOZAIR

micheda大约 6 年前

The hmm_filter project implements Viterbi-inspired algorithms and transition matrices in Python, might be also a useful learning resource: <a href="https://github.com/minodes/hmm_filter" rel="nofollow">https://github.com/minodes/hmm_filter</a>

orasis大约 6 年前

The most important thing is to realize just how damn simple they are. As you get mired in the literature everything will seem overwhelmingly complex. Just grok the very very basic idea of them and it will come easier.Also, they’re just a convenient model (for some problems), not a holy truth.

AlexCoventry大约 6 年前

You could try Gelman et al.'s Bayesian Data Analysis. It has a good overview of MCMC.If you want an overview of Markov chains as statistical models in their own right, Durbin et al.'s Biological Sequence Analysis is a well-motivated overview.

ggggtez大约 6 年前

There isn't really very much to learn. Just start on wikipedia, and expand out if you think there is something more. Markov Chains are very simple in practice.

i_am_proteus大约 6 年前

If the "motivation-theorem-proof" style appeals to you, find a copy of Finite Markov Chains by Kemeny and Snell. ISBN 0442043287

ackbar03大约 6 年前

How about a textbook maybe? There aren't always easy alternatives out there, sometimes you have to bite the bullet and do the work

tnecniv大约 6 年前

Do you have an application in mind to help guide suggestions?As others have said, if you know know probability, start there.

currymj大约 6 年前

You can find a copy of “Markov Chains and Mixing Times” online, which is good and relatively accessible.

graycat大约 6 年前

E. Cinlar, Introduction to Stochastic ProcessesCovers limit theorems and continuous time.