Hmm. I'm not an expert, but some of this seems definitely not to be accurate. Some of the "Bullshit" turns out perhaps to be quite important.<p>Take the statement:<p>> Markov Chain is essentially a fancy name for a random walk on a graph<p>Is that really true? I definitely don't think so. To my understanding, a Markov process is a stochastic process that has the additional (aka "Markov") property that it is "memoryless". That is, to estimate the next state you only need to know the state now, not any of the history. It becomes a Markov chain if it is a discrete, rather than continuous process.<p>There are lots of random walks on graphs that satisfy this definition. Like say you have a graph and you just specify the end points and say "walk 5 nodes at random from this starting node, what is the expectation that you end up at a specific end node". This could be a Markov process. At any point to estimate the state you only need to know the state now.<p>But lots of random walks on graphs do not have the Markov property. For example, say I did the exact same thing as the previous example, so I have a random graph and a start and target node and I say "Walk n nodes at random from the starting node. What's the expectation that at some point you visit the target node". Now I have introduced a dependency on the history and my process is no longer memoryless. It is a discrete stochastic process and it is a random walk on a graph but is not a Markov chain.<p>An example of a Markov and non-Markov processes in real life is if I have a European option on a stock I only care about what the stock price is at the expiry date. But if I have a barrier option or my option has some knock-in/knock-out/autocallable features then it has a path dependence because I care about whether at any point in its trajectory the price hit the barrier level, not just the price at the end. So the price process for the barrier option is non-Markov.
Science communication is so important. I write scientific papers and I always write a blog post about the paper later, because nobody understands the scientific paper -- not even the scientists. The scientists regularly read my blog instead. The "scientific style" has become so obtuse and useless that even the professionals read the blog instead. True insanity.
I had to learn Bayesian econometrics mostly on my own[1]. Fortunately, Jeff Miller[2] created a series of fantastic YouTube videos to explain Markov Chain Monte Carlo in detail. Personally, I prefer to learn mathematical concepts with equations and develop the intuition on my own. If you have the same preference, you will find his videos really helpful: <a href="https://www.youtube.com/watch?v=12eZWG0Z5gY" rel="nofollow">https://www.youtube.com/watch?v=12eZWG0Z5gY</a><p>[1] I am lucky to know people who are fantastic Bayesian modelers and they helped me polish my concepts.<p>[2] <a href="https://jwmi.github.io/index.html" rel="nofollow">https://jwmi.github.io/index.html</a>
So he sets up a toy problem (drawing from a baby names distribution), then never explains how to solve this problem.<p>The intuition is, you set up a graph where the vertices are names, and the edges are based on name similarity. Two names are neighbors if e.g. their edit distance is within some limit. You start at a random name, then at the neighbors, flip a biased coin with the ratio of the P(x) of your current name and the neighbor, if heads you move to the neighbor.<p>I'm sure this is wrong in many and subtle ways, but when I read an article like this I expect some intuition like this to be imparted.
> The “bullshit” here is the implicit claim of an author that such jargon is needed. Maybe it is to explain advanced applications (like attempts to do “inference in Bayesian networks”), but it is certainly not needed to define or analyze the basic ideas.<p>"The bullshit here is the implicit claim of an author that German language is needed. Maybe it is for advanced applications (like Goethe's poetry), but it is certainly not needed to describe basic ideas."<p>(proceeds to explain the same basic concepts 10x more verbose than in any math textbook on the subject)<p>Math/statistics nomenclature is certainly not perfect (think of it as a general utilities library that has been in active development for 200+ years), but it is widely used for a reason: once you learn the language it becomes second nature (very much the same as knowing all the details of the standard library API in your language of choice) allowing to communicate arbitrarily complex abstract ideas with speed and precision.
Common pattern where a bright spark asks, 'why you all so complicated?'
Proceeds to assume we're dealing with a finite graph / set.<p>All the complication is needed to handle the fact that the state can be a random vector of real numbers in a possibly varying dimensional space. It's not jerking off on jargon for its own sake.<p>Sure, there are simple cases - doesn't make the general case 'bullshit'.
Interesting! I also make extensive notes of mathematical and computational concepts (and “buzzwords”) with a “no bullshit” title, and it works great. It’s great for quick refreshers once in a while.
If I ever write abt "Markov chains without the Math/Jargon/BS" I'll use the clip from the "Ten Second Tom" scene from 50 First Dates[1] & a host of other Sci Fi movies abt time loops[2,3] to illustrate the Memorylessness[4] Markov property[5]<p>---<p>1. <a href="https://www.youtube.com/watch?v=iN_BDcKhtWk" rel="nofollow">https://www.youtube.com/watch?v=iN_BDcKhtWk</a><p>2. <a href="https://en.wikipedia.org/wiki/Time_loop" rel="nofollow">https://en.wikipedia.org/wiki/Time_loop</a><p>3. <a href="https://en.wikipedia.org/wiki/List_of_films_featuring_time_loops" rel="nofollow">https://en.wikipedia.org/wiki/List_of_films_featuring_time_l...</a><p>4. <a href="https://en.wikipedia.org/wiki/Memorylessness" rel="nofollow">https://en.wikipedia.org/wiki/Memorylessness</a><p>5. <a href="https://en.wikipedia.org/wiki/Markov_property" rel="nofollow">https://en.wikipedia.org/wiki/Markov_property</a>
There is definite truth to the idea stats people have a habit of writing deliberately impenetrable prose. (Probably due to proximity to economists and political scientists).<p>I happened to need to implement a markov chain for playing rock, paper, scissors in <a href="https://luduxia.com/showdown" rel="nofollow">https://luduxia.com/showdown</a> - once you actually understand it the code is short and took no more than 20 minutes to write, and was by far the easiest part of the entire effort, which was surprising. Vertically centering a div is harder, and involves dealing with even more jargon.
McMc is bullshit to begin with. Too slow…and alternatives are better for many reasons. It should not be the standard method taught to students when it comes to Bayesian Inference.