Something to note about this formulation is the explicit assumption that in p(y|do(x)), the 'do' operation is supposed to be completely independent of prior observed variables, e.g. the doers are 'unmoved movers' [1].<p>That fits the model where you randomly 'do' one thing or another (e.g. blinded testing); however this is <i>not</i> the same thing as p(y|do'(x)), where do' is your empirical observation of when you yourself have set X=x in a more natural context.<p>E.g. let's say you will always turn on the heat when it's cold outside. P(cold outside | do(turn on heat)) = P(cold outside), because turning on the heat does not affect the temperature outdoors.<p>However, P(cold outside | do'(turned on heat)) > P(cold outside), because empirically, you actually only <i>choose</i> to turn on the heat when it's cold outdoors.<p>These two are also different from P(cold outside | heat was turned on) (since <i>someone else</i> might have access to the thermostat).<p>In reality our choices and actions are also products of the initial states (including our own beliefs, and our own knowledge of what would happen if we did x). Our actions both move the world, but we are also moved by the world.<p>Does do-calculus have a careful treatment of 'mixed' scenarios where actions are both causes <i>and</i> effects of other causes?<p>[1] <a href="https://en.wikipedia.org/wiki/Unmoved_mover" rel="nofollow">https://en.wikipedia.org/wiki/Unmoved_mover</a>
For those trying to understand the difference between action and observation, here's a good example from a friend:<p>Every bug you fix in your code increases your chances of shipping on time, but provides evidence that you won't.
I really enjoyed the humility the author had in the introduction to this piece. He paused and took a hard look at what seemed to be harsh or arrogant criticism of his field and found insight.
Here is a paper explaining the essentials of how 45+ years of Causal Inference applies to ML:
<a href="http://www.nber.org/chapters/c14009.pdf" rel="nofollow">http://www.nber.org/chapters/c14009.pdf</a><p>In this podcast by the same author, it explains the potential of sharing lessons from both worlds, if you're not in the mood for an academic paper:
<a href="http://www.econtalk.org/archives/2016/09/susan_athey_on.html" rel="nofollow">http://www.econtalk.org/archives/2016/09/susan_athey_on.html</a>
How does someone <i>use</i> do-calculus?
It's a nice mathematization of Goodhart's law, <a href="https://en.wikipedia.org/wiki/Goodhart%27s_law" rel="nofollow">https://en.wikipedia.org/wiki/Goodhart%27s_law</a><p>but how would help an algorithm make better predictions?<p>Sure, the reason a person turns on the heat affects our belief in the outside weather (were they feeling cold, or were they just trolling?), but how do you <i>know</i> the reason a person turned on the heat, and couldn't you learn which reason are predictive by measuring correlations with other observables? If you <i>know</i> the reason directly ("I'm just playing with the dial because I'm 4 years old") that's a data point you could throw into your ML model <i>without</i> explicitly knowing it's a <i>reason</i>.
I am interested in a companion phenomenon with the recent interest in causal models in machine learning. Namely, the fact that at least in computer vision, it is not new at all and has been an important idea for at least many decades.<p>One of the original sources that took this approach is "The Ecological Approach to Visual Perception" (1979) [0], by James Gibson, discussed at length the idea of "affordances" of an algorithmic model, similar in some respects to topics in reinforcement learning as well. Affordances represented the information about outcomes you gained by varying your degrees of observational freedom (i.e. you learn how to generalize beyond occluded objects by moving your head a little to the left or right and seeing how the visual input varies. This lets you get food, or hide from a predator that's partially blocked by a tree, etc., so over time generalizing past occlusions become better and better -- this is much more interesting than a naive approach, like using data augmentation to augment a labeled data set with synthetically occluded variations, for example as is often done to improve rotational invariance).<p>Then this idea was extended with a lot of formality in the mid-to-late 00's by Stefano Soatto in his papers on "Actionable Information" [1].<p>I wish more effort had been made by e.g. Pearl to look into this and unify his approach with what had already been thought of, especially because it turns me off a lot when someone tries to create a "whole new paradigm" and it starts to feel like they want to generate sexy marketing hype about it, rather than to say hey, this is an extension or connection or alternative of this older idea <i>already in the topic of machine learning</i> rather than appearing like one is saying, "Us over hear in causal inference world already know so much more about what to do ... so now let's apply it to your domain where you never thought of this". Pearl has a history of doing this stuff too, like with his previous debates with Gelman about Bayesian models. It almost feels to me like he is shopping around for some sexy application area where his one-upsmanship approach will catch on too give him a chance at the hype gravy train or something.<p>[0]: < <a href="https://en.wikipedia.org/wiki/James_J._Gibson#Major_works" rel="nofollow">https://en.wikipedia.org/wiki/James_J._Gibson#Major_works</a> ><p>[1]: < <a href="http://www.vision.cs.ucla.edu/papers/soatto09.pdf" rel="nofollow">http://www.vision.cs.ucla.edu/papers/soatto09.pdf</a> >
Worth mentioning, perhaps, that Cybernetics originated from the study of "circular loops of causality", systems where e.g. A causes B, B causes C, and in turn C causes A, etc...
Nothing to see here. The do-calculus is just fancy notation for what reinforcement learning is already doing: trying different possible actions and trying to maximize reward. If you know possible actions in advance, this is basically minimizing regret of wrong policy actions.