Same content in prose rather than slides: <a href="https://xgboost.readthedocs.org/en/latest/model.html" rel="nofollow">https://xgboost.readthedocs.org/en/latest/model.html</a>
One interesting thing about Boosted Trees is the author's software (XGBoost[1]) reliably outperforms other implementations (in terms of accuracy of results[2]). I'm not entirely sure why this is - I know there is an open ticket in the Spark GBT implementation to investigate this.<p>[1] <a href="https://github.com/tqchen/xgboost" rel="nofollow">https://github.com/tqchen/xgboost</a><p>[2] It's also very fast in terms of absolute speed.
It's worth checking out Friedman's "Gradient Boosting Machine" paper (as mentioned here in the references) from 1999 -- this has a good description of "boosting" from the general perspective of function optimisation.<p>Here's a copy: [pdf] <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31.869&rep=rep1&type=pdf" rel="nofollow">http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31....</a>
I haven't read much about XGBoost boosted trees. Does each tree have additive independence? Is the tree ensemble of two trees better than one tree?<p>It seems like additive training that removes all constants in addition to regularization of model complexity would shape the tree ensemble into a baseline model that defines minimum assumptions. So, what's its success rate in predicting favorable outcomes vs. tree learning focused on heuristic specialization (impurity)?
In the first example, being male is one of two features that predict playing video games, and (surprise!) only the boy and the old man are classified as gamers. Talk about casual sexism! Can you imagine taking this class as a woman (who maybe, just maybe, happens to enjoy video games) and having to forgive/ignore the instructor's cluelessness in order to get through the material? So incredibly tone-deaf and lazy, ugh.