Isn't this a flawed approach? It seems like Khan Academy is trying to re-construct a record of behaviours across their business by stitching together:<p>1. Parsing web logs for web page views and API accesses<p>2. Exporting "some client-side events" from MixPanel<p>3. Mining their transactional databases for state changes<p>On #1 - web caching and client-side events have long invalidated web log based analytics approaches. How is Khan different?<p>On #3 - this is reverse engineering your user behaviours by mining state changes in your transactional systems. This is typically a ton of work, it breaks when you change your data models, and your operational systems aren't designed to reveal user behaviours anyway.<p>Have Khan explored alternative approaches? Typically: defining with the analyst team a set of events you want to monitor, making sure all of your systems (client-side, mobile, server-side, whatever) emit immutable streams of these events, and then collecting, storing, enriching, analyzing at your leisure.