TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Causal Analytics

69 pointsby bmahmoodalmost 6 years ago

9 comments

otterk10almost 6 years ago
Scott here from ClearBrain - the ML engineer who built the underlying model behind our causal analytics platform.<p>We’re really excited to release this feature after months of R&amp;D. Many of our customers want to understand the causal impact of their products, but are unable to iterate quickly enough running A&#x2F;B tests. Rather than taking the easy path and serving correlation based insights, we took the harder approach of automating causal inference through what&#x27;s known as an observational study, which can simulate A&#x2F;B experiments on historical data and eliminate spurious effects. This involved a mix of linear regression, PCA, and large-scale custom Spark infra. Happy to share more about what we did behind the scenes!
评论 #20711254 未加载
评论 #20711091 未加载
评论 #20701526 未加载
cuchoialmost 6 years ago
Very exciting to see causal theory being productionized!<p>From the article, this seems like a normal regression to me. Would be interesting to know what makes it causal (or at least better) compared to an OLS. PCA has been used for a long time to select the features to use in regression. Would it be accurate to say that the innovation is on how the regression is calculated rather than the statistical methodology?<p>Either way, it would interesting to test this approach against an A&#x2F;B test and check how much an observational study differs from the A&#x2F;B estimates, and how sensitive is this approach to including (or not) a set of features. Also would be interesting to compare it to other quasi-experimental methodologies, such as propensity score matching.<p>Is there a more extended document explaining the approach?<p>Good luck!
评论 #20711916 未加载
6gvONxR4sf7oalmost 6 years ago
I only skimmed it, so forgive me if I got this wrong. The causal model used here makes some incredibly strong (unlikely to be close enough to accurate) assumptions. Are these results valid if there are unobserved confounders or selection bias?
评论 #20711266 未加载
评论 #20711028 未加载
mrbonneralmost 6 years ago
I have been involving in causal inference analysis since 2015. We use a mixed model of decision tree and fixed effect regressions. I read your paper and could not find a reference of why, while one cannot do AB test to verify the relationship but can use observational analysis to do it. Could you share a reference please? Thank you for this insightful article!
评论 #20711801 未加载
whoisnnamdialmost 6 years ago
Cool stuff, thanks for sharing publicly.<p>Did you all consider using Double Selection [1] or Double Machine Learning [2]?<p>The reason I ask is that your approach is very reminiscent of a Lasso style regression where you first run lasso for feature selection then re-run a normal OLS with only those controls included (Post-Lasso). This is somewhat problematic because Lasso has a tendency to drop too many controls if they are too correlated with one another, introducing omitted variable bias. Compounding the issue, some of those variables may be correlated with the treatment variable, which increases the chance they will be dropped.<p>The solution proposed is to run two separates Lasso regressions, one with the original dependent variable and another with the treatment variable as the dependent variable, recovering two sets of potential controls, and then using the union of those sets as the final set of controls. This is explained in simple language at [3].<p>Now, you all are using PCA, not Lasso, so I don&#x27;t know if these concerns apply or not. My sense is that you still may be omitting variables if the right variables are not included at the start, which is not a problem that any particular methodology can completely avoid. Would love to hear your thoughts.<p>Also, you don&#x27;t show any examples or performance testing of your method. An example would be demonstrating in a situation where you &quot;know&quot; (via A&#x2F;B test perhaps) what the &quot;true&quot; causal effect is that your method is able to recover a similar point estimate. As presented, how do we &#x2F; you know that this is generating reasonable results?<p>[1] <a href="http:&#x2F;&#x2F;home.uchicago.edu&#x2F;ourminsky&#x2F;Variable_Selection.pdf" rel="nofollow">http:&#x2F;&#x2F;home.uchicago.edu&#x2F;ourminsky&#x2F;Variable_Selection.pdf</a> [2] <a href="https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1608.00060" rel="nofollow">https:&#x2F;&#x2F;arxiv.org&#x2F;abs&#x2F;1608.00060</a> [3] <a href="https:&#x2F;&#x2F;medium.com&#x2F;teconomics-blog&#x2F;using-ml-to-resolve-experiments-faster-bd8053ff602e" rel="nofollow">https:&#x2F;&#x2F;medium.com&#x2F;teconomics-blog&#x2F;using-ml-to-resolve-exper...</a>
评论 #20712435 未加载
kk58almost 6 years ago
Did you guys look into Partial mutual information for confounding variable selection<p>Granger causality for estimating Granger cause
whirlofpearlalmost 6 years ago
Looks like you lifted this straight of Judea Pearl&#x27;s seminal research.<p>Congratulations! Just remember to patent it :)
评论 #20702677 未加载
move-on-byalmost 6 years ago
An analytics platform without a privacy policy? :(<p>404: <a href="https:&#x2F;&#x2F;www.clearbrain.com&#x2F;privacy" rel="nofollow">https:&#x2F;&#x2F;www.clearbrain.com&#x2F;privacy</a><p>404: <a href="https:&#x2F;&#x2F;www.clearbrain.com&#x2F;terms" rel="nofollow">https:&#x2F;&#x2F;www.clearbrain.com&#x2F;terms</a>
评论 #20712331 未加载
Rainymoodalmost 6 years ago
Interesting to note that ClearBrain is in YC.