科技回声

3 条评论

palae超过 3 年前

It's probably a good idea to remind (or inform) people that at least in scientific research, null hypothesis statistical testing and "statistical significance" in particular have come under fire [1,2]. From the American Statistical Association (ASA) in 2019 [2]:"We conclude, based on our review of the articles in this special issue and the broader literature, that it is time to stop using the term “statistically significant” entirely. Nor should variants such as “significantly different,” “p < 0.05,” and “nonsignificant” survive, whether expressed in words, by asterisks in a table, or in some other way.Regardless of whether it was ever useful, a declaration of “statistical significance” has today become meaningless."[1] The ASA Statement on p-Values: Context, Process, and Purpose - <a href="https://www.tandfonline.com/doi/full/10.1080/00031305.2016.1154108" rel="nofollow">https://www.tandfonline.com/doi/full/10.1080/00031305.2016.1...</a>[2] Moving to a World Beyond “p < 0.05” - <a href="https://www.tandfonline.com/doi/full/10.1080/00031305.2019.1583913" rel="nofollow">https://www.tandfonline.com/doi/full/10.1080/00031305.2019.1...</a>

评论 #29044334 未加载

评论 #29044036 未加载

评论 #29044313 未加载

dmitriid超过 3 年前

Before interpreting A/B results, the main question that needs to be asked: "what is it that you're A/B testing?"For too many companies, it's testing "engagement" which leads to hiding functionality (more clicks is more engagement), reducing info density (more time spent is more engagement) etc.And coming from Netflix... I don't think there's a single person who likes that when you browse Netflix it autoplays random videos (not even trailers) with audio at full volume. But yeah, A/B tests something something. So I wish Netflix learned from their own teachings.

评论 #29043768 未加载

评论 #29043215 未加载

评论 #29043128 未加载

jonathanbentz超过 3 年前

I am interested to see what they will be testing in some of the upcoming posts in this series. It would be fun to be scrolling Netflix and have the transparency to know that I'm seeing the 'B' test.

评论 #29047203 未加载

评论 #29043820 未加载

3 条评论

palae超过 3 年前

评论 #29044334 未加载

评论 #29044036 未加载

评论 #29044313 未加载

dmitriid超过 3 年前

评论 #29043768 未加载

评论 #29043215 未加载

评论 #29043128 未加载

jonathanbentz超过 3 年前

I am interested to see what they will be testing in some of the upcoming posts in this series. It would be fun to be scrolling Netflix and have the transparency to know that I'm seeing the 'B' test.

评论 #29047203 未加载

评论 #29043820 未加载

Interpreting A/B test results: false positives and statistical significance

3 条评论

Interpreting A/B test results: false positives and statistical significance

3 条评论