TechEcho

3 comments

data_dan_over 3 years ago

(I wrote this article)<p>We recently wrote an article (<a href="https://l.bit.io/o-cop26" rel="nofollow">https://l.bit.io/o-cop26</a>) about methane emissions and the COP26 commitment to cut emissions. During the writing of that article, we found some serious inconsistencies in some of the data sources.<p>Discussions of data quality and validation in data science tend to end with recommendations for a few data validation checks, such as making sure data come from trusted sources; handling missing values; and investigating outliers. These sorts of checks are important, but they won't save an analysis from perfectly-formatted data from a trusted source that happens to be wrong for reasons that can't be found in the dataset itself. Even data of apparently good quality can lead to faulty conclusions.<p>This article delves into this question by exploring a case study. The U.N. publishes greenhouse gas emissions data supplied each year by parties to the UNFCCC (United Nations Framework Convention on Climate Change). The data are consistent, up-to-date, and well formatted, and the U.N. is a reliable source of official data. However, there is good reason to believe the data submitted by some countries is not accurate. There are other trusted data sources that show startlingly large differences from the U.N. data. In particular, we found that Russia's Methane emissions data were highly inconsistent with the World Resources Institute (WRI) Climate Analysis Indicators Tool (CAIT) data, even though these data were quite similar to the U.N. data for other countries.

otacustover 3 years ago

The Washington Post article referenced in this post is really interesting: <a href="https://www.washingtonpost.com/climate-environment/interactive/2021/russia-greenhouse-gas-emissions/" rel="nofollow">https://www.washingtonpost.com/climate-environment/interacti...</a><p>The yearly revisions of the GHG emissions estimates are really striking.

rurbanover 3 years ago

Change the label to unreliable source then.

评论 #29556336 未加载

3 comments

data_dan_over 3 years ago

otacustover 3 years ago

rurbanover 3 years ago

Change the label to unreliable source then.

评论 #29556336 未加载

What Do You Do When Reliable Sources Publish Unreliable Data?

3 comments

What Do You Do When Reliable Sources Publish Unreliable Data?

3 comments