TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: In your data models, how do you trust your labels?

1 点作者 lostsoul8282大约 4 年前
Hi HN!<p>In my previous life at tech at BigCo, I was always given data from providers(bloomberg, reuters, etc) and processed it using my models.<p>I was asked recently how do you trust data that may not be audited and is self reported? i.e. say a company reports the number of women in the company or enviromental metrics.<p>I feel this is a general problem for any self reported data. How would you handle it?

1 comment

PaulHoule大约 4 年前
In financial accounting a public company (say General Motors) will create aggregated financial statistics that they publish in their quarterly reports.<p>They hire an accounting firm (say Deloitte) in to check their work by looking at some sample of their documentation. This is a lot like Deming-style quality control; they look at some fraction of the checks that came from car dealers, or that were cut to parts suppliers and see that the story makes sense.<p>In fields like insurance where fraud is particularly dangerous they do things like look to see if there is a real person for some of the policies, etc.<p>I would look to the same model for other kinds of accounting too.<p>For instance, if the company said that 42% of its employees are women they could let a third party look at a sample of 1000 employees that the third party chooses, going so far as letting the third party see employement records filed with the state, contact those employees, etc.<p>Like a public opinion poll it is not an exact answer, maybe they will find 40% or 44% of the employees are women, which is close enough.<p>That is just one trick in the toolbox that accountants have, sometimes they will see a bunch of deposits with round numbers ($7700) and then you get one for $345.34 and that is the one they ask you about.<p>So that&#x27;s the subject you should be looking up, the kind of people you should be talking to.