TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Data diffs: Algorithms for explaining what changed in a dataset

21 pointsby marcuaover 3 years ago

1 comment

rgavuliakover 3 years ago
This is a long known issue in BI. You can either report as is (i.e. the latest version of data) or as was (if you store the history either as deltas). A good example is subscriptions. You have someone pay for an annual subscription on the 1st of January 2021, but they cancel half-way through and get a refund. This already shows various outcomes.<p>Meta, Twitter and other advertising networks overwrite data. In a past analysis I oversaw the numbers could differ ranging from 25 % to 300 % difference. This could be to a variety of reasons - i.e. the report you pull tells you you had 1000 impressions, but two weeks later when you download the same report you find out it was actually 900 (100 were filtered out due to bot filtering post-correction).<p>Perhaps the biggest culprit is Google with conversions. For some reason they attribute conversions to the date the converting user saw the ad. This means that if that person purchases your product for up to 28 days since first seeing the ad, Google goes back and updates conversion for the day when they saw the ad.