There are other reasons for duplicates in event streams - not just the dupes introduced by at-least once processing in Kinesis or Kafka workers. We've done a lot of thinking about this (all open-source) at Snowplow, this is a good starting point:<p><a href="http://snowplowanalytics.com/blog/2015/08/19/dealing-with-duplicate-event-ids/" rel="nofollow">http://snowplowanalytics.com/blog/2015/08/19/dealing-with-du...</a><p>Our last release started to tackle dupes caused by bots, spiders and dodgy UUID algos:<p><a href="http://snowplowanalytics.com/blog/2016/12/20/snowplow-r86-petra-released/#synthetic-dedupe" rel="nofollow">http://snowplowanalytics.com/blog/2016/12/20/snowplow-r86-pe...</a>