Hi, Ben from DigitalOcean here - just to give you guys an update. This method will no longer work on a newly created droplet.<p>We've now default scrub_data to ON for both web interface and API as we look at making this process permanent. Additionally, we've re-engineered the way we're provisioning disks and access to previously written data is no longer possible.<p>We've taken all steps in favor of security currently and will build a permanent solution that favors security and caution moving forward.
So, this is going to overwrite lots of data on the block device you're trying to recover data from, resulting in a lot of repeated information and erasure of recoverable stuff. The correct answer is to redirect the output, and make find.sh output the gzipped data so you can pipe that to your local disk never touching the remote end.<p>Edit: Here's the code. <a href="https://github.com/gregimba/DigitalHax/pull/1" rel="nofollow">https://github.com/gregimba/DigitalHax/pull/1</a>
This is a cute PoC of how much easy it is, but with freely available forensic tools like, say, PhotoRec is possible to extract much more meaningful and diverse data (entire files, images, database files...) that by simply running strings.<p>So, don't take it as the maximum damage one can get.
how long does dd take? Could use an estimate. Ran dd for around 10 minutes this morning and got 500,000 lines, and it was still running.<p>update: finished in around 12 minutes. out.txt is around 10gb.<p>update: out.txt is around 54 million lines from wc -l out.txt. I'm using less with command [line number]G to poke around. I have an NYC1 droplet, and there's a lot of junk not mine.. text in other languages and python which i don't use
As a user of Digital Ocean (amongst others) I find it hard to get too excited about this. When I destroy a droplet (VM) I already have the option to scrub the discs before deletion.<p>If I choose not to use that (and I never have on any of the hundreds of machines I've created and later torn down) it's because there is nothing of any sensitivity on them. If someone wants to resurrect gigabytes of entirely boring and transient log data from what I was last doing, they're welcome to!<p>I can only really see this being a concern for people who were storing sensitive information on a cloud instance which they then removed and chose NOT to scrub. In which case, they already have larger issues than this one. "Problem with user, not with cloud."
This is a case where only "aggressive full disclosure" got a company to respond. Which is why I'm generally only willing to go through "responsible disclosure" for companies which have shown themselves to be reasonable in the past, or in exceptional cases where the vulnerability is impossible to end users to mitigate, and/or causes exceptionally grave harm.
In what circumstances will this work? Are you recovering data from other customers? If so, will this work even if the other customer has deleted their VM using the recommended procedure?