"Aurora will try to retain enough log information to support that window of time."<p>It's good to know that Aurora will try. It's not like it needs to be reliable or anything.
Aurora keeps coming along in leaps and bounds, congratulations to the team, this is a fantastic achievement!<p>I only wish that every new feature didn't inevitably come with the caveat that it's only for the MySQL flavour of Aurora.<p>I understand both the engineering and product development reasons for doing so (different stack and MySQL is undoubtedly a much larger customer base), but it always makes these announcements a little underwhelming as an Aurora Postgres user.
Ok Oracle has had this feature for at least a decade, it’s called a “flashback query”. Obviously Aurora costs 10% of Oracle, but still, I thought this was going to be a huge feature-add considering the HN comment count.<p>That being said, I love AWS, am Pro-Certified, and work with it everyday.<p>I know Oracle is a giant mean bully company, but at least their arrogance was never of the “world-destabilizing” kind like Facebook.<p>EDIT: changed rollback query to flashback query (flashback query can be used both to view or to actually change the DB)
Reading the paper[1] linked from Jeff's post:<p>> In Aurora, we have chosen a design point of tolerating (a) losing an entire AZ and one additional node (AZ+1) without losing data, and (b) losing an entire AZ without impacting the ability to write data. We achieve this by replicating each data item 6 ways across 3 AZs with 2 copies of each item in each AZ. We use a quorum model with 6 votes (V = 6), a write quorum of 4/6 (V w = 4), and a read quorum of 3/6 (V r = 3). With such a model, we can (a) lose a single AZ and one additional node (a failure of 3 nodes) without losing read availability, and (b) lose any two nodes, including a single AZ failure and maintain write availability. Ensuring read quorum enables us to rebuild write quorum by adding additional replica copies.<p>There are many 2 AZ regions in AWS, of course. I don't think you can stripe 3 copies per AZ, an AZ failure drops you to potentially 2/6, and if you allow for 2/6 and 3/6 writing you could have a split brain. Any thoughts how they manage that?<p>[1] <a href="https://www.allthingsdistributed.com/files/p1041-verbitski.pdf" rel="nofollow">https://www.allthingsdistributed.com/files/p1041-verbitski.p...</a>
This is nice but it appears that the entire database instance gets rolled back to that point. It'd be a lot nicer if it could be done at a per-db or per-table granularity.<p>Realistically I'd never use this feature because of the risk of data loss. I'd restore a new instance from backups and copy the lost data back over manually.
Very interesting. The describe it as a rewind. Does anybody know if it's really a rewind, where each log record is reversible? Or do they do the easier thing of saving snapshots and then replaying the log from snapshot to desired point?
Amazon is the new IBM. Knock yourself out and jump into the AWS ecosystem. In a few years down the line, you'll understand that you've lost the leverage you had to potentially take your public cloud business somewhere else when you have so many dependencies on Amazon tech. Basic principles from my view: don't adopt anything but standard EC2/S3 services and create diversity not only in your teams but in your infrastructure policies.
TiDB has already supported this (similar) feature about 2 years ago and it has been adopted by the gaming users:
<a href="https://www.pingcap.com/blog/2016-11-15-Travelling-Back-in-Time-and-Reclaiming-the-Lost-Treasures/" rel="nofollow">https://www.pingcap.com/blog/2016-11-15-Travelling-Back-in-T...</a>
I'm slightly confused, is this the same as the existing point-in-time restore that's available for other RDS instances?<p>Edit: Main difference seems to be new cluster vs. in place.
It's a relatively classic invention, take something that exists and repackage. A snapshot and a log replay accomplishes something pretty similar. AWS slapped a ui and some orchestration around it. The cloud lock stuff makes sense (although if having an easy "undo button" on your db layer is mission critical to your business you might have other interesteting challenges.
I don't know anything about Aurora and maybe I'm missing something. But why not just wrap everything in a TRANSACTION and then do a ROLLBACK if there's an issue?
The seamlessness of this feature is quite amazing. Backups are usually a huge pain to deal with (I've recently been dealing with Postgres/Barman quite a bit). And disaster scenarios aside (for which AWS already does replications across regions), I think a frequent purpose of backups is really to do this "Undo", go back in time and pretend something didn't happen.<p>All this makes me really really wanna use Aurora. :)
How is this better than their already existing PITR?<p>Why would someone want to rollback their own production database instead of PITR to a new database and switching over to it? Surely you would end up losing data because you wouldn't be able to reconcile the new data written to it.
CockroachDB also support this.
<a href="https://news.ycombinator.com/item?id=11958660" rel="nofollow">https://news.ycombinator.com/item?id=11958660</a>
> <i>We’ve all been there! You need to make a quick, seemingly simple fix to an important production database.</i><p>Have we though? This could be one of those safety nets that makes me worse not better.