Persistent storage remains a complicated problem. Attaching volumes on the fly with docker volume abstraction works well enough for most cloud workloads, whether on-demand or spot, but it's still easy to run into problems.<p>This is leading to rapid progress in clustered/distributed filesystems and it's even built into the Linux kernel now with OrangeFS [1]. There are also commercial companies like Avere [2] who make filers that run on object storage with sophisticated caching to provide a fast networked but durable filesystem.<p>Kubernetes is also changing the game with container-native storage. This seems to be the most promising model for the future as K8S can take care of orchestrating all the complexities of replicas and stateful containers while storage is just another container-based service using whatever volumes are available to the nodes underneath. Portworx [3] is the great commercial option today with Rook and OpenEBS [4] catching up quickly.<p>1. <a href="http://www.orangefs.org" rel="nofollow">http://www.orangefs.org</a><p>2. <a href="http://www.averesystems.com/products/products-overview" rel="nofollow">http://www.averesystems.com/products/products-overview</a><p>3. <a href="https://portworx.com" rel="nofollow">https://portworx.com</a><p>4. <a href="https://github.com/openebs/openebs" rel="nofollow">https://github.com/openebs/openebs</a>
OP is offering some very dangerous advice.<p>Twenty years ago, software was hosted on fragile single-node servers with fragile, physical hard disks. Programmers would read and write files directly from and to the disk, and learn the hard way that this left their systems susceptible to corruption in case things crashed in the middle of a write. So behold! People began to use relational databases which offered ACID guarantees and were designed from the ground up to solve that problem.<p>Now we have a resource (spot instances) whose unreliability is a <i>featured design constraint</i> and OP's advice is to just mount the block storage over the network and everything will be fine?<p>Here's hoping OP is taking frequent snapshots of their volumes because it sure sounds like data corruption is practically a statistical guarantee if you take OP's advice without considering exactly how state is being saved on that EBS volume.
Spot instances can now "stop" instead of "terminate" when you get priced out, persisting the attached EBS volumes:<p><a href="https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ec2-spot-can-now-stop-and-start-your-spot-instances/" rel="nofollow">https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ec...</a>
Even if you don't use spot instances, the technique of using separate EBS volumes to hold state is useful (and well-known). Ordinary on-demand instances can also be terminated prematurely due to hardware failure or other issues, so storing state on a non-root volume should be considered a best current practice for any instance type.
There's a mechanism exactly for this purpouse in Linux: pivot_root. It's used in the standard boot process to switch from the initrd (initial ramdisk) environment to the real system root.<p>ec2-spotter classic uses this, but you can also make a pivoting AMI of your favourite Linux distribution.<p>One thing to watch out for is how to keep the OS automatic kernel updates working. AMIs are rarely updated and you're going to have a "damn vulnerable linux" if you don't get the updates just after booting a new image.
When you are using Kubernetes, you won't have to deal with this yourself. The Cluster will move pods from nodes that are stopped because the spot price is exceeded. Ideally place nodes at different bids. So there will be a performance hit but no outage. With the new AWS start/stop feature [1] nodes will come up again when the spot price sinks.<p>1) <a href="https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ec2-spot-can-now-stop-and-start-your-spot-instances/" rel="nofollow">https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ec...</a>
To make this even more streamlined you'd tag the volumes and discover the volumes with `aws ec2 describe-volumes` and filter unattached volumes with the magic tag.
We normally utilize spots with Spotinst + Elasticbeanstalk. Our billing looked great ever since.<p>This solution looks good, yet only applies to single instance scenarios. I presume this kind of thinking might move forward with EFS + chroot for an actual scalable solution that cannot be ran on Elasticbeanstalk.
So I was pleasantly surprised to discover that for the last several years, spot instances have provided a mechanism that give you 2 minutes notice prior to shutdown:<p><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html" rel="nofollow">http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-inte...</a><p>Learn something new everyday. :)<p><a href="https://aws.amazon.com/blogs/aws/new-ec2-spot-instance-termination-notices/" rel="nofollow">https://aws.amazon.com/blogs/aws/new-ec2-spot-instance-termi...</a>
The author goes to great lengths to come up with a way for the software that was running on a terminated spot instance to be relaunched using the same root filesystem on a new spot instance, but they never explain <i>why</i> they need to do <i>exactly</i> this. Maybe they already ran everything in Docker containers on CoreOS, so their solution isn't a big shift, but I strongly suspect they could find a simpler way to save and restore state if they got over this obsession with preserving the root filesystem their software sees.
If you don't care about reliability, why not just get a cheap and powerful VPS? Paying $90/month for that machine is madness. I pay $6/month for 6GB RAM, 4 cores, 50GB disk.
Well, one easy way when using Ubuntu-like distributions is to simply place your `/home` folder on a separate (persistent) EBS volume [1].<p>With a few on-boot scripts to attach-volumes / start-containers, it should be fairly easy to get going as well.<p>[1] <a href="https://engineering.semantics3.com/the-instance-is-dead-long-live-the-instance-8b159f25f70a" rel="nofollow">https://engineering.semantics3.com/the-instance-is-dead-long...</a>
I don't know why all the comments are saying this is bad idea. For me, one of thing for I use EC2 is deep learning. I just use spot GPU instance, attach overlayroot volume and launch jupyter notebook in it. Other things like google dataflow is not useful to me due to the price and the process of installing packages. I can also think of many other use cases for using some persistence volume for some manual task.
Wouldn't it be simpler to have the smallest possible instance run an NFS server? This would also have an additional bonus of scalability.<p>Edit: or use AWS EFS
Is it just me or to me spot instances should deal with work and not storage, and hence your (stateful) units of work should be in a Queue/DB? (in a non-spot instance)<p>Attaching and detaching volumes is a good idea but I wouldn't use that to keep state
we use k8s at work. i just have to create PVC and when spot instance terminated along with the container; new container will be created and mount the PVC again automatically.
It sounds wrong to try to keep the state across two ec2 instances. If you find yourself in that situation, try pushing your state outside the ec2 instance a bit harder. (dynamodb, s3 etc...)<p>You will get a <i>lot</i> of benefit out of it, but may lose in performance, which is fine in 99% of the cases.