We've had several i3's (2xl and 4xl's) die in the space of several days on critical pathways, out of about 18 that we have in service right now. Our AWS rep has given nothing but non-answers and nervous comments.<p>We love the i3 because of the cost/compute value, but we need to pull them out of the fleet if they're on fragile hardware.<p>HN SRE's/devops/infeng people, have you experienced similar?
Anecdata:<p>We've had a mix of ~40 t2.u & c4.l instances running for a year with no downtime. Our i3.4xl has fully borked twice (memorable when we lose the ephemeral drives and need to reconstitute the analytics data).<p>Though it will be much more expensive and less performant, we're moving the system to an RDB-backed c4 soon for reliability, the people time to recover is too expensive.