This would be deployed on Kubernetes on the major cloud providers, and need near-zero downtime. Mainly wondering what kind of catastrophic failures to watch out for.
If you have very large redis instance and it starts using all the memory on the box then eventually you will not have enough memory to replicate to your other redis server. Be sure there is enough free memory to fork your redis process so that it can send a copy of itself over to the replicating instance. Practice using sentinel commands so that when you need to do emergency maintenance it isn’t something you need to google to remember...