We have a bunch of python services in flask and falcon which we run using gunicorn with sync workers. Most of our services are I/O bound. We want to be able to handle bursty loads of upto 50-60 requests per second in each service.<p>We have been using the 2x + 1 thumb rule to decide the number of workers for gunicorn but to achieve the throughput we want with sync workers we'd need to scale the amount of cpu which leads to inefficient cpu utilization.<p>One option we are trying is gevent but that has its own issues with grpc which we are looking into. What other things do people do for this? AsyncIO? I know there's a bunch of new ASGI frameworks that have come up. What has been your experience with these in production?