If I understand this correctly then the huge improvement in latency (from 200ms to 3ms) comes from not having to deal with slow clients directly. Traffic to your front-end server are now only from ELB, and ELB is "spoon-feeding" the web-clients. This is true if you are using ELB in "http-mode".
This also explains why you can cut the front-end servers by 20% - as each request is handled more efficiently (lower latency equals higher throughput). Also, connection-reuse is more efficient as the set of servers in the ELB-pool is more limited that the set of web-clients.