I'm trying to identify the bottleneck in my rabbitmq consumer written in Node.JS. The consumer performs 2 simple MYSQL SELECT queries using TypeORM, and then finally perform an API request using Axios.<p>My Setup
--
<i>This consumer, the rabbitmq server, and the MYSQL server, everything is in docker containers inside a t3.large EC2 instance.<p></i> The tables are very small, (less than 2000 rows)<p>The issue -<p>* This consumer is able to process only around 200 jobs per second. If I push 1000 jobs into the queue, it takes atleast 5 seconds to complete the 1000th job.<p>* Observations
---
* I monitored the typeorm find() function execution time in node. It degrades when the number of jobs in queue increase. For example, if I have only 1 job in queue, it takes around 10ms to complete the query. But when the queue has 1000 jobs, in the 1000th job the execution time of the same function is above 1000ms.<p>* So to check if the MYSQL server itself is slowing the queries, I fetched the performance metrics of queries from the MYSQL server, but it shows a consistent query speed of around 2-3ms. Max query execution time recorded is around 40ms.<p>* Consumer docker containers CPU usage is near 100% when it process 1000 jobs.<p>* If I set the rabbitmq prefetch count to 10 or 20, the consumer behaves normally, and the find() function execution time is also stable. And the overall job completion time also comes down. But if I add more workers with same prefetch count, the same problem again happens. (find() function execution starts to degrade)<p>Any idea where is the real bottleneck here? Is it the consumer that is not able to perform the job under high load? Why? Is it because of MYSQL connection initiation? (I even increased max_connections in MYSQL server but not much difference). Can anyone help?