TechEcho

7 comments

So at one point we where doing scale testing for our product where we needed to simulate systems running our software connected back to a central point. The idea was to run as many docker containers as we could on a server with 2x24 core and 512GB of RAM. The RAM needed for each container was very small. No matter what the system would start to break around ~1000 containers (this was 4 years ago). After doing may hours the normal debugging we did not see anything on the network stack or linux limits side that we had not already tweaked (so we thought).So out comes strace! Bingo! We found out that the system could not handle the ARP cache with so many end points. Playing with net.ipv4.neigh.default.gc_interval and the stuff associated with it got us up to 2500+ containers.

评论 #20602078 未加载

评论 #20602418 未加载

ec109685almost 6 years ago

Is there any talk of increasing these defaults in higher memory systems. The low defaults feel like foot guns that people stumble into rather than something needed for optimal performance.

评论 #20601499 未加载

评论 #20602004 未加载

falsedanalmost 6 years ago

The big bottleneck we had with docker containers per host was not sustained peak but simultaneous start. This was with 1.6-1.8 but we’d see containers failing to start if more than 10 or so (sometimes as low as 2!) were started at the same time.<p>Hopefully rootless docker completely eliminates the races by removing the kernel resource contention.

评论 #20605052 未加载

mmaunderalmost 6 years ago

"Access was initially fronted by nginx with consul-template generating the config. When it did not scale anymore nginx was replaced by Traefik."<p>Wonder why Nginx didn't scale.

评论 #20603629 未加载

_nickwhitealmost 6 years ago

"With /proc/sys/kernel/pid_max defaulting to 32768 we actually ran out of PIDs. We increased that limit vastly, probably way beyond what we currently need, to 500000. Actuall limit on 64bit systems is 222"<p>Time to start thinking about 128bit systems!

评论 #20603572 未加载

lewaldmanalmost 6 years ago

Well... I'm running 100~150 per EC2 with Kubernetes... ¯\_(ツ)_/¯

评论 #20603695 未加载

m0zgalmost 6 years ago

They could easily double that density with Go, or quadruple with C++ or Rust. Why people still use JRE I fail to understand.

评论 #20602295 未加载

评论 #20602306 未加载

评论 #20602398 未加载

7 comments

myrandomcommentalmost 6 years ago

评论 #20602078 未加载

评论 #20602418 未加载

ec109685almost 6 years ago

Is there any talk of increasing these defaults in higher memory systems. The low defaults feel like foot guns that people stumble into rather than something needed for optimal performance.

评论 #20601499 未加载

评论 #20602004 未加载

falsedanalmost 6 years ago

评论 #20605052 未加载

mmaunderalmost 6 years ago

"Access was initially fronted by nginx with consul-template generating the config. When it did not scale anymore nginx was replaced by Traefik."<p>Wonder why Nginx didn't scale.

评论 #20603629 未加载

_nickwhitealmost 6 years ago

评论 #20603572 未加载

lewaldmanalmost 6 years ago

Well... I'm running 100~150 per EC2 with Kubernetes... ¯\_(ツ)_/¯

评论 #20603695 未加载

m0zgalmost 6 years ago

They could easily double that density with Go, or quadruple with C++ or Rust. Why people still use JRE I fail to understand.

评论 #20602295 未加载

评论 #20602306 未加载

评论 #20602398 未加载

From 30 to 230 Docker containers per host

7 comments

From 30 to 230 Docker containers per host

7 comments