For a latest reference on AI and machine learning for network engineer please check this book by Javier Antich [1].<p>Please also check the review here [2]. For what it's worth, the book is listed in the "10 Books Every Network Engineer Should Read" [3].<p>[1] Machine Learning for Network and Cloud Engineers: Get ready for the next Era of Network Automation:<p><a href="https://www.goodreads.com/book/show/101180344-machine-learning-for-network-and-cloud-engineers" rel="nofollow">https://www.goodreads.com/book/show/101180344-machine-learni...</a><p>[2] MUST READ: Machine Learning for Network and Cloud Engineers:<p><a href="https://blog.ipspace.net/2023/02/machine-learning-network-cloud/" rel="nofollow">https://blog.ipspace.net/2023/02/machine-learning-network-cl...</a><p>[3] 10 Books Every Network Engineer Should Read:<p><a href="https://networkphil.com/2024/05/21/10-books-every-network-engineer-should-read/" rel="nofollow">https://networkphil.com/2024/05/21/10-books-every-network-en...</a>
This is not "AI for network engineers" but rather "Network engineering for AI datacenters". I was expecting to read that a small neural network could be used to direct traffic.
"Though BGP supports the traditional Flow-based Layer 3 Equal Cost Multi-Pathing (ECMP) traffic load balancing method, it is not the best fit for a RoCEv2-based AI backend network. This is because GPU-to-GPU communication creates massive elephant flows, which RDMA-capable NICs transmit at line rate. These flows can easily cause congestion in the backend network."