If I read correctly, the agent only controls one player at a time. On offense it controls the player with the ball, and on defense it controls probably the player closest to the ball. The other players are controlled by the built-in AI. Controlling a single agent kind of takes away from the appeal of deep-RL: that entire teams can learn to coordinate in novel and optimal ways.