I don't really grasp why inference speed of tiny YOLO models is still reported by using giant GPUs. I guess it helps to see the improvement from the previous iteration, but under what circumstances would I realistically want to run this on a P100 at 1600 fps? I feel like I'd rather know how fast it runs on, say, a Jetson or even a Raspberry Pi CPU.