A high RAM server is (at least) an order of magnitude cheaper than a GPU compute server. Why aren't we seeing RAM servers running inference? Is it just because the RAM bandwidth isn't high enough or is there another bottleneck that makes it unsuitable despite the cost saving?