It's interesting how big of a difference it makes when you allow local disk I/O, and you need to schedule around it. At Google they disintermediate storage and all (not really all) I/O goes to their cluster FS (Colossus, sort of). They don't have to schedule around I/O resources because every process has access to the full I/O resources of the entire cluster at any time from any node. By contrast as soon as you let some open source or off-the-shelf commercial thing leak into your operations, it will demand ordinary POSIX disk I/O and then you've got big problems. I propose that some companies would actually be better off concentrating on disintermediated storage more, and I/O workload scheduling less.