Hey HN, I was wondering if anyone has ever come across a good paper on scaling horizontally (Sharding) a survey of strategies used in the field. I am a beginner and I am trying to build a scalable image store for fast retrieval and slow writes, and data integrity.
It is pretty straightforward to do ... the only suggestion I can offer is use a random hash instead of doing it by something silly like a key prefix. The random hash gives you better balancing between the shards. Also check out the recent issue of the communications of the acm. A dude from Joyent wrote about the issues in scaling an image transcoding system.