I find it a bit disturbing how this post reaches the top of HN. But I suppose I shouldn't be surprised.<p>I probably live in my own little bubble, but only lately have I realized that NoSQL has two audiences: (1) People for whom normalization can't work because of their application's characteristics and the limitations of current hardware. (2) People who just don't understand basic relational concepts, or why they were invented in the first place.<p>It's kinda sad. I've consulted on projects where people implemented sharding before adding any indices to MySQL.<p>The thing about being in group (1) is that you can also recognize when the ground shifts beneath your feet. Artur Bergman is one of those guys.<p><a href="http://www.youtube.com/watch?v=H7PJ1oeEyGg&feature=youtube_gdata_player" rel="nofollow">http://www.youtube.com/watch?v=H7PJ1oeEyGg&feature=youtu...</a>
This is one of the cleanest explanations of normalization I have come across. I will use this with a few high school students I am working with, and I expect it will be pretty easy for them to understand.
So are we supposed to be designing all our databases to conform to third-normal form? I am not very adept at DB stuff but doesn't that increase the number of lookups needed to retrieve a row of useful data? Is the performance hit from that less painful than storing redundant (perhaps simply for caching) fields in one table?
Very nice article, but I did not like the example data he was using.<p>The problem is that both population and tournament city could be dependent on year.<p>Population (obviously) changes from year-to-year, so either his original data is incorrect or he's recording a city's latest population along with all historic tennis championship winners -- neither interpretation makes a lot of sense. Why not use something unlikely to change such as either elevation or country?<p>Likewise, tournament city is not a fixed value for a tournament. For example, the Australian Open, which he uses as one of his examples, has also been set in Sydney, Adelaide, Brisbane, Perth, Christchurch (NZ), and Hastings (NZ).<p>To resolve this, he would either have to introduce a two column primary key (tournament, year), or pick some simpler data. I suggest the latter.<p>Even with all of this criticism, I think it's one of the cleanest introductions of the normal forms for beginners.
Very clear. I found that the met comprehensive explanation of normalization was in an APress book, <i>Applied Mathematics for Database Professionals</i>.<p>Incidentally, did you cover 4NF and 5NF in the course? IMO, you'll almost never need 6NF.
This is a good summary/introduction. Would it be improved with examples of insertion/deletion/modification anomalies using the context of your data example? This seems like a loose end to me at the moment.