Gosh, I must say there seems to be some misunderstanding of RDBMS concepts in some posts in this thread!<p>I was writing database systems professionally, back in the days before the RDBMS concept was even a thing. So here's my (enormously long and convoluted) 2 cents worth. Please be sure to pack a sandwich and get hydrated before you continue.<p>Say you were dealing with doctors and patients, and needed to store that information in a database. Back in the day, you'd typically use a so-called hierarchical database. To design one of those, you need to decide, what is the most common access method expected to be: getting the patients for a given doctor, or the doctors for a given patient? You'd design the schema accordingly. Then, the preferred access method was easy to code, and efficient to run. The "other" access method was still possible, but harder to code, and slower to run. The database schema depended on how you thought the users would access the data.<p>But that is absolutely what <i>NOT</i> what to do with an RDBMS. Certainly you look at the users' screens, reports, and so on - but that's just to determine what unique entities the system must handle - in this case, doctors and patients. Then you <i>ignore</i> how the users will access the data, and work out what are the inherent logical relationships between all the entities.<p>Your initial answer might be this. A doctor can have many patients, and a patient can have many doctors. As any competent relational designer will instantly know, this means you need a resolving table whose primary key is a composite key comprising the primary keys of the other two tables. So if Mary was a patient of Tom, you'd add Mary to the patients table (if not already there), Tom to the doctors table (ditto), then add a Mary/Tom record to the resolving table. By that means, a doctor could have any number of patients, a patient could have any number of doctors, and it's trivially easy to write simple, performant SQL to access that data however you want.<p>But then you'd have a ghastly thought: patients can also be doctors, and doctors can also be patients! Say Tom was also a patient of Mary! Now you need a Tom record in the patient's table, but that would inappropriately duplicate all his data from the doctors table! Something's clearly wrong. You'd soon see that from a data modelling viewpoint, you don't want doctors and patients as separate entities - you want a single entity Person, and a resolving table to relate arbitrary people in specified ways.<p>So what?!!<p>So this. IMHO, many developers using relational databases have absolutely no idea about any of that. They design hopelessly unnormalised schemas, which then need reams of ghastly SQL to get anything out. The query planner can barely parse all that crap, let along optimise it. The database has to stop every five minutes to wet its brow and take a drink.<p>So here's my advice to any inexperienced relational database designers who have actually managed to get this far!! If you can answer the following questions off the top of your head, you're probably on the right track. If you <i>can't,</i> you're lacking basic knowledge that you need in order to use an RDBMS properly:<p>- what is a primary key?
- what is a foreign key?
- what is an important difference between primary keys and foreign keys?
- what is a composite key? When would you use one?
- what are the three main relations?
- what is normalisation?
- what is denormalization?
- what is a normal form?
and so on.<p>Just my 2c! :-)