Social media are typically quite heavy on tree traversals. That kind of pattern is very similar to trying to resolve a deep ORM query or a deep GraphQL query which also doesn't map very well on 'traditional' relational databases <a href="https://en.wikipedia.org/wiki/Object%E2%80%93relational_impedance_mismatch" rel="nofollow">https://en.wikipedia.org/wiki/Object%E2%80%93relational_impe...</a>. I believe this 'issue' depends on:
A) the type of join
B) whether your relational databases flattens between consecutive joins.
C) is there easy/efficient pagination on multiple levels<p>The type of join shouldn't be a problem, SQL engines should in most cases be able to determine the best join. In the cases it can't you can go start tweaking (although tricky to get right, especially if your data evolves, it's possible, you probably want to fix your query plan). B is however tricky and a performance loss since it's really a bit silly that data is flattened into a set each time to be then (probably) put into a nested (Object-Oriented or JSON) format to provide the data to the client. This is closely related to C, in a social graph you might have nodes (popular people or tweets) who have a much higher amount of links than others. That means if you do a regular join on tweets and comments and sort it, on the tweet you might not get beyond the first person. Instead, you probably only want the first x comments. That query might result in an amount of nested groups. So it might look more like the following SQL (wrote it by heart, probably not correct):<p>SELECT
tweet.*,
jsonb_agg(to_jsonb(comment)) ->> 0 as comments,
FROM tweet
JOIN comment ON tweet.id = comment.tweet_id<p>GROUP BY tweet.id
HAVING COUNT(comment.tweet_id) < 64
LIMIT 64<p>That obviously becomes increasingly complex if you want a feed with comments, likes, retweets, people, etc.. all in one.
There are reasons why two engineers that helped to scale twitter create a new database (<a href="https://fauna.com/" rel="nofollow">https://fauna.com/</a>) where I work. Although relational, the relations are done very differently. Instead of flattening sets, you would essentially walk the tree and on each level join. I did an attempt to explain that here for the GraphQL case: <a href="https://www.infoworld.com/article/3575530/understanding-graphql-engine-implementations.html?page=2" rel="nofollow">https://www.infoworld.com/article/3575530/understanding-grap...</a><p>TLDR, in my opinion you can definitely use a traditional relational database. But it might not be the most efficient choice due to the impedance mismatch. Relational applies to more than traditional SQL databases though, graph database or something like fauna is also relational and would be a better match (Fauna is similar in the sense that joins are very similar to how a graph database does these). Obviously I'm biased though since I work for Fauna.