TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Best Practices for Data Modeling

44 pointsby mjirvover 5 years ago

4 comments

zackmorrisover 5 years ago
From the article:<p><i>In general, when building a data model for end users you&#x27;re going to want to materialize as much as possible. This often means denormalizing as much as possible so that, instead of having a star schema where joins are performed on the fly, you have a few really wide tables (many many columns) with all of the relevant information for a given object available.</i><p>Where &quot;materialization&quot; means whether a relation is created as a view (rather than a table).<p>I&#x27;m not sure I agree with this. Personally, I think it&#x27;s better to primarily normalize and write all queries as close to functional programming as possible. Which means opting for computation on the fly during development and relying on the underlying database infrastructure to optimize internally. Then denormalize or dupe data via views and caching after bottlenecks have been identified. Otherwise you&#x27;re fighting premature optimization during development.<p>That said, there could be a place for materialization in log-structured storage (LSS):<p><a href="https:&#x2F;&#x2F;jvns.ca&#x2F;blog&#x2F;2017&#x2F;06&#x2F;11&#x2F;log-structured-storage&#x2F;" rel="nofollow">https:&#x2F;&#x2F;jvns.ca&#x2F;blog&#x2F;2017&#x2F;06&#x2F;11&#x2F;log-structured-storage&#x2F;</a><p>I&#x27;m thinking that the future of all of this could be LSS on distributed consensus algorithms like RAFT, then devoting some amount of cache to building out materialized relationships that can always be derived from the log. Then we could have our cake and eat it too with fast durable writes and low-overhead reads without joins.<p>If anyone has an example of this, I&#x27;d love to see it.
coward12345678over 5 years ago
This article has more fluff in it than Ron Jeremey&#x27;s condo circa 1995
评论 #21388905 未加载
rumanatorover 5 years ago
Does anyone know if there are any resources on how to design data warehouses for non-relationa data such as images, both unprocessed and processed?
throwaway35784over 5 years ago
&gt; IDs should get an _id suffix, and primary keys should be called $OBJECT_id (e.g., order_id, user_id, subscription_id, order_item_name_id).<p>Wrong. Be consistent. All id&#x27;s are named ID or id. Consistency is key.<p>Most of the article is filler and style choices.
评论 #21388215 未加载
评论 #21388780 未加载
评论 #21388768 未加载