> For example, loading data from an SSD is more than 3 orders of magnitude (more than 1000x) slower than referencing main memory, and a disk seek on a spinning disk is 5 orders of magnitude (100,000x) slower than referencing data that is in memory. The above latencies should make it clear that there is a huge performance advantage to minimizing disk access.<p>this is the first explanation i've seen that directly links disk latency and data architecture decisions. always felt it intuitively but never did the math.
The advantage of OLAP vs. OLTP on analytical queries is clearly visible on the benchmark:
<a href="https://benchmark.clickhouse.com/" rel="nofollow">https://benchmark.clickhouse.com/</a><p>If we take the extreme - ClickHouse as the most thoroughly optimized column-oriented DBMS and compare it with Postgres, the difference will be more than 100 times on average.
> The column-oriented storage format used by data warehouses allows them to efficiently leverage modern SIMD computer architectures for columnar-vectorized processing.<p>I find it interesting how these vectorized processing engines with DuckDB and Photon Engine of Databricks try to combine row and columnar-oriented strengths.