TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Time Series Benchmark TurboPFor,TurboFloat,TurboFloat LzX,TurboGorilla

3 pointsby powturboalmost 2 years ago

1 comment

powturboalmost 2 years ago
Gorilla [1] and Gorilla based algos [2] are simply overrated.<p>Store these values as 32 bits floats instead of 64 bits and you get instant 50% reduction without any compression.<p>This is valid for allmost all time series data.<p>Most of time series databases (ex. DuckDB) are storing floating point data as 64 bits.<p>They are reporting some extraordinary compression ratio by using a gorilla&#x2F;chimp like algorithm.<p>However as shown in this benchmark, lot of time series data (ex. temparature, climate data, stocks,...)<p>don&#x27;t have more than 1 or 2 fixed decimal digits and can be stored losslessly in 16&#x2F;32 bits integers.<p>Integer compression [1] algorithms can then be used, which results in significant compression ratio and several times faster than the gorilla like algorithms.<p>TurboGorilla, the fastest Gorilla (or chimp) based algo in c, cannot exceed 1GB&#x2F;s in decompression, wherea TurboPFor is in the order of 10Gb&#x2F;s, TurboBitByte is &gt;100Gb&#x2F;s.<p>-[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;powturbo&#x2F;TurboPFor-Integer-Compression">https:&#x2F;&#x2F;github.com&#x2F;powturbo&#x2F;TurboPFor-Integer-Compression</a><p>-[2] <a href="https:&#x2F;&#x2F;www.vldb.org&#x2F;pvldb&#x2F;vol8&#x2F;p1816-teller.pdf" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.vldb.org&#x2F;pvldb&#x2F;vol8&#x2F;p1816-teller.pdf</a><p>-[3] <a href="https:&#x2F;&#x2F;www.vldb.org&#x2F;pvldb&#x2F;vol15&#x2F;p3058-liakos.pdf" rel="nofollow noreferrer">https:&#x2F;&#x2F;www.vldb.org&#x2F;pvldb&#x2F;vol15&#x2F;p3058-liakos.pdf</a>
评论 #36472937 未加载