TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

SQLite Archive Files (2018)

166 pointsby alfonsodevover 3 years ago

11 comments

lifthrasiirover 3 years ago
I&#x27;m generally supportive of SQLite&#x27;s NIH syndrome---which is normally bad, but it can work if the trade-off is well researched and the resulting product is of high quality---but this one is not. Specifically sqlar is a <i>worse</i> replacement of ZIP.<p>It lacks pretty much every feature of modern compressed archive formats: filesystem and custom metadata besides from simple st_mode, solid compression, metadata compression, encryption and integrity check and so on. Therefore it can only be legitimately compared with ZIP, which does support custom metadata, very bad encryption and partial integrity check (via zlib) and only lacks the guaranteed encoding for file names. Even ignoring other formats it is not without a problem: for example the compression mode (DEFLATE vs. uncompressed) is implicitly indicated by `sz = length(data)` and I don&#x27;t think it is a good idea. If I were designing sqlar and didn&#x27;t want to spare an additional field I would have instead set sz to something negative so that it never collides with the compressed case (of course, if I had a chance I would just add a separate field instead). Pretty disappointing given other tools from the SQLite ecosystem.
评论 #28672895 未加载
评论 #28670521 未加载
评论 #28672923 未加载
评论 #28670701 未加载
评论 #28674191 未加载
评论 #28673669 未加载
评论 #28674248 未加载
评论 #28672643 未加载
评论 #28671075 未加载
kybernetikosover 3 years ago
Given how crazy the zip file format is, and the claim that sqlite is faster than the filesystem for small files (<a href="https:&#x2F;&#x2F;www.sqlite.org&#x2F;fasterthanfs.html" rel="nofollow">https:&#x2F;&#x2F;www.sqlite.org&#x2F;fasterthanfs.html</a>) this seems pretty reasonable to me.<p>In particular, development repositories with many many small source files often have horrendously slow copying&#x2F;deleting behaviour (particularly on windows) even on fast disks. I wonder if sqlite archive files would be a better way to store them.
评论 #28670322 未加载
ComputerGuruover 3 years ago
I think the title could use a (2018) appended to it, just so no one thinks this is a new thing that will be pushed on them or something.
评论 #28675307 未加载
m_keover 3 years ago
Would love something like that for storing large image datasets for computer vision. Storing embeddings, predictions and metadata in a contiguous format with compression support, ANN indexing support and SQL would be amazing.
评论 #28670543 未加载
parhamnover 3 years ago
Very cool idea! I&#x27;m a bit torn whether the format should concern itself with compression. Seems like a useful general blob container strategy, might be prudent to leave compression to the consumer?
评论 #28669832 未加载
cxrover 3 years ago
For an honest assessment, difficulty of implementation—and, accordingly, lack of diversity in implementations—should be be listed in the &quot;Disadvantages&quot; section.<p>(It&#x27;s interesting that applications against censorship are brought up. Difficulty of implementation has consequences here, too. In order to effectively use SQLite as an archive format, the receiving end will need the SQLite software. By comparison, it&#x27;s pretty trivial to craft a polyglot file that is both plain text and HTML and is self-extracting and assumes no software on the other end except to rely on the ubiquity of commodity web browsers. Always bet on text.)
评论 #28673859 未加载
noxerover 3 years ago
I wish they would build compression directly into SQLite. I use SQLite as a log store mostly dumping JSON data in it. Due to the lack of compression the DB is probably 10 times the size it could be.
评论 #28673840 未加载
评论 #28670738 未加载
评论 #28670613 未加载
edwintorokover 3 years ago
Would be interesting if this was expanded to support more compression formats. E.g. zstd. Gzip is quite an old format and zstd is a lot quicker to decompress.
评论 #28671087 未加载
评论 #28671039 未加载
say_it_as_it_isover 3 years ago
S3Lite?
PostThisTooFastover 3 years ago
I don&#x27;t see why this is better than simply implementing a table like this yourself.
reacwebover 3 years ago
When I read &quot;If the input X is incompressible, then a copy of X is returned&quot;, I worry that this is broken. If I archive a file, then extract it from the archive, I can not be sure to obtain the same file. If the file is already compressed at the beginning, it will be decompressed at the end.<p>Maybe I am wrong. I didn&#x27;t know this tool. My brief review of the documentation leads me to believe that it has an obvious problem.
评论 #28669752 未加载
评论 #28669755 未加载