TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Python – Writing large ZIP archives without memory inflation

107 pointsby sandesabout 5 years ago

9 comments

mehrdadnabout 5 years ago
It'd be better if they didn't require knowing the paths a priori. One of the fundamental strengths of ZIP is that the file list is at the end of the archive rather than the beginning, letting you dynamically discover and send contents in a streaming fashion. Forcing the list of files to be known a priori works against that.
评论 #23201861 未加载
kummapppabout 5 years ago
Please test your library with the c10-archive test suite <a href="https:&#x2F;&#x2F;www.ee.oulu.fi&#x2F;research&#x2F;ouspg&#x2F;PROTOS_Test-Suite_c10-archive" rel="nofollow">https:&#x2F;&#x2F;www.ee.oulu.fi&#x2F;research&#x2F;ouspg&#x2F;PROTOS_Test-Suite_c10-...</a>
评论 #23202560 未加载
评论 #23201760 未加载
2bluescabout 5 years ago
I did something similar to stream large amounts of data off of an embedded device via a zip file, falcon, and wsgi server with no external dependencies for actual zip stream.<p>Proof of concept: <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;1e22bbf31b7e5ae84bbdfa32c68e03a9" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;1e22bbf31b7e5ae84bbdfa32c68e03a9</a>
jonatronabout 5 years ago
I remember needing this 5+ years ago, I used a branch of a fork of python-zipstream <a href="https:&#x2F;&#x2F;github.com&#x2F;longaccess&#x2F;python-zipstream&#x2F;tree&#x2F;streaminput" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;longaccess&#x2F;python-zipstream&#x2F;tree&#x2F;streamin...</a>
m4rtinkabout 5 years ago
Yeah, I looked into this a while ago forgenerating photo gallery zip files on the fly. Currently I&#x27;m using lazygal to generate static html galleries and if I want to give users the option to easoly download the full gallery, it basically doubles the size I need to store, as lazygal simply creates a zip file and puts it next to the photos and html files in the static gallery folder it generates.<p>So I looked to creating the zip files on the fly when requested &amp; streaming them to the client onthe other end withou having to create temporary files and&#x2F;or a lot of memmory consumption on the server. I got it working on a prototype and even found some articles from others that got it working - it is not that hard, really.
评论 #23200249 未加载
评论 #23205189 未加载
评论 #23201372 未加载
tyingqabout 5 years ago
The zip file format itself, and the compression algorithms, like deflate...all seem set up to encourage low memory usage. The directory, for example, is at the end of the file.<p>Also, most places in a zipfile you might not have context to write in a streaming fashion are predictable in size&#x2F;position so you can throw in a placeholder and seek() back to it later.<p>Kind of a shame someone has to specifically implement a low memory usage library. That implies other implementations went the lazy route.
评论 #23198530 未加载
评论 #23198798 未加载
mrbonnerabout 5 years ago
I wrote a servlet that streamed gigabytes of zipped data to HTTP clients in 2005. Something like this was baked into the JDK awhile ago. Why is it an achievement for Python, though?
评论 #23200541 未加载
nerdbaggyabout 5 years ago
I wish windows supported more than just zip natively. I would love something like a tar. So much easier to generate on the fly than compressed formats.
评论 #23200896 未加载
eggsnbacon1about 5 years ago
Is this an issue in new versions of vanilla Python? I mostly bang on crusty old Java stuff and zip streaming has been built into JDK for decades
评论 #23202814 未加载
评论 #23202181 未加载
评论 #23200305 未加载
评论 #23200774 未加载