TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Compression efficiency with shared dictionaries in Chrome

155 点作者 chamoda大约 1 年前

19 条评论

jgrahamc大约 1 年前
The very first project I worked on at Cloudflare but in 2012 was a delta compression-based service called Railgun. We installed software both on the customer&#x27;s web server and on our end and thus were able to automatically manage shared dictionaries (in this case version of pages sent over Railgun were used as dictionaries automatically). You definitely get incredible compression results.<p><a href="https:&#x2F;&#x2F;blog.cloudflare.com&#x2F;cacheing-the-uncacheable-cloudflares-railgun-73454" rel="nofollow">https:&#x2F;&#x2F;blog.cloudflare.com&#x2F;cacheing-the-uncacheable-cloudfl...</a><p>I am glad to see that things have moved on from SDCH. Be interesting to see how this measures up in the real world.
评论 #39617931 未加载
评论 #39622642 未加载
saagarjha大约 1 年前
Even putting aside CORS because I don’t even want to think about how this plays well with requests to another (tracking?) domain, this still doesn’t seem worth it. The explicit use case seems to be that it basically tells the server when you last visited the site based on which dictionary you have and then it gives you the moral equivalent of a delta update. Except, most browsers are working hard to expire data of this kind for privacy reasons. What’s the lifetime of these dictionaries going to be? I can see it being ok if it’s like 1 day but if this outlives how long cookies are stored it’s a significant privacy problem. The user visits the site again and essentially a cookie gets sent to the server? The page says “don’t put user-specific data in the request” but like nobody is stopping a website from doing this.
评论 #39619019 未加载
评论 #39620457 未加载
评论 #39618624 未加载
评论 #39618623 未加载
评论 #39623451 未加载
jauntywundrkind大约 1 年前
The Request For Position on Mozilla Zstd Support (2018) has a ton of interesting discussion on dictionaries. <a href="https:&#x2F;&#x2F;github.com&#x2F;mozilla&#x2F;standards-positions&#x2F;issues&#x2F;105">https:&#x2F;&#x2F;github.com&#x2F;mozilla&#x2F;standards-positions&#x2F;issues&#x2F;105</a><p>The original proposal for Zstd was to use a predefined stastically generated dictionary. Mozilla rejected the proposal for that.<p>But there&#x27;s a lot of great discussion on what Zstd can do, whic.h is astoundingly flexible &amp; powerful. There&#x27;s discussion on dynamic adjustment if cinpression ratios. And discussion around shared dictionaries and their privacy implications. That Mozilla turned around &amp; started supporting Zstd &amp; has stamped a positive indicator, worth prototyping on shared dictionaries is a good initial stamp of approval to see! <a href="https:&#x2F;&#x2F;github.com&#x2F;mozilla&#x2F;standards-positions&#x2F;issues&#x2F;771">https:&#x2F;&#x2F;github.com&#x2F;mozilla&#x2F;standards-positions&#x2F;issues&#x2F;771</a><p>One of my main questions after reading this promising update is: how do pick what to include when generating custom dictionaries? Another comment mentions that brotli has a standard dictionary it uses, and that&#x27;s some kind of possible starting place. But it feels like tools to build one&#x27;s custom dictionary would be ideal.
评论 #39623553 未加载
eyelidlessness大约 1 年前
I agree with other comments concerned with fingerprinting, and it was my second thought reading through the article. But my first thought was how beneficial this could be for return visitors of a web app, and how it could similarly benefit related concerns, such as managing local caches for offline service workers.<p>True, for <i>documents</i> (as is another comment’s focus) this is perhaps overkill. Although even there, a benefit could be imagined for a large body of documents—it’s unclear whether this case is addressed, but it certainly could be with appropriate support across say preload links[0]. But if “the web is for documents, not apps” isn’t the proverbial hill you’re prepared to die on, this is a very compelling story for web apps.<p>I don’t know if it’s so compelling that it outweighs privacy implications, but I expect the other browser engines will have some good insights on that.<p>0: <a href="https:&#x2F;&#x2F;developer.mozilla.org&#x2F;en-US&#x2F;docs&#x2F;Web&#x2F;HTML&#x2F;Attributes&#x2F;rel&#x2F;preload" rel="nofollow">https:&#x2F;&#x2F;developer.mozilla.org&#x2F;en-US&#x2F;docs&#x2F;Web&#x2F;HTML&#x2F;Attributes...</a>
评论 #39623484 未加载
lukevp大约 1 年前
This seems so ludicrous to me when all we really need is a way to share a resource reference across sites. Like “I need react 18.1 on this page, and the SHA should be abcdefghi “. If you don’t have it, I can give it to you from my server, or you can follow this link to a CDN, but the resource itself can be deduplicated based on the hashed contents instead of the URI. Why isn’t this a thing when basically everything uses frameworks nowadays? This shared dictionary seems like a more obtuse and roundabout way to solve these. If there was caching by hashes, browsers could even preload the latest versions of new libraries before any sites even referenced them.
评论 #39623416 未加载
评论 #39623409 未加载
matsemann大约 1 年前
How could a dictionary in the browsers that are pre-made with JS in mind fare? Aka instead of making a custom dictionary per resource I send to the user, I could say that &quot;my scripts.js file uses the browser&#x27;s built-in js-es2023-abc dictionary&quot;. So the browser&#x27;s would have some dictionaries others could reuse.<p>What&#x27;s the savings on that approach vs a gziped file without any dictionary?
评论 #39618252 未加载
ComputerGuru大约 1 年前
This seems like a possibly huge user&#x2F;browser fingerprint. Yes, CORS has been taken into account, but for massive touch surface origins (Google, Facebook, doubleclick, etc) this certainly has concerning ramifications.<p>It’s also insanely complicated. All this effort, so many possible tuples of (shared dictionary, requested resource), none of which make sense to compress on-the-fly per-request, mean it’s specifically for the benefit of a select few sites.<p>When I saw the headline I thought that Chrome would ship with specific dictionaries (say one for js, one for css, etc) and advertise them and you could use the same server-side. But this is really convoluted.
评论 #39618017 未加载
评论 #39618169 未加载
评论 #39618011 未加载
评论 #39617880 未加载
falsandtru大约 1 年前
Doesn&#x27;t the fact that resources send different data mean that SRI(Subresource Integrity) checks cannot be performed? As for fingerprinting, it would not be a problem since it is the same as with Etag.<p><a href="https:&#x2F;&#x2F;developer.mozilla.org&#x2F;en-US&#x2F;docs&#x2F;Web&#x2F;Security&#x2F;Subresource_Integrity" rel="nofollow">https:&#x2F;&#x2F;developer.mozilla.org&#x2F;en-US&#x2F;docs&#x2F;Web&#x2F;Security&#x2F;Subres...</a>
评论 #39618401 未加载
TacticalCoder大约 1 年前
&gt; Available-Dictionary: :pZGm1Av0IEBKARczz7exkNYsZb8LzaMrV7J32a2fFG4=:<p>The savings are nice in the best case (like in TFA: switching from version 1.3.4 to 1.3.6 of a lib or whatever) but that Base64 encoded hash is not compressible and so this line basically adds 60+ bytes to the request.<p>Kinda ouch for when it&#x27;s going to be a miss?
评论 #39618309 未加载
评论 #39619920 未加载
评论 #39618477 未加载
评论 #39623478 未加载
评论 #39620598 未加载
评论 #39618512 未加载
评论 #39620106 未加载
ramses0大约 1 年前
This plus native web-components is an incredible advance for &quot;the web&quot;.<p>Fingerprinting concerns aside (compression == timing attacks in the general case), the fact that it&#x27;s nearly network-transparent and framework&#x2F;webserver compatible is incredible!
raggi大约 1 年前
What I really want: dictionaries derived from the standards and standard libraries (perhaps once a year or somesuch), which I&#x27;d use independently of build system gunk, and while it wouldn&#x27;t be the tightest squeeze you can get, it would make my non-built assets get very close to built asset size for small to medium sized deployments.
IshKebab大约 1 年前
Ah damn I thought this was going to be available to JavaScript. Would be amazing for one use case I have (an HTML page containing inline logs from a load of commands, many of which are substantially similar).
评论 #39623604 未加载
评论 #39620780 未加载
netol大约 1 年前
The part I&#x27;m missing is how these dictionaries are created. Can I use the homepage to create my dictionary, so all other pages that share html are better efficiently compressed? How?
评论 #39623532 未加载
Sigliotio大约 1 年前
That should be used together with ML models.<p>Image compression for example or voice and video compression like what nvidia does.<p>But i do like this implementation focusing on libs, why not?
jwally大约 1 年前
Dumb question, but with respect to fingerprinting - how is this any worse than cookies, service workers, or localstorage?
skybrian大约 1 年前
I wonder if this would be a good alternative to minimizing JavaScript and having separate sourcemaps?
评论 #39620236 未加载
评论 #39619033 未加载
tsss大约 1 年前
This _screams_ sidechannel attack.
评论 #39623575 未加载
kazinator大约 1 年前
With shared dictionaries you can compress everything down to under a byte.<p>Just put the to-be-compressed item into the shared dictionary, somehow distribute that to everyone, and then the compressed artifact consists of a reference to that item.<p>If the shared dictionary contains nothing else, it can just be a one-bit message whose meaning is &quot;extract the one and only item out of the dictionary&quot;.
cuckatoo大约 1 年前
What stands out to me is that this creates another &#x27;key&#x27; that the browser sends on every request which can be fingerprinted or tracked by the server.<p>I do not want my browser sending anything that looks like it could be used to uniquely identify me. Ever.<p>I want every request my browser makes to look like any other request made by another user&#x27;s browser. I understand that this is what Google doesn&#x27;t want but why can&#x27;t they just be honest about it? Why come up with these elaborate lies?<p>Now to limit tracking exposure, in addition to running the AutoCookieDelete extension I&#x27;ll have to go find some AutoDictionaryDelete extension to go with it. Boy am I glad the internet is getting better every day.
评论 #39619551 未加载
评论 #39620256 未加载