TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Amazon S3 Adds Put-If-Match (Compare-and-Swap)

524 点作者 Sirupsen6 个月前

29 条评论

torginus6 个月前
Ah so its not only me that uses AWS primitives for hackily implementing all sorts of synchronization primitives.<p>My other favorite pattern is implementing a pool of workers by quering ec2 instances with a certain tag in a stopped state and starting them. Starting the instance can succeed only once - that means I managed to snatch the machine. If it fails, I try again, grabbing another one.<p>This is one of those things that I never advertised out of professional shame, but it works, its bulletproof and dead simple and does not require additional infra to work.
评论 #42244410 未加载
评论 #42243588 未加载
评论 #42244895 未加载
JoshTriplett6 个月前
It&#x27;s also possible to enforce the use of conditional writes: <a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;about-aws&#x2F;whats-new&#x2F;2024&#x2F;11&#x2F;amazon-s3-enforcement-conditional-write-operations-general-purpose-buckets&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;about-aws&#x2F;whats-new&#x2F;2024&#x2F;11&#x2F;amazon-s3...</a><p>My biggest wishlist item for S3 is the ability to enforce that an object is named with a name that matches its hash. (With a modern hash considered secure, not MD5 or SHA1, though it isn&#x27;t supported for those either.) That would make it much easier to build content-addressible storage.
评论 #42243234 未加载
评论 #42242293 未加载
评论 #42241838 未加载
评论 #42241624 未加载
评论 #42241630 未加载
评论 #42241750 未加载
Sirupsen6 个月前
To avoid any dependencies other than object storage, we&#x27;ve been making use of this in our database (turbopuffer.com) for consensus and concurrency control since day one. Been waiting for this since the day we launched on Google Cloud Storage ~1 year ago. Our bet that S3 would get it in a reasonable time-frame worked out!<p><a href="https:&#x2F;&#x2F;turbopuffer.com&#x2F;blog&#x2F;turbopuffer" rel="nofollow">https:&#x2F;&#x2F;turbopuffer.com&#x2F;blog&#x2F;turbopuffer</a>
评论 #42242120 未加载
评论 #42243418 未加载
1a527dd56 个月前
Be still my beating heart. I have lived to see this day.<p>Genuinely, we&#x27;ve wanted this for ages and we got half way there with strong consistency.
评论 #42241188 未加载
评论 #42241305 未加载
CubsFan10606 个月前
I feel dumb for asking this, but can someone explain why this is such a big deal? I’m not quite sure I am grokking it yet.
评论 #42241577 未加载
评论 #42243437 未加载
评论 #42241551 未加载
评论 #42242041 未加载
评论 #42241537 未加载
maglite776 个月前
Noting that Azure Blob storage supports e-tag &#x2F; optimistic controls as well (via If-Match conditions)[1], how does this differ? Or is it the same feature?<p>[1]: <a href="https:&#x2F;&#x2F;learn.microsoft.com&#x2F;en-us&#x2F;azure&#x2F;storage&#x2F;blobs&#x2F;concurrency-manage" rel="nofollow">https:&#x2F;&#x2F;learn.microsoft.com&#x2F;en-us&#x2F;azure&#x2F;storage&#x2F;blobs&#x2F;concur...</a>
评论 #42242409 未加载
koolba6 个月前
This combined with the read-after-write consistency guarantee is a perfect building block (pun intended) for incremental append only storage atop an object store. It solves the biggest problem with coordinating multiple writers to a WAL.
评论 #42240996 未加载
评论 #42243896 未加载
offmycloud6 个月前
If the default ETag algorithm for non-encrypted, non-multipart uploads in AWS is a plain MD5 hash, is this subject to failure for object data with MD5 collisions?<p>I&#x27;m thinking of a situation in which an application assumes that different (possibly adversarial) user-provided data will always generate a different ETag.
评论 #42241391 未加载
评论 #42241417 未加载
评论 #42242333 未加载
评论 #42243448 未加载
评论 #42242010 未加载
ipython6 个月前
I can&#x27;t wait to see what abomination Cory Quinn can come up with now given this new primitive! (see previous work abusing Route53 as a database: <a href="https:&#x2F;&#x2F;www.lastweekinaws.com&#x2F;blog&#x2F;route-53-amazons-premier-database&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.lastweekinaws.com&#x2F;blog&#x2F;route-53-amazons-premier-...</a>)
amazingamazing6 个月前
Ironically with this and lambda you could make a serverless sqlite by mapping pages to objects, using http range reads to read the db and lambda to translate queries to the writes in the appropriate pages via cas. Prior to this it would require a server to handle concurrent writers, making the whole thing a nonstarter for “serverless”.<p>Too bad performance would be terrible without a caching layer (ebs).
评论 #42242312 未加载
sillysaurusx6 个月前
Finally. GCP has had this for a long time. Years ago I was surprised S3 didn’t.
评论 #42241173 未加载
评论 #42248936 未加载
评论 #42241735 未加载
m_d_6 个月前
s3fs&#x27;s <a href="https:&#x2F;&#x2F;github.com&#x2F;fsspec&#x2F;s3fs&#x2F;pull&#x2F;917">https:&#x2F;&#x2F;github.com&#x2F;fsspec&#x2F;s3fs&#x2F;pull&#x2F;917</a> was in response to the IfNoneMatch feature from the summer. How would people imagine this new feature being surfaced in a filesystem abstraction?
spprashant6 个月前
I had no idea people rely on S3 beyond dumb storage. It almost feels like people are trying to build out a distributed OLAP database in the reverse direction.
评论 #42246730 未加载
vytautask6 个月前
An open-source implementation of Amazon S3 - MinIO has had it for almost two years (relevant post: <a href="https:&#x2F;&#x2F;blog.min.io&#x2F;leading-the-way-minios-conditional-write-feature-for-modern-data-workloads&#x2F;" rel="nofollow">https:&#x2F;&#x2F;blog.min.io&#x2F;leading-the-way-minios-conditional-write...</a>). Strangely, Amazon is catching up just now.
评论 #42243073 未加载
评论 #42248151 未加载
tonymet6 个月前
good example of how a simple feature on the surface (a header comparison) requires tremendous complexity and capacity on the backend.
评论 #42241351 未加载
wanderingmind6 个月前
Does this mean, in theory we will be able to manage multiple concurrent writes&#x2F;updates to s3 without having to use new solutions like Regatta[1] that was recently launched?<p><a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=42174204">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=42174204</a>
评论 #42244897 未加载
gravitronic6 个月前
First thing I thought when I saw the headline was &quot;oh! I should tell Sirupsen&quot;
lttlrck6 个月前
Isn&#x27;t this compare-and-set rather than compare-and-swap?
rrr_oh_man6 个月前
Could anybody explain for the uninitiated?
评论 #42241619 未加载
stevefan19996 个月前
So...are we closer to getting to use S3 as a...you guessed it...a database? With CAS, we are probably able to get a basic level of atomicity, and S3 itself is pretty durable, now we have to deal with consistency and isolation...although S3 branded itself as &quot;eventually consistent&quot;...
评论 #42242195 未加载
评论 #42241951 未加载
评论 #42246685 未加载
vlovich1236 个月前
I implemented that extension in R2 at launch IIRC. Thanks for catching up &amp; helping move distributed storage applications a meaningful step forward. Intended sincerely. I&#x27;m sure adding this was non-trivial for a complex legacy codebase like that.
anonymousDan6 个月前
Would be interesting to understand how they&#x27;ve implemented it and they whether there is any perf impact on other API calls.
dvektor6 个月前
[rejected] error: failed to push some refs to remote repository<p>Finally we can have this with s3 :)
评论 #42248056 未加载
paulsutter6 个月前
What’s amazing is that it took them so long to add these functions
thayne6 个月前
Now if only you had more control over the ETag, so you could use a sha256 of the total file (even for multi-part uploads), or a version counter, or a global counter from an external system, or a logical hash of the content as opposed to a hash of the bytes.
londons_explore6 个月前
So we can now implement S3-as-RAM for a worldwide million-core linux VM?
juggli6 个月前
finally
grahamj6 个月前
bender_neat.gif
serbrech6 个月前
Why is standard etag support making the frontpage?