TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: s3-lambda – Lambda functions over S3 objects: each, map, reduce, filter

177 点作者 wellsjohnston超过 8 年前

12 条评论

hoodoof超过 8 年前
Its weird how S3 seems to be the unwanted stepchild of AWS.<p>So many obvious innovations just aren&#x27;t turning up.<p>For example, strangely, AWS introduced tagging for S3 resources, but you can&#x27;t search&#x2F;filter by tag, nor is the tag even returned when you get a list of objects, you can only get the tag with an object request. The word &quot;pointless&quot; springs to mind.<p>In fact it&#x27;s strange that there is NO useful filtering at all apart from the very useful folder&#x2F;hierarchy&#x2F;prefix filtering. But apart from that you can&#x27;t do wildcard searches or filters or date filters or tag filters.<p>I&#x27;m building an application right now that needs to get a list of all the jpg files - the only way to do that is get every single object in the bucket and manually filter out the unwanted ones - feels like its 1988 again.<p>It seems like it would also be valuable for there to be alternate interfaces to S3 such as the ability to send data via ftp or SMTP or sftp or whatever, but there are no such interfaces.<p>Hopefully Google will goad AWS into action on S3 innovation by implementing such features.
评论 #13632720 未加载
评论 #13631953 未加载
评论 #13633536 未加载
评论 #13631968 未加载
评论 #13647646 未加载
dschnurr超过 8 年前
Might make sense to rename this to avoid confusion with AWS Lambda (I immediately thought it was related). Otherwise, looks like an awesome library!
评论 #13630285 未加载
评论 #13629956 未加载
评论 #13629896 未加载
simonw超过 8 年前
First impression: this is a brilliant piece of software design.<p>The ability to compose a map&#x2F;filter chain and execute it in parallel against every object in an S3 bucket that matches a specific prefix - wow.<p>The set of problems that can be quickly and cheaply solved with this thing is enormous. My biggest problem with lambda functions is that they are a bit of a pain to actually write - for transforming data in S3 this looks like my ideal abstraction.
评论 #13629979 未加载
评论 #13629802 未加载
评论 #13629756 未加载
hayd超过 8 年前
see also aws athena <a href="https:&#x2F;&#x2F;aws.amazon.com&#x2F;athena&#x2F;" rel="nofollow">https:&#x2F;&#x2F;aws.amazon.com&#x2F;athena&#x2F;</a> ?
评论 #13632693 未加载
评论 #13630760 未加载
评论 #13632539 未加载
评论 #13632540 未加载
DenisM超过 8 年前
So... the client-side code iterates S3 objects matching a certain filter, and then schedules a lambda for each one of those objects. Is that right? Or does the iteration procedure itself is a lambda? Also, when you chain several operators together, where does the chaining happen?<p>I&#x27;d like to understand where different parts of the code are being executed.
评论 #13629915 未加载
评论 #13629906 未加载
评论 #13630820 未加载
avip超过 8 年前
This is a nice project. For real-world use cases, we have good alternatives:<p>1. Migrate s3 ==&gt; gc and use BigQuery which does support udf<p>2. Register to databricks (I&#x27;m not affiliated)<p>3. (for the brave) poke aws support to implement udf on Athena
_Marak_超过 8 年前
If anyone is interested in this same kind of architecture for multi-cloud file-system providers ( no cloud lock-in ), please check out this project: <a href="https:&#x2F;&#x2F;github.com&#x2F;bigcompany&#x2F;hook.io-vfs" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;bigcompany&#x2F;hook.io-vfs</a><p>Used in production, but it could use some contributors.
kvz超过 8 年前
Getting aan index of (millions of) files on s3 is very slow for us, like, days. Is there anything you do to work around this? It seems since this is not an AWS Lambda project the client first has to acquire an index from S3 before concurrency benefits set in?
评论 #13631008 未加载
cle超过 8 年前
Is this susceptible to any of S3&#x27;s eventual consistency constraints?
评论 #13635074 未加载
评论 #13634157 未加载
wcdolphin超过 8 年前
I thinking having the default be destructive for mapping is a strange design decision. That is going to bite someone one day soon.
评论 #13634153 未加载
dhpe超过 8 年前
Really nice to have a generic functional interface to S3. Thanks.
stolendog超过 8 年前
where actually you can use it ? in which cases? can you provide examples?
评论 #13630146 未加载