TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Ask HN: Best document storage solution in 2023

15 点作者 ID1452319大约 2 年前
We have a application which requires a document store. This need to hold up to 10 million documents and be accessible via APIs to retrieve documents to display in our application.<p>We are considering everything from Dropbox-type solutions to blob storage in GCP.<p>What kind of document storage solutions are people using in 2023 to meet this use case?

7 条评论

tothrowaway大约 2 年前
I use B2 and Wasabi because I don&#x27;t like relying on a single cloud provider. Files are uploaded to both. OpenResty (Nginx+Lua) sits in front to provide caching, and the logic for deciding which provider to pull from.<p>Wasabi gives you a free bandwidth allowance equal to the number of bytes stored per month. When I use up most of that, I start pulling from B2. And of course, if one of them is down, I pull from the other.<p>It&#x27;s more time up front to build instead of just relying completely on GCP&#x2F;Azure&#x2F;AWS. But I don&#x27;t have to worry as much about spontaneous account terminations destroying my business.
s1k3s大约 2 年前
It&#x27;s too much of a generic question to be answered right. Do you need global availability? Do you need high speed downloads? Are you worried about bandwidth costs? etc.<p>We use S3 + Cloudfront for documents that we want to be quickly accessed by our customers. We use SFTP for our internal docs when we don&#x27;t care that much about availability and speed.
评论 #35081392 未加载
评论 #35083690 未加载
speedgoose大约 2 年前
I would go with an S3 compatible object store by default.<p>In Open-source Ceph and Minio are common. Garage is newer and has good potential too and it has a simpler design.<p><a href="https:&#x2F;&#x2F;ceph.com&#x2F;en&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ceph.com&#x2F;en&#x2F;</a> <a href="https:&#x2F;&#x2F;min.io&#x2F;" rel="nofollow">https:&#x2F;&#x2F;min.io&#x2F;</a> <a href="https:&#x2F;&#x2F;garagehq.deuxfleurs.fr&#x2F;" rel="nofollow">https:&#x2F;&#x2F;garagehq.deuxfleurs.fr&#x2F;</a>
fpdavis大约 2 年前
The file system was designed to hold documents and does a pretty good job of it, there are several to choose from depending on what OS you run. Backing them up and restoring them is easy. An API to retrieve documents is trivial to write and customize or there are a few tools and APIs already available.
giaour大约 2 年前
There are a number of fine options for blob storage (S3, R2, Ceph, Azure Storage, etc.), but with that many documents it&#x27;s likely access control and audit logging will be important. If that&#x27;s the case, something heavyweight like SharePoint may be a better choice.
评论 #35084812 未加载
locustmostest大约 2 年前
One possibility is to use our open-core document management API build to deploy in your AWS account: <a href="https:&#x2F;&#x2F;github.com&#x2F;formkiq&#x2F;formkiq-core">https:&#x2F;&#x2F;github.com&#x2F;formkiq&#x2F;formkiq-core</a><p>The files are stored in S3, with customizable metadata storage in DynamoDB. As the system is designed to run on AWS Serverless and Managed Services, the majority of the cost will come from S3 storage fees.
LLcolD大约 2 年前
Documents need to be indexed?
评论 #35081657 未加载