I have a small website/personal project and very limited money.<p>I'm trying to work out a way to keep it alive and well for the longest time possible and using as little money as possible.<p>It basically gets data from an external api, transforms that data using some node/python scripts and save that as json files. Then I use the data in the json files to generate static html pages. As I save the json files with the specific data I need to generate the pages(kinda like a view), there's no querying needs other than key/value.<p>I'm divided wether to keep going with the file system or use a database/document store/whatever.<p>I'm running it on a small 512mb DO vps and I want to keep the costs as low as possible, so ram and general systems resources usage should be kept at minimum.<p>I have a lot of data and most of it is kinda redundant, so it's not a big deal if some of it goes bad/is lost. So I don't really see an incentive to use a proper database/document store/etc. But it feels kinda wrong to deal with GB's of data all sorted in lil folders saved as json files. It may be just prejudice I got from reading too much stuff about big data, trendy dbs and such. But it also may be some technical debt I'm failing to see right now that may be a giant pain in the near future.
I'm doing this kind of thing for hundreds of billion records stored across hundreds of Terabytes (our business is to generate fingerprints based on binary files).<p>The added advantages:
-- less failure points in case of data corruption
-- any tool or any language can be used for parsing data
-- easy to partition and move data elsewhere as needed
-- grow as you see needed (only need disk space)
-- dependable disk space, no caches nor anything else needed<p>So, will depend on your context but you are not alone in case deciding to follow that route. On our side we are happy with the approach, sqlite gets slow when adding billions of records.
DBMS's generally bring a couple of things:<p>ACID<p>Queries<p>Unless you need either of these things then dumping onto the file system is fine.<p>To ease your anxiety you could always use SQLite (has JSON extensions) and simply writes to a file on the file-system (which means no DBMS server to run), this is only an option if access to SQLite is single threaded.
As long as there is no pain point where you wish you had a database or are starting to reimplement parts of one you should be fine with flat files. If you generate static files parallel access, speed advantages etc likely are not an issue as well.