Hosting SQLite Databases on GitHub Pages

567 pointsby isnotchicagoalmost 4 years ago

14 comments

arrmnalmost 4 years ago

Here's the original post <a href="https://news.ycombinator.com/item?id=27016630" rel="nofollow">https://news.ycombinator.com/item?id=27016630</a> This guy is a genius

papanoahalmost 4 years ago

I get this error on Firefox 90.0.2 on Debian 10. It works in chrome though.[error: RuntimeError: abort(Error: Couldn't load <a href="https://phiresky.netlify.app/world-development-indicators-sqlite/split-db/db.sqlite3.000" rel="nofollow">https://phiresky.netlify.app/world-development-indicators-sq...</a>. Status: 0). Build with -s ASSERTIONS=1 for more info.]Other than that, is pretty awesome and exactly what I was hoping for.

评论 #28016578 未加载

cafxxalmost 4 years ago

That's one lovely trick.If I may suggest one thing... instead of range requests on a single huge file how about splitting the file in 1-page fragments in separate files and fetching them individually? This buys you caching (e.g. on CDNs) and compression (that you could also perform ahead of time), both things that are somewhat tricky with a single giant file and range requests.With the reduction in size you get from compression, you can also use larger pages almost for free, potentially further decreasing the number of roundtrips.There's also a bunch of other things that could be tried later, like using a custom dictionary for compressing the individual pages.

评论 #28019121 未加载

评论 #28018680 未加载

goeiedaggoeiealmost 4 years ago

SQLite not being in browsers instead of indexdb saddens me today still.I designed a system 15 years ago that released dimensional star schemas for specific reports as sqllite databases into Adobe Air (or whatever the prerelease name was) for a large retailer in the UK. We would query the data warehouse, build the sqlite db file (I can't remember the exact db sizes but they weren't too big - 15mb or so) and the computational performance we got from using sqlite as our crunching layer when building the dimensional reports with drilldown was just astounding.

评论 #28017426 未加载

评论 #28019272 未加载

评论 #28035370 未加载

devwastakenalmost 4 years ago

Huh, this is a funny one. I had this idea a long time ago when doing some napkin design of a "static wiki". Problem was the querying didn't fit how software optimizes content delivery, so millions of people requesting from a single database would most likely be difficult to accomplish in a performant manner. Secondarily writing to said database would of course be impossible because locking, and you'd need a server anyways to do any sort session based submittal of data.Very nice for read-only static data sets for small sites though. Infact this may be very useful for county mapping systems, converting over the GIS data to tables in SQLite.If at all possible it would be better if this could be in ES5 (no async await) javascript, only very very modern browsers are going to be able to access it. People with older phones (which is many) wouldn't be able to use it at all.

评论 #28021051 未加载

评论 #28019441 未加载

manmalalmost 4 years ago

I can’t fully put my finger on why exactly, but I feel that this is a transformative idea. What’s to stop me from emulating a private SQLite DB for every user of a web app, and use that instead of GraphQL?

评论 #28017540 未加载

评论 #28018011 未加载

评论 #28017628 未加载

评论 #28021362 未加载

评论 #28017712 未加载

评论 #28022017 未加载

评论 #28042800 未加载

Hackbratenalmost 4 years ago

Good writeup, thanks!All the code snippets, when run, give me the following error message:[error: RuntimeError: abort(Error: server uses gzip or doesn't have length). Build with -s ASSERTIONS=1 for more info. (evaluating 'new WebAssembly.RuntimeError(e)')]Could that be a Mobile Safari thing?

评论 #28016465 未加载

评论 #28016398 未加载

评论 #28017055 未加载

评论 #28016263 未加载

makmanalpalmost 4 years ago

FWIW the scientific computing community (who often deal with petabytes of geodata) has been thinking of ideas like this for a while, e.g. techniques around file formats that are easy to lazily and partially parse, (ab)using FUSE to do partial reads using http RANGE requests, some combination thereof, etc:<a href="http://matthewrocklin.com/blog/work/2018/02/06/hdf-in-the-cloud" rel="nofollow">http://matthewrocklin.com/blog/work/2018/02/06/hdf-in-the-cl...</a>

zubairqalmost 4 years ago

On yazz.com we have been embedding and running SQLite in web pages for over 2 years now. It is definitely something that works well

评论 #28021325 未加载

mfbx9da4almost 4 years ago

Wow okay, so is this like an HTTP based buffer pool manager? Instead of reading pages from disk it reads via HTTP?

评论 #28017253 未加载

makmanalpalmost 4 years ago

This is a great example of how as technology changes, it changes use cases, which can prompt a revisiting of what was once considered a good idea. You'll often see the pendulum of consensus swing in one direction, and then swing back to the exact opposite direction less than a decade later.2010s saw REST-conforming APIs with json in the body largely as an (appropriate) reaction to what came before, and also in accordance with changes around what browsers were able to do, and thus how much of web apps moved from the backend to the front.But then, that brought even more momentum where web apps started doing /even more/. There was a time when downloading a few megabytes per page, generating an SVG chart or drawing an image, interacting to live user interaction was all unthinkable. But interactive charting is now de facto. So now we need ways to access ranges and pieces of bulk data. And it looks a lot more like block storage access than REST.---These are core database ideas: you maintain a fast and easy to access local cache of key bits of data (called a bufferpool, stored in memory, in e.g. mysql). In this local cache you keep information on how to access the remaining bulk of the data (called an index). You minimize dipping into "remote" storage that takes 10-100x time to access.Database people refer to the "memory wall" as a big gap in the cache hierarchy(CPU registers, L1-L3, main memory, disk / network) where the second you dip beyond it, latency tanks (Cue the "latency numbers every programmer should know" chart). And so you have to treat this specially and build your query plan to work around it. As storage techniques changed (e.g. SSDs, then NVMEs and 3d x-point etc), databases research shifted to adapt techniques to leverage new tools.In this new case, the "wall" is just before the WAN internet, instead of being before the disk subsystem.---This new environment might call for a new database (and application) architectural style where executing large and complex code quickly at the client side is no problem at all in an era of 8 core CPUs, emscripten, and javascript JITs. So the query engine can move to the client, the main indexes can be loaded and cached within the app, and the function of the backend is suddenly reduced to simply storing and fetching blocks of data, something "static" file hosts can do no problem.The fundamental idea is: where do I keep my data stored, where do I keep my business logic, and where do I handle presentation. The answer is what varies. Variations on this thought:We've already had products that completely remove the "query engine" from the "storage" and provides it as a separate service, e.g. Presto / Athena where you set it up to use anything from flat files to RDBMSs as "data stores" across which it can do fairly complicated query plans, joins, predicate pushdown, etc. Slightly differently, Snowflake is an example of a database that's architected around storing main data in large, cheap cloud storage like s3, no need to copy and keep entire files to the ec2 node, only the block ranges you know you need. Yet another example of leveraging the boundary between the execution and the data.People have already questioned the wisdom of having a mostly dumb CRUD backend layer with minimal business logic between the web client and the database. The answer is because databases just suck at catering to this niche, but nothing vastly more complicated than that. They certainly could do granular auth, serde, validation, vastly better performance isolation, HTTP instead of a special protocol, javascript client, etc etc. Some tried.Stored procedures are also generally considered bad (bad tooling, bad performance characteristics and isolation, large element of surprise), but they needn't be. They're vastly better in some products that are generally inaccessible to or unpopular with large chunks of the public. But they're a half baked attempt to keep business logic and data close together. And some companies had decided at a certain time that their faults were not greater than their benefits, and had large portions of critical applications written in this way not too long ago.---Part of advancing as an engineer is to be able to weigh the cost of when it's appropriate to sometimes free yourself from the yoke of "best practices" and "how it's done". You might recognize that something about what you're trying to do is different, or times and conditions have changed since a thing was decided.And also to know when it's not appropriate: existing, excellent tooling probably works okay for many use cases, and the cost of invention is unnecessary.We see this often when companies and products that are pushing boundaries or up against certain limitations might do something that seems silly or goes against the grain of what's obviously good. That's okay: they're not you, and you're not them, and we all have our own reasons, and that's the point.

评论 #28021059 未加载

ridajalmost 4 years ago

Neat. Just not to be used for authentication!

Mootyalmost 4 years ago

I wonder if you could turn this with a Google Spreadsheet into a real DB system with writing access and a little obfuscated security wrapper.

chovybizzassalmost 4 years ago

bummer on not being able to write to sqlite. I am using neocities and was wonder how i could get a db into play

评论 #28017573 未加载

评论 #28016880 未加载

14 comments

arrmnalmost 4 years ago

Here's the original post <a href="https://news.ycombinator.com/item?id=27016630" rel="nofollow">https://news.ycombinator.com/item?id=27016630</a> This guy is a genius

papanoahalmost 4 years ago

评论 #28016578 未加载

cafxxalmost 4 years ago

评论 #28019121 未加载

评论 #28018680 未加载

goeiedaggoeiealmost 4 years ago

评论 #28017426 未加载

评论 #28019272 未加载

评论 #28035370 未加载

devwastakenalmost 4 years ago

评论 #28021051 未加载

评论 #28019441 未加载

manmalalmost 4 years ago

评论 #28017540 未加载

评论 #28018011 未加载

评论 #28017628 未加载

评论 #28021362 未加载

评论 #28017712 未加载

评论 #28022017 未加载

评论 #28042800 未加载

Hackbratenalmost 4 years ago

评论 #28016465 未加载

评论 #28016398 未加载

评论 #28017055 未加载

评论 #28016263 未加载

makmanalpalmost 4 years ago

zubairqalmost 4 years ago

On yazz.com we have been embedding and running SQLite in web pages for over 2 years now. It is definitely something that works well

评论 #28021325 未加载

mfbx9da4almost 4 years ago

Wow okay, so is this like an HTTP based buffer pool manager? Instead of reading pages from disk it reads via HTTP?

评论 #28017253 未加载

makmanalpalmost 4 years ago

评论 #28021059 未加载

ridajalmost 4 years ago

Neat. Just not to be used for authentication!

Mootyalmost 4 years ago

I wonder if you could turn this with a Google Spreadsheet into a real DB system with writing access and a little obfuscated security wrapper.

chovybizzassalmost 4 years ago

bummer on not being able to write to sqlite. I am using neocities and was wonder how i could get a db into play

评论 #28017573 未加载

评论 #28016880 未加载