A Static Future: The magic of compile-time workflows

115 点作者 joshwcomeau大约 5 年前

13 条评论

"It's exactly the kind of intensely-dynamic application that seems inconceivable to build statically"Statically is to some degree how these things used to work. Before hosting with databases was widely available, internet forum software (such as UBB) used to work by regenerating all the affected pages when posting new threads and replies. They relied on simple files as the data source. They worked really well to a point, until each update took so long as to be intolerable. The practice was then to prune older content to fix the performance issues.The biggest problem area for SSGs is sites with user-specific content where dynamic "pop-in" after load isn't a tolerable solution, or where there's a combinatorial explosion of pages. The ecommerce example is interesting because it's not the 50m product pages that's the problem, it's all the faceted navigation/search pages where it _may_ be desirable to support search indexing, the number of unique pages here may as well be considered unlimited. All of these problems are solvable with SSGs to some degree, but there's always going to be a tipping point where the amount of engineering contortion you're having to perform just to avoid using servers isn't worth it. Especially when we have a little thing called cache headers which work really well if you set them up correctly.

评论 #22812595 未加载

评论 #22813807 未加载

chrismorgan大约 5 年前

This is not a new concept by any means—it’s reviving a concept that has been around since at least the very early days of the web.It’s known as push-based caching, as distinct from pull-based caching.Push caching: when something changes, you immediately propagate the change to every location, to be valid until replaced. That used to be the standard model. It’s precise, but its eagerness can lead to prohibitive storage requirements (combinatorial explosion is very easy to achieve, and now you must store every computed state, not just the wrapping and the data) and cost of editing if the change has to propagate far (e.g. if you put a list of the five most recent blog posts in a sidebar of a site, making a new blog post now requires that every single page be regenerated). Also it’s conceptually more difficult to get right—doing no caching is far, far easier.Pull caching: responses are generated on demand and cached for typically a limited time, meaning that you may serve stale data for up to your cache lifetime. But the laziness (that it doesn’t generate everything possible) works in its favour for many situations.Hybrids are possible. Your cache may have the ability to programmatically invalidate entries, so that if you can calculate which pages would need to be regenerated, you can tell the cache, supplanting time-based caching and potentially yielding the most database-efficient result (no extraneous content generated, but pages cached exactly as long as they will be valid).Build pipelines like Make are similar: you pull by specifying the resource you require, but its validity is determined by the dependency tree, and the tool essentially pushes things, materialising the necessary resources, until the requested resource is complete.Push caching has an elegance that pull caching lacks (and I much prefer it, when feasible), but there are reasons why it’s not the standard model of the web any more. It’s tougher to implement well, and has scaling limits that can constrain your design.----As a practical application of this: in the article itself, it speaks of the reduced costs and better scaling of the static approach. But it doesn’t take into account the possibility that you generated a bunch of pages that weren’t ever requested (e.g. everyone read the blog post about widgets, but no one opened the list of posts tagged “widgets”, even though you had gone to the trouble of generating it), and so could easily have made more requests than you would have if you had instead had regular pull-based caching in place.

评论 #22813909 未加载

ealexhudson大约 5 年前

To be honest, if you dynamically generate a site and include the relevant headers, a web cache in front of the site will do all the "compilation" without any of the headache of a specific ahead-of-time process.If you get a really large site (= in terms of pages), it's often easier to invalidate an entire cache and allow it to rebuild lazily than go through and linearly reprocess every page, and you don't get any examples of pages having both "old" and "new" content as you browse the site while republishing is taking place (although you can also accomplish this with content-switching once the full rebuild was done).

评论 #22816408 未加载

评论 #22812569 未加载

siscia大约 5 年前

I am quite deep into this idea of static website and I had a little of experience building one ( now defunct ) and I provide tools targeted to static website developers ( <a href="https://simplesql.redbeardlab.com" rel="nofollow">https://simplesql.redbeardlab.com</a> )The experience of creating one with python and netlify was superb, you just push out all the pages that you need and you call it done. Simple, fast, cheap, nothing to complain.However, I need to nitpick the author, what he calls compile time, is "compile time" only in the world of frontend development (and in most of the case is not a compilation step). Moreover, also Gatsby call the step "build" and not compile (and for good reason) so I don't see the reason to introduce another source of confusion. Indeed I was running a python script that at runtime create the correct webpages.SimpleSQL provides API to interact directly with databases from JS or making API call. That I believe works amazingly for static website, provided that people want to write SQL.One problem that I see is how to do authentication and I am working on fixing it. Different credentials, one only for logging, and if you are able to login, that you will receive a credentials for accessing your own data. The logging credential can be stored in the cookies or wherever make sense for the application.

评论 #22815725 未加载

评论 #22814176 未加载

arkanciscan大约 5 年前

If your definition of "static site" is simply that the server doesn't generate any HTML at request-time, then any SPA is "static" (just one empty index.html and a web packed bundle.js generated at compile-time and an API call does the rest). But if that's the case the static future started about 7-8 years ago!Personally I think that if your client script is making AJAX requests that patch the content of the document after the initial HTML is loaded and there's no other way to see that content then it's not a static site. In other words, a static site to me is simply one that works with JavaScript turned off.The author seems to be talking about JamStack, where-in every change to the content of the site constitutes a recompile. I have serious reservations about using that technique on a site with many users making changes, like a social network or a comment section.Also, since you're all probably gonna try to fight me; there's nothing wrong with SPAs, and Gatsby and Next.js both do SSR which ticks my "works without JS" box.

评论 #22818059 未加载

评论 #22815756 未加载

kasey_junk大约 5 年前

The static movement needs to figure out a clever way to do Authn/Authz before it will really take off.If I have to hit a db for that every page I lose a lot of the value and might as well not bother.

评论 #22812623 未加载

评论 #22817734 未加载

baybal2大约 5 年前

The biggest thing about that are CDNs.The biggest growth pain for sites passing 1 million per hour mark is that the application side is not scaling well.The moment you go for a multi-DC setup, from then on all your code have to be written with that in mind. Very often that is an overkill just to show some pictures, user profiles, and articles.Third point, having one or more application servers going down becomes nowhere near as critical when your site was served from a single nodejs instance, which you have to debug under load to see why it crashes

xg15大约 5 年前

Seems to me, you could archive similar benefits by generating your HTML server-side on request (the classic pre-ajax way to do things) but also allowing your generated pages to be cached.I guess the challenge would then be cache management, as you'd have to return Last-Modified or ETag headers for dynamically generated pages - meaning you'd have to hit the database to populate those headers.However, seems to me, this could still be less work than statically generating everything up-front.

评论 #22817260 未加载

jacobr大约 5 年前

> If you change the header component, and that header component is shown on every page in the site, you will have to give it some time.To me this invalidates SSG for any busy site with a somewhat large menu. Any time you change the order of some options or rename an item, everything needs to be rebuilt.If you combine it with something like Edge Side Includes it could work I guess, only rebuilding the part if the page that’s needed.

fxtentacle大约 5 年前

I remember some years ago, generating your data files directly into a web-enabled S3 bucket was all the rage. Basically, that's what is now being called "incremental builds".It appears that the sad story of web development frameworks is that a new team will reinvent the wheel every year so that nothing will ever be fashionable and production-grade at the same time.I have some internal C++ servers running for 10+ years now. I wouldn't be able to imagine that with Rails. When we set up our current website, AngularJS 1.4 was new and considered very fashionable. By now, everyone treats the entire AngularJS project as archaic.I predict that in a year, there'll be a new hip web framework and its acolytes will then have their own Heureka moment with static files.

评论 #22816471 未加载

评论 #22815097 未加载

Arkdy大约 5 年前

It's in Clojure, not Gatsby, but I've been working on statically serving citations to covid-19 data (<a href="https://devpost.com/software/coronavirus-charts-org" rel="nofollow">https://devpost.com/software/coronavirus-charts-org</a>)

thelastbender12大约 5 年前

Building a component one time and using CDN as a cache seems really efficient. The article also mentions plausibly extending this to support an application like Spectrum. Really curious how would that work since Spectrum has real time chat like features.