How Google handles JavaScript throughout the indexing process

200 pointsby ea01610 months ago

19 comments

The rich snippet inspection tool will give you an idea of how Googlebot renders JS.Although they will happily crawl and render JS heavy content, I strongly suspect bloat negatively impacts the "crawl budget". Although in 2024 this part of the metric is probably much less than overall request latency. If Googlebot can process several orders of magnitude of sanely built pages with the same memory requirement as a single React page, it isn't unreasonable to assume they would economize.Another consideration would be that "properly" used, a JS heavy page would most likely be an application of some kind on a single URL, whereas purely informative pages, such as blog articles or tables of data would exist on a larger number of URLs. Of course there are always exceptions.Overall, bloated pages are a bad practice. If you can produce your content as classic "prerendered" HTML and use JS only for interactive content, both bots and users will appreciate you.HN has already debated the merits of React and other frameworks. Let's not rehash this classic.

评论 #41141592 未加载

评论 #41138042 未加载

dlevine10 months ago

I work for a company that enables businesses to drop eCommerce into their websites. When I started, this was done via a script that embedded an iFrame. This wasn't great for SEO, and some competitors started popping up with SEO-optimized products.Since our core technology is a React app, I realized that we could just mount the React app directly on any path at the customer's domain. I won't get into the exact implementation, but it worked, and our customers' product pages started being indexed just fine. We even ranked competitively with the upstarts who used server-side rendering. We had a prototype in a few months, and then a few months after that we had the version that scaled to 100s of customers.We then decided to build a new version of our product on Remix (SSR framework similar to nextjs). It required us to basically start over from scratch since most of our technologies weren't compatible with Remix. 2 years later, we still aren't quite done. When all is said and done, I'm really curious to see how this new product SEOs compared to the existing one.

评论 #41135171 未加载

评论 #41135619 未加载

评论 #41136521 未加载

jxi10 months ago

I actually worked on this part of the Google Search infrastructure a long time ago. It's just JSC with a bunch of customizations and heuristics tuned for performance to run at a gigantic scale. There's a lot of heuristics to penalize bad sites, and I spent a ton of time debugging engine crashes on ridiculous sites.

评论 #41139886 未加载

评论 #41139388 未加载

评论 #41139272 未加载

评论 #41139739 未加载

orenlindsey10 months ago

I really think it would be cool if Google started being more open about their SEO policies. Projects like this use 100,000 sites to try to discover what Google does, when Google could just come right out and say it, and it would save everyone a lot of time and energy.The same outcome is gonna happen either way, Google will say what their policy is, or people will spend time and bandwidth figuring out their policy. Either way, Google's policy becomes public.Google could even come out and publish stuff about how to have good SEO, and end all those scammy SEO help sites. Even better, they could actively try to promote good things like less JS when possible and less ads and junk. It would help their brand image and make things better for end users. Win-win.

评论 #41136262 未加载

评论 #41136224 未加载

评论 #41134540 未加载

评论 #41136446 未加载

encoderer10 months ago

I did experiments like this in 2018 when I worked at Zillow. This tracks with our findings then, with a big caveat: it gets weird at scale. If you have a very large number of pages (hundreds of thousands or millions) Google doesn’t just give you limitless crawl and indexing. We had js content waiting days after scraping to make it to the index.Also, competition. In a highly competitive seo environment like US real estate, we were constantly competing with 3 or 4 other well-funded and motivated companies. A couple times we tried going dynamic first with a page we lost rankings. Maybe it’s because fcp was later? I don’t know. Because we ripped it all out and did it server side. We did use NextJs when rebuilding trulia but it’s self hosted and only uses ssr.

dheera10 months ago

I actually think intentionally downranking sates that require JavaScript to render static content is not a bad idea. It also impedes accessibility-related plugins trying to extract the content and present it to the user in whatever way is compatible to their needs.Please only use JavaScript for dynamic stuff.

评论 #41134268 未加载

评论 #41136867 未加载

评论 #41136239 未加载

rvnx10 months ago

Strange article, it seems to imply that Google has no problem to index JS-rendered pages, and then the final conclusion is "Client-Side Rendering (CSR), support: Poor / Problematic / Slow"

评论 #41133482 未加载

评论 #41133265 未加载

评论 #41134149 未加载

ea01610 months ago

A really great article. However they tested on nextjs.org only, so it's still possible Google doesn't waste rendering resources on smaller domains

评论 #41137749 未加载

globalise8310 months ago

Would be interested to know how well Google copes with web components, especially those using Shadow DOM to encapsulate styles. Anyone have an insight there?

评论 #41138955 未加载

orliesaurus10 months ago

If Google handles front-end JS so well, and the world is basically a customer-centric SEO game to make money - why do we even bother use server side components in Next.js?

评论 #41139827 未加载

评论 #41138827 未加载

评论 #41138751 未加载

评论 #41143839 未加载

bbarnett10 months ago

I kinda wish Google would not index JS rendered stuff. The world would be so much better.

评论 #41136474 未加载

评论 #41134036 未加载

评论 #41133754 未加载

评论 #41135654 未加载

EcommerceFlow10 months ago

They tested Google's ability to index and render JS, but not how well those sites ranked. I know as an SEO those results would look completely different. When you're creating content to monetize, the thought process is "why risk it?" with JS.

评论 #41133664 未加载

TZubiri10 months ago

It's not a coincidence Google developed Chrome. They needed to understand what the fuck they were looking at, so they were developing a JS + DOM parser anyways.

评论 #41136539 未加载

azangru10 months ago

Here's Google search team talking about this in a podcast: <a href="https://search-off-the-record.libsyn.com/rendering-javascript-for-google-search" rel="nofollow">https://search-off-the-record.libsyn.com/rendering-javascrip...</a>

llmblockchain10 months ago

This blog post is akin to Phillip Morris telling us smoking isn't bad. Go on, try it.

toastal10 months ago

Just a reminder: there are alternative search engines, & most of those don’t work without JavaScript & should still be a target

sylware10 months ago

headless blink like the new BOTs: headless blink with M&K driven by an AI (I don't think google have click farms with real humans like hackers)

kawakamimoeki10 months ago

I have noticed that more and more blog posts from yesteryear are not appearing in Google's search results lately. Is there an imbalance between content ratings and website ratings?

评论 #41136119 未加载

DataDaemon10 months ago

This is a great auto-promotion article, but everyone knows Googlebot is busy; give him immediate content generated on the server or don't bother Googlebot.