We built our site, <a href="https://appapp.io" rel="nofollow">https://appapp.io</a> (a search engine for the App Store) as a one page app. It serves no dynamic content in html from the server, so we were unsure to what extent google would spider/index it.<p>As far as we can tell, it makes no difference from if it was generated server side: <a href="https://www.google.com/search?q=site%3Aappapp.io" rel="nofollow">https://www.google.com/search?q=site%3Aappapp.io</a><p>So yes, Google definitely does index dynamic content. I would love to know if it ranks it equivalently.<p>Also, Bing does not: <a href="http://www.bing.com/search?q=site%3aappapp.io" rel="nofollow">http://www.bing.com/search?q=site%3aappapp.io</a><p>(apologies for the minor self-promotion)
I suppose this explains all the times I've seen a promising search result with the words I was searching for prominently highlighted, then visited the page to find what I was looking for is no longer there. Sometimes the cached, text-only version has it, and sometimes not. Alternatively, I'll see search results with <i>none</i> of the words I was searching for, yet perhaps they did sometime in the past. Rather annoying.
I have modified wikipedia pages, then googled it, to see search result "instantly" updated.<p>Also, sneaky web sites often give different results to the googlebot user agent than to a non-google firefox user agent<p><a href="https://en.wikipedia.org/wiki/User_agent" rel="nofollow">https://en.wikipedia.org/wiki/User_agent</a><p><a href="https://addons.mozilla.org/en-GB/firefox/search/?q=user+agent&cat=all" rel="nofollow">https://addons.mozilla.org/en-GB/firefox/search/?q=user+agen...</a>
I'd <i>really</i> love it if you repeated the same tests for Bing, just to get coverage. (Yahoo/Baidu would be the other big two.) Historically, Bing hasn't used fully functional headless browsers to crawl, which has limited its ability to index dynamic content like this.<p>Google has "only" 70% market share, so it seems irresponsible to make engineering decisions without testing the others. Google+Bing+Yahoo+Baidu get you to 98%.
The post author writes:<p>> So, very soon, the days of pre-rendering PhantomJs snapshots and serving shadow content to spiders will be over.<p>To be clear: webmasters of sites with dynamic content should not celebrate yet. There are still influential spiders other than Google's that do not parse JavaScript (for example, Facebook[1] and Twitter[2]).<p>[1] <a href="https://developers.facebook.com/docs/sharing/webmasters/crawler" rel="nofollow">https://developers.facebook.com/docs/sharing/webmasters/craw...</a><p>[2] Can't find an official statement on this, but <a href="https://twittercommunity.com/search?q=javascript%20crawl" rel="nofollow">https://twittercommunity.com/search?q=javascript%20crawl</a>
I'm curious how google strongly penalizes SPAs for being slow to load.<p>The content may be indexed, but if your visitors are on a mobile network, that initial visit (or a visit with stale cache) is going to be crappy. It's great that they can read in they content (though bing cannot), but if it's buried on page two, does it even matter?<p>As someone who is a proponent of web perf, these kind of articles make me worried that server side rendering will be ignored because "SEO works now for Javascript", even if it's slow and google is only 70% desktop & 80% mobile search.
I wonder if you could use this to find information about the google crawler. Inject system and browser info into the page. Then you can find out what kind of browser engine it runs, with which settings etc.. If you wanted, you could use this information to do undetectable masking (I don't think it would work in the long run, though)<p>It would be also interesting to see what timeouts it still allows. I wouldn't be surprized if the modified browser "virtualizes" time and runs window.setTimeout immediately. Maybe you could make a busy loop and find out what the real timeouts are. It seems there got to be some, otherwise this would open a way to DOS the crawler (not that I'd do that).
Google may be indexing dynamic content now, but the question I'm curious about is how it affects crawl efficiency. I can't imagine indexing JS content is as efficient as indexing content returned from the original HTTP request.
Related comment from an HNer who worked on this at Google (from 2006 to 2010): <a href="https://news.ycombinator.com/item?id=9531344" rel="nofollow">https://news.ycombinator.com/item?id=9531344</a>
Regarding SPA-based websites, as long as your site has only a few pages, these results are relevant. I would like to see the same kind of test on a site with 1000+ pages for example. I already did this kind of test in the past and it was failing miserably (i.e. only a dozen of pages were correctly indexed).
The next test could be: Does google crawl hidden text (display:none, very small, very transparent colored text)?
My guess is they do crawl it because it can have legitimate uses, but if there is to much of them on a page then they give it a lower ranking.
This article is from May.<p><a href="http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157" rel="nofollow">http://searchengineland.com/tested-googlebot-crawls-javascri...</a>
I am sure that google "discovers" javascript/ajax content. They also mention this on their guides several times.<p>But are there any experiments/results related to SEO impact/crawl frequency etc?
Offtopic: "Google search results on tablet"<p>Recently Google changed their search result page for tablets. First it looked fine, and useful.<p>But many times the first result page is now completely full of advertisements, only the second page now shows usual links to websites like Github, Wikipedia, Youtube, etc. of a common search term. Very annoying! And the Youtube link is broken on iPad (it tries to link to a non HTTP address). I am just unlucky to be part of an AB-testing?<p>An news article about the changes: <a href="http://searchengineland.com/google-launches-new-search-results-interface-for-tablets-235340" rel="nofollow">http://searchengineland.com/google-launches-new-search-resul...</a>