I feel this site shows why websites need more than 512kb these days... Most of these are just unaesthetic, simple personal sites with a few blog posts. Nothing wrong with that but nothing to glorify either.
I agree with the general sentiment about size, but page size isn't everything; it's really more what that size is contributing towards. If a multi-megabyte page consists of many high-resolution images, that clearly gives far more value to the reader than if the same amount of data was spent on tons of JS of which next to none actually gets executed. Likewise, the <i>huge</i> amount of video data on a YouTube page is of definite value to the viewer (perhaps excepting the ads).
The Guardian is 4MB because the front page has a dozen+ hi-res images. Turn them off if you want to optimize for size. Not every site has to be a static wall of text.<p>And why is 512KB acceptable but 4MB a problem? Yes the web is a bloated mess, but there's no point drawing random lines on meaningless metrics. Just focus on providing a good user experience. A few bytes of bad JavaScript can mess up your site's performance, while several well-optimized MBs can make it accessible and useful.
It would be better to evaluate the sites based on the size of non-image resources. E.g, load the site but discard all image files when computing the size so only text, css, js, etc are counted. Then add the images in separately. That would allow sites with useful images like a photo gallery to still be listed on this page. That would still endorse well designed sites that load the text & CSS instantly so that the reader doesn't need to wait for image loading.
I think there also needs to another metric like Alexa rank (or something more reliable) which shows the traffic/popularity of the website wrt to its size.<p>That would be more interesting to me.