I had a similar idea recently, originally I intended to try and identify "fashionable" color palettes, which worked pretty well:
<a href="https://idle.nprescott.com/2016/scraping-color-palettes.html" rel="nofollow">https://idle.nprescott.com/2016/scraping-color-palettes.html</a><p>and only later tried branching out to the 100 most popular sites:
<a href="https://idle.nprescott.com/2016/image-capture-crawler.html" rel="nofollow">https://idle.nprescott.com/2016/image-capture-crawler.html</a><p>I tried ImageMagick's histogram functionality and a K-Means but wasn't happy with the results. I didn't finish doing a meaningful color extraction from the most popular sites because of difficulty getting (in my mind) representative samples from more complex images (like the linked GitHub homepage screenshot). I still intend on circling back at some point and trying a color quantization.<p>I hadn't thought to scrape the CSS colors directly, that's an interesting approach. I'd like to see the colors sorted in some way, to get a better sense of the range for each site.