I'd love to see more transparency in the way CDNs decide to cache or not cache your content. For example: Cloudflare publishes crawl frequencies in their pricing table but what do they actually do with that content? Push it to all their edges? I'd doubt that. I guess it's based on website traffic, your website pricing plan, ... but it seems quite arbitrary to me.
For dynamic http content, there is a "cache-control" in http header for CDN to decide how to do cache.<p>For static content, CDN will try to cache the static content for ever.