<p><pre><code> On an average, this kind of metadata occupies 16% of size
of the JPEG file.
</code></pre>
Ho ho. You think that's bad? Back in 2011, Tumblr didn't strip metadata from <i>avatar</i> images. That results in some funny files, like this one: <a href="http://28.media.tumblr.com/avatar_c5ee131b70d0_40.png" rel="nofollow">http://28.media.tumblr.com/avatar_c5ee131b70d0_40.png</a><p>That PNG has a 3325 byte IDAT chunk, and a 106022 iCCP chunk. The metadata is 3188% bigger than the image itself.<p>Personally, I think websites <i>should</i> strip metadata from thumbnails and resized images, but should <i>also</i> let you download the original, unmodified image, complete with original filename. Why?<p>Instagram and others always recompress and strip metadata when you submit an image. This results in shitpics-- images so mangled by recompression that they look like visual gravel: <a href="https://theawl.com/the-triumphant-rise-of-the-shitpic-e25d8e5af9bc#.bkxh5tln3" rel="nofollow">https://theawl.com/the-triumphant-rise-of-the-shitpic-e25d8e...</a> This is a complete own goal, there's no technical reason this has to happen. Digital files aren't supposed to decay!<p>And, of course, stripping authorship tags would make the dream of automated attribution impossible: <a href="https://eev.ee/blog/2016/08/15/attribution-on-the-web/" rel="nofollow">https://eev.ee/blog/2016/08/15/attribution-on-the-web/</a>
From my experience hosting a bunch of user-provided images:<p>1. Strip all metadata but provide downloads of originals somewhere<p>2. Keep it simple, just use imagemagick's convert to remove profiles (but don't use imagemagick for file type detection)<p>3. If the image has orientation exif tags, rotate the image to the right orientation (-auto-orient) before removing the exif profile.<p>4. Don't remove image profile data. Or convert to sRGB first.
ImageOptim is a handy little tool to strip all the metadata <a href="https://imageoptim.com/mac" rel="nofollow">https://imageoptim.com/mac</a>
There's some use to this metadata, for example gps coordinates to locate where it was taken, author info, camera parameters, etc. It might not be needed all the time, but it probably also shouldn't be stripped off all the time.