The bottom line is this, if users can upload something to your site, and then your site will show that thing to other users before you have a chance to figure out if its a problem, then your site will be exploited by bad actors.<p>For a long time an out of the box server installation would include anonymous ftp access. Of course nothing is quite so attractive as a 'free' place to dump and retrieve stuff. It was kind of like setting up a warez/malware camera trap.
Should clarify: "The pitfalls of hosting user-uploaded files on your website"<p>Hosting user-uploaded files on a separate domain would probably solve this problem.
Hold-on, doesn't using a<p><pre><code> Content-Disposition: attachment; filename=”image.jpg”
</code></pre>
header mean you can no longer display the image in your service? Won't browsers treat it as a file download? Most services that allow image uploads do so because the images will get displayed on a page? (that's what I do)<p>Most services seem to be moving file uploads to S3 (or similar services) these days, so I'm not sure this advice is really helpful. To take that a step further, my preference now is to upload <i>directly</i> to S3 and bypass my app server altogether. At least in Rails, it's fairly easy to setup.
A nice way to achieve this with Rails is to upload straight to S3 and then use Paperclip to get, verify and process the file.<p>By uploading straight to S3 you also get a faster upload (than, say, Heroku) and server separation.
> <i>So if you allow file uploads or printing arbitrary user data in your service, you should always verify the contents as well as sending a Content-Disposition header where applicable.</i><p>The idea that you can "verify the contents" is pretty much just wrong. You actually have to parse the files and write out your own known-safe version. It's a real pain in the butt to do that correctly and securely across a wide variety of file types.<p>Even parsing arbitrary user uploads with something like ImageMagick is probably exploitable, simply because those libraries weren't designed to handle hostile input.