Good advice for problems, but one should focus more on the underlying problem mentioned:<p><i>chances are, vital files can be accessed with the rights your application has.</i><p>When you deploy your webapp, NEVER run your web server or application container server as root. I deploy python using nginx and uWSGI where nginx runs as the "nginx" user and group, and uwsgi runs as the "uwsgi" user and group.<p>Since uWSGI is spawning all the python instances, if I want to be able access files on disk such as for storing and reading sessions, I must explicitly change the folder owner to the uwsgi group for the webapp to even launch.<p>If you take only one security precaution for your production python app, make sure none of the instantiating servers run as root and that they must be given explicit permission to access folders.
The author also makes a common mistake: blacklisting as safe string escaping. My standard is to only accept alphabetic characters in any file-based user input system (I find very few filesystem problems require more than this). For example, one of my applications allows people to upload files. They never access the file names themselves, but are allowed to input a "file name." This file name is all alphabetic and is checked for length restrictions and encoded in my own file naming system. The translations are maintained in the application database. This prevents a wide variety of attacks on the filesystem itself.<p>Moreover, this whole article embodies the statement NEVER, EVER, IN ANY CIRCUMSTANCES TRUST USER INPUT.
Common mistakes as an experienced developer: over-complicate your system with a complex topology of multiple, independent services with separate maintenance, deployment and configuration concerns when a simple, monolithic script could do.
Hmm, startswith('/', '../') in is_secure_path doesn't check for paths like this: './../'<p>Although it would prohibit some legitimate uses, you could just prohibit any usage of '../' anywhere in the path.
Shouldn't sanitization tools of the kind he is talking about, be in the Python standard library? The os package is supposed to let you deal with OS-specific things in an OS-independent way. Isn't filename sanitization one of those things?<p>More generally, I think we should have a reasonable expectation that any library/API that gives one access to resources that are dealt with using text (file paths, HTTP requests, SQL stuff, interfaces to a command line, etc.) should include sanitization functionality.<p>Wouldn't that make a nice general design guideline?
It's been a while since I tried it but doesn't this also work to get you up a directory?<p>ExistingDir/../../parentRestrictedDir/passwords<p>His code seems to only check for the case where it starts with ../ of course this requires knowing of an existing directory in e current folder but that is not insurmountable for an enterprising hacker.<p>I'm on an iPad so I cannot check right now.<p>Edit: Just tried it out on my mac and it works like I remember.
Parsing paths properly is certainly a good idea. But perhaps a more important issue is why you're not running your web application in a secure sandbox in the first place.<p>I think SELinux is now standard part of the Linux kernel[1], and if you're using Ubuntu there's AppArmor[2].<p>[1] <a href="http://en.wikipedia.org/wiki/Selinux" rel="nofollow">http://en.wikipedia.org/wiki/Selinux</a><p>[2] <a href="https://help.ubuntu.com/10.04/serverguide/C/apparmor.html" rel="nofollow">https://help.ubuntu.com/10.04/serverguide/C/apparmor.html</a>
> A few weeks ago I had a <i>hatred</i> discussion with a bunch of Python and Open Source people at a local meet-up about the way Python's path joining works. (emphasis mine - "heated"?)<p>Freudian slip? Or something halfway between a typo and a misheard word?
I have a question about the "Mixing up Data with Markup"-Problem. If you don't have markup in your database - how do you structure your content? What is a header, what is a paragraph?
I recommend Unipath for dealing with the os.path.join issue:<p><a href="http://sluggo.scrapping.cc/python/unipath/" rel="nofollow">http://sluggo.scrapping.cc/python/unipath/</a>