The money quote: "I don’t care what your politics are. If you’re building a data-driven system and you’re not actively seeking to combat prejudice, you’re building a discriminatory system."<p>This is extremely important. The weight of existing prejudice drives and reinforces future prejudice. Garbage in, garbage out. If you start with biased, flawed data, and you look for patterns like the biased, flawed data, you'll just add to the biases and flaws. You need to account for the poor data quality explicitly.
I feel like this is a politically-correct rant thinly disguised as programming related.<p>> I don’t care what your politics are. If you’re building a data-driven system and you’re not actively seeking to combat prejudice, you’re building a discriminatory system.<p>Fuck that. Build _correct_ systems first. If you’re making a system that intentionally distorts data for the sake of “combating prejudice”, you’re _lying_. That doesn’t help anyone.
I don't care what your politics are. If you're building a data-driven system and you forget that empirically observed covariation is a necessary but not sufficient condition for causality, you're building flawed system, which is inaccurate at best, discriminatory at worst.
It seems perfectly reasonable to station police in places which are indisputable hotspots for crime.<p>I mean, is it discriminatory to say that Compton has a lot of crime and police should patrol it?<p>Or is this just loosely using the tech community as a demographic to drum up clicks while scraping the bottom of the barrel as far as looking for topics to write about?
There's those funny/sad stories of how face recognition software has categorised black people as either invisible[1] or gorillas:<p><a href="http://www.businessinsider.com/google-tags-black-people-as-gorillas-2015-7" rel="nofollow">http://www.businessinsider.com/google-tags-black-people-as-g...</a><p><a href="http://edition.cnn.com/2009/TECH/12/22/hp.webcams/index.html" rel="nofollow">http://edition.cnn.com/2009/TECH/12/22/hp.webcams/index.html</a><p>The prejudice of the programmer (unintentional, as all prejudice really is) can show up in the code just because they forgot to include enough data in their training set.<p>---<p>[1] In light of which Ralph Ellison's novel is particularly prescient:<p><a href="https://en.wikipedia.org/wiki/Invisible_Man" rel="nofollow">https://en.wikipedia.org/wiki/Invisible_Man</a>
"ignore the data and do what's good for the feels."<p>If you are writing a system that is designed to predict arrests for any reason other than telling police officers where to go, then arrest records are precisely the data you want to use without these political correctness coefficients. So advertising legal help for criminal arrests to someone that matches arrest demographics is the correct thing to do. Anything else is stupidity.
I thought this was going to be something interesting until I hit the "Environmental Consequences" part. I mean, really? We have environmental problems as a result on agriculture and manufacturing and transportation and construction and literally any other industry, and you're going to complain about the cost of running some servers? We are <i>orders of magnitude</i> away in impact. Actually, I suspect its even worse - the more computing power we make available to those industries, the more efficient they become. Please, tell me how making digital designs is environmentally worse than physical prototypes, or simulations are less power efficient than real-world tests, or data-driven irrigation is more wasteful than spraying water into the air for hours.