TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Do algorithms reveal sexual orientation or just expose our stereotypes?

6 点作者 ALee超过 7 年前

1 comment

YeGoblynQueenne超过 7 年前
Something that I found truly shocking is that, in the paper&#x27;s data, the number of straight men is exactly the same as the number of gay men- and the same for women (for individuals with at least one picture, i.e. everyone; numbers for those with more than one picture are different).<p>The paper itself cites a 7% distribution of gay men in the general population. Yet they trained with a 50&#x2F;50, uniform distribution. But- why?<p>Well- because a problem involving unbalanced classes, like gay&#x2F;straight individuals, where your target class (gay men&#x2F;women) is less than a tenth of your entire data is a bitch to train a classifier for. Now, if you artificially equalise the data, by just adding more of that class, you can get a pretty good &quot;accuracy&quot; score (precision over recall, which they used).<p>Except of course, that score is completely useless as an estimate of the true accuracy of your classifier in the real world, against the true distribution of your data, &quot;in the wild&quot;. It&#x27;s also completely useless as evidence for whatever hare-brained theory you want to posit, that involves, oh, say, the distribution of feminine and masculine features in gay and straight individuals&#x27; faces - you know, the point the paper was making.<p>This should be a cautionary tale: you can&#x27;t just force a classifier to give you the results you want it to and then claim that those results prove your theory. That&#x27;s just bad machine learning. Like bad statistics, but with more assumptions.