This happens because of what's in the training corpus. But that was chosen from a large chunk of the internet. Why is the text of the internet like this?<p>When someone walks into a school and starts shooting, we don't think it's relevant that they're Christian, Hindu, or atheist. But we <i>do</i> care about their motives. If they're shooting up a school because they're a Christian and think the school is teaching atheism, <i>now</i> it's relevant.<p>Well, in the parts of the world where most of the English text comes from, the people who are committing atrocities <i>because of their religion or philosophy</i> are most frequently Muslims. The corpus is biased because people commit violence that (in their own logic) flows out of following Islam, and they do so in really disproportionate numbers.