Hey everyone!<p>I built ModerateHatespeech, an initiative that working on better understanding + building solutions to combat hate speech in online platform. We have our flagship API, which is completely free and ML-powered, and gives a lot more actionable/better results than pretty much any other similar platform out there.<p>We've been able to leverage our partnerships with many communities to get a lot of good data + feedback, and bring our system to a lot of users (we process ~200k comments a day right now).<p>We've also done a lot of work to better understand biases/potential abuse-cases of our API, which you can read about on our site (trying to avoid too many links getting caught in spam filters).<p>I would definitely love to hear any thoughts/feedback! Here's a link to information about our project/API: https://moderatehatespeech.com/
"Republicans are bigots"<p>{
"class": normal
"confidence": 0.955
}<p>"Democrats are bigots"<p>{
"class": flag
"confidence": 0.575
}
It doesn't seem to deal with misgendering hate or transphobia very well (I saw lots of models failing on this so I checked that right away), and I mean obvious ones regardless of your stance, e.g.:<p>> "She is a he" => { "class": normal, "confidence": 1 }<p>> "He will never be a woman" => { "class": normal, "confidence": 0.999 }<p>But it does seem to identify 2nd-person targeted transphobia:<p>> "You will never be a woman" => { "class": flag, "confidence": 0.996 }
Interesting. It seems to be able to detect the subtle nuances of meaning fairly well. Maybe not perfect, but I would give it at least 8 stars on a scale from 1 to 10.
What is the max qps and latency this supports? What city is it hosted in?<p>Some moderation platforms I've worked with are too slow for messaging, especially if users are in different countries.<p>Do you have plans for image moderation?