I wanted to read your article, but the design of your blog really sucks. I can hardly read the text, it's quite small. Monospace fonts are great for programming, bad for reading. Animated gifs every 5 lines is really annoying.<p>Sorry!
We (the PLASMA research group at UMass, <a href="http://plasma.cs.umass.edu" rel="nofollow">http://plasma.cs.umass.edu</a>) developed a system called AutoMan specifically designed to automatically manage quality (as well as to automatically compute pay and time) for a wide variety of tasks. You basically invoke people <i>as functions</i> and it just works, with statistical guarantees (it also handles payment, etc. without any additional effort). Makes dealing with MTurk <i>much</i> nicer. Best used in Scala but also can be used from Java.<p><a href="http://automan-lang.com" rel="nofollow">http://automan-lang.com</a>
<a href="https://github.com/plasma-umass/AutoMan" rel="nofollow">https://github.com/plasma-umass/AutoMan</a><p>Paper here on AutoMan, round one:
* <a href="http://cacm.acm.org/magazines/2016/6/202648-automan/abstract" rel="nofollow">http://cacm.acm.org/magazines/2016/6/202648-automan/abstract</a> (CACM Research Highlight, 2015)<p>Original paper, not behind a paywall:
* <a href="https://people.cs.umass.edu/~emery/pubs/res0007-barowy.pdf" rel="nofollow">https://people.cs.umass.edu/~emery/pubs/res0007-barowy.pdf</a> (OOPSLA '12)<p>New features described here have been rolled into AutoMan:
"VoxPL: Programming with the Wisdom of the Crowd" (CHI '17, to appear): <a href="https://people.cs.umass.edu/~emery/pubs/voxpl-chi.pdf" rel="nofollow">https://people.cs.umass.edu/~emery/pubs/voxpl-chi.pdf</a>
For US-based people, I suggest using Mechanical Turk as a worker before creating HITs as a Requester. See <a href="https://www.reddit.com/r/HITsWorthTurkingFor/" rel="nofollow">https://www.reddit.com/r/HITsWorthTurkingFor/</a> for decent examples.<p>It makes me sad that so much of the work available on Mechanical Turk is poorly designed, and that workers have little recourse to bad Requesters.
I recently tested MTurk for my startup. We set up about 500 HITs to collect website URL and email from various businesses. We set the price at $0.05 (Amazon takes an additional $0.01). Jobs quickly got started and within 24 hours we had all of our data collected.<p>I'm not sure I would do it again though. A lot of the businesses we were targeting don't have a web presence and therefore "No URL/No Email" was a viable answer. However, when I went through the list to see 150 "No URL/No Email" answers I didn't know for sure whether that is true or whether the Turker realized they could just copy/paste and make a quick buck. Amazon does provide the amount of time they spent on the task so I rejected any that were less than 10 seconds as I felt like they didn't give it a good enough try. Over that, I just accepted the answer realize that it may be false.<p>In the end I think I spent more time going through results and correcting them then it actually saved me. I'm excited to use MTurk in the future again, but only for appropriate projects.
<i>So why use Mechanical Turk in the first place? Turkers will work for a single penny in many cases.</i><p>Exactly what I was looking for: the cold brutal logic of captialism. It's all good if it's low-cost.<p><i>Even after all of this you will still get bad answers.</i><p>...yes, of course, you're not paying for quality, you're paying for quantity and to reduce your costs. If you were paying for quality you would put up a few posters on college campuses and pay more.
Hey guys! I'm cofounder of Scale API (www.scaleapi.com), a YC S16 company building an API for human intelligence. We've been working to obviate the need to tune your system to work on products like MTurk, and instead have a really simple API that <i>just works</i>. We've worked to build technology to guarantee quality to our customers and build a simple developer experience.<p>I really respect your ability to work with MTurk and have it work for you guys. In our experience, it often takes significant effort to get anything remotely functional and reliable on MTurk. That's why we're building Scale :)
There was a great talk by some AWS guys at the aws re-invent summit on ways to improve accuracy using ideas similar to cross-validation...<p>It's on iTunes - title is:
"Getting to ground truth with Amazon web services mechanical Turk"<p>Video also available on YouTube:
<a href="https://m.youtube.com/watch?v=vRtLdeNl7Tg" rel="nofollow">https://m.youtube.com/watch?v=vRtLdeNl7Tg</a>
I tested out Mechanical Turk back in 2008. I think I was trying to get a YouTube video promoted on some site.<p>I still have some credits on the system.<p>I am not even sure if I could come up with a good use. For those with current usage experience, could I create in theory a task where people would look up the best sights to see in the top 10 travel destinations?<p>Would this be a valid use case, and how would you deal with duplicates?