So, for this to be even a feasible solution, EdX needs to show more proof of concept...like, here's a sample question, here's the 100 sample answers that were used to machine-learn against, and here's how the auto-grader graded these 10 different answers (both good and bad).<p>Why should anyone have faith that EdX has cracked the perfect mix of machine learning and NLP and other associated technologies needed to provide accurate assessments of essays? Even Google from time to time has trouble guessing intent. Wolfram Alpha even more so. If the engineers at these companies can't get it always right -- and it's not just engineering talent, but data and data analysis -- why should a school entrust one of its most important functions to EdX?<p>Grading is something <i>critical</i> to get right, not just, "almost there." Think of how much time you spent arguing with a profesor that your score deserved an 8 instead of a 6, enough points to bring you from a B to a B+ for the semester...Think of the incentive to do so (career prospects). If the machine is ever wildly off just even in one case, would you ever take its assessments as gospel? Multiply yourself by 20 or 50 or whatever a usual professor's lecture load is, and now a ton of lag has been introduced into the grading workflow.<p>Obviously, there are ways to mitigate this. One would be to write questions that are so narrowly focused that there are very clearly right and wrong answers...Which of course raises the problem of: why not just give everyone multiple choice tests, then?<p>The sad thing is is that even if these machine graders were empirically better than human graders, they can't just be better, they have to be almost perfect. If a plane's autopilot failed, on the whole, 1 out of a 1000 times compared to 5 out of a 1000 times for human pilots...how much more pissed do you think victim families are going to be when they find out their loved ones died because of a algorithmic malfunction rather than just pilot error? People, for obvious reasons, don't like thinking of their work or their destinies being defined by deterministic machines...so if those machines aren't 99.99% right, then the fallout and pushback may cost schools more than the savings in professor-grading time.