His OCR errors (Del. Jennifer L. McClellan -> Del. Jennifer L i\1cCie1ian) look like something that would be easily fixable at the right spot - the dictionaries and language models used by Tesseract.<p>While a spellchecker might fix Jenn1fer -> Jennifer, at the OCR stage there is much more information to do it properly; but it obviously doesn't know that McClellan is valid word and thus a much more likely alternative than i\1cCie1ian, and it needs to be told that. The list of speakers on those videos is limited, and their surnames can be added to the appropriate dictionaries to improve their recognition.