Here are two papers that describe the techniques used by the FDA system (or that were used in the mid-2000's) to find these confusable names.<p>"Automatic identification of confusable drug names" (2006, <a href="http://goo.gl/W5DK0f" rel="nofollow">http://goo.gl/W5DK0f</a> PDF)<p>and<p>"Identification of Confusable Drug Names: A New Approach and Evaluation Methodology" (2004, <a href="http://goo.gl/RziUgf" rel="nofollow">http://goo.gl/RziUgf</a> PDF)<p>Both by Grzegorz Kondrak and Bonnie Dorr.<p>I've used the BI-SIM in a medical-informatics system and it does quite well. I'm also a big fan of EDITEX, which for some uses is better.
This submission was reposted by request. The POCA software was mentioned
in the "How FDA Reviews Proposed Drug Names" PDF and is somewhat
interesting reading:<p><a href="http://www.fda.gov/downloads/Drugs/DrugSafety/MedicationErrors/ucm080867.pdf" rel="nofollow">http://www.fda.gov/downloads/Drugs/DrugSafety/MedicationErro...</a><p>The PDF was previously submitted by 'aclimatt' here:<p><a href="https://news.ycombinator.com/item?id=10079659" rel="nofollow">https://news.ycombinator.com/item?id=10079659</a>