OK, there is a pretty "simple" way to do it but I can't divulge all the details b/c we are using it for my semantic news startup. However, I can give you a hint on what you need to do. You need to find a way to reduce different words to their "base" semantics and then compute a similarity measure of sorts between the different event descriptions. Traditionally, people have been breaking down the text into word vectors and then calculating a pearson or tanimoto, or whatnot similarity measures b/n the vectors. The problem with this approach is not the similarity measure but the fact that (as aristus points out) different people use different words to describe the same thing. So computing the tanimoto score of two sentences like "Bush visits Iraq to encourage soldiers" and "US. President arrives at Bagdad to raise troops morale" will result in very low similarity score using just the literal words used in both sentences. However, in reality both sentences are almost identical in meaning. So let's say that there was a way to reduce the semantics of both sentences down to their "base", then for sentence one you would have something like "US President present in Middle Eastern country to increase spirit of soldiers", and for sentence two you would have "US President present in capital of Middle Eastern country to increase spirit of soldiers". So NOW if you take the tanimoto measure of similarity or actually any other decent similarity measure b/n the semantic "base" of both sentences you would get a very high score, which is what is needed. So, like I said I can't divulge the method of how to do reduce text to its semantic "base" but there is definitely at least one way, which is what we are using.
If you are interested, send me an email to haidut@gmail.com and when I launch my startup within the next couple of weeks I can hopefully divulge more. Sorry for my vagueness but my lawyers will make a minced-meat out of me if I say more:-)
Here is something you can do and I am allowed to divulge. You can use Pointwise Mutual Information (PMI) to find out the semantic relatedness of words in both event descriptions and if the overall cross-word similarity is above a certain threshold then you can consider both event descriptions similar. To find out more on how to calculate PMI and some more info on how useful it is, read this paper:
<a href="http://cogprints.org/1796/0/ECML2001.pdf" rel="nofollow">http://cogprints.org/1796/0/ECML2001.pdf</a>
Overall, what you are trying to do is simple in concept but not easy to implement. Basically, in order for semantic similarity to be computed automatically by a machine you'd need a mapping of EVERY word in EVERY language to its semantic "base" so you can create a table and compute this efficiently. Such mapping does not currently exists and one of the reasons is the constantly changing nature of English language itself - new words come out all the time and even if you could track them all - some of them don't have semantic "base". Using the example I gave above - the fact that Bush is currently a President won't be true in couple of months so that huge word-to-meaning mapping table would have to account for that, so mapping English language to a semantic base if a (very fast) moving target.<p>That's all I have for now. Good luck in your endeavor.