Sort of off topic, but I wonder if you can copyright or otherwise do something legalish with specific word frequency distributions. It's not nearly enough information to, say, reconstruct a corpus, but it's often sufficient information to identify someone with precision (so I've heard, anyway).