Hey,
I need to hire someone who is really good at text parsing? specifically sec forms.<p>Obviously this is a paid project with potential for a full time job, if anyone is interested please im me on bkmrkr314.
If you end up going with raw text, but semi-organized, you might look into something like ELIE. I've never, ever used it, but stumbled upon it today in a different context:
<a href="http://www.aidanf.net/software/elie-an-adaptive-information-extraction-system" rel="nofollow">http://www.aidanf.net/software/elie-an-adaptive-information-...</a>
Which im network are you using? I worked on SEC crawler in PubSub during 2004 to 2006. I remember SEC has SGML version of all filings. So in this case, it may be better to use SGML parsers to process the document due to its structure.<p>Let me know because I think any project that will utilize data SEC will be interesting.