I'm building a small app which basically reads my emails on gmail and collects data from my bank transaction emails of type purchases, withdrawals and credits. I'm doing it so to put all my transactions at one place. In order to pick data from those emails, i am just splitting the entire text by spaces and collecting data by hardcoding its index in the text. Now that it is working fine , if the template of those emails changes then again i have to change the indexes. I found out that NLP can help in this regard. But i want to know if there are any other ways to do it ?
I'm pretty sure Apache Tika [0] will do what you need (and lots more).<p>[0] <a href="https://tika.apache.org/" rel="nofollow">https://tika.apache.org/</a>