LLMs currently seem to be best at summarization and question answering tasks when relevant chunks of source text have been strategically placed within their input context via a semantic search and semantic chunking preprocess. With EdgarGPT we are exploring the limits of this strategy against dense financial disclosure documents, with an eye towards enabling more customized processing of these filings.
This is really cool - would be awesome to connect at some point. I co-founded Marqo which also enables this through providing the semantic search component of this workflow. We've found some interesting results combining it with LLMs. <a href="https://www.marqo.ai/blog/from-iron-manual-to-ironman-augmenting-gpt-with-marqo-for-fast-editable-memory-to-enable-context-aware-question-answering" rel="nofollow">https://www.marqo.ai/blog/from-iron-manual-to-ironman-augmen...</a>