I've seen a lot of papers recently tackle the needle-in-a-haystack problem wrt LLMs, and I think this approach (and more generally, any in-context solution) is a mistake.<p>Imo the best way to handle this is RAG + multi-shot prompting (+ symbolic mapping to an actual data structure). For example, a pre-processing step where you partition the context by "records," another step where you insert (and potentially split up the records) in a RAG database, and another step where you make fuzzy queries. So, if you ask for record 1234 you get an exact match on that line (or set of lines, or record, or whatever) of the original context. And if you ask for "elephant" but there's no "elephant" in the context, you might get the "hippo" record because of the RAG reranking.<p>This is a lot of work, and is essentially a data pipeline, but the results are much better-curated than just fine-tuning and hoping that generalized needle-in-a-haystack search will work reliably as part of a language model.