Both Elastic and Opensearch also have S3 based stateless versions of their search engines in the works. The Elastic one is available in early access currently. It would be interesting to see how this on improves on both approaches.<p>With all the licensing complexities around Elastic, more choice is not necessarily bad.<p>The tradeoff with using S3 is indexing latency (the time between the write getting accepted and being visible via search) vs. easy scaling. The default refresh interval (the time the search engine waits before committing changes to an index) is 1 second. That means it takes upto 1 second before indices get updated with recently added data. A common performance tweak is to increase this to 5 or more seconds. That reduces the number of writes and can improve write throughput, which when you are writing lots of data is helpful.<p>If you need low latency (anything where users might want to "read" their own writes), clustered approaches are more flexible. If you can afford to wait a few seconds, using S3 to store stuff becomes more feasible.<p>Lucene internally stores documents in segments. Segments are append only and there tend to be cleanup activities related to rewriting and merging segments to e.g. get rid of deleted documents, or deal with fragmentation. Once written, having some jobs to merge segments in the background isn't that hard. My guess is that with S3, the trick is to gather whatever amount of writes up and then store them as one segment and put that in S3.<p>S3 is not a proper file system and file operations are relatively expensive (compared to a file system) because they are essentially REST API calls. So, this favors use cases where you write segments in bulk and never/rarely update or delete individual things that you write. Because that would require updating a segment in S3, which means deleting and rewriting it and then notifying other nodes somehow that they need to re-read that segment.<p>For both Elasticsearch and Opensearch log data or other time series data fits very well to this because you don't have to deal with deletes/updates typically.