This article doesn't even get into SQLite's full-text search feature, which really is surprisingly good. It's also very widely used - desktop and mobile apps that offer a search feature often build that using SQLite, so it's a very robust and well-trodden path at this point.<p>I've written a few tools to help work with FTS in SQLite:<p>- <a href="https://sqlite-utils.datasette.io/en/stable/cli.html#configuring-full-text-search" rel="nofollow">https://sqlite-utils.datasette.io/en/stable/cli.html#configu...</a> - command-line utilities for enabling full-text search against an existing SQLite table<p>- <a href="https://sqlite-utils.datasette.io/en/stable/python-api.html#enabling-full-text-search" rel="nofollow">https://sqlite-utils.datasette.io/en/stable/python-api.html#...</a> - that same functionality as a Python library API<p>- <a href="https://docs.datasette.io/en/stable/full_text_search.html" rel="nofollow">https://docs.datasette.io/en/stable/full_text_search.html</a> - Datasette spots full-text search enabled tables and adds a search interface to them<p>I also put together this article exploring classic search relevance algorithms with SQLite - SQLite FTS5 has relevance built in, but FTS4 leaves it as an exercise for the developer which makes it a really fun tool for understanding how algorithms like BM25 actually work: <a href="https://simonwillison.net/2019/Jan/7/exploring-search-relevance-algorithms-sqlite/" rel="nofollow">https://simonwillison.net/2019/Jan/7/exploring-search-releva...</a>
I’m always surprised when people haven’t heard of Xapian which is also in this space.<p>I tell people Xapian is the SQLite of search.<p><a href="https://xapian.org/" rel="nofollow">https://xapian.org/</a>
I think replacing Elasticsearch with SQLite is a great idea. Even more if you use the full-text search functions that SQLite includes.<p>I recommend checking out scout[0], which, I think, can be a good replacement for Elasticsearch in some cases. I'm also working on an Elasticsearch replacement built on top of SQLite for my litements[1] project, but it will still take a few weeks to have a working version.<p>[0] <a href="https://github.com/coleifer/scout" rel="nofollow">https://github.com/coleifer/scout</a>
[1] <a href="https://github.com/litements/" rel="nofollow">https://github.com/litements/</a>
I wonder how effective this would be with the SQLite with a new virtual file system fetching pages via XHR that someone posted last month. Then you would be able to do full text search on a huge database only sending a few KB per search<p><a href="https://news.ycombinator.com/item?id=27016630" rel="nofollow">https://news.ycombinator.com/item?id=27016630</a>
> This query can’t be represented in Diesel’s DSL (a sample of the DSL is demonstrated in Diesel’s getting started).<p>I feel like most of the time, SQL DSLs end up being bad. SQL is already a very high level language. There are a ton of examples of how to do stuff with SQL queries. DSLs, even if they have a way to express the SQL, are not nearly as widespread as SQL and then you will have to spend time figuring out how to translate the SQL into the DSL.<p>In addition, with libraries like sqlx for Rust, you can also get the type safety of a DSL using regular SQL.<p>I would say that as a developer, the time spent getting familiar with SQL is a good investment that will likely pay off across many projects and programming languages.
Its a low traffic website. Simple non distributed systems usually outperform complex distributed systems when the scale is small. Right tool for the right job and all.
There are also two good Elasticsearch alternatives in Rust - Sonic[1] and Toshi[2].<p>[1] <a href="https://github.com/valeriansaliou/sonic" rel="nofollow">https://github.com/valeriansaliou/sonic</a><p>[2] <a href="https://github.com/toshi-search/Toshi" rel="nofollow">https://github.com/toshi-search/Toshi</a>
Great work.<p>This is a really excellent exercise for understanding material need for refactoring.<p>I don't think I share the author's conclusions but it makes a really good test case.<p>"At idle, Elasticsearch uses:<p><pre><code> 1GB of RAM (out of 8GB total)
30% utilization of a CPU
1GB of Disk space after a week
</code></pre>
This doesn’t sound bad, but considering this server houses a dozen containers and the next CPU intensive container uses only 2% of a CPU – Elasticsearch is too heavy. For a server that gets 15 requests a minute, the resource consumption of Elasticsearch doesn’t justify it’s use. What would happen if I received a spike of traffic? I feel like the server would fall over not because the app couldn’t handle it, but log management couldn’t handle the load."<p>The thing is 'uses more resources than the thing next to it' is maybe a nice referential starting point, but it doesn't mean much.<p>What is the actual cost/risk/benefit work out to be? Cost in terms of development, maintenance, hosting, opportunity for future expansion?<p>Are we going to need to 'do more soon'? Are the hosting costs even material? Can Rust be easily supported?<p>The bit about 'what if there are more requests' does give pause for thought, but, it could be that there's a threshold/minimum that the service needs to operate, above which there's only marginal, incremental resource consumption. I don't know, I'm only pointing out the possibility.<p>And the choice of Rust, the author denoted has a few nice attributes ... but is this a personal choice ... or an optimal choice? Would the Python or Java solution be cleaner? Elastisearch is based on Lucene (Java), so it might be possible to do something very fast, powerful and extensible there as well.<p>Thanks to the author for both 'doing it' and 'writing about it' - but I think the meta issue here hinges is the technical product case.
See also <a href="https://whoosh.readthedocs.io/en/latest/intro.html" rel="nofollow">https://whoosh.readthedocs.io/en/latest/intro.html</a>, a fast, pure Python search engine library.<p><a href="https://appliedmachinelearning.blog/2018/07/31/developing-a-fast-indexing-and-full-text-search-engine-with-whoosh-a-pure-python-library/" rel="nofollow">https://appliedmachinelearning.blog/2018/07/31/developing-a-...</a><p>?
For full-text search on Rust (and replacing Elastic) have a look at: <a href="https://www.meilisearch.com" rel="nofollow">https://www.meilisearch.com</a>
I built a full text search functionality with SQLite in Go for searching a sql database of Magic cards, and I was impressed with how well SQLite handled it. I'm not surprised by this finding at all, I've found that even fairly complex queries on different cards are still plenty fast.
> for my nginx access logs<p>Interesting how we can refine use-cases and come up with simpler solutions that work as well or better. It's not a general replacement for Elasticsearch, notably may not scale for searching all the nginx logs for a distributed multi-tenant system.
Didn't java get play-nice-with-docker support only quite recently, at about 2019 (JDK 8u191, JDK-8146115) ?<p>My gut feeling says it won't make too much of a difference.
for nginx a Prometheus exporter module can be used <a href="https://github.com/vozlt/nginx-module-vts" rel="nofollow">https://github.com/vozlt/nginx-module-vts</a>