For the uninitiated, a geocoder is maps-tech jargon for a search engine for addresses and points of interest.<p>Geocoders are expensive to run. Like, really expensive. Like, $100+/month per instance expensive unless you go for a budget provider. I've been poking at this problem for about a month now and I think I've come up with something kind of cool. I'm calling it Airmail. Airmail's unique feature is that it can query against a remote index, e.g. on object storage or on a static site somewhere. This, along with low memory requirements mean it's about 10x cheaper to run an Airmail instance than anything else in this space that I'm aware of. It does great on 512MB of RAM and doesn't require any storage other than the root disk and remote index. So storage costs stay fixed as you scale horizontally. Pretty neat.<p>Demo here: <a href="https://airmail.rs/#demo-section" rel="nofollow">https://airmail.rs/#demo-section</a><p>Writeup: <a href="https://blog.ellenhp.me/host-a-planet-scale-geocoder-for-10-month" rel="nofollow">https://blog.ellenhp.me/host-a-planet-scale-geocoder-for-10-...</a><p>Repository: <a href="https://github.com/ellenhp/airmail">https://github.com/ellenhp/airmail</a>
I don't know if it's fit your specific use case, but for pure search take a look to sonic (<a href="https://github.com/valeriansaliou/sonic">https://github.com/valeriansaliou/sonic</a>). It's blazing fast and require very few resources
Congrats on the achievement.<p>One trick part when working on "planet-scale" is parsing and matching the results for multiple countries. I tried some addresses in Brazil without success. Queries like "Starbucks Sao Paulo" return some results but addresses like "Avenida Paulista 100" (or its variations) don't.<p>Last time I looked (~2018) pelias-parser used some ML training and the results weren't very good for Brazil. I'm guessing in 2024, an open-source fine tuned LLM would do a good job?
Is there any way to zoom on the demo map in mobile?<p>I’ve enjoyed my brief dalliances with digital cartography. I’m grateful for a stack like this that I can explore.
I’m a Range request fan-boy, so thanks for sharing. Byte-indexing static objects is basically a brilliant point of light in the dark universe of code. I have fixed dozens of systems that read entire zip files over the network and into memory just to get the list of files inside (totally unnecessary if the host supports Range headers).
Woah this is ridiculously good. You've done a good job here working off Pelias. I did find managing the Elasticsearch cluster for a production instance of Pelias hopelessly annoying.<p>Remarkable.
<i>Geocoders are expensive to run. Like, really expensive. Like, $100+/month per instance expensive unless you go for a budget provider.</i><p>When you write like this it sounds very unprofessional. Also you are basically saying "this is really expensive, unless it isn't".<p>Why is there any difficulty in this at all? Why would this even need to be something someone subscribes to? It is basically a nearest neighbor search.