TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Open Source Search with Lucene & Solr

58 pointsby igrigorikover 14 years ago

10 comments

fizxover 14 years ago
For anyone who would like to take Solr for a spin, I invite you to check out nzadrozny's and my startup: <a href="http://websolr.com/" rel="nofollow">http://websolr.com/</a><p>We are a bootstrapped startup providing managed Solr hosting in the cloud (currently EC2). We're all about making the operational side of high performance Solr hosting as one-click easy as possible, so developers can focus their time on doing cool stuff with it.<p>We love HN and are frequent commenters/lurkers around here, so we made a "HN10" coupon which you can use on signup to get a month of our Silver plan for free.
评论 #1821096 未加载
evilhackerdudeover 14 years ago
Riak Search has been released recently. It’s got Lucene and part of the Solr HTTP API built-in.<p>Basically you push json/xml/whatever documents into buckets. The docs will be indexed, i.e., by field names (json &#38; xml) or simply fulltext. It is pretty cool because it’s based on Riak Core and thus has the same benefits as Riak K/V. Lucene runs transparently in the background - afaik you never even have to touch it.<p>Read more in their wiki: <a href="https://wiki.basho.com/display/RIAK/Riak+Search" rel="nofollow">https://wiki.basho.com/display/RIAK/Riak+Search</a><p>Especially: <a href="https://wiki.basho.com/display/RIAK/Riak+Search+-+Indexing+and+Querying+Riak+KV+Data" rel="nofollow">https://wiki.basho.com/display/RIAK/Riak+Search+-+Indexing+a...</a>
ankimalover 14 years ago
We use an Enterprise Search Platform (our biggest software acquisition) minus the support (another dumb idea). The entire thing is like a Black Box. It takes days to figure out what "Error: FS error" actually means. For a new project, we used Solr to maintain a smaller index and have never looked back since. Anybody about to start building a search index, Lucene/Solr is the way to go.
评论 #1820967 未加载
dangroverover 14 years ago
Haystack for Django is a really nice way to integrate with these systems. You can use lucene, solr, or whoosh as backends for your search.
评论 #1821199 未加载
akozakover 14 years ago
At Creative Commons we use Lucene/Nutch for our educational search prototype DiscoverEd: <a href="http://wiki.creativecommons.org/DiscoverEd" rel="nofollow">http://wiki.creativecommons.org/DiscoverEd</a><p>It was easy enough to add in our special sauce like a triple-store for consuming and displaying semantic data (I guess I can say easy since I didn't do it myself).
评论 #1821791 未加载
nkurzover 14 years ago
I'm a fan and contributor to Lucy, which is mentioned briefly in the header: <a href="http://incubator.apache.org/lucy/" rel="nofollow">http://incubator.apache.org/lucy/</a><p>While Lucy did start out as a C port of Lucene (hence the name), it's since broken any attempts at Lucene compatibility. Instead, it's aiming to be a fast and flexible standalone C core with bindings to higher level languages. Since it's growing out of Kinosearch, it's best developed bindings are in Perl, but support for all the usual suspects (Python, Ruby, etc.) is planned.<p>Technically, the main difference from Lucene is that it gets cozier with the machine: the OS is our VM. It's mostly mmap() IO, and we're very conscious of paging and cache issues. While we're trying to maintain 32-bit back compatibility, we take full advantage of 64-bit solutions when they offer themselves. The scripted bindings are also very cool --- you can do things like make callbacks to scoring methods in your script language to truly customize your results.<p>If for some reason you're not finding what you need in Lucene and Solr, check it out. We just became a full Apache incubator project, and are eager to get more developers involved. You'll find clean C code, decent documentation, and a low traffic but very responsive list. If you're using Perl, C or C++, you'll get a great product from the start. If you're using anything else, you'll have to help a lot on the bindings, but I think you'll be quite pleased with the end result.
spoondanover 14 years ago
Lucene is great but I wish schemas were an optional part of Solr. They add complexity and take away flexibility. If you have a photo database where you want searchable metadata describing the subject of the photographs, you can do this easily and naturally in Lucene. But Solr requires you either (1) prefigure available metadata or (2) expose field typing details to your users (so a field for birthday is actually "birthday_d", with the "_d" indicating it's a date). Both of these are very unattractive to me.<p>The worst part is that I have no idea what benefits schemas are supposed to bring me. The documentation vaguely promises that schemas "can drive more intelligent processing", but I have a feeling I could get that more easily without schemas. It also tells me that "explicit types eliminate the need for guessing of types," but only, apparently, by requiring users to <i>understand and remember</i> them.
评论 #1821387 未加载
cowmixtooover 14 years ago
So has anyone used this combination for realtime and historical log searching (like what Splunk offers)?
评论 #1820879 未加载
reinhardtover 14 years ago
Any experience on how Lucene/Solr stacks up against other search tools such as Sphinx or Xapian ?
评论 #1823887 未加载
knownover 14 years ago
I prefer <a href="http://aspseek.org" rel="nofollow">http://aspseek.org</a>