Logstash, Elasticsearch and Kibana are just fantastic. After being unsatisfied with a whole bunch of Logging As A Service providers (I tried loggly.com, logentries.com and splunkstorm.com) I spent an afternoon setting up Logstash and co and couldn't be happier.<p>There's a neat demo of Kibana here: <a href="http://demo.kibana.org/#/dashboard/elasticsearch/Logstash%20Search" rel="nofollow">http://demo.kibana.org/#/dashboard/elasticsearch/Logstash%20...</a><p>The only thing that isn't fully baked in with this stack is alerts (e.g. sending an email if a certain error log message comes in), but you can do that using Logstash filters and outputs, although there's no pretty UI.<p>There are some excellent Chef cookbooks for setting up Logstash and friends too:<p>- Logstash: <a href="https://github.com/lusis/chef-logstash" rel="nofollow">https://github.com/lusis/chef-logstash</a><p>- Elasticsearch: <a href="https://github.com/elasticsearch/cookbook-elasticsearch" rel="nofollow">https://github.com/elasticsearch/cookbook-elasticsearch</a><p>- Kibana: <a href="https://github.com/lusis/chef-kibana" rel="nofollow">https://github.com/lusis/chef-kibana</a>
For anyone who can't immediately see the significance..this is Elasticsearch's entry into real-time log analytics. There is plenty of room for innovation and financial opportunity in this area, given the success of the $5 billion valued Splunk along with companies like SumoLogic and LogLogic.<p>What's most interesting is that Elasticsearch seems like a completely open source (and widely used) offering of a product that Splunk charges close to oracle pricing for.<p>Shameless plug: If you're looking for an opportunity at a well-funded true real-time analytics company in silicon valley...feel free to ping me. There's lots of exciting and fun work to do in this area.
logstash + elasticsearch are pretty amazing. however, if you are generating a high rate of log entries you may want to consider using mozilla hekad instead (<a href="http://hekad.readthedocs.org/en/latest/" rel="nofollow">http://hekad.readthedocs.org/en/latest/</a>). on our servers logstash was running around 20% CPU during quite periods while hekad was running around 1-2% CPU. while during busy periods i think logstash was going up to 100% CPU while hekad was sitting around 20-30% CPU.<p>hekad is written in go which compiles down to native code while logstash is written in jruby which is not the most performant runtime.
I'm confused. Can someone explain to me why this is so obviously interesting, yet not worth discussing, that it stands - as of 2 hours after submission - at 75 points with zero comments?<p>Honestly, I've never heard of either company, although I obviously wish them the best of luck. Am I just out of touch?
This is great news. Our centralized logging system at Semantics3 (<a href="https://semantics3.com" rel="nofollow">https://semantics3.com</a>) is built using Logstash+Kibana+Rsyslog+ElasticSearch. Running off a single EC2 large instance it has been been able to seamlessly aggregate and process logs from about 200-300 instances, processing on average of about 15 GB of log data. We hit some performance bottlenecks (particularly with elasticsearch) when our number of instances went beyond the 300 mark. But that should get fixed once we shard and distribute ElasticSearch.<p>Looking forward to some really tight integration between the Logstash, ES and Kibana.
Logstash is awesome. We use it at Swiftype to index all our logs and it's super helpful nailing down support requests and bugs (using Kibana).<p>Since you can access the logs via the Elasticsearch API, we made users' recent logs available to them in our dashboard: <a href="https://swiftype.com/blog/api-logs.html" rel="nofollow">https://swiftype.com/blog/api-logs.html</a>
I wonder how all this compares to Graylog2? (<a href="http://graylog2.org/" rel="nofollow">http://graylog2.org/</a>)<p>Those guys are meant to be releasing a new re-vamped version at the end of October, from the screenshots and videocasts, looks pretty good:<p><a href="https://www.facebook.com/graylog2" rel="nofollow">https://www.facebook.com/graylog2</a>
For people using this, I'd be interested to know what kind of throughput you're seeing and your cluster size - I'm trying to find something that can handle upwards of 100k small messages per second for a near-realtime analytics platform, and although this is a bit left-field (compared to Cassandra, HBase etc...) it could be a fit.
Logstash is really great and Jordan is approachable and very helpful. To all interested, I recommend joining their IRC channel (#logstash on Freenode) and talking to the people there a bit.<p>Congrats :)
I'm currently evaluating elasticsearch and riak for rt analytics of large amount of data. Anyone has similar experience? Maybe even Cassandra, haven't touched it seriously yet.
Both Logstash and elasticsearch are great - but they both suffer from the same flaw: they're a pain to deploy and it's a pain to manage their packages.
This space is heating up. Cloudera is building a similar stack with Solr - <a href="http://www.cloudera.com/content/cloudera/en/campaign/introducing-search.html" rel="nofollow">http://www.cloudera.com/content/cloudera/en/campaign/introdu...</a>
This is great news as well. @ Wildbit we have a dedicated logging server consisting of Rsyslog, ES, LogStash and Kibana3. It's been improving considerably each month.