Call me maybe: Elasticsearch

392 pointsby itamarhaberalmost 11 years ago

17 comments

keypusheralmost 11 years ago

As someone who is currently building out a distributed storage cluster system, this series has been amazing. Not only are the results informative, but I have learnt so much about clustering, consistency, testing methodology and what to look for when evaluating reliability on these systems. Very nicely done.

评论 #7918240 未加载

评论 #7917464 未加载

Radimalmost 11 years ago

Indeed. ElasticSearch is super useful, but its docs used to be of the frustrating variety: "Explain minute API details, as if a passing note from the lead dev to himself, assuming all context and concepts are understood and obvious. Don't bother with why's & what's & high-level nonsense."It's been improving lately though, "going big" helped ES here.My personal favourite: an issue from 3 years ago, where ElasticSearch returns incorrect facet counts (as in, fundamentally BROKEN faceting). Still unresolved: <a href="https://github.com/elasticsearch/elasticsearch/issues/1305" rel="nofollow">https://github.com/elasticsearch/elasticsearch/issues/1305</a>

评论 #7919521 未加载

synternalmost 11 years ago

TL;DR:"Some people actually advocate using Elasticsearch as a primary data store; I think this is somewhat less than advisable at present.""The good news is that Elasticsearch is a search engine, and you can often afford the loss of search results for a while."My personal favorite solution would reliably channel data from a Riak cluster to an ES cluster. Anyone knows if there is something like that out there?

评论 #7918018 未加载

评论 #7917710 未加载

评论 #7917252 未加载

评论 #7919492 未加载

lbarrowalmost 11 years ago

Aphyr is truly amazing. This series of blog posts has introduced me to a rigorous, careful way of thinking about distributing systems. I'm a much, much better developer for having read his blog. How many people can you really say that about?

评论 #7919685 未加载

silentehalmost 11 years ago

I am currently writing a Golang client for Elasticsearch which uses the native binary protocol and I have to say the lack of documentation about it is making the process really painful!I tried to use the Elasticsearch thrift plugin but unfortunately it does not work for the version 1.1 and 1.2So basically I have to inspect each and every byte of each and every request and response in order to be able to send or parse data.While developing the client a managed several time to crash the Elasticsearch server by sending malformed packets. In addition, this, brought me to review the networking part of Elasticsearch code and I think it needs a refactoring and a better, deeper and cleaner usage of Netty.I hope they will soon sort out this and the problems mentioned in the article, since I think that Elasticsearch is really an amazing product!

评论 #7918468 未加载

评论 #7918182 未加载

评论 #7922900 未加载

Tornalmost 11 years ago

From the article comments:> seems like the ES team is moving in the right direction with testing this stuff <a href="https://github.com/elasticsearch/elasticsearch/commit/ef759322231b21aa3c8b160f86b895483cff1ebf" rel="nofollow">https://github.com/elasticsearch/elasticsearch/commit/ef7593...</a>

bzelipalmost 11 years ago

I'm not much of a programmer, but it's great to know about this guy. The diversity link someone posted here [0] is inspiring.Off topic question about Aphyr's website: the stylesheet is linked to only as `<link rel="stylesheet" type="text/css" href="/css" />` How and why does this work?[0]<a href="http://aphyr.com/diversity" rel="nofollow">http://aphyr.com/diversity</a>

评论 #7919626 未加载

评论 #7919913 未加载

评论 #7919627 未加载

nsxwolfalmost 11 years ago

Can someone explain the "Call me maybe" theme/meme? I know it is a song, but what's the relevance here?Edit: I looked at the archive and found the original post where it is ... erm... explained?

评论 #7917998 未加载

room271almost 11 years ago

The 'Nic' he quotes in the article is me :) This is seriously the highlight in my career folks!Although to be fair, he quotes me generously. Later on in that discussion I give up my wisdom and get confused again.

评论 #7918476 未加载

cjbprimealmost 11 years ago

Aphyr is amazing and we're very lucky to have him doing this!

sjaaktrekhaakalmost 11 years ago

Shay says they've already "fixed this" [1] and the aim is to get the fix in 1.3. I'm interested in what the fix actually is.To quote Shay: "FYI, the improved_zen branch already contains a fix for this issue, we are letting it bake as this is a delicate change, and we are working on adding more test scenarios (aside from the one detailed in this issue) to make sure. The plan is to aim at getting this into 1.3. We have not yet ran Jespen (which simulates the same scenario we already simulate in our test), but we will do it as well."[1]: <a href="https://github.com/elasticsearch/elasticsearch/issues/2488#issuecomment-46135721" rel="nofollow">https://github.com/elasticsearch/elasticsearch/issues/2488#i...</a>

programminggeekalmost 11 years ago

Elasticsearch is great for um... search. Like, you use it as an index to point to the real system of record and it is not expected to be perfect (or shouldn't' be).I though it worked out real well when used to alleviate pressure on the database which was being used for search results (which were sometimes ajax live search style). The big benefit was our database usage went down and search was better/more reliable.The other big benefit is if elastic search goes down search stops working. That is FAR better than if elastic search goes down the whole database and site stops working.At a big enough scale, with thousands of dollars in transactions every day, the database can't go down. Search can break gracefully, but the spice must flow (so to speak).

评论 #7918801 未加载

评论 #7918776 未加载

AznHisokaalmost 11 years ago

These problems are a major reason why I decided to not go with the typical 1 cluster-multiple replica infrastructure.Instead I have multiple clusters, 0 replicas, and load balance against those clusters. No split brain problem, but the same data availability benefits as having replicas. Granted, the logic is all in my app now. Now when a cluster is down, I have retry logic to keep reindexing that data to that cluster until it succeeds.

评论 #7917391 未加载

评论 #7917850 未加载

linux_devilalmost 11 years ago

Informative , started looking into ES since past week for indexing , really like the ease of use in ES , but definitely there are few points to be kept in mind.

fasteoalmost 11 years ago

I am a happy user of ElasticSearch I always thought that it would pass the "Call me maybe" test with flying colors...Well, not

johnnymonsteralmost 11 years ago

whats up with the OP and barbie memes?

评论 #7917773 未加载

评论 #7919621 未加载

elialmost 11 years ago

I am probably just an aging fuddy duddy, but the animated GIFs make me much less likely to share this article with my team.

评论 #7917451 未加载

评论 #7917653 未加载

评论 #7917736 未加载

评论 #7917417 未加载

评论 #7917837 未加载

评论 #7918022 未加载

评论 #7918399 未加载