科技回声

7 条评论

justizin将近 10 年前

"If this happened our cluster would become unavailable and may have trouble re-clustering."This was basically the repeated experience I had which caused me to abandon etcd for the time being.If it can barely ever heal, what the fuck good is it? And I found that it could barely ever heal. A 3-node CoreOS cluster I ran _always_ crashed when it attempted a coordinated update, and rarely could be repaired with the help of #CoreOS over hours.Because CoreOS pushes out updates with versions of etcd incompatible with recent versions, the etcd cluster could never survive the upgrade.Add this to the fact that the CEO of CoreOS told me in person that he expected them to be the _only_ Operating System on the internet, and I'm generally not along for the ride with CoreOS any longer.Consul, Mesos, and Docker are looking good.Anyone interested in this space should check out:<pre><code> https://github.com/CiscoCloud/microservices-infrastructure</code></pre>

评论 #9708262 未加载

评论 #9708704 未加载

评论 #9708829 未加载

评论 #9707519 未加载

评论 #9710794 未加载

jefe78将近 10 年前

... thanks Monsanto?In all seriousness, this is really interesting. They solved some of the problems associated with persisting a cluster and we're likely going to use that. Feels weird thanking them for anything though.Edit: Is anyone using CoreOS in a physical DC? We're using AWS with ~1.5k VMs but have another 5-6k hosts in physical DCs. Trying to move us towards containers but struggling.

评论 #9707403 未加载

评论 #9709333 未加载

yeukhon将近 10 年前

I think they fixed etcd cluster problem in 2.0 release (previously this is 0.5 branch).For example, we use CF (old version), and we hit <a href="https://github.com/coreos/etcd/issues/863" rel="nofollow">https://github.com/coreos/etcd/issues/863</a>.

KnownSubset将近 10 年前

From my experience etcd is pretty rock solid, until you start using it across availability zones. Then if you add in SSL into the mix, the reliability drops even further if you are using the default configuration. At that point you need to start tweaking the heartbeat and timeout parameters for a the cluster to stay stable.

评论 #9708330 未加载

narsil将近 10 年前

We solve the bootstrapping problem with an internal ELB instead.Autoscaling Groups can be configured to have instances join multiple ELBs. We have one be the regular ELB to access the instances with, and the other is an internal ELB that only allows connections from instances in the cluster to other instances in the cluster on the etcd port (controlled via security groups).When an instance comes up, it adds itself to the cluster via the internal ELB's hostname. The hostname is set in Route 53.The biggest issues we've been having with etcd continue to be simultaneous reboots and/or joins to the cluster. It would also be great if the membership timeout feature that used to exist in 0.4 made its way back in. Right now, each member has to be explicitly removed rather than eventually timing out if it hasn't joined back in.Looking forward to hear any other approaches folks have taken.

codewithcheese将近 10 年前

Running docker clusters on AWS seems a little foolish to me unless your trying to save money. Instead of manage containers why not just manage instances?

评论 #9708724 未加载

评论 #9711894 未加载

gct将近 10 年前

I can't get over how bad a name etcd is. Everytime I see it I think it's some sort of daemon for /etc files.

评论 #9708315 未加载

评论 #9708553 未加载

7 条评论

justizin将近 10 年前

评论 #9708262 未加载

评论 #9708704 未加载

评论 #9708829 未加载

评论 #9707519 未加载

评论 #9710794 未加载

jefe78将近 10 年前

评论 #9707403 未加载

评论 #9709333 未加载

yeukhon将近 10 年前

KnownSubset将近 10 年前

评论 #9708330 未加载

narsil将近 10 年前

codewithcheese将近 10 年前

Running docker clusters on AWS seems a little foolish to me unless your trying to save money. Instead of manage containers why not just manage instances?

评论 #9708724 未加载

评论 #9711894 未加载

gct将近 10 年前

I can't get over how bad a name etcd is. Everytime I see it I think it's some sort of daemon for /etc files.

评论 #9708315 未加载

评论 #9708553 未加载

Etcd Clustering in AWS

7 条评论

Etcd Clustering in AWS

7 条评论