Avoiding elasticsearch split-brain

Loggly recently held an elasticsearch meetup, which was a great success. One question that was repeatedly asked was how to ensure elasticsearch does not suffer a partition — known as a split-brain.

This can be a particular problem in AWS EC2, where the network is subject to interruptions. It can also happen if the elasticsearch master node performs long garbage collection cycles.

One configuration that is very effective at preventing this problem is described in this post.

Example elasticsearch cluster

Say I have 8 data nodes (esdata0-7), 3 master-eligible nodes (esmaster0-2), and 2 data-less nodes (esclient0-1).

The data nodes then run with this configuration:
node.master: false discovery.zen.minimum_master_nodes: 2 node.data: true discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]

The master-eligible nodes run with this configuration:
node.master: true discovery.zen.minimum_master_nodes: 2 node.data: false discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]

The data-less nodes run with this configuration:
node.master: false discovery.zen.minimum_master_nodes: 2 node.data: false discovery.zen.ping.unicast.hosts: ["esmaster0", "esmaster1", "esmaster2"]

Minimum master nodes

The correct configuration for this value is 1 more than half the total number of master-eligible nodes in the cluster. With this setting, any node that can only see 1 master refuses to form a cluster — even with itself. This means only one cluster can ever result from this elasticsearch deployment. Some nodes may be disconnected from the cluster but by preventing them from forming their own mini-cluster, they will rejoin with ease when the connection is restored.

Let masters be masters

While only one master-eligible node becomes the actual master, all master-eligible only perform master duties. This means that all queries and indexing requests should go through data-less nodes. This allows master-eligible nodes to remain responsive at all times.

Philip O'Toole says:

April 16, 2014 at 7:32 am

Yes, because discovery.zen.minimum_master_nodes is 2, the cluster will still have sufficient master-eligible nodes to remain up, and one of the other master-eligible nodes will be elected master, within a matter of seconds.
However, if the cluster suffers a second master failure, the cluster will fail, so it’s definitely a situation that would need immediate attention in a production environment.

Reply

Thank you Phillip for such a wonderful and helpful description of split-brain solution.
But I have a doubt here, what if there is a network failure between 2 locations?
Location 1 consist of : 1 master, 1 master eligible and a data node.
Location 2 consist of : rest of the 5 nodes (includes 1 master eligible node).
Case : Network goes down, and the master eligible node in Location 2 has been promoted to master. And then, network connectivity gets restored. Will there be 2 master nodes in the cluster then?

Gaurav — assuming I understand your question, what you describe shouldn’t be a problem. What you describe is a network-partition, and it would a serious failure in a production system. However the master-eligible node in Location 2 can never become a master due to the network partition you describe because it can’t see any other master or master-eligible nodes, since it’s configured with:
discovery.zen.minimum_master_nodes: 2
This prevents it becoming master. (Strictly speaking, it prevents it from forming a cluster).
It is important to realise that the configuration I have outlined above is applied to every master-eligible node in the cluster. In otherwords, a majority of master-eligible nodes can only ever be on 1-side of a network partition.
That said, there have been some reports of some corner cases, where an elasticsearch cluster can violate this, but this is not by design. I have never actually seen this occur in practise. See this issue on github for some details.

nice post!
What’s the number of shards should I set for different nodes? e.g. data nodes and less data nodes?

Philip O'Toole says:

September 2, 2014 at 11:52 am

EG — setting shard counts is a completely different issue, and nothing to do with preventing a split-brain. I suggest you take a look at the elasticsearch documentation to learn more about when and how shards are created and managed.

Reply

thnx u for this helpful description
i m working to collect logs from 3000 server
I installed 2 servers with ElasticSearch and logstash indexer
2 with redis , logstash shipper ans elasticsearch
So i have 4 instance for ElasticSearch
I tried to find example of cluster configuraton
node.master: true
discovery.zen.minimum_master_nodes: 3
node.data: true
discovery.zen.ping.unicast.hosts: [“es01”, “es02”, “es03″,”es04”]
node.master: true
discovery.zen.minimum_master_nodes: 3
node.data: true
discovery.zen.ping.unicast.hosts: [“es01”, “es02”, “es03″,”es04”]
node.master: true
discovery.zen.minimum_master_nodes: 3
node.data: true
discovery.zen.ping.unicast.hosts: [“es01”, “es02”, “es03″,”es04”]
node.master: false
discovery.zen.minimum_master_nodes: 3
node.data: true
discovery.zen.ping.unicast.hosts: [“es01”, “es02”, “es03″,”es04”]
its correct ?

Philip O'Toole says:

June 11, 2015 at 6:25 pm

stef — that is not exactly what I described since you are using some nodes as both masters and data nodes. That may not be ideal.

Reply

Vallified

8 thoughts on “Avoiding elasticsearch split-brain”

Leave a Reply Cancel reply

Philip O'Toole