Elasticsearch升级

操作流程

Elasticsearch官方提供的Elasticsearch版本升级流程可参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-upgrade.html

在实际操作过程中我们首先遇到的一个问题是:https://github.com/elastic/elasticsearch/pull/7210 ,简单来说就是elasticsearch的低版本和高版本之间采用的压缩处理有所不同,导致低版本的分片在高版本的集群中进行allocation时出错。这个问题在elasticsearch-1.3.2版本中已经解决,在官方的文档中有如下说明: The sequence of events needed to trigger this bug occurs rarely. It may be responsible for the occasional case of corruption that we have seen reported, but which has remained unexplained. We advise users to upgrade, but in the meantime you can avoid this bug completely by disabling compression with the following request:

具体操作如下:

curl -XPUT "http://localhost:9200/_cluster/settings" -d'
{
  "persistent": {
    "indices.recovery.compress": false
  }
}'

解决不同版本间的压缩问题后,我们才开始进行elasticsearch-1.2.2升级到Elasticsearch-1.5.2的操作。

以下测试均在测试网18.33和18.36两台服务器上操作完成,初始状态下两台服务器上分别部署了两个master节点和data节点

集群节点Elasticsearch升级

  • 操作流程

1.首先执行Elasticsearch-1.2.2集群的索引数据备份

2.关闭elasticsearch-1.2.2集群的recovery.compress

curl -XPUT "http://localhost:9200/_cluster/settings" -d'
        {
              "persistent": {
               "indices.recovery.compress": false
              }
        }'

3.关闭shard reallocation

curl -XPUT localhost:9200/_cluster/settings -d '{
        "transient" : {
            "cluster.routing.allocation.enable" : "none"
            }
        }'

4.关闭集群中需要升级的节点

curl -XPOST 'http://localhost:9200/_cluster/nodes/_local/_shutdown'

5.确认被关闭节点上的分片正确重新分配到集群中还在运行的节点上

6.在服务器上安装好elasticsearch-1.5.2的实例, 将elasticsearch-1.2.2实例的配置文件覆盖elasticsearch-1.5.2的配置文件; 同时elasticsearch-1.5.2节点的data目录和elasticsearch-1.2.2的data目录做一个symbolic link:

cp /app/IDC/KT-ES/elasticsearch-1.2.2-data/config/* ./config/
cp -r /app/IDC/KT-ES/elasticsearch-1.2.2-master/bin/service* ./bin/service
ln -s /app/IDC/KT-ES/elasticsearch-1.2.2-data/data/ ./data

7.启动新升级的elasticsearch-1.5.2节点,确认其正常加入cluster 8.恢复分片的reallocation

curl -XPUT localhost:9200/_cluster/settings -d '{
            "transient" : {
                   "cluster.routing.allocation.enable" : "all"
            }
        }'

9.观察所有分片可能在所有的节点上allocated。分片balance会花费一些时间

10.针对所有剩下的节点,重复上述操作步骤。

11.待所有节点操作完成后,执行如下命令:

curl -XPUT localhost:9200/_cluster/settings -d '{
        "persistent" : {
            "cluster.routing.allocation.disable_allocation" : true
            }
       }

同时整个集群进行重启;

12.待所有节点操作完成后,打开indices.recovery.compress,同时整个集群进行重启。