User Guide

Cluster Upgrade from 4.21 to 4.22

This guide will lead you through the steps specific for upgrading from a NetEye Cluster installation from version 4.21 to 4.22.

Warning

Remember that you must upgrade sequentially without skipping versions, therefore an upgrade to 4.22 is possible only from 4.21; for example, if you have version 4.14, you must first upgrade to the 4.15, then 4.16, and so on.

Before starting an upgrade, you should very carefully read the latest release notes on NetEye’s blog and check the feature changes and deprecations specific to the version being upgraded. You should check also the whole section Breaking Changes below.

The remainder of this section is organised as follows. Section Breaking Changes introduces substantial changes that users must be aware of before starting the upgrade procedure and may require to carry out some tasks before starting the upgrade; section Prerequisites provide information to be known before starting the upgrade procedure; section Conventions Used defines some notation used in this procedure; section NetEye Single Instance Upgrade Procedure presents the actual procedure, including directions for special nodes; section Cluster Reactivation instructs on how to bring the cluster back to complete functionality, and finally section Additional Tasks shows which tasks must be executed after the upgrade procedure has been successfully executed.

Breaking Changes

NATS telegraf user

The NATS telegraf user has been deprecated due to security issues and will be removed in the NetEye 4.22 release. This means that the telegraf user must be migrated before the upgrade to NetEye 4.22. It has been replaced by two new users:

  1. telegraf_wo with write-only privileges on NATS

  2. telegraf_ro with read-only privileges on NATS

Please change your telegraf collectors and consumers to use the two new users as described in Section Write Data to influxDB through NATS master of the User Guide. Once you have removed all occurrences of telegraf user please go to Configuration / Modules / neteye / Configuration, click Remove NATS telegraf user and Save Changes.

Elastic Stack

From NetEye 4.22 onwards, we regenerated server certificates required to expose Elasticsearch through NGINX. The Elasticsearch REST certificate will be replaced but existing files are backed up with timestamp as follows:

  • /neteye/local/elasticsearch/conf/certs/es-rest.crt.pem.<timestamp>

  • /neteye/local/elasticsearch/conf/certs/es-rest.csr.<timestamp>

  • /neteye/local/elasticsearch/conf/certs/private/es-rest.key.<timestamp>

If you want to recreate certificates for any reason you have to include in the new certificates at least following information:

  • cluster FQDN

  • elasticsearch.neteyelocal

  • internal node IP and cluster IP

Note

These parameters should be given as argument to script /usr/share/neteye/scripts/security/generate_server_certs.sh.

Prerequisites

Upgrading a cluster will take a nontrivial amount of time. During the cluster upgrade, individual nodes will be put into standby mode and so overall cluster performance will be degraded until the upgrade procedure is completed and all nodes are removed from standby mode.

An estimate for the time needed for a full upgrade (update + upgrade) when the cluster is healthy, there is no additional NetEye modules installed, and the procedure is successful is approximately 30 minutes, plus 15 minutes per node.

So for instance on an 3-node cluster it may take approximately 1 hour and 15 minutes (30 + 15*3).

Warning

This estimate does not include the time required to download the packages and for the manual intervention: migration of configurations due to breaking changes, failure of tasks during the execution of the neteye update and neteye upgrade commands.

Conventions Used

A NetEye cluster can be composed by different types of nodes, including Elastic-only and Voting-only nodes, which require a different upgrade procedure. Therefore, the following notation has been devised, to identify nodes in the cluster.

  • (ALL) is the set of all cluster nodes

  • (N) indicates the NetEye master node of the Cluster

  • (E) is an Elastic-only node

  • (V) is a Voting-only node.

  • (OTHER) is the set of all nodes excluding (N), (E), and (V)

For example if we take the sample cluster defined in The Elected NetEye Master, (ALL) is my-neteye-01, my-neteye-02, my-neteye-03, my-neteye-04, and my-neteye-05.

  • (N) is my-neteye-01

  • (OTHER) is composed by my-neteye-02 and my-neteye-03

  • (E) is my-neteye-04

  • (V) is my-neteye-05

Note

Please see The Elected NetEye Master for a discussion about the Cluster Master Node.

Running the Upgrade

The Cluster Upgrade is carried out by running the command:

# nohup neteye upgrade

All the tasks carried out by the command are listed in section neteye upgrade; a dedicated section provides directions in case the command fails.

Warning

The neteye upgrade command can be run on a standard NetEye node, but in must be never issued on an Elastic-only (E) or a Voting-only (V) Node, because it would turn these nodes into NetEye Nodes.

Special Nodes

In the context of the Upgrade procedure, special nodes are Elastic-only (E) and Voting-only (V) Nodes. They do not need to be upgraded manually, because the neteye upgrade command will automatically take care of upgrading them.

Additional Tasks

In this upgrade, no additional manual step is required.

Cluster Reactivation

You can now restore the cluster to high availability operation.

  • Bring all cluster nodes back out of standby with this command on the last node (N):

    # pcs node unstandby --all --wait=300
    # echo $?
    
    0
    

    If the exit code is different from 0, some nodes have not been not reactivated, so please be sure that all nodes are active before proceeding.

  • Run the checks in the section Checking that the Cluster Status is Normal. If any of the above checks fail, please call our service and support team before proceeding.

  • Re-enable fencing on the last node (N), if it was enable prior to the upgrade:

    # pcs property set stonith-enabled=true