Cluster Upgrade from 4.17 to 4.18¶
This guide will lead you through the steps specific for upgrading from a NetEye Cluster installation from version 4.17 to 4.18.
Warning
Remember that you must upgrade sequentially without skipping versions, therefore an upgrade to 4.yy is possible only from 4.xx; for example, if you have version 4.14, you must first upgrade to the 4.15, then 4.16, and so on.
Before starting an upgrade, you should very carefully read the latest release notes on NetEye’s blog and check the feature changes and deprecations specific to the version being upgraded. You can check also the whole section Post Upgrade Steps below to verify if there are changes or specific steps that might significantly impact your NetEye Cluster installation.
Cluster Upgrade Prerequisites¶
Upgrading a cluster will take a nontrivial amount of time. During the cluster upgrade, individual nodes will be put into standby mode and so overall cluster performance will be degraded until the upgrade procedure is completed and all nodes are removed from standby mode.
An estimate for the time needed for a full upgrade (update + upgrade) when the cluster is healthy and there are no problems is approximately 30 minutes, plus 15 minutes per node. So for instance on an 3-node cluster it may take approximately 1 hour and 15 minutes (30 + 15*3). This estimate is a lower bound that does not include additional time should there be a kernel update or if you have additional modules installed.
This user guide uses the following conventions to highlight in which node you should execute the process:
(ALL) is the set of all cluster nodes
(N) indicates the last node
(OTHER) is the set of all nodes excluding (N)
For example if (ALL) is neteye01.wp
, neteye02.wp
, and
neteye03.wp
then:
(N) is
neteye03.wp
(OTHER) is
neteye01.wp
andneteye02.wp
The order in which (OTHER) nodes are upgraded is not important. However, you should note that the last node (N) to be upgraded will require a slightly different process than the other nodes (see Post-upgrade steps for the last node (N) for details).
Cluster Upgrade Preparation¶
The Cluster Upgrade Preparation is carried out by running the command:
# nohup neteye upgrade
Warning
The neteye upgrade
command can be run on a standard
neteye node, but never on an Elastic-only or a Voting-only
Node.
The neteye upgrade command will run a number of checks to make sure that:
The version on NetEye installed is eligible for upgrade, that is, it checks which is the installed version (i.e., 4.xx) and that the last upgrade was finalized, i.e., the
neteye_finalize_installation
script was carried out successfully.NetEye is fully updated and there are no minor (bugfix) updates to be installed.
Warning
The neteye upgrade
command may take a long time
before it completes successfully, so please do not interrupt it
until it exits.
If any of these tasks is unsuccessful, a message will explain where the command failed, allowing you to manually fix the corresponding step. For example, if the exit message is similar to the following one, you need to manually install the latest updates.
"Found updates not installed"
"Example: icingacli, version 2.8.2_neteye1.82.1"
Then, if needed, the command will:
Update all the NetEye repositories to the newer version (i.e., 4.yy, which is the next version to which it is possible to upgrade)
Install all the RPMs of the newer version (i.e., 4.yy)
Upgrade the NetEye’s yum groups
Warning
The neteye upgrade
command will not restore
stonith, therefore remember to execute it when necessary, that
is, in step Cluster Reactivation (N) below.
If the command is successful, a message will inform you that it is possible to continue the cluster upgrade procedure.
Standby of Cluster Nodes (OTHER)¶
Put all (OTHER) nodes in standby, by running the following command so that the current node is no longer able to host cluster resources.
Note
Make sure that the last node, (N) remains in active (i.e., non-standby) mode.
# pcs node standby --wait=300
# echo $?
The output of the last command (which is the exit code of the previous command) must be 0.
0
If the exit code is different from 0, the current node is not yet in standby, so please be sure that the current node is in standby before proceeding.
Upgrade All Cluster Nodes (ALL)¶
Repeat these upgrade steps for all nodes (ALL).
Check cluster status
Run the following cluster command:
# pcs status
and please ensure that:
Only the last node (N) MUST be active
All cluster resources are marked “Started” on the last node (N)
All cluster services under “Daemon Status” are marked active/enabled on the last node (N)
Check DRBD status
Check if the DRBD status is ok by using the drbdmon command, which updates the DRBD status in real time.
See also
Section 4.2 of the official documentation <https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-check-status> contains information and details about the possible statuses.
Migrate configuration of RPMs
Each upgraded package can potentially create .rpmsave and/or .rpmnew files. You will need to verify and migrate all such files.
You can find more detailed information about what those files are and why they are generated in the official RPM documentation.
Briefly, if a configuration file has changed since the last version, and the configuration file was edited since the last version, then the package manager will do one of these two things:
If the new system configuration file should replace the edited version, it will save the old edited version as an .rpmsave file and install the new system configuration file.
If the new system configuration file should not replace the edited version, it will leave the edited version alone and save the new system configuration file as an .rpmnew file.
Note
You can use the following commands to locate .rpmsave and .rpmnew files:
# updatedb
# locate *.rpmsave*
# locate *.rpmnew*
The instructions below will show you how to keep your customized operating system configurations.
How to Migrate an .rpmnew Configuration File
The update process creates an .rpmnew file if a configuration file has changed since the last version so that customized settings are not replaced automatically. Those customizations need to be migrated into the new .rpmnew configuration file in order to activate the new configuration settings from the new package, while maintaining the previous customized settings. The following procedure uses Elasticsearch as an example.
First, run a diff between the original file and the .rpmnew file:
# diff -uN /etc/sysconfig/elasticsearch /etc/sysconfig/elasticsearch.rpmnew
OR
# vimdiff /etc/sysconfig/elasticsearch /etc/sysconfig/elasticsearch.rpmnew
Copy all custom settings from the original into the .rpmnew file. Then create a backup of the original file:
# cp /etc/sysconfig/elasticsearch /etc/sysconfig/elasticsearch.01012018.bak
And then substitute the original file with the .rpmnew:
# mv /etc/sysconfig/elasticsearch.rpmnew /etc/sysconfig/elasticsearch
How to Migrate an .rpmsave Configuration File
The update process creates an .rpmsave file if a configuration file has been changed in the past and the updater has automatically replaced customized settings to activate new configurations immediately. In order to preserve your customizations from the previous version, you will need to migrate those from the original .rpmsave into the new configuration file.
Run a diff between the new file and the .rpmsave file:
# diff -uN /etc/sysconfig/elasticsearch.rpmsave /etc/sysconfig/elasticsearch
OR
# vimdiff /etc/sysconfig/elasticsearch.rpmsave /etc/sysconfig/elasticsearch
Copy all custom settings from the .rpmsave into the new configuration file, and preserve the original .rpmsave file under a different name:
# mv /etc/sysconfig/elasticsearch.rpmsave /etc/sysconfig/elasticsearch.01012018.bak
Post Upgrade Steps¶
This section describes all steps, necessary after the upgrade, that must be done on specific nodes.
Post-upgrade Scripts On (OTHER) nodes¶
Run the NetEye Secure Install on (OTHER) nodes but wait for the successful execution of the NetEye Secure Install before running it on another node:
# nohup neteye_secure_install
Warning
In case during the upgrade procedure a new cluster resource has been installed, an error can be thrown. This error can be disregarded because it will be automatically fixed during the Post-upgrade steps for the last node (N) step.
Post-upgrade scripts on the Elastic-only nodes¶
Run the NetEye Secure Install on the Elastic-only nodes:
# nohup neteye_secure_install
Post-upgrade steps for the last node (N)¶
Run the NetEye Secure Install on the last node (N):
# nohup neteye_secure_install
Cluster Reactivation (N)¶
You can now restore the cluster to high availability operation.
Bring all cluster nodes back out of standby with this command on the last node (N):
# pcs node unstandby --all --wait=300 # echo $?
0
If the exit code is different from 0, some nodes have not been not reactivated, so please be sure that all nodes are active before proceeding.
Run the checks in the section Checking that the Cluster Status is Normal. If any of the above checks fail, please call our service and support team before proceeding.
Re-enable fencing on the last node (N):
# pcs property set stonith-enabled=true
Finalize the Cluster Upgrade (ALL)¶
You can now finalize the upgrade process by launching the following script on every node (ALL) one by one:
# neteye_finalize_installation
In this upgrade, no additional manual step is required.
Troubleshooting: Failing health checks, migration of modules¶
After the finalization procedure has successfully ended, you might notice in the Problems View that some health check fails and is in state WARNING. The reason is that you are using some module that needs to be migrated, because some breaking change has been introduced in the release.
Hence, you should go to the Problems View and check which health check is failing. There you will also find instructions for the correct migration of the module, which is in almost all cases amounts to enabling an option: the actual migration will then be executed manually.