Cluster Upgrade from 4.19 to 4.20¶
This guide will lead you through the steps specific for upgrading from a NetEye Cluster installation from version 4.19 to 4.20.
Warning
Remember that you must upgrade sequentially without skipping versions, therefore an upgrade to 4.20 is possible only from 4.19; for example, if you have version 4.14, you must first upgrade to the 4.15, then 4.16, and so on.
Before starting an upgrade, you should very carefully read the latest release notes on NetEye’s blog and check the feature changes and deprecations specific to the version being upgraded. You can check also the whole section Post Upgrade Steps below to verify if there are changes or specific steps that might significantly impact your NetEye Cluster installation.
Breaking Changes¶
- Icinga2 Satellites
NetEye now supports Icinga2 satellites. You have to migrate you current installation following the procedure described in Additional steps after successfully completing the upgrade.
- NATS telegraf user
The NATS telegraf user has been deprecated due to security issues and will be removed in future releases. It has been replaced by two new users:
telegraf_wo with write-only privileges on NATS
telegraf_ro with read-only privileges on NATS
Please change your telegraf collectors and consumers to use the two new users as described in Section Write Data to influxDB through NATS master of the User Guide. Once you have removed all occurrences of telegraf user please go to , click Remove NATS telegraf user and Save Changes.
Cluster Upgrade Prerequisites¶
Upgrading a cluster will take a nontrivial amount of time. During the cluster upgrade, individual nodes will be put into standby mode and so overall cluster performance will be degraded until the upgrade procedure is completed and all nodes are removed from standby mode.
An estimate for the time needed for a full upgrade (update + upgrade) when the cluster is healthy and there are no problems is approximately 30 minutes, plus 15 minutes per node. So for instance on an 3-node cluster it may take approximately 1 hour and 15 minutes (30 + 15*3). This estimate is a lower bound that does not include additional time should there be a kernel update or if you have additional modules installed.
This user guide uses the following conventions to highlight in which node you should execute the process:
(ALL) is the set of all cluster nodes
(N) indicates the last node
(OTHER) is the set of all nodes excluding (N)
For example if (ALL) is neteye01.wp
, neteye02.wp
, and
neteye03.wp
then:
(N) is
neteye03.wp
(OTHER) is
neteye01.wp
andneteye02.wp
The order in which (OTHER) nodes are upgraded is not important. However, you should note that the last node (N) to be upgraded will require a slightly different process than the other nodes (see Post Upgrade Steps For The Last Node (N) for details).
Cluster Upgrade Preparation¶
The Cluster Upgrade Preparation is carried out by running the command:
# nohup neteye upgrade
Warning
The neteye upgrade command can be run on a standard NetEye node, but never on an Elastic-only or a Voting-only Node.
Like neteye update, the neteye upgrade command will run a number of checks to make sure that:
NetEye installation is healthy
The version on NetEye installed is eligible for upgrade, that is, it checks which is the installed version (i.e., 4.xx) and that the last upgrade was finalized, i.e., the
neteye_finalize_installation
script was carried out successfullyNetEye is fully updated and there are no minor (bugfix) updates to be installed
Moreover, it checks are all successful, neteye upgrade will perform also these additional tasks:
Disable fencing (NetEye Clusters only)
Put all nodes into standby except the one on which the command is executed (NetEye Clusters only) so that they are no longer able to host cluster resources
Warning
The neteye upgrade command may take a long time before it completes successfully, so please do not interrupt it until it exits.
If any of these tasks is unsuccessful, a message will explain where the command failed, allowing you to manually fix the corresponding step. For example, if the exit message is similar to the following one, you need to manually install the latest updates.
"Found updates not installed"
"Example: icingacli, version 2.8.2_neteye1.82.1"
Then, if needed, the command will:
Update all the NetEye repositories to the newer version (i.e., 4.yy, which is the next version to which it is possible to upgrade)
Install all the RPMs of the newer version (i.e., 4.yy)
Upgrade the NetEye’s yum groups
If the neteye upgrade command is successful, a message will inform you that it is possible to continue the upgrade procedure, by checking if there are some manual migrations to carry out: if there are, they will be listed in the next section.
Warning
When executed on a cluster, neteye upgrade will neither bring the nodes back from the standby, nor restore stonith: these steps need to be manually carried out after the upgrade has been successfully completed.
Upgrade All Cluster Nodes (ALL)¶
Repeat these upgrade steps for all nodes (ALL).
#1 Check cluster status
Run the following cluster command:
# pcs status
and please ensure that:
Only the last node (N) MUST be active
All cluster resources are marked “Started” on the last node (N)
All cluster services under “Daemon Status” are marked active/enabled on the last node (N)
#2 Check DRBD status
Check if the DRBD status is ok by using the drbdmon command, which updates the DRBD status in real time.
See also
Section 4.2 of DRBD’s official documentation contains information and details about the possible statuses.
https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-check-status
#3 Migrate configuration of RPMs
Each upgraded package can potentially create .rpmsave and/or .rpmnew files. You will need to verify and migrate all such files.
You can find more detailed information about what those files are and why they are generated in the official RPM documentation.
Briefly, if a configuration file has changed since the last version, and the configuration file was edited since the last version, then the package manager will do one of these two things:
If the new system configuration file should replace the edited version, it will save the old edited version as an .rpmsave file and install the new system configuration file.
If the new system configuration file should not replace the edited version, it will leave the edited version alone and save the new system configuration file as an .rpmnew file.
Note
You can use the following commands to locate .rpmsave and .rpmnew files:
# updatedb
# locate *.rpmsave*
# locate *.rpmnew*
The instructions below will show you how to keep your customized operating system configurations.
How to Migrate an .rpmnew Configuration File
The update process creates an .rpmnew file if a configuration file has changed since the last version so that customized settings are not replaced automatically. Those customizations need to be migrated into the new .rpmnew configuration file in order to activate the new configuration settings from the new package, while maintaining the previous customized settings. The following procedure uses Elasticsearch as an example.
First, run a diff between the original file and the .rpmnew file:
# diff -uN /etc/sysconfig/elasticsearch /etc/sysconfig/elasticsearch.rpmnew
OR
# vimdiff /etc/sysconfig/elasticsearch /etc/sysconfig/elasticsearch.rpmnew
Copy all custom settings from the original into the .rpmnew file. Then create a backup of the original file:
# cp /etc/sysconfig/elasticsearch /etc/sysconfig/elasticsearch.01012018.bak
And then substitute the original file with the .rpmnew:
# mv /etc/sysconfig/elasticsearch.rpmnew /etc/sysconfig/elasticsearch
How to Migrate an .rpmsave Configuration File
The update process creates an .rpmsave file if a configuration file has been changed in the past and the updater has automatically replaced customized settings to activate new configurations immediately. In order to preserve your customizations from the previous version, you will need to migrate those from the original .rpmsave into the new configuration file.
Run a diff between the new file and the .rpmsave file:
# diff -uN /etc/sysconfig/elasticsearch.rpmsave /etc/sysconfig/elasticsearch
OR
# vimdiff /etc/sysconfig/elasticsearch.rpmsave /etc/sysconfig/elasticsearch
Copy all custom settings from the .rpmsave into the new configuration file, and preserve the original .rpmsave file under a different name:
# mv /etc/sysconfig/elasticsearch.rpmsave /etc/sysconfig/elasticsearch.01012018.bak
Post Upgrade Steps¶
This section describes all steps, necessary after the upgrade, that must be done on specific nodes.
Post Upgrade Steps for the creation of new Cluster resources¶
In this section you find directions to manually create new cluster resources if needed, e.g., when installing new applications.
Warning
From NetEye 4.20 the Tornado module is automatically installed on all NetEye instances. For this reason, if Tornado was not already present and running in NetEye, you need to create the cluster resources needed by Tornado with the steps explained here.
Cluster resources required by Tornado
If the Tornado module was not previously installed on your NetEye, please follow the procedure below. You can safely skip to the next section if the Tornado module was already installed before the upgrade to NetEye 4.20.
Connect to the terminal of any of the NetEye Cluster Nodes (excluding Elastic-only and Voting-only Nodes).
Create the main cluster resources for Tornado:
Adapt the template
/usr/share/neteye/cluster/templates/Services-tornado.conf.tpl
to the settings of your clusterSave it to a file with the same name without the
.tpl
suffixExecute the following command:
# /usr/share/neteye/scripts/cluster/cluster_service_setup.pl -c /usr/share/neteye/cluster/templates/Services-tornado.conf
Create the cluster resources for Tornado NATS JSON Collector:
Adapt the template
/usr/share/neteye/cluster/templates/Services-tornado-nats-json-collector.conf.tpl
to the settings of your clusterSave it to a file with the same name without the
.tpl
suffixExecute the following command:
# /usr/share/neteye/scripts/cluster/cluster_service_setup.pl -c /usr/share/neteye/cluster/templates/Services-tornado-nats-json-collector.conf
If the SIEM module is installed, create the cluster resources for Tornado rsyslog Collector:
Adapt the template
/usr/share/neteye/cluster/templates/Services-tornado-rsyslog-collector-logmanager.conf.tpl
to the settings of your clusterSave it to a file with the same name without the
.tpl
suffixExecute the following command:
# /usr/share/neteye/scripts/cluster/cluster_service_setup.pl -c /usr/share/neteye/cluster/templates/Services-tornado-rsyslog-collector-logmanager.conf
Post Upgrade Steps On (OTHER) Nodes¶
Run the NetEye Secure Install on (OTHER) nodes but wait for the successful execution of the NetEye Secure Install before running it on another node:
# nohup neteye_secure_install
Warning
In case during the upgrade procedure a new cluster resource has been installed, an error can be thrown. This error can be disregarded because it will be automatically fixed during the Post Upgrade Steps For The Last Node (N) step.
Post Upgrade Steps on the Elastic-only, Voting-only Nodes¶
Run the NetEye Secure Install on the Elastic-only and/or the Voting-only nodes:
# nohup neteye_secure_install
Post Upgrade Steps For The Last Node (N)¶
Run the NetEye Secure Install on the last node (N):
# nohup neteye_secure_install
Cluster Reactivation (N)¶
You can now restore the cluster to high availability operation.
Bring all cluster nodes back out of standby with this command on the last node (N):
# pcs node unstandby --all --wait=300 # echo $?
0
If the exit code is different from 0, some nodes have not been not reactivated, so please be sure that all nodes are active before proceeding.
Run the checks in the section Checking that the Cluster Status is Normal. If any of the above checks fail, please call our service and support team before proceeding.
Re-enable fencing on the last node (N):
# pcs property set stonith-enabled=true
Finalize the Cluster Upgrade (ALL)¶
You can now finalize the upgrade process by launching the following script on every node (ALL) one by one:
# neteye_finalize_installation
In this upgrade, no additional manual step is required.
Troubleshooting: Failing health checks, migration of modules¶
After the finalization procedure has successfully ended, you might notice in the Problems View that some health check fails and is in state WARNING. The reason is that you are using some module that needs to be migrated, because some breaking change has been introduced in the release.
Hence, you should go to the Problems View and check which health check is failing. There you will also find instructions for the correct migration of the module, which is in almost all cases amounts to enabling an option: the actual migration will then be executed manually.
Upgrade NetEye Satellites¶
Icinga2 Satellites¶
Migrate Icinga2 Satellites to Standard¶
Warning
Before to start upgrading your Satellites you must carefully
check you configuration in /etc/neteye-satellite.d/
. The procedure
below describes several common scenarios for NetEye users.
With the introduction in NetEye of the support for Icinga2 Satellites, existing custom configurations must be migrated to the new standard.
Warning
The following procedures are supposed to be executed on the NetEye Master. Once the migration is completed at Master side, to conclude the Satellite migration please refer to the Satellite Upgrade Procedure.
Prerequisites¶
If you are on a cluster you need to put all your nodes in standby except one and execute the migration on the active node.
Migration procedure¶
Suppose to have an initial configuration file satellites.conf
, located in
/neteye/shared/icinga2/conf/icinga2/zones.d/
, containing the following definitions:
object Endpoint "satellite3.example.com" {
host = "satellite3.example.com"
}
object Zone "zone B" {
endpoints = [ "satellite3.example.com" ]
parent = "master"
}
Migrating the Icinga2 Zone to the new Schema with Tenants¶
Since Satellites can be arranged in tenants, it is recommended to migrate your Satellite satellite3 under a new zone that clearly indicates the Satellite’s tenant in its name. For example, if the tenant of satellite3 is called tenant_A, the new zone must contain this information. The existing zone zone B must be then renamed to tenant_A_zone B.
The migration can be achieved via the following procedure:
Step 1. Satellite configuration
Create a configuration file for the Satellite
in /etc/neteye-satellite.d/tenant_A/satellite3.conf
:
{
"fqdn": "satellite3.example.com",
"name": "satellite3",
"icinga2_zone": "zone B",
"ssh_port": "22",
"ssh_enabled": true
}
Step 2. Zone configuration
Create a placeholder for the new zone without any Endpoint configured as follows
object Zone "tenant_A_zone B" {
endpoints = [ ]
parent = "master"
}
in a file called zone_tenant_A_zone_B.conf
located in
/neteye/shared/icinga2/conf/icinga2/zones.d/
.
Note
Please note that the Zone name must contain only alphanumeric characters, underscores, dashes and whitespaces. The filename that defines the placeholder for the new zone, instead, must not contain whitespaces.
Step 3. Zone deploy and objects migration
Check the current configuration with icinga2-master daemon --validate
Now you have to restart icinga2-master and execute the following commands to execute the kickstart and deploy the new configuration.
icingacli director kickstart run
icingacli director config deploy run
At this point you have to change all your hosts, services, templates and apply rules which belonged to the zone zone B to the new zone tenant_A_zone B.
Step 4. Satellite deploy
To deploy the Satellite you have to:
Delete your old configuration file
/neteye/shared/icinga2/conf/icinga2/zones.d/satellites.conf
Sync satellites configuration across all the cluster nodes with neteye config cluster sync (NetEye Clusters only)
Generate new configuration on the master with neteye satellite config create satellite3 this will also run the kickstart and deploy new configuration
Send the new configuration for the satellite with neteye satellite config send satellite3
Keeping the Existing Icinga2 Zone in Single Tenant Environments¶
In single tenant environments, where multi tenancy is not a requirement, the special tenant master can be used. This allows migrating the Satellite satellite3 while keeping the same Icinga2 zone, namely zone B.
The migration, in this case, can be achieved by the following procedure:
Step 1 Satellite configuration
Create a configuration file for the satellite in
/etc/neteye-satellite.d/master/satellite3.conf
:
{
"fqdn": "satellite3.example.com",
"name": "satellite3",
"icinga2_zone": "zone B",
"ssh_port": "22",
"ssh_enabled": true
}
Step 2. Remove existing Satellite objects
Remove the Endpoint and Zone definitions referring to Satellite satellite3 from file
/neteye/shared/icinga2/conf/icinga2/zones.d/satellites.conf
, in order to
avoid any duplication of Icinga2 objects in the configuration.
Step 3. Satellite configuration and deploy
Synchronize satellites configuration across all the cluster nodes with neteye config cluster sync (NetEye Clusters only). Generate the new configuration, for Satellite satellite3, on the Master with neteye satellite config create satellite3. The command will also run the Icinga2 kickstart and deploy the new Icinga2 configuration files.
Send the new configuration to the Satellite with neteye satellite config send satellite3
Keeping the Existing Icinga2 Zone in Existing Multi Tenant Environments¶
It can be the case, in very large multi tenant environments, that changing the name of the an Icinga2 zone can be detrimental for the monitoring capabilities of the whole NetEye ecosystem.
To avoid disrupting the monitoring, we allow, for existing multi tenant environments only, to keep the same Icinga2 zone of the Satellite, namely zone B.
Please refer to the following procedure:
Step 1 Satellite configuration
Create a configuration file for the satellite in
/etc/neteye-satellite.d/tenant_A/satellite3.conf
:
{
"fqdn": "satellite3.example.com",
"name": "satellite3",
"icinga2_zone": "zone B",
"ssh_port": "22",
"ssh_enabled": true,
"icinga2_tenant_in_zone_name": false
}
Note
Please note that the special tenant master must not be used in multi tenant environments
Note
Please note that the attribute icinga2_tenant_in_zone_name must be used only in already existing multi tenant installations
Step 2. Remove existing Satellite objects
Remove the Endpoint and Zone definitions referring to Satellite satellite3 from file
/neteye/shared/icinga2/conf/icinga2/zones.d/satellites.conf
, in order to
avoid any duplication of Icinga2 objects in the configuration.
Step 3. Satellite configuration and deploy
Synchronize satellites configuration across all the cluster nodes with neteye config cluster sync (NetEye Clusters only). Generate the new configuration, for Satellite satellite3, on the Master with neteye satellite config create satellite3. The command will also run the Icinga2 kickstart and deploy the new Icinga2 configuration files.
Note
Please note that the Zone name must be unique across all tenants, and must contain only alphanumeric characters, underscores, dashes and whitespaces.
Send the new configuration for the Satellite with neteye satellite config send satellite3
Upgrade Satellites¶
To upgrade a Satellite it is required to have the latest configuration archive located in
/root/satellite-setup/config/<neteye_release>/satellite-config.tar.gz
. The archive
is generated by the upgraded Master.
To generate the configuration archive on Master (see the Satellite Configuration)
To automatically download the latest upgrade you can run the following command on the Satellite:
neteye satellite upgrade
The command updates the NetEye repositories to the newer NetEye version, installs all the RPMs of the newer version, and upgrades the NetEye’s yum groups.
Please check for any .rpmnew and .rpmsave files (see the Migrate RPM Configuration section for further information).
If the command is successful, a message will inform you that it is possible to continue the upgrade procedure.
Execute the command below to setup the Satellite with the new upgrade:
neteye satellite setup
Complete the satellite upgrade process by launching the following script:
neteye_finalize_installation
Note
You should launch the finalize command only if all previous steps have been completed successfully. If you encounter any errors or problems during the upgrade process, please contact our service and support team to evaluate the best way forward for upgrading your NetEye System.