User Guide Functional Overview Requirements Architecture System Installation NetEye Additional Components Installation Setup The neteye Command Director NetEye Self Monitoring Tornado Business Service Monitoring IT Operation Analytics - Telemetry Geo Maps NagVis Audit Log Shutdown Manager Reporting ntopng Visual Monitoring with Alyvix Elastic Stack IT Operations (Command Orchestrator) Asset Management Service Level Management Cyber Threat Intelligence - SATAYO NetEye Update & Upgrade How To NetEye Extension Packs Troubleshooting Security Policy Glossary
module icon NetEye Update & Upgrade
Before you start Update Procedure Single Node Upgrade from 4.44 to 4.45 Cluster Upgrade from 4.44 to 4.45 Satellite Upgrade from 4.44 to 4.45 DPO machine Upgrade from 4.44 to 4.45 Create a mirror of the RPM repository Sprint Releases Feature Troubleshooting
NetEye Update & Upgrade How To NetEye Extension Packs Troubleshooting Security Policy Glossary Introduction to NetEye Monitoring Business Service Monitoring IT Operation Analytics Visualization Network Visibility Log Management & Security Orchestrated Datacenter Shutdown Application Performance Monitoring User Experience Service Management Service Level Management & Reporting Requirements for a Node Cluster Requirements and Best Practices NetEye Satellite Requirements TCP and UDP Ports Requirements Additional Software Installation Introduction Single Node Cluster NetEye Master Master-Satellite Architecture Underlying Operating System Acquiring NetEye ISO Image Installing ISO Image Single Nodes and Satellites Cluster Nodes Configuration of Tenants Satellite Nodes Only Nodes behind a Proxy Additional NetEye Components Single Node Cluster Node Satellites Nodes only Verify if a module is running correctly Accessing the New Module Cluster Satellite Security Identity and Access Management External Identity Providers Configure federated LDAP/AD Emergency Reset of Keycloak Configuration Advanced Configuration Roles Single Page Application in NetEye Module Permissions and Single Sign On Within NetEye Importing User Federation Groups inside another Group Importing OIDC IdP Groups inside another Group Resources Tuning Advanced Topics Basic Concepts & Usage Advanced Topics Monitoring Environment Templates Monitored Objects Import Monitored Objects Data Fields Deployment Icinga 2 Agents Configuration Baskets Dashboard Monitoring Status VMD Permissions Notifications Jobs API Configuring Icinga Monitoring Retention Policy NetEye Self Monitoring Concepts Collecting Events Add a Filter Node WHERE Conditions Iterating over Event fields Retrieving Payload of an Event Extract Variables Create a Rule Tornado Actions Test your Configuration Export and Import Configuration Example Under the hood Development Retry Strategy Configuration Thread Pool Configuration API Reference Configure a new Business Process Create your first Business Process Node Importing Processes Operators The ITOA Module Configuring User Permissions Telegraf Metrics in NetEye Telegraf Configuration Telegraf on Monitored Hosts Visualizing Dashboards Customizing Performance Graph The NetEye Geo Map Visualizer Map Viewer Configuring Geo Maps NagVis Audit Log Overview Shutdown Manager user Shutdown Manager GUI Shutdown Commands Advanced Topics Overview User Role Management Cube Use Cases ntopng and NetEye Integration Permissions Retention Advanced Topics Overview User Roles Nodes Test Cases Dashboard Use Cases Overview Architecture Authorization Kibana Elasticsearch Cluster Elasticsearch Configuration Replicas on a Single Node Elasticsearch Performance tuning Overview Enabling El Proxy Sending custom logs to El Proxy Configuration files Commands Elasticsearch Templates and Retentions El Proxy DLQ Blockchain Verification Handling Blockchain Corruptions El Proxy Metrics El Proxy Security El Proxy REST Endpoints Agents Logstash Elastic APM Elastic RUM Elastic XDR Log Manager - Deprecated Overview Authorization in the Command Orchestrator Module Configuring CLI Commands Executing Commands Overview Permissions Installation Single Tenancy Multitenancy Communication through a Satellite Asset collection methods Display asset information in monitoring host page Overview Customers Availability Event Adjustment Outages Resource Advanced Topics Introduction Getting Started SATAYO Items Settings Managed Service Mitre Attack Coverage Changelog Before you start Update Procedure Single Node Upgrade from 4.44 to 4.45 Cluster Upgrade from 4.44 to 4.45 Satellite Upgrade from 4.44 to 4.45 DPO machine Upgrade from 4.44 to 4.45 Create a mirror of the RPM repository Sprint Releases Feature Troubleshooting Tornado Networking Service Management - Incident Response IT Operation Analytics - Telemetry Identity Provider (IdP) Configuration NetEye Cluster on Microsoft Azure Introduction to NEP Getting Started with NEPs Online Resources Obtaining NEP Insights Available Packages Advanced Topics Upgrade to NetEye 4.31 Setup Configure swappiness Restarting Stopped Services Enable stack traces in web UI How to access standard logs Director does not deploy when services assigned to a host have the same name How to enable/disable debug logging Activate Debug Logging for Tornado Modules/Services do not start Sync Rule fails when trying to recreate Icinga object How to disable InfluxDB query logging Managing an Elasticsearch Cluster with a Full Disk Some logs are not indexed in Elasticsearch Elasticsearch is not functioning properly Reporting: Error when opening a report Debugging Logstash file input filter Bugfix Policy Reporting Vulnerabilities Glossary

Troubleshooting

The Update and Upgrade procedures can stop for disparate reasons. This section collects the most frequents cases and provide some guidelines to resolve the issue and continue the procedures.

In some cases you might want to check out the logs of the various commands that have been executed. All the logs are stored in a log file at /neteye/local/os/log/neteye_command/

If you find a problem that is not covered in this page, please refer to the official channels: sales, consultant or support portal. for help and directions on how to proceed.

Some check fails

In this case, an informative message will point out the check that failed, allowing to inspect and fix the problem.

For example, if the exit message is similar to the following one, you need to manually install the latest updates.

"Found updates not installed"
"Example: icingacli, version 2.8.2_neteye1.82.1"

Then, after the updates are installed, you can run it again and the command will start over the tasks.

An .rpmnew and/or .rpmsave file is found

This can happen in presence of a customisation in some of the installed packages. Check section Migrate .rpmsave and .rpmnew Files for directions on how to proceed. Once done, remember to run neteye update again.

A cluster resource has not been created

During a NetEye Cluster upgrade, it can happen that there is the need of creating new cluster resources before running the neteye install script. Creation of a resource must be done manually, and directions can be found in section 4. Additional Tasks of the Cluster Upgrade from 4.44 to 4.45.

An health check is failing

…during the update/upgrade procedure

The NetEye update or upgrade commands run all the deep health checks to ensure that the NetEye installation is healthy before running the update or upgrade procedure. It might happen, however, that one of the check fail, thus preventing the procedures to complete successfully.

Hence, to manually solve the problem you should follow the directions that can be found in section The NetEye Health Check.

Once the issue is solved, the NetEye update/upgrade commands can be run again.

…after the finalization procedure

After the finalization procedure has successfully ended, you might notice in the Problems View (see Menu / Problems) that some health check fails and is in state WARNING. The reason is that you are using some module that needs to be migrated, because some breaking change has been introduced in the release.

Hence, you should go to the Problems View and check which health check is failing. There you will also find instructions for the correct migration of the module, which is in almost all cases amounts to enabling an option: the actual migration will then be executed manually.

How to check the NetEye Cluster status

Run the following cluster command:

# pcs status

and please ensure that:

  1. Only the last (N) node MUST be active

  2. All cluster resources are marked “Started” on the last (N) node

  3. All cluster services under “Daemon Status” are marked active/enabled on the last (N) node

How to check DRBD status

Check if the DRBD status is ok by using the drbdmon command, which updates the DRBD status in real time.

See also

Section 4.2 of DRBD’s official documentation contains information and details about the possible statuses.

https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-check-status

Elasticsearch Cluster upgrade

To guarantee an efficient yet reliable upgrade of the nodes in the Elasticsearch cluster, NetEye adopts a strategy that upgrades nodes in parallel when possible, in order to save time. To troubleshoot potential issues during the upgrade, it is important to understand how the procedure works.

Parallel Upgrade of Elasticsearch Nodes

Step 1: Group Nodes by Role

Nodes are organized into logical groups based on their roles:

  1. Master Nodes Group

    • Nodes with the master role.

  2. Data Nodes Group

    • Nodes with the data role (excluding master).

  3. Data Tier Groups

    • Nodes with tier roles: hot, warm, cold, frozen.

    • If a node has multiple tier roles, a combined group is created.

      • Example: a node with cold and frozen roles is placed in a group named cold+frozen. All nodes with either cold or frozen roles are included in this group.

    • Each node belongs to only one group.

Step 2: Upgrade Sequence

We upgrade the groups in the following order to maintain cluster health:

  1. Data Tier Groups

    • Nodes are upgraded in parallel, but only one node per group at a time.

      For example:
      • One node from hot

      • One node from warm

      • One node from cold+frozen

  2. Data Nodes

    • Nodes are upgraded sequentially, one node at a time.

  3. Master Nodes

    • Nodes are upgraded sequentially, one node at a time.

Waiting for Shard Relocation

Updating or upgrading Elasticsearch requires restarting the service to take effect. During this process, the shards allocated on the node being restarted are temporarily unassigned until the node is back online.

To ensure that upgrading node X does not cause shards to become completely unavailable, the procedure by default waits until there are no unassigned shards whose replica is allocated on node X before proceeding with its upgrade.

Note

If a shard has no replicas (i.e., it is a primary shard without any replicas), it will become unavailable during the upgrade of the node hosting it.

By default, each node waits up to one hour for shard relocation to complete before continuing with the upgrade. If relocation is not completed within this time frame, the procedure fails with an error, allowing you to investigate the issue.

In installations with large volumes of data, relocation may take longer. In such cases, you may choose to increase the waiting time or maybe skip the relocation check entirely. Refer to the following sections for instructions.

Customize maximum relocation waiting time

You can customize the maximum waiting time for shard relocation by specifying two parameters when launching the update or upgrade command: the number of retries and the seconds between each retry.

For example, to set a maximum waiting time of two hours:

neteye# (nohup neteye update --extra-vars '{"es_status_wait_retries":120,"es_status_wait_seconds_between_retries":60}' &) && tail --retry -f nohup.out
neteye# (nohup neteye upgrade --extra-vars '{"es_status_wait_retries":120,"es_status_wait_seconds_between_retries":60}' &) && tail --retry -f nohup.out

Skipping relocation wait

If shard availability during the upgrade is not required in your installation, you can skip the relocation wait using the skip_es_status_to_wait parameter:

neteye# (nohup neteye update --extra-vars '{"skip_es_status_to_wait":true}' &) && tail --retry -f nohup.out
neteye# (nohup neteye upgrade --extra-vars '{"skip_es_status_to_wait":true}' &) && tail --retry -f nohup.out

Waiting for a particular cluster status

If the default behavior of waiting for shard relocation is not suitable for your installation, you can configure the procedure to wait for a specific cluster status before upgrading each node.

For example, to wait until the cluster reaches green status:

neteye# (nohup neteye update --extra-vars '{"es_status_to_wait": "green"}' &) && tail --retry -f nohup.out
neteye# (nohup neteye upgrade --extra-vars '{"es_status_to_wait": "green"}' &) && tail --retry -f nohup.out

Note

Supported values are: green and yellow.