User Guide Functional Overview Requirements Architecture System Installation NetEye Additional Components Installation Setup The neteye Command Director NetEye Self Monitoring Tornado Business Service Monitoring IT Operation Analytics - Telemetry Geo Maps NagVis Audit Log Shutdown Manager Reporting ntopng Visual Monitoring with Alyvix Elastic Stack IT Operations (Command Orchestrator) Asset Management Service Level Management Cyber Threat Intelligence - SATAYO NetEye Update & Upgrade How To NetEye Extension Packs Troubleshooting Security Policy Glossary
module icon Service Level Management
Overview Customers Availability Event Adjustment Outages Resource Advanced Topics
ntopng Visual Monitoring with Alyvix Elastic Stack IT Operations (Command Orchestrator) Asset Management Service Level Management Cyber Threat Intelligence - SATAYO Introduction to NetEye Monitoring Business Service Monitoring IT Operation Analytics Visualization Network Visibility Log Management & Security Orchestrated Datacenter Shutdown Application Performance Monitoring User Experience Service Management Service Level Management & Reporting Requirements for a Node Cluster Requirements and Best Practices NetEye Satellite Requirements TCP and UDP Ports Requirements Additional Software Installation Introduction Single Node Cluster NetEye Master Master-Satellite Architecture Underlying Operating System Acquiring NetEye ISO Image Installing ISO Image Single Nodes and Satellites Cluster Nodes Configuration of Tenants Satellite Nodes Only Nodes behind a Proxy Additional NetEye Components Single Node Cluster Node Satellites Nodes only Verify if a module is running correctly Accessing the New Module Cluster Satellite Security Identity and Access Management External Identity Providers Configure federated LDAP/AD Emergency Reset of Keycloak Configuration Advanced Configuration Authorization Resources Tuning Advanced Topics Basic Concepts & Usage Advanced Topics Monitoring Environment Templates Monitored Objects Import Monitored Objects Data Fields Deployment Icinga 2 Agents Configuration Baskets Dashboard Monitoring Status VMD Permissions Notifications Jobs API Configuring Icinga Monitoring Retention Policy NetEye Self Monitoring 3b Concepts Collecting Events Add a Filter Node WHERE Conditions Iterating over Event fields Retrieving Payload of an Event Extract Variables Create a Rule Tornado Actions Test your Configuration Export and Import Configuration Example Under the hood Development Retry Strategy Configuration Thread Pool Configuration API Reference Configure a new Business Process Create your first Business Process Node Importing Processes Operators The ITOA Module Configuring User Permissions Telegraf Metrics in NetEye Telegraf Configuration Telegraf on Monitored Hosts Visualizing Dashboards Customizing Performance Graph The NetEye Geo Map Visualizer Map Viewer Configuring Geo Maps NagVis 3b Audit Log 3b Overview Shutdown Manager user Shutdown Manager GUI Shutdown Commands Advanced Topics Overview User Role Management Cube Use Cases ntopng and NetEye Integration Permissions Retention Advanced Topics Overview User Roles Nodes Test Cases Dashboard Use Cases Overview Architecture Authorization Elasticsearch Overview Enabling El Proxy Sending custom logs to El Proxy Configuration files Commands Elasticsearch Templates and Retentions El Proxy DLQ Blockchain Verification Handling Blockchain Corruptions El Proxy Metrics El Proxy Security El Proxy REST Endpoints Agents Logstash Elastic APM Elastic RUM Log Manager - Deprecated Overview Authorization in the Command Orchestrator Module Configuring CLI Commands Executing Commands Overview Permissions Installation Single Tenancy Multitenancy Communication through a Satellite Asset collection methods Display asset information in monitoring host page Overview Customers Availability Event Adjustment Outages Resource Advanced Topics Introduction Getting Started SATAYO Items Settings Managed Service Mitre Attack Coverage Changelog Before you start Update Procedure Single Node Upgrade from 4.41 to 4.42 Cluster Upgrade from 4.41 to 4.42 Satellite Upgrade from 4.41 to 4.42 DPO machine Upgrade from 4.41 to 4.42 Create a mirror of the RPM repository Sprint Releases Feature Troubleshooting Tornado Networking Service Management - Incident Response IT Operation Analytics - Telemetry Identity Provider (IdP) Configuration Introduction to NEP Getting Started with NEPs Online Resources Obtaining NEP Insights Available Packages Advanced Topics Upgrade to NetEye 4.31 Setup Configure swappiness Restarting Stopped Services Enable stack traces in web UI How to access standard logs Director does not deploy when services assigned to a host have the same name How to enable/disable debug logging Activate Debug Logging for Tornado Modules/Services do not start Sync Rule fails when trying to recreate Icinga object How to disable InfluxDB query logging Managing an Elasticsearch Cluster with a Full Disk Some logs are not indexed in Elasticsearch Elasticsearch is not functioning properly Reporting: Error when opening a report Debugging Logstash file input filter Bugfix Policy Reporting Vulnerabilities Glossary 3b

Overview

The Service Level Management Module (SLM) allows to setup contracts between one SLM customer and the service provider, with the purpose to sending a periodic report to the customer. Reports over specified period of time can be created and scheduled for generation in Reporting Module, due to functionality integrated from SLM module.

There are two types of contracts available:

  • The Availability contract measures the availability of hosts and services within a given period of time, to verify if the level of availability required by the customer has been met. An availability contract is defined by a customer, a SLA type and a set of monitored objects.

  • The Resource Contract shows a dashboard with diagrams that show the load of the monitored objects within a given time range. Dashboards are Grafana-based and need to be created for each contract. A resource contract is defined by a customer and a set of diagrams.

While the set up of a contract is quite straightforward, and it is described in dedicated sections below, it is important to highlight a few points that should be understood correctly, in order to avoid possible sources of problems. They are described in the next section, that you can look up for reference.

Important concepts

  • Users, customers, and their permissions

    When configuring new object in the SLM module, it is important to highlight the importance of permissions in the management of customers. Indeed, NetEye users with access to the SLM module can see and assign to a customer only roles that they belong to.

    For example, if a user Jake has B and C roles, then he can only see and assign roles B or C to a customer. The only exception is for users with Role Administrative access in NetEye, which can assign every role.

    This affects both Availability and Resource contracts: if in the SLM Module there are contracts involving a role that is not assigned to a user, then they will not be seen by the user.

    It is therefore important to assign appropriate roles to a user of the SLM Module, to allow them to create and manage the contracts of his customers.

  • Module permissions

    While the role of users and customers is important to understand their access to contracts, but also the module permissions are relevant and need to be clearly understood.

    • Full Module Access. It is essentially a shortcut and enables all the permissions below. User with full module access can see all the content of the module related to his role.

    • General Module Access. This permission gives only the ability to load the module configuration and provide only View access to Event Adjustment, This permission is mandatory for enabling the following permissions. It is also necessary to enable the SLM extension of the Reporting module. To give Add/Edit/Delete permissions you need to enable slm/admin.

    • slm/admin. With this permission it is possible to view and edit everything that the user’s role allows to see.

    • slm/report-adjustment-override. Granting this permission allows to modify the Consider Event Adjustments field in the Reporting module, provided the Reporting’s SLM extensions have been enabled.

Note

Users with slm/admin or slm/report-adjustment-override permissions but without General Module Access can neither see nor access the SLM module (and the SLM extensions of the Reporting module), but this is the expected behaviour: the General Module Access is required, to load the configuration of the SLM module and activate the Extensions.

  • Operational Time explained

    The operational time does not need to indicate a single contiguous extent of time. For instance, it may be defined as “Business Hours” (i.e., “Monday through Friday, 9:00AM to 5:00PM”), which would exclude evening and early morning hours. You would construct such a Time Period in Director by specifying each individual contiguous Time Range separately, e.g. first “monday 9:00AM to 5:00PM”, then “tuesday 9:00AM to 5:00PM”, etc.

    When calculating availability, the monitored object’s initial state is valid until the first state change event (if one exists) during that Time Range, and the last state change event occurring in the Time Range is valid until the end of the Time Range. Thus given the “Business Hours” example above where the initial state of a service is OK on Monday at 9:00AM, and a single state change event of type CRITICAL occurs at 4:30PM, then the resulting availability will be 7 and a half hours of OK and 30 minutes of CRITICAL.

  • How Downtime affects calculation

    A downtime is a scheduled period in which a host or service is not available. Suppose we have an overall operational time of 10 seconds where a series of state change events result in:

    • 1 second where the host is DOWN

    • 7 seconds with the host in an OK state

    • 2 more seconds where the host is DOWN

    And let’s also assume that at the first second the downtime was unexpected, but the final 2 seconds of this period was scheduled downtime.

    • If the Downtime box is checked, the availability will be calculated as (OK + DOWNTIME)/OPERATIONAL TIME = (7s + 2s)/10s, therefore 9/10 or 90%.

    • If instead the box is not checked, the availability will be: OK/OPERATIONAL TIME = 7s / 10s, therefore 7/10 or 70%.