User Guide

Configuration

Multitenancy Roles Configuration

If your NetEye installation is tenant aware, roles associated to each user must be configured to limit their access only to the Tornado configuration they are allowed to work with.

In the NetEye roles (Configuration / Authentication / Roles), add or edit the role related to the tenant limited users. In the detail of the role configuration you can find the tornado module section. You can set the tenant ID in the tornado/tenant_id restriction.

Hint

You can find the list of available Tenant IDs by reading the directory names in /etc/neteye-satellites.d/. You can use this command:

neteye# basename -a $(ls -d  /etc/neteye-satellite.d/*/)

Tornado Interface Overview

In order to continuously improve UX and usability of the Tornado Instance, NetEye provides a GUI based on the Carbon Design System’s best practices.

While the new GUI is developed to completely replace the current UI, it is currently in preview.

For the start, select the tenant that you would like to work with in the toolbar. Here you can also enable Edit mode in order to modify Tornado configurations with the help of Processing Tree Editor.

The Processing Tree presents all filters and rulesets within a tenant. The order of rules within a ruleset defines the sequence of their execution. Use drag & drop function focusing on a button to the left of the rule name to change the order.

../../_images/processing_tree.png

Fig. 152 Example Processing Tree.

Test Events Panel

In order to check if the Processing Tree was configured in a proper way, i.e. filters’ structure and rules’ conditions are working as expected, you can send test events in the Test Events panel, where test events can be created by providing data in a dedicated form.

../../_images/test_window.png

Fig. 153 The test event window.

Warning

In case ‘Enable execution of actions’ setting is set to ‘On’, actions will be executed in production environment.

When a test is executed by clicking the “Run Test” button, the linked Event is sent to Tornado and the outcome of the operation will be reported in the Processing Tree.

Following the yellow line it is possible to see the path that the event has taken. The nodes that have matched the event are distinguishable by a full yellow lightning bolt while those partially matched have an empty bolt.

../../_images/processing_tree_with_event.png

Fig. 154 A Processing Tree with an event result

At this point, a rule can be in one of the following states:

  • matched: If a rule matched the Event.

  • stopped: If a rule matched the Event and then stopped the execution flow. This happens if the continue flag of the rule is set to false.

  • partially matched: If where condition of the Rule was matched but it was not possible to process the required extracted variables.

  • not matched: If the rule did not match the Event.

../../_images/ruleset_matched.png

Fig. 155 Example of processed rules

  • Matched rules: Extract_sender, Extract_subject, Archive_all

  • Partially matched: Extract_message

  • Not matched: Block_invalid_senders

For each rule in the table, the extracted variables and the generated Action payloads are shown. In addition, all these extracted variables are also shown in the Event Test form.

../../_images/extracted_variables.png

Fig. 156 Sample of extracted variables

Two other buttons are visible, one for cleaning all the fields of the form and one for cleaning the outcome of the test.

Processing Tree Editor

The Tornado GUI provides an edit mode that allows to modify the configuration of the Tornado rules’ processing tree directly from NetEye’s web interface. Two important principles have been used for the development of the edit mode and must be understood and taken into account when modifying Tornado’s configuration:

  • Implicit Lock Mode. Only one user at a time can modify the processing tree configuration. This prevents multiple users from changing the configuration simultaneously, which might lead to unwanted results and possibly to Tornado not working correctly due to incomplete or wrong configuration. When a user is editing the configuration, the actual, running configuration is left untouched: it continues to be operative and accepts incoming data to be processed.

  • Edit Mode. When starting to modify the configuration, Tornado will continue to work with the existing configuration–thanks to the implicit lock mode, while the new changes are saved in a separate draft configuration. The new configuration then must be deployed to become operative.

    This mode has other positive side effects: one does not need to complete the changes in one session, but can stop and then continue at a later point; another user can pick up the draft and complete it; in case of a disaster (like e.g., the abrupt end of the HTTPS connection to the GUI) it is possible to resume the draft from the point where it was left.

Warning

Only one draft at a time is allowed; that is, editing of multiple draft is not supported!

When a user enters the edit mode, a new draft is created on the fly if none is present, which will be an exact copy of the running Tornado configuration. If not present in the draft, a root node of type Filter will be automatically added to the draft.

To check for the correctness of a Draft, without impacting the deployed configuration, it is possible to open the test window also while in Edit Mode. The event will be processed using the Draft and the result will be displayed, while keeping the existing configuration running.

You can add a new node in two ways:

  • by clicking on the “Add” button in the top right corner and then selecting the parent node to which you want to add the new node.

  • by clicking on the icon with the three dots on each node that from now on we will call overflow menu.

../../_images/new-filter-node.png

Fig. 157 Adding a node

All nodes at the same level are ordered alphabetically.

A Filter node can be alternatively managed via JSON-based syntax, and examples can be found in the various How-tos present in the tornado section of the User Guide.

A node can be also deleted from the overflow menu when in Edit mode.

Rule Editor

All the rules presented in a Processing Tree belong to a particular Ruleset.

By clicking on a Rule item or on the “Add rule” button, a form to add or edit a rule, respectively, opens. It is organized in tabs:

../../_images/rule_editor.png

Basic Properties of a Rule include:

  • rule name: A string value representing a unique rule identifier. It can be composed only of alphabetical characters, numbers and the “_” (underscore) character.

  • description

  • continue: A boolean value indicating whether to proceed with the event matching process if the current rule matches.

  • active: A boolean value; if false, the rule is ignored.

Constraints

Here you will find tests that determine whether or not an event matches the rule. There are two types of constraints:

  • WHERE: A set of operators that allows you to specify the condition where the rule should be matched; when applied to an event returns true or false

  • WITH: A set of regular expressions that extract values from an Event and associate them with named variables

An event matches a rule only if the WHERE clause evaluates to true and all regular expressions in the WITH clause return non-empty values.

The following operators are available in the WHERE clause. Check also the examples in the dedicated section to see how to use them, including example rules.

  • ‘AND’: Receives an array of operator clauses and returns true if and only if all of them evaluate to true.

  • ‘OR’: Receives an array of operator clauses and returns true if at least one of the operators evaluates to true.

  • ‘NOT’: Receives one operator clause and returns true if the operator clause evaluates to false, while it returns false if the operator clause evaluates to true.

  • ‘contains’: Evaluates whether the first argument contains the second one. It can be applied to strings, arrays, and maps. The operator can also be called with the alias ‘contain’.

  • ‘containsIgnoreCase’: Evaluates whether the first argument contains, in a case-insensitive way, the string passed as second argument. This operator can also be called with the alias ‘containIgnoreCase’.

  • ‘equals’: Compares any two values (including, but not limited to, arrays, maps) and returns whether or not they are equal. An alias for this operator is ‘equal’.

  • ‘equalsIgnoreCase’: Compares two strings and returns whether or not they are equal in a case-insensitive way. The operator can also be called with the alias ‘equalIgnoreCase’.

  • ‘ge’: Compares two values and returns whether the first value is greater than or equal to the second one. If one or both of the values do not exist, it returns false.

  • ‘gt’: Compares two values and returns whether the first value is greater than the second one. If one or both of the values do not exist, it returns false.

  • ‘le’: Compares two values and returns whether the first value is less than or equal to the second one. If one or both of the values do not exist, it returns false.

  • ‘lt’: Compares two values and returns whether the first value is less than the second one. If one or both of the values do not exist, it returns false.

  • ‘ne’: This is the negation of the ‘equals’ operator. Compares two values and returns whether or not they are different. It can also be called with the aliases ‘notEquals’ and ‘notEqual’.

  • ‘regex’: Evaluates whether a field of an event matches a given regular expression.

Note

We use the Rust Regex library (see its github project home page ) to evaluate regular expressions provided by the WITH clause and by the regex operator. Refer to its dedicated documentation for details about its features and limitations. You can also visit this site <https://regex101.com> in order to test your regex input and get interactive feedback on its syntax.

Actions

Action editor allows you to specify the actions to be executed when an Event matches a Rule.

Reading Event Fields

A Rule can access Event fields through the “${” and “}” delimiters. To do so, the following conventions are defined:

  • The ‘.’ (dot) char is used to access inner fields.

  • Keys containing dots are escaped with leading and trailing double quotes.

  • Double quote chars are not accepted inside a key.

For example, given the incoming event:

{
    "type": "trap",
    "created_ms": 1554130814854,
    "payload":{
        "protocol": "UDP",
        "oids": {
            "key.with.dots": "38:10:38:30.98"
        }
    }
}

The rule can access the event’s fields as follows:

  • ${event.type}: Returns trap

  • ${event.payload.protocol}: Returns UDP

  • ${event.payload.oids."key.with.dots"}: Returns 38:10:38:30.98

  • ${event.payload}: Returns the entire payload

  • ${event}: Returns the entire event

  • ${event.metadata.key}: Returns the value of the key key from the metadata. The metadata is a special field of an event created by Tornado to store additional information where needed (e.g. the tenant_id, etc.)

String interpolation

An action payload can also contain text with placeholders that Tornado will replace at runtime. The values to be used for the substitution are extracted from the incoming Events following the conventions mentioned in the previous section; for example, using that Event definition, this string in the action payload:

Received a ${event.type} with protocol ${event.payload.protocol}

produces:

*Received a trap with protocol UDP*

Note

Only values of type String, Number, Boolean and null are valid. Consequently, the interpolation will fail, and the action will not be executed, if the value associated with the placeholder extracted from the Event is an Array, a Map, or undefined.

Filter Editor

A Filter node contains the following properties:

  • filter name: A unique string value should be only composed of letters, numbers and the “_” (underscore) character; it corresponds to the filename, stripped from its .json extension.

  • description

  • active: A boolean value; if false, the Filter’s children will be ignored.

  • filter: A boolean operator that, when applied to an event, returns true or false. This operator determines whether an Event matches the Filter; consequently, it determines whether an Event will be processed by the Filter’s inner nodes.

../../_images/filter-properties.png

Filter node is using the same set of Constrains in ‘Where’ tab as it is stated for a Ruleset node in Rule Editor.

Filters available by default

The Tornado Processing Tree provides some out of the box Filters, which match all, and only, the Events originated by some given tenant. For more information on tenants in NetEye visit the dedicated page.

These Filters are created at the top level of the Processing Tree, in such a way that it is possible to set up tenant-specific Tornado pipelines.

Given for example a tenant named acme, the matching condition of the Filter for the acme tenant will be defined as:

{
    "type": "equals",
    "first": "${event.metadata.tenant_id}",
    "second": "acme"
}

Keep in mind that these Filters must never be deleted nor modified, because they will be automatically re-created.

Note

NetEye generates one Filter for each tenant, including the default master tenant.

Import and Export Configuration

The Tornado GUI provides multiple ways to import and export the whole configuration or just a subset of it.

Export Configuration

You have three possibilities to export Tornado configuration or part of it:

  1. entire configuration: select the root node from the Processing Tree View and click on the export button to download the entire configuration

  2. a node (either a ruleset or a filter): select the node from the Processing Tree View and click on the export button to download the node and its sub-nodes

  3. a single rule: navigate to the rules table, select a rule, and click on the export button

Hint

You can backup and download Tornado configuration by exporting the entire configuration.

Import Configuration

You can use the import feature to upload to NetEye a previously downloaded configuration, new custom rules, or even the configuration from another NetEye instance.

When clicking on the import button a popup will appear with the following fields:

  • Node File: the file containing the configuration

    Note

    When importing a single rule the field will be labeled as Rule File.

  • Replace whole configuration?: If selected, the imported configuration will replace the root node and all of its sub-nodes.

    Hint

    You can restore a previous Tornado configuration by selecting this option.

  • Parent Node: The parent node where to add the imported configuration, by default it is set to the currently selected node.

Note

When a node or a rule with the same name of an already existing one is imported, the name of the new node/rule will be suffixed with _imported.

Tornado Collectors

Tornado provides a number of preconfigured Collectors that handle inputs from various data sources:

  1. Email Collector

  2. Rsyslog Collector

  3. Webhook Collector

  4. Nats JSON Collector

  5. Icinga 2 Collector

  6. SNMP Trap Daemon Collector

Most of the Tornado Collectors are functioning out of the box and do not require manual configuration. Follow Tornado Collectors for more details. However, there are some, that may be configured to work in accordance with your needs.

Tornado Webhook Collector

The Webhook Collector is a standalone HTTP server built on actix-web that listens for REST calls from a generic webhook, generates Tornado Events from the webhook JSON body, and sends them to the Tornado Engine.

On startup, it creates a dedicated REST endpoint for each configured webhook. Calls received by an endpoint are processed by the embedded JMESPath Syntax that uses them to produce Tornado Events. In the final step, the Events are forwarded to the Tornado Engine through the configured connection type.

You must configure a JSON file for each webhook in the /neteye/shared/tornado_webhook_collector/conf/webhooks/ folder.

For each webhook, you must provide three values in order to successfully create an endpoint:

  • id: The webhook identifier. This will determine the path of the endpoint; it must be unique per webhook.

  • token: A security token that the webhook issuer has to include in the URL as part of the query string (see the example at the bottom of this page for details). If the token provided by the issuer is missing or does not match the one owned by the Collector, then the call will be rejected and an HTTP 401 code (UNAUTHORIZED) will be returned.

  • collector_config: The transformation logic that converts a webhook JSON object into a Tornado Event. It consists of a JMESPath Collector configuration as described in its specific documentation.

{
  "id": "<webook_id>",
  "token": "<webhook_token>",
  "collector_config": {
    "event_type": "<webhook_custom_event_type>",
    "payload": {
      "source": "${@}"
    }
  }
}

Tornado Icinga 2 Collector

The Icinga 2 Collector subscribes to the Icinga 2 API event streams, generates Tornado Events from the Icinga 2 Events, and publishes them on the Tornado Engine TCP address.

The Icinga 2 Collector executable is built on actix.

On startup, it connects to an existing Icinga 2 Server API and subscribes to user defined Event Streams. Each Icinga 2 Event published on the stream, is processed by the embedded jmespath Collector that uses them to produce Tornado Events which are, finally, forwarded to the Tornado Engine’s TCP address.

The streams in /neteye/shared/tornado_icinga2_collector/conf/streams/ are to be configured as JSON files.

More than one stream subscription can be defined. For each stream, you must provide two values in order to successfully create a subscription:

  • stream: the stream configuration composed of:

    • types: An array of Icinga 2 Event types;

    • queue: A unique queue name used by Icinga 2 to identify the stream;

    • filter: An optional Event Stream filter. Additional information about the filter can be found in the official documentation.

  • collector_config: The transformation logic that converts an Icinga 2 Event into a Tornado Event. It consists of a JMESPath Collector configuration as described in its specific documentation.

For all Icinga 2 events

{
  "stream": {
    "types": ["CheckResult",
              "StateChange",
              "Notification",
              "AcknowledgementSet",
              "AcknowledgementCleared",
              "CommentAdded",
              "CommentRemoved",
              "DowntimeAdded",
              "DowntimeRemoved",
              "DowntimeStarted",
              "DowntimeTriggered"],
    "queue": "icinga2_AllEvents_all"
 },
  "collector_config": {
    "event_type": "icinga2_AllEvents_all",
    "payload": {
      "response": "${@}"
    }
  }
}

For check result events

{
  "stream": {
    "types": ["CheckResult"],
    "queue": "icinga2_CheckResult_all"
 },
  "collector_config": {
    "event_type": "icinga2_CheckResult_all",
    "payload": {
      "response": "${@}"
    }
  }
}

For notification events

{
  "stream": {
    "types": ["Notification"],
    "queue": "icinga2_Notification_all"
 },
  "collector_config": {
    "event_type": "icinga2_Notification_all",
    "payload": {
      "response": "${@}"
    }
  }
}

For statechange events

{
  "stream": {
    "types": ["StateChange"],
    "queue": "icinga2_StateChange_all"
  },
  "collector_config": {
    "event_type": "icinga2_StateChange_all",
    "payload": {
      "response": "${@}"
    }
  }
}

Note

Based on the Icinga 2 Event Streams documentation, multiple HTTP clients can use the same queue name as long as they use the same event types and filter.

Email Collector

When the Email Collector receives a valid MIME email message as input, it parses it and produces a Tornado Event with the extracted data.

With the attachments included, the ones that are text files will be in plain text, otherwise they will be encoded in base64.

For example, passing this email with attachments:

From: "Francesco" <francesco@example.com>
Subject: Test for Mail Collector - with attachments
To: "Benjamin" <benjamin@example.com>,
 francesco <francesco@example.com>
Cc: thomas@example.com, francesco@example.com
Date: Sun, 02 Oct 2016 07:06:22 -0700 (PDT)
MIME-Version: 1.0
Content-Type: multipart/mixed;
 boundary="------------E5401F4DD68F2F7A872C2A83"
Content-Language: en-US

This is a multi-part message in MIME format.
--------------E5401F4DD68F2F7A872C2A83
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: 7bit

<html>Test for Mail Collector with attachments</html>

--------------E5401F4DD68F2F7A872C2A83
Content-Type: application/pdf;
 name="sample.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="sample.pdf"

JVBERi0xLjMNCiXi48/TDQoNCjEgMCBvYmoNCjw8DQovVHlwZSAvQ2F0YWxvZw0KT0YNCg==

--------------E5401F4DD68F2F7A872C2A83
Content-Type: text/plain; charset=UTF-8;
 name="sample.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="sample.txt"

dHh0IGZpbGUgY29udGV4dCBmb3IgZW1haWwgY29sbGVjdG9yCjEyMzQ1Njc4OTA5ODc2NTQz
MjEK
--------------E5401F4DD68F2F7A872C2A83--

will generate this Event:

{
  "type": "email",
  "created_ms": 1554130814854,
  "payload": {
    "date": 1475417182,
    "subject": "Test for Mail Collector - with attachments",
    "to": "\"Benjamin\" <benjamin@example.com>, francesco <francesco@example.com>",
    "from": "\"Francesco\" <francesco@example.com>",
    "cc": "thomas@example.com, francesco@example.com",
    "body": "<html>Test for Mail Collector with attachments</html>",
    "attachments": [
      {
        "filename": "sample.pdf",
        "mime_type": "application/pdf",
        "encoding": "base64",
        "content": "JVBERi0xLjMNCiXi48/TDQoNCjEgMCBvYmoNCjw8DQovVHlwZSAvQ2F0YWxvZw0KT0YNCg=="
      },
      {
        "filename": "sample.txt",
        "mime_type": "text/plain",
        "encoding": "plaintext",
        "content": "txt file context for email Collector\n1234567890987654321\n"
      }
    ]
  }
}

Within the Tornado Event, the filename and mime_type properties of each attachment are the values extracted from the incoming email.

Instead, the encoding property refers to the content encoding in the Event itself, which is one of two types:

  • plaintext: The content is included in plain text

  • base64: The content is encoded in base64

Tornado Executors

Director Executor

The Director Executor is an application that extracts data from a Tornado Action and prepares it to be sent to the Icinga Director REST API; it expects a Tornado Action to include the following elements in its payload:

  1. An action_name: The Director action to perform.

  2. An action_payload (optional): The payload of the Director action.

  3. An icinga2_live_creation (optional): Boolean value, which determines whether to create the specified Icinga Object also in Icinga 2.

Valid values for action_name are:

  • create_host: creates an object of type host in the Director

  • create_service: creates an object of type service in the Director

The action_payload should contain at least all mandatory parameters expected by the Icinga Director REST API for the type of object you want to create.

An example of a valid Tornado Action is:

{
    "id": "director",
    "payload": {
        "action_name": "create_host",
        "action_payload": {
          "object_type": "object",
          "object_name": "my_host_name",
          "address": "127.0.0.1",
          "check_command": "hostalive",
          "vars": {
            "location": "Bolzano"
          }
        },
        "icinga2_live_creation": true
    }
}

Logger Executor

An Executor that logs received Actions: it simply outputs the whole Action body to the standard log at the info level.

Script Executor

An Executor that runs custom shell scripts on a Unix-like system. To be correctly processed by this Executor, an Action should provide two entries in its payload: the path to a script on the local filesystem of the Executor process, and all the arguments to be passed to the script itself.

The script path is identified by the payload key script. It is important to verify that the Executor has both read and execute rights at that path.

The script arguments are identified by the payload key args; if present, they are passed as command line arguments when the script is executed.

An example of a valid Action is:

{
    "id": "script",
    "payload" : {
        "script": "./usr/share/scripts/my_script.sh",
        "args": [
            "tornado",
            "rust"
        ]
    }
}

In this case the Executor will launch the script my_script.sh with the arguments “tornado” and “rust”. Consequently, the resulting command will be:

neteye# ./usr/share/scripts/my_script.sh tornado rust

Other Ways of Passing Arguments

There are different ways to pass the arguments for a script:

  • Passing arguments as a String:

    {
      "id": "script",
      "payload" : {
          "script": "./usr/share/scripts/my_script.sh",
          "args": "arg_one arg_two -a --something else"
      }
    }
    

    If args is a String, the entire String is appended as a single argument to the script. In this case the resulting command will be:

    neteye# ./usr/share/scripts/my_script.sh "arg_one arg_two -a --something else"
    
  • Passing arguments in an array:

    {
      "id": "script",
      "payload" : {
          "script": "./usr/share/scripts/my_script.sh",
          "args": [
              "--arg_one tornado",
              "arg_two",
              true,
              100
          ]
      }
    }
    

    Here the argument’s array elements are passed as four arguments to the script in the exact order they are declared. In this case the resulting command will be:

    neteye# ./usr/share/scripts/my_script.sh "--arg_one tornado" arg_two true 100
    
  • Passing arguments in a map:

    {
      "id": "script",
      "payload" : {
          "script": "./usr/share/scripts/my_script.sh",
          "args": {
            "arg_one": "tornado",
            "arg_two": "rust"
        }
      }
    }
    

    When arguments are passed in a map, each entry in the map is considered to be a (option key, option value) pair. Each pair is passed to the script using the default style to pass options to a Unix executable which is –key followed by the value. Consequently, the resulting command will be:

    neteye# ./usr/share/scripts/my_script.sh --arg_one tornado --arg_two rust
    

    Please note that ordering is not guaranteed to be preserved in this case, so the resulting command line could also be:

    neteye# ./usr/share/scripts/my_script.sh --arg_two rust --arg_one tornado
    

    Thus if the order of the arguments matters, you should pass them using either the string- or the array-based approach.

  • Passing no arguments:

    {
      "id": "script",
      "payload" : {
          "script": "./usr/share/scripts/my_script.sh"
      }
    }
    

    Since arguments are not mandatory, they can be omitted. In this case the resulting command will simply be:

    neteye# ./usr/share/scripts/my_script.sh
    

Monitoring Executor

Warning

The monitoring Executor is deprecated, please use the Smart Monitoring Check Result Executor, which is equivalent but it is easier to configure.

The Monitoring Executor permits to perform Icinga process-check-results also in the case that the Icinga object for which you want to perform the process-check-result does not yet exist.

This is done by means of executing the action process-check-result with the Icinga Executor, and by executing the actions create_host or create_service with the Director Executor, in case the underlying Icinga objects do not yet exist in Icinga.

Warning

The Monitoring Executor requires the live-creation feature of the Icinga Director to be exposed in the REST API. If this is not the case, the actions of this Executor will always fail in case the Icinga Objects are not already present in Icinga 2.

This Executor expects a Tornado Action to include the following elements in its payload:

  1. An action_name: The Monitoring action to perform.

  2. A process_check_result_payload: The payload for the Icinga 2 process-check-result action.

  3. A host_creation_payload: The payload which will be sent to the Icinga Director REST API for the host creation.

  4. A service_creation_payload: The payload which will be sent to the Icinga Director REST API for the service creation (mandatory only in case action_name is create_and_or_process_service_passive_check_result).

Valid values for action_name are:

  • create_and_or_process_host_passive_check_result: sets the passive check result for a host, and, if necessary, it also creates the host.

  • create_and_or_process_service_passive_check_result: sets the passive check result for a service, and, if necessary, it also creates the service.

The process_check_result_payload should contain at least all mandatory parameters expected by the Icinga API to perform the action. The object on which you want to set the passive check result must be specified with the field host in case of action create_and_or_process_host_passive_check_result, and service in case of action create_and_or_process_service_passive_check_result (e.g. specifying a set of objects on which to apply the passive check result with the parameter filter is not valid)

The host_creation_payload should contain at least all mandatory parameters expected by the Icinga Director REST API to perform the creation of a host.

The servie_creation_payload should contain at least all mandatory parameters expected by the Icinga Director REST API to perform the creation of a service.

An example of a valid Tornado Action is:

{
  "id": "monitoring",
  "payload": {
    "action_name": "create_and_or_process_service_passive_check_result",
    "process_check_result_payload": {
      "exit_status": "2",
      "plugin_output": "Output message",
      "service": "myhost!myservice",
      "type": "Service"
    },
    "host_creation_payload": {
      "object_type": "Object",
      "object_name": "myhost",
      "address": "127.0.0.1",
      "check_command": "hostalive",
      "vars": {
        "location": "Rome"
      }
    },
    "service_creation_payload": {
      "object_type": "Object",
      "host": "myhost",
      "object_name": "myservice",
      "check_command": "ping"
    }
  }
}

The flowchart shown in Flowchart of Monitoring Executor. helps to understand the behaviour of the Monitoring Executor in relation to Icinga 2 and Icinga Director REST APIs.

../../_images/monitoring-executor-flowchart.png

Fig. 158 Flowchart of Monitoring Executor.

Foreach Executor

An Executor that loops through a set of data and executes a list of actions for each entry; it extracts all values from an array of elements and injects each value to a list of action under the item key.

There are two mandatory configuration entries in its payload:

  • target: the array of elements

  • actions: the array of action to execute

For example, given this rule definition:

{
  "name": "do_something_foreach_value",
  "description": "This uses a foreach loop",
  "continue": true,
  "active": true,
  "constraint": {
    "WITH": {}
  },
  "actions": [
    {
      "id": "foreach",
      "payload": {
        "target": "${event.payload.values}",
        "actions": [
          {
            "id": "logger",
            "payload": {
              "source": "${event.payload.source}",
              "value": "the value is ${item}"
            }
          },
          {
            "id": "archive",
            "payload": {
              "event": "${event}",
              "item_value": "${item}"
            }
          }
        ]
      }
    }
  ]
}

When an event with this payload is received:

{
  "type": "some_event",
  "created_ms": 123456,
  "payload":{
    "values": ["ONE", "TWO", "THREE"],
    "source": "host_01"
  }
}

Then the target of the foreach action is the array ["ONE", "TWO", "THREE"]; consequently, each one of the two inner actions is executed three times; the first time with item = “ONE”, then with item = “TWO” and, finally, with item = “THREE”.

Archive Executor

The Archive Executor writes the Events from the received Actions to a file.

Requirements and Limitations

The archive Executor can only write to locally mounted file systems. In addition, it needs read and write permissions on the folders and files specified in its configuration.

Configuration

The archive Executor has the following configuration options:

  • file_cache_size: The number of file descriptors to be cached. You can improve overall performance by keeping files from being continuously opened and closed at each write.

  • file_cache_ttl_secs: The Time To Live of a file descriptor. When this time reaches 0, the descriptor will be removed from the cache.

  • base_path: A directory on the file system where all logs are written. Based on their type, rule Actions received from the Matcher can be logged in subdirectories of the base_path. However, the archive Executor will only allow files to be written inside this folder.

  • default_path: A default path where all Actions that do not specify an archive_type in the payload are logged

  • paths: A set of mappings from an archive_type to an archive_path, which is a subpath relative to the base_path. The archive_path can contain variables, specified by the syntax ${parameter_name}, which are replaced at runtime by the values in the Action’s payload.

The archive path serves to decouple the type from the actual subpath, allowing you to write Action rules without worrying about having to modify them if you later change the directory structure or destination paths.

As an example of how an archive_path is computed, suppose we have the following configuration:

base_path =  "/tmp"
default_path = "/default/out.log"
file_cache_size = 10
file_cache_ttl_secs = 1

[paths]
"type_one" = "/dir_one/file.log"
"type_two" = "/dir_two/${hostname}/file.log"

and these three incoming actions:

  1. action_one:

    {
        "id": "archive",
        "payload": {
            "archive_type": "type_one",
            "event": "__the_incoming_event__"
        }
    }
    
  2. action_two:

    {
        "id": "archive",
        "payload": {
            "archive_type": "type_two",
            "hostname": "net-test",
            "event": "__the_incoming_event__"
        }
    }
    
  3. action_three:

    {
        "id": "archive",
        "payload": {
            "event": "__the_incoming_event__"
        }
    }
    

then:

  • action_one will be archived in /tmp/dir_one/file.log

  • action_two will be archived in /tmp/dir_two/net-test/file.log

  • action_three will be archived in /tmp/default/out.log

The archive Executor expects an Action to include the following elements in the payload:

  1. An event: The Event to be archived should be included in the payload under the key event

  2. An archive type (optional): The archive type is specified in the payload under the key archive_type

When an archive_type is not specified, the default_path is used (as in action_three). Otherwise, the Executor will use the archive_path in the paths configuration corresponding to the archive_type key (action_one and action_two).

When an archive_type is specified but there is no corresponding key in the mappings under the paths configuration, or it is not possible to resolve all path parameters, then the Event will not be archived. Instead, the archiver will return an error.

The Event from the payload is written into the log file in JSON format, one event per line.

Elasticsearch Executor

The Elasticsearch Executor is a functionality that extracts data from a Tornado Action and sends it to Elasticsearch.

The Executor expects a Tornado Action that includes the following elements in its payload:

  1. An endpoint : The Elasticsearch endpoint which Tornado will call to create the Elasticsearch document

  2. An index : The name of the Elasticsearch index in which the document will be created

  3. An data: The content of the document that will be sent to Elasticsearch

  4. (optional) An auth: a method of authentication, see below

An example of a valid Tornado Action is a json document like this:

{
    "id": "elasticsearch",
    "payload": {
        "endpoint": "http://localhost:9200",
        "index": "tornado-example",
        "data": {
            "user" : "kimchy",
            "post_date" : "2009-11-15T14:12:12",
            "message" : "trying out Elasticsearch"
        }
    }
}

The Elasticsearch Executor will create a new document in the specified Elasticsearch index for each action executed; also the specified index will be created if it does not already exist.

In the above json document, no authentication is specified, therefore the default authentication method created during the Executor creation is used. This method is saved in a tornado configuration file (elasticsearch_executor.toml) and can be overridden for each Tornado Action, as described in the next section.

Elasticsearch authentication

When the Elasticsearch Executor is created, a default authentication method can be specified and will be used to authenticate to Elasticsearch, if not differently specified by the action. On the contrary, if a default method is not defined at creation time, then each action that does not specify an authentication method will fail.

To use a specific authentication method the action should include the auth field with either of the following authentication types: None or PemCertificatePath, like shown in the following examples.

  • None: the client connects to Elasticsearch without authentication

    Example:

    {
        "id": "elasticsearch",
        "payload": {
            "index": "tornado-example",
            "endpoint": "http://localhost:9200",
            "data": {
                "user": "myuser"
            },
            "auth": {
                "type": "None"
            }
        }
    }
    
  • PemCertificatePath: the client connects to Elasticsearch using the PEM certificates read from the local file system. When this method is used, the following information must be provided:

    • certificate_path: path to the public certificate accepted by Elasticsearch

    • private_key_path: path to the corresponding private key

    • ca_certificate_path: path to CA certificate needed to verify the identity of the Elasticsearch server

    Example:

    {
        "id": "elasticsearch",
        "payload": {
            "index": "tornado-example",
            "endpoint": "http://localhost:9200",
            "data": {
                "user": "myuser"
            },
            "auth": {
                "type": "PemCertificatePath",
                "certificate_path": "/path/to/tornado/conf/certs/tornado.crt.pem",
                "private_key_path": "/path/to/tornado/conf/certs/private/tornado.key.pem",
                "ca_certificate_path": "/path/to/tornado/conf/certs/root-ca.crt"
            }
        }
    }
    

Icinga 2 Executor

The Icinga 2 Executor extracts data from a Tornado Action and prepares it to be sent to the Icinga 2 API.

In more details, this Executor expects a Tornado Action to include the following elements in its payload:

  1. An icinga2_action_name: The Icinga 2 action to perform

  2. An icinga2_action_payload (optional): The parameters of the Icinga 2 action

The icinga2_action_name should match one of the existing Icinga 2 actions.

The icinga2_action_payload should contain at least all mandatory parameters expected by the specific Icinga 2 action.

An example of a valid Tornado Action is:

{
    "id": "icinga2",
    "payload": {
        "icinga2_action_name": "process-check-result",
        "icinga2_action_payload": {
            "exit_status": "${event.payload.exit_status}",
            "plugin_output": "${event.payload.plugin_output}",
            "filter": "host.name==\"example.localdomain\"",
            "type": "Host"
        }
    }
}

Smart Monitoring Check Result Executor

The Smart Monitoring Check Result Executor allows to perform an Icinga process-check-results in case the Icinga 2 object for which you want to carry out that action does not exist. Moreover, this Executor also ensures that no outdated process-check-result will overwrite newer check results already present in Icinga 2.

In case the underlying Icinga 2 objects do not exist in Icinga 2, the actions create_host or create_service are performed via the Director Executor.

Warning

The Smart Monitoring Check Result Executor requires the live-creation feature of the Icinga Director to be exposed in the REST API. If this is not the case, the actions of this Executor will always fail in case the Icinga Objects are not already present in Icinga 2.

Note however, that the Icinga agent cannot be created live using Smart Monitoring Executor because it always requires a defined endpoint in the configuration which is not possible since the Icinga API doesn’t support live-creation of an endpoint.

To ensure that outdated check results are not processed, the action process-check-result is carried out by the Icinga 2 Executor with the parameters execution_start and execution_end inherited by the Action definition or set equal to the value of the created_ms property of the originating Tornado Event. Section Discarded Check Results explains how the Executor handles these cases.

The Tornado Action sent to this Executor shall include the following elements in its payload:

  1. A check_result: The basic data to build the Icinga 2 process-check-result action payload

  2. A host: The data to build the payload which will be sent to the Icinga Director REST API for the host creation

  3. A service: The data to build the payload which will be sent to the Icinga Director REST API for the service creation (optional)

The check_result should contain all mandatory parameters expected by the Icinga API except the following ones that are automatically filled by the Executor:

  • host

  • service

  • type

The host and service should contain all mandatory parameters expected by the Icinga Director REST API to perform the creation of a host and/or a service, except object_type.

The service key is optional. When it is included in the action payload, the Executor will invoke the process-check-results call to set the status of a service; otherwise, it will set the one of a host.

An example of a valid Tornado Action is to set the status of the service myhost|myservice:

{
  "id": "smart_monitoring_check_result",
  "payload": {
    "check_result": {
      "exit_status": "2",
      "plugin_output": "Output message"
    },
    "host": {
      "object_name": "myhost",
      "address": "127.0.0.1",
      "check_command": "hostalive",
      "vars": {
        "location": "Rome"
      }
    },
    "service": {
       "object_name": "myservice",
       "check_command": "ping"
    }
  }
}

By simply removing the service key, the same action will set the status of the host myhost:

{
  "id": "smart_monitoring_check_result",
  "payload": {
    "check_result": {
      "exit_status": "2",
      "plugin_output": "Output message"
    },
    "host": {
      "object_name": "myhost",
      "address": "127.0.0.1",
      "check_command": "hostalive",
      "vars": {
        "location": "Rome"
      }
    }
  }
}

The flowchart shown in Flowchart of Monitoring Executor. helps to understand the behaviour of the Monitoring Executor in relation to Icinga 2 and Icinga Director REST APIs.

Discarded Check Results

Some process-check-results may be discarded by Icinga 2 if more recent check results already exist for the target object. In this situation the Executor does not retry the Action, but simply logs an error containing the tag DISCARDED_PROCESS_CHECK_RESULT in the configured Tornado Logger.

The log message showing a discarded process-check-result will be similar to the following excerpt, enclosed in an ActionExecutionError:

SmartMonitoringExecutor - Process check result action failed with error ActionExecutionError {
  message: "Icinga2Executor - Icinga2 API returned an unrecoverable error. Response status: 500 Internal Server Error.
    Response body: {\"results\":[{\"code\":409.0,\"status\":\"Newer check result already present. Check result for 'my-host!my-service' was discarded.\"}]}",
  can_retry: false,
  code: None,
  data: {
    "payload":{"execution_end":1651054222.0,"execution_start":1651054222.0,"exit_status":0,"plugin_output":"Some process check result","service":"my-host!my-service","type":"Service"},
    "tags":["DISCARDED_PROCESS_CHECK_RESULT"],
    "url":"https://icinga2-master.neteyelocal:5665/v1/actions/process-check-result",
    "method":"POST"
  }
}.

Common Logger

The tornado_common_logger crate contains the logger configuration for the Tornado components.

The configuration is based on three entries:

  • level: A list of comma separated logger verbosity levels. Valid values for a level are: trace, debug, info, warn, and error. If only one level is provided, this is used as global logger level. Otherwise, a list of per package levels can be used. E.g.:

    • level=info: the global logger level is set to info

    • level=warn,tornado=debug: the global logger level is set to warn, the tornado package logger level is set to debug

  • stdout-output: A boolean value that determines whether the Logger should print to standard output. Valid values are true and false.

  • file-output-path: An optional string that defines a file path in the file system. If provided, the Logger will append any output to that file.

The configuration subsection logger.tracing_elastic_apm allows to configure the connection to Elastic APM for the tracing functionality. The following entries can be configured:

  • apm_output: Whether the Logger data should be sent to the Elastic APM Server. Valid values are true and false.

  • apm_server_url: The URL of the Elastic APM Server.

  • apm_server_api_credentials.id: (Optional) the ID of the API Key for authenticating to the Elastic APM server.

  • apm_server_api_credentials.key: (Optional) the key of the API Key for authenticating to the Elastic APM server. If apm_server_api_credentials.id and apm_server_api_credentials.key are not provided, they will be read from the file <config_dir>/apm_server_api_credentials.json

  • exporter.max_queue_size: (Optional) The maximum queue size of the tracing batch exporter to buffer spans for delayed processing. Defaults to 65536.

  • exporter.scheduled_delay_ms: The delay interval in milliseconds between two consecutive exports of batches. Defaults to 5000 (5 seconds).

  • exporter.max_export_batch_size: The maximum number of spans to export in a single batch. Defaults to 512.

  • exporter.max_export_timeout_ms: The time (in milliseconds) for which the export can run before it is cancelled. Defaults to 30000 (30 seconds).

In Tornado executables, the Logger configuration is usually defined with command line parameters managed by structopt. In that case, the default level is set to warn, stdout-output is disabled and the file-output-path is empty.

For example:

./tornado --level=info --stdout-output --file-output-path=/tornado/log

Advanced Configuration

Below you will be able to find a list of configuration cases which on top of the basic Tornado Configuration allow to customize your experience of using Tornado within your NetEye installation.

Thread Pool Configuration

Even if the default configuration should suit most of the use cases, in some particular situations it could be useful to customise the size of the internal queues used by Tornado. Tornado utilizes these queues to process incoming events and to dispatch triggered actions.

Tornado uses a dedicated thread pool per queue; the size of each queue is by default equal to the number of available logical CPUs. Consequently, in case of an action of type script, for example, Tornado will be able to run in parallel at max as many scripts as the number of CPUs.

This default behaviour can be overridden by providing a custom configuration for the thread pools size. This is achieved through the optional tornado_pool_config entry in the tornado.daemon section of the Tornado.toml configuration file.

Example of Thread Pool’s Dynamical Configuration

[tornado.daemon]
thread_pool_config = {type = "CPU", factor = 1.0}

In this case, the size of the thread pool will be equal to (number of available logical CPUs) multiplied by (factor) rounded to the smallest integer greater than or equal to a number. If the resulting value is less than 1, then 1 will be used be default.

For example, if there are 16 available CPUs, then:

  • {type: "CPU", factor: 0.5} => thread pool size is 8

  • {type: "CPU", factor: 2.0} => thread pool size is 32

Example of Thread Pool’s Static Configuration

[tornado.daemon]
thread_pool_config = {type = "Fixed", size = 20}

In this case, the size of the thread pool is statically fixed at 20. If the provided size is less than 1, then 1 will be used be default.

Retry Strategy Configuration

Tornado allows the configuration of a global retry strategy to be applied when the execution of an Action fails.

A retry strategy is composed by:

  • retry policy: the policy that defines whether an action execution should be retried after an execution failure;

  • backoff policy: the policy that defines the sleep time between retries.

Valid values for the retry policy are:

  • {type = "MaxRetries", retries = 5} => A predefined maximum amount of retry attempts. This is the default value with a retries set to 20.

  • {type = "None"} => No retries are performed.

  • {type = "Infinite"} => The operation will be retried an infinite number of times. This setting must be used with extreme caution as it could fill the entire memory buffer preventing Tornado from processing incoming events.

Valid values for the backoff policy are:

  • {type = "Exponential", ms = 1000, multiplier = 2 }: It increases the back off period for each retry attempt in a given set using the exponential function. The period to sleep on the first backoff is the ms; the multiplier is instead used to calculate the next backoff interval from the last. This is the default configuration.

  • {type = "None"}: No sleep time between retries. This is the default value.

  • {type = "Fixed", ms = 1000 }: A fixed amount of milliseconds to sleep between each retry attempt.

  • {type = "Variable", ms = [1000, 5000, 10000]}: The amount of milliseconds between two consecutive retry attempts.

    The time to wait after ‘i’ retries is specified in the vector at position ‘i’.

    If the number of retries is bigger than the vector length, then the last value in the vector is used. For example:

    ms = [111,222,333] -> It waits 111 ms after the first failure, 222 ms after the second failure and then 333 ms for all following failures.

Example of a complete Retry Strategy configuration

[tornado.daemon]
retry_strategy.retry_policy = {type = "Infinite"}
retry_strategy.backoff_policy = {type = "Variable", ms = [1000, 5000, 10000]}

When not provided explicitly, the following default Retry Strategy is used:

[tornado.daemon]
retry_strategy.retry_policy = {type = "MaxRetries", retries = 20}
retry_strategy.backoff_policy = {type = "Exponential", ms = 1000, multiplier = 2 }