User Guide

Concepts

Architecture

General Elasticsearch Cluster Information

In order to avoid excessive, useless network traffic generated when the cluster reallocates shards across cluster nodes after you restart an Elasticsearch instance, NetEye employs systemd post-start and pre-stop scripts to automatically enable and disable shard allocation properly on the current node whenever the Elasticsearch service is started or stopped by systemctl.

Details on how shard allocation works can be found here.

Note

By starting a stopped Elasticsearch instance, shard allocation will be enabled globally for the entire cluster. So if you have more than one Elasticsearch instance down, shards will be reallocated in order to prevent data loss.

Therefore best practice is to:
  • Never keep an Elasticsearch instance stopped on purpose. Stop it only for maintenance reasons (e.g. for restarting the server) and start it up again as soon as possible.

  • Restart or stop/start one Elasticsearch node at a time. If something bad happens and multiple Elasticsearch nodes go down, then start them all up again together.

Elastic Only Nodes

From Neteye 4.9 is possible to install Elastic-only nodes in order to improve elasticsearch performance without a full Neteye installation.

To create an elastic only node you have to create an entry of type ElasticOnlyNodes in the file /etc/neteye-cluster as in the following example. Syntax is the same used for standard Node

{ "ElasticOnlyNodes": [
             {
          "addr" : "192.168.47.3",
          "hostname" : "neteye03.neteyelocal",
          "hostname_ext" : "rdneteye03.si.wp.lan"
       }
    ]
}

Voting Only Nodes

From Neteye 4.16 is possible to install Voting-only nodes in order to add a node dedicated only to provide quorum. If SIEM module is installed this node provides a voting-only functionalities also to Elasticsearch cluster.

This functionality is achieved configuring the node as voting-only master-eligible node specifying the variable ES_NODE_ROLES="master, voting_only" in the sysconfig file /neteye/local/elasticsearch/conf/sysconfig/elasticsearch-voting-only.

Voting-only node is defined in /etc/neteye-cluster as in the following example

{ "VotingOnlyNode": {
         "addr" : "192.168.47.3",
         "hostname" : "neteye03.neteyelocal",
         "hostname_ext" : "rdneteye03.si.wp.lan",
         "id" : 3
      }
}

Please note that VotingOnlyNode is a json obect and not an array because you can have a single Voting-only node in a NetEye cluster.

Elasticsearch Clusters

Design and Configuration

With NetEye 4 we recommend that you use at least 3 nodes to form an Elasticsearch cluster. If nevertheless you decide you need a 2-node cluster, our next recommendation is that you talk with a Würth Phoenix NetEye Solution Architect who can fully explain the risks in your specific environment and help you develop strategies that can mitigate those risks.

Elasticsearch coordination subsystem is in charge to choose which nodes can form a quorum (note that all NetEye cluster nodes are master eligible by default). If Log Manager is installed, the neteye_secure_install script will properly set seed_hosts and initial_master_nodes according to Elasticsearch’s recommendations and no manual intervention is required.

neteye_secure_install will set two options to configure cluster discovery:

discovery.seed_hosts: ["host1", "host2", "host3"]
cluster.initial_master_nodes: ["node1"]

Please note that the value for initial_master_nodes will be set only on the first installed node of the cluster (it is optional on other nodes and if set it must be the same for all nodes in the cluster). Option seed_hosts will be set on all cluster nodes, included Elastic Only nodes, and will have the same value on all nodes.

Elasticsearch reverse proxy

Starting with NetEye 4.13, NGINX has been added to NetEye. NGINX acts as a reverse proxy, by exposing a single endpoint and acting as a load-balancer, to distribute incoming requests across all nodes and, in this case, to all Elasticsearch instances. This solution improves the overall performance and reliability of the cluster.

The elasticsearch endpoint is reachable at URI https://elasticsearch.neteyelocal:9200/. Please note that this is the same port used before so no additional change is required; old certificates used for elastic are still valid with the new configuration.

All services connected elastic stack services like Kibana, Logstash and Filebeat have been updated in order to reflect this improvement and to take advantages of the new load balancing feature.

Elasticsearch Only Nodes

Large NetEye installations, especially those running the SIEM feature module, will often have multi-node clusters where some nodes are only running Elasticsearch. An Elasticsearch Only node can be setup with a procedure similar to NetEye nodes.

An Elasticsearch Only node does not belong to the RedHat cluster so that it runs a local master-eligible Elasticsearch service, connected to all other Elasticsearch nodes.

Elastic Blockchain Proxy

The elastic-blockchain-proxy allows a secure live signature of log streams from Logstash to Elasticsearch.

It provides protection against data tampering by transforming an input stream of plain logs into a secured blockchain where each log is cryptographically signed.

Architecture

From an high level point of view, three are the main components of the architecture:

  1. The first component is Logstash, which collects logs from various sources and sends them to the elastic-blockchain-proxy using the json_batch format of Elastic’s http-output plugin.

    Note

    Due to the fact that the elastic-blockchain-proxy does not provide persistence, Logstash should always be configured to take care of the persistence of the involved logs pipelines.

  2. The second component is elastic-blockchain-proxy itself, which receives batches of logs from Logstash, signs every log with a cryptographic key used only once, and, finally, forwards the signed logs to the Elasticsearch Bulk API;

  3. The third component is Elasticsearch, which acquires the signed logs from elastic-blockchain-proxy and persists them on the dedicated index.

How the Elastic Blockchain Proxy works

REST Endpoints

The elastic-blockchain-proxy receives logs from two REST endpoints:

  • log endpoint:

    • description: Receives and processes a single log message in JSON format

    • path : /api/log

    • request type: JSON

    • request example:

      {
         "entry": "A log message"
      }
      
    • response: The HTTP Status code 200 is used to signal a successful processing. Other HTTP status codes indicate a failure.

  • log_batch endpoint:

    • description: Receives and processes an array of log messages in JSON format

    • path : /api/log_batch

    • request type: JSON

    • request example:

      [
           {
               "entry": "A log message",
          "other": "Additional values...",
          "EBP_METADATA": {
             "agent": {
                "type": "auditbeat",
                "version": "7.10.1"
             },
             "customer": "neteye",
             "event": {
                "module": "elproxysigned"
             }
          }
           },
           {
               "entry1": "Another log message",
          "entry2": "Another log message",
          "EBP_METADATA": {
             "agent": {
                "type": "auditbeat",
                "version": "7.10.1"
             },
             "customer": "neteye",
             "event": {
                "module": "elproxysigned"
             }
          }
           },
           {
              "entry": "Again, another message",
         "EBP_METADATA": {
            "agent": {
               "type": "auditbeat",
               "version": "7.10.1"
            },
            "customer": "neteye",
            "event": {
               "module": "elproxysigned"
            }
         }
       }
      ]
      
    • response: The HTTP Status code 200 is used to signal a successful processing. Other HTTP status codes indicate a failure.

Log Signature Flow - Generation of Signature Keys

The elastic-blockchain-proxy achieves secure logging by authentically encrypting each log record with an individual cryptographic key used only once and protects the integrity of the whole log archive by a cryptographic authentication code.

The encryption key is saved on the filesystem in the {data_dir}/key.json file. This file is in the form:

{
  "key": "initial_key",
  "iteration": 0
}

Where:

  • key is the encryption key to be used to sign the next incoming log

  • iteration: is the iteration number for which the signature key is valid.

Every time a log is signed, a new pair key/iteration is generated starting from the last one. The new pair will have the following values:

  • key equals to the SHA256 hash of the previous key

  • iteration equals to the previous iteration incremented by one

For example, if the key at iteration 10 is:

{
  "key": "abcdefghilmno",
  "iteration": 10
}

the next key will be:

{
  "key": "d1bf0c925ec44e073f18df0d70857be56578f43f6c150f119e931e85a3ae5cb4",
  "iteration": 11
}

This mechanism creates a blockchain of keys that cannot be altered without breaking the validity of the entire chain.

In addition, every time a set of logs is successfully sent to Elasticsearch, the {data_dir}/key.json is updated with the new value. Consequently, every reference to the previous keys is removed from the system making it impossible to recover and reuse old keys.

However, in case of Elasticsearch errors, the elastic-blockchain-proxy will reply with an error message to Logstash and will reuse the keys of the failed call for the next incoming logs.

Note

To be valid, the iteration values of signed logs in Elasticsearch should be incremental with no missing or duplicated values.

When the first log is received after its startup, the elastic-blockchain-proxy calls Elasticsearch to query for the last indexed log iteration value to determine the correct iteration number for the next log. If the last log iteration value returned from Elasticsearch is greater than the value stored in the {data_dir}/key.json, the elastic-blockchain-proxy will fail to start.

Log Signature Flow - How are signature keys used

For each incoming log, the elastic-blockchain-proxy retrieves the first available encryption key, as described in the previous section, and then uses it to calculate the HMAC-SHA256 hash of the log.

The calculation of the HMAC hash takes into account:

  • the log itself as received from Logstash

  • the iteration number

  • the timestamp

  • the hash of the previous log

At this point, the signed log is a simple JSON object composed by the following fields:

  • All fields of the original log : all fields from the original log message

  • ES_BLOCKCHAIN: an object containing all the elastic-blockchain-proxy calculated values. They are:
    • fields: fields of the original log used by the signature process

    • hash: the hmac hash calculate as described before

    • previous_hash: the hmac hash of the previous log message

    • iteration: the iteration number of the signature key

    • timestamp_ms: the signature epoch timestamp in milliseconds

For example, given this key:

{
  "key": "d1bf0c925ec44e073f18df0d70857be56578f43f6c150f119e931e85a3ae5cb4",
  "iteration": 11
}

when this log is received:

{
   "value": "A log message",
   "origin": "linux-apache2",
   "EBP_METADATA": {
      "agent": {
         "type": "auditbeat",
         "version": "7.10.1"
      },
      "customer": "neteye",
      "event": {
         "module": "elproxysigned"
      }
   }
}

then this signed log will be generated:

{
   "value": "A log message",
   "origin": "linux-apache2",
   "EBP_METADATA": {
      "agent": {
         "type": "auditbeat",
         "version": "7.10.1"
      },
      "customer": "neteye",
      "event": {
         "module": "elproxysigned"
      }
   },
   "ES_BLOCKCHAIN": {
      "fields": {
         "value": "A log message",
         "origin": "linux-apache2"
      },
      "hash": "HASH_OF_THE_CURRENT_LOG",
      "previous_hash": "HASH_OF_THE_PREVIOUS_LOG",
      "iteration": 11,
      "timestamp_ms": 123456789
    }
}
Log Signature Flow - How the index name is determined

The name of the Elasticsearch index for the signed logs is determined by the content of the EBP_METADATA field of the incoming Log.

The index name has the following structure:

{EBP_METADATA.agent.type}-{EBP_METADATA.agent.version}-{EBP_METADATA.event.module}-{EBP_METADATA.customer}-YYYY.MM.DD

The following rules and contraint are valid:

  • The EBP_METADATA.agent.type and the EBP_METADATA.customer fields are mandatory

  • The YYYY.MM.DD part of the index name is based on the epoch timestamp of the signature

  • If the {EBP_METADATA.event.module} is not present, EPB will use by default elproxysigned

For example, given this log is received on 23 March, 2021:

{
   "value": "A log message",
   "origin": "linux-apache2",
   "EBP_METADATA": {
      "agent": {
         "type": "auditbeat",
         "version": "7.10.1"
      },
      "customer": "neteye",
      "event": {
         "module": "elproxysigned"
      }
   }
}

Then the inferred index name is: auditbeat-7.10.1-elproxysigned-neteye-2021.03.23

As a consequence of the default values and of the default Logstash configuration, most of the indexes created by EBP will have elproxysigned in the name. Consequently, special care should be applied when manipulating those indexes and documents; in particular, the user must not delete or rename *-elproxysigned-* indices manually nor alter the content of ES_BLOCKCHAIN or EBP_METADATA fields as any change could lead to a broken blockchain.

Sequential logs processing

An important aspect to bear in mind of is that the log requests for the same blockchain are always processed sequentially by elastic-blockchain-proxy. This means that, when a batch of logs is received from Logstash, it is queued in an in-memory queue and it will be processed only when all the previously received requests are completed.

This behavior is required to assure that the blockchain is kept coherent with no holes in the iteration sequence.

Nevertheless, as no parallel processing is possible for a single blockchain, this puts some hard limits on the maximum throughput reachable.

Error Handling

The elastic-blockchain-proxy implements an optional retry strategy to handle communication errors with Elasticsearch; when enabled (see the Configuration section), whenever an error is returned by Elasticsearch, the elastic-blockchain-proxy will retry for a fixed amount of times to resubmit the same signed_logs to Elasticsearch.

This strategy permits to deal, for example, with temporary networking issues without forcing Logstash to resubmit the logs for a new from-scratch processing.

Nevertheless, while this can be useful in dealing with a large set of use cases, it should also be used very carefully. In fact, due to the completely sequential nature of the blockchain processing, a too high number of retries could lead to an ever growing queue of logs waiting to be processed while the elastic-blockchain-proxy is busy with processing over and over again the same failed logs.

In conclusion, whether it is better to let elastic-blockchain-proxy fail fast or retry more times is a decision that needs a careful, case-by-case analysis.

Agents

The Elastic Beat feature

A Beat is a small, self-contained agent installed on devices within an infrastructure (mostly servers and workstations) that acts as a client to send data to a centralised server where they are processed in a suitable way.

Beats are part of the Elastic Stack; they gather data and can send them to either Logstash or Elasticsearch and visualised on a Kibana dashboard.

There are different types of Beat agents available, each tailored for a different type of data. NetEye currently uses the Filebeat NetFlow Module for internal use. Additional information about the Beat feature can be found in the official documentation.

NetEye can receive data from Beats installed on monitored hosts (i.e., on the clients). The remainder of this section shows first how NetEye is configured to receive data from Beats, i.e., as a receiving point for data sent by Beats, then explains how to install and configure Beats on clients, using SSL certificates to protect the communication.

Overview of NetEye’s Beat infrastructure setup

Beats are part of the SIEM module, which is an additional module, that can be installed following the directions in the Feature Modules Installation section if you have the subscription.

Warning

Beats are intended as a replacement for Safed, even if they can coexist. However, since both Beat and Safed might process the same data, they would double the time and resources required, therefore it is suggested to activate only one of them.

The NetEye implementation allows Logstash to listen to incoming data on a secured TCP port (5044). Logstash then sends data into two flows:

  • to a file on disk, in the /neteye/shared/rsyslog/data folder, with the following name: %{[agent][hostname]}/%{+YYYY}/%{+MM}/%{+dd}/[LS]%{[host][hostname]}.log. The format of the file is the same used for safed files. This file is encrypted and its integrity validated, like it happens for Safed, and written to disk to preserve its inalterability.

  • to Elastic, to be displayed into preconfigured Kibana dashboards.

Communication is SSL protected, and certificates need to be installed on clients together with the agents, see next section for more information.

Note

When the module is installed there is no data flow until agents are installed on the clients to be monitored. Indeed, deployment on NetEye consists only of the set up of the listening infrastructure.

The Beat feature is currently a CLI-only feature: no GUI is available and the configuration should be done by editing configuration files.