Concepts¶
Architecture¶
General Elasticsearch Cluster Information¶
In order to avoid excessive, useless network traffic generated when the cluster reallocates shards across cluster nodes after you restart an Elasticsearch instance, NetEye employs systemd post-start and pre-stop scripts to automatically enable and disable shard allocation properly on the current node whenever the Elasticsearch service is started or stopped by systemctl.
Details on how shard allocation works can be found here.
Note
By starting a stopped Elasticsearch instance, shard allocation will be enabled globally for the entire cluster. So if you have more than one Elasticsearch instance down, shards will be reallocated in order to prevent data loss.
- Therefore best practice is to:
Never keep an Elasticsearch instance stopped on purpose. Stop it only for maintenance reasons (e.g. for restarting the server) and start it up again as soon as possible.
Restart or stop/start one Elasticsearch node at a time. If something bad happens and multiple Elasticsearch nodes go down, then start them all up again together.
Elastic Only Nodes¶
From Neteye 4.9 is possible to install Elastic-only nodes in order to improve elasticsearch performance without a full Neteye installation.
To create an elastic only node you have to create an entry of type
ElasticOnlyNodes in the file /etc/neteye-cluster
as in the following
example. Syntax is the same used for standard Node
{ "ElasticOnlyNodes": [
{
"addr" : "192.168.47.3",
"hostname" : "neteye03.neteyelocal",
"hostname_ext" : "rdneteye03.si.wp.lan"
}
]
}
Voting Only Nodes¶
From Neteye 4.16 is possible to install Voting-only nodes in order to add a node dedicated only to provide quorum. If SIEM module is installed this node provides a voting-only functionalities also to Elasticsearch cluster.
This functionality is achieved configuring the node as voting-only master-eligible node specifying the
variable ES_NODE_ROLES="master, voting_only"
in the sysconfig file
/neteye/local/elasticsearch/conf/sysconfig/elasticsearch-voting-only
.
Voting-only node is defined in /etc/neteye-cluster
as in the following example
{ "VotingOnlyNode": {
"addr" : "192.168.47.3",
"hostname" : "neteye03.neteyelocal",
"hostname_ext" : "rdneteye03.si.wp.lan",
"id" : 3
}
}
Please note that VotingOnlyNode is a json obect and not an array because you can have a single Voting-only node in a NetEye cluster.
Elasticsearch Clusters¶
Design and Configuration¶
With NetEye 4 we recommend that you use at least 3 nodes to form an Elasticsearch cluster. If nevertheless you decide you need a 2-node cluster, our next recommendation is that you talk with a Würth Phoenix NetEye Solution Architect who can fully explain the risks in your specific environment and help you develop strategies that can mitigate those risks.
Elasticsearch coordination
subsystem
is in charge to choose which nodes can form a quorum (note that all
NetEye cluster nodes are master eligible by default). If Log Manager is
installed, the neteye_secure_install
script will properly set
seed_hosts and initial_master_nodes according to Elasticsearch’s
recommendations and no manual intervention is required.
neteye_secure_install
will set two options to configure cluster
discovery:
discovery.seed_hosts: ["host1", "host2", "host3"]
cluster.initial_master_nodes: ["node1"]
Please note that the value for initial_master_nodes will be set only on the first installed node of the cluster (it is optional on other nodes and if set it must be the same for all nodes in the cluster). Option seed_hosts will be set on all cluster nodes, included Elastic Only nodes, and will have the same value on all nodes.
Elasticsearch reverse proxy¶
Starting with NetEye 4.13, NGINX has been added to NetEye. NGINX acts as a reverse proxy, by exposing a single endpoint and acting as a load-balancer, to distribute incoming requests across all nodes and, in this case, to all Elasticsearch instances. This solution improves the overall performance and reliability of the cluster.
The elasticsearch endpoint is reachable at URI https://elasticsearch.neteyelocal:9200/. Please note that this is the same port used before so no additional change is required; old certificates used for elastic are still valid with the new configuration.
All services connected elastic stack services like Kibana, Logstash and Filebeat have been updated in order to reflect this improvement and to take advantages of the new load balancing feature.
Elasticsearch Only Nodes¶
Large NetEye installations, especially those running the SIEM feature module, will often have multi-node clusters where some nodes are only running Elasticsearch. An Elasticsearch Only node can be setup with a procedure similar to NetEye nodes.
An Elasticsearch Only node does not belong to the RedHat cluster so that it runs a local master-eligible Elasticsearch service, connected to all other Elasticsearch nodes.
Elastic Blockchain Proxy¶
The elastic-blockchain-proxy allows a secure live signature of log streams from Logstash to Elasticsearch.
It provides protection against data tampering by transforming an input stream of plain logs into a secured blockchain where each log is cryptographically signed.
Architecture¶
From an high level point of view, three are the main components of the architecture:
The first component is Logstash, which collects logs from various sources and sends them to the elastic-blockchain-proxy using the
json_batch
format of Elastic’s http-output plugin.Note
Due to the fact that the elastic-blockchain-proxy does not provide persistence, Logstash should always be configured to take care of the persistence of the involved logs pipelines.
The second component is elastic-blockchain-proxy itself, which receives batches of logs from Logstash, signs every log with a cryptographic key used only once, and, finally, forwards the signed logs to the Elasticsearch Bulk API;
The third component is Elasticsearch, which acquires the signed logs from elastic-blockchain-proxy and persists them on the dedicated index.
How the Elastic Blockchain Proxy works¶
REST Endpoints¶
The elastic-blockchain-proxy receives logs from two REST endpoints:
log endpoint:
description: Receives and processes a single log message in JSON format
path : /api/log
request type: JSON
request example:
{ "entry": "A log message" }
response: The HTTP Status code 200 is used to signal a successful processing. Other HTTP status codes indicate a failure.
log_batch endpoint:
description: Receives and processes an array of log messages in JSON format
path : /api/log_batch
request type: JSON
request example:
[ { "entry": "A log message", "other": "Additional values...", "EBP_METADATA": { "agent": { "type": "auditbeat", "version": "7.10.1" }, "customer": "neteye", "event": { "module": "elproxysigned" } } }, { "entry1": "Another log message", "entry2": "Another log message", "EBP_METADATA": { "agent": { "type": "auditbeat", "version": "7.10.1" }, "customer": "neteye", "event": { "module": "elproxysigned" } } }, { "entry": "Again, another message", "EBP_METADATA": { "agent": { "type": "auditbeat", "version": "7.10.1" }, "customer": "neteye", "event": { "module": "elproxysigned" } } } ]
response: The HTTP Status code 200 is used to signal a successful processing. Other HTTP status codes indicate a failure.
Log Signature Flow - Generation of Signature Keys¶
The elastic-blockchain-proxy achieves secure logging by authentically encrypting each log record with an individual cryptographic key used only once and protects the integrity of the whole log archive by a cryptographic authentication code.
The encryption key is saved on the filesystem in the
{data_dir}/key.json
file. This file is in the form:
{
"key": "initial_key",
"iteration": 0
}
Where:
key
is the encryption key to be used to sign the next incoming logiteration
: is the iteration number for which the signature key is valid.
Every time a log is signed, a new pair key/iteration is generated starting from the last one. The new pair will have the following values:
key
equals to the SHA256 hash of the previous keyiteration
equals to the previous iteration incremented by one
For example, if the key at iteration 10 is:
{
"key": "abcdefghilmno",
"iteration": 10
}
the next key will be:
{
"key": "d1bf0c925ec44e073f18df0d70857be56578f43f6c150f119e931e85a3ae5cb4",
"iteration": 11
}
This mechanism creates a blockchain of keys that cannot be altered without breaking the validity of the entire chain.
In addition, every time a set of logs is successfully sent to
Elasticsearch, the {data_dir}/key.json
is updated with
the new value. Consequently, every reference to the previous keys is
removed from the system making it impossible to recover and reuse old
keys.
However, in case of Elasticsearch errors, the elastic-blockchain-proxy will reply with an error message to Logstash and will reuse the keys of the failed call for the next incoming logs.
Note
To be valid, the iteration
values of signed logs in
Elasticsearch should be incremental with no missing or duplicated
values.
When the first log is received after its startup,
the elastic-blockchain-proxy calls Elasticsearch to query
for the last indexed log iteration
value to determine the correct
iteration
number for the next log. If the last log iteration
value returned from Elasticsearch is greater than the value stored in
the {data_dir}/key.json
, the elastic-blockchain-proxy
will fail to start.
Log Signature Flow - How are signature keys used¶
For each incoming log, the elastic-blockchain-proxy retrieves the first available encryption key, as described in the previous section, and then uses it to calculate the HMAC-SHA256 hash of the log.
The calculation of the HMAC hash takes into account:
the log itself as received from Logstash
the iteration number
the timestamp
the hash of the previous log
At this point, the signed log is a simple JSON object composed by the following fields:
All fields of the original log : all fields from the original log message
- ES_BLOCKCHAIN: an object containing all the elastic-blockchain-proxy calculated values. They are:
fields: fields of the original log used by the signature process
hash: the hmac hash calculate as described before
previous_hash: the hmac hash of the previous log message
iteration: the iteration number of the signature key
timestamp_ms: the signature epoch timestamp in milliseconds
For example, given this key:
{
"key": "d1bf0c925ec44e073f18df0d70857be56578f43f6c150f119e931e85a3ae5cb4",
"iteration": 11
}
when this log is received:
{
"value": "A log message",
"origin": "linux-apache2",
"EBP_METADATA": {
"agent": {
"type": "auditbeat",
"version": "7.10.1"
},
"customer": "neteye",
"event": {
"module": "elproxysigned"
}
}
}
then this signed log will be generated:
{
"value": "A log message",
"origin": "linux-apache2",
"EBP_METADATA": {
"agent": {
"type": "auditbeat",
"version": "7.10.1"
},
"customer": "neteye",
"event": {
"module": "elproxysigned"
}
},
"ES_BLOCKCHAIN": {
"fields": {
"value": "A log message",
"origin": "linux-apache2"
},
"hash": "HASH_OF_THE_CURRENT_LOG",
"previous_hash": "HASH_OF_THE_PREVIOUS_LOG",
"iteration": 11,
"timestamp_ms": 123456789
}
}
Log Signature Flow - How the index name is determined¶
The name of the Elasticsearch index for the signed logs is determined by the content of
the EBP_METADATA
field of the incoming Log.
The index name has the following structure:
{EBP_METADATA.agent.type}-{EBP_METADATA.agent.version}-{EBP_METADATA.event.module}-{EBP_METADATA.customer}-YYYY.MM.DD
The following rules and contraint are valid:
The
EBP_METADATA.agent.type
and theEBP_METADATA.customer
fields are mandatoryThe
YYYY.MM.DD
part of the index name is based on the epoch timestamp of the signatureIf the
{EBP_METADATA.event.module}
is not present, EPB will use by defaultelproxysigned
For example, given this log is received on 23 March, 2021:
{
"value": "A log message",
"origin": "linux-apache2",
"EBP_METADATA": {
"agent": {
"type": "auditbeat",
"version": "7.10.1"
},
"customer": "neteye",
"event": {
"module": "elproxysigned"
}
}
}
Then the inferred index name is: auditbeat-7.10.1-elproxysigned-neteye-2021.03.23
As a consequence of the default values and of the default Logstash configuration,
most of the indexes created by EBP will have elproxysigned
in the name.
Consequently, special care should be applied when manipulating those indexes and documents;
in particular, the user must not delete or rename *-elproxysigned-*
indices manually nor alter
the content of ES_BLOCKCHAIN
or EBP_METADATA
fields as any change could lead to a broken
blockchain.
Sequential logs processing¶
An important aspect to bear in mind of is that the log requests for the same blockchain are always processed sequentially by elastic-blockchain-proxy. This means that, when a batch of logs is received from Logstash, it is queued in an in-memory queue and it will be processed only when all the previously received requests are completed.
This behavior is required to assure that the blockchain is kept coherent
with no holes in the iteration
sequence.
Nevertheless, as no parallel processing is possible for a single blockchain, this puts some hard limits on the maximum throughput reachable.
Error Handling¶
The elastic-blockchain-proxy implements an optional retry strategy to handle communication errors with Elasticsearch; when enabled (see the Configuration section), whenever an error is returned by Elasticsearch, the elastic-blockchain-proxy will retry for a fixed amount of times to resubmit the same signed_logs to Elasticsearch.
This strategy permits to deal, for example, with temporary networking issues without forcing Logstash to resubmit the logs for a new from-scratch processing.
Nevertheless, while this can be useful in dealing with a large set of use cases, it should also be used very carefully. In fact, due to the completely sequential nature of the blockchain processing, a too high number of retries could lead to an ever growing queue of logs waiting to be processed while the elastic-blockchain-proxy is busy with processing over and over again the same failed logs.
In conclusion, whether it is better to let elastic-blockchain-proxy fail fast or retry more times is a decision that needs a careful, case-by-case analysis.
Agents¶
The Elastic Beat feature¶
A Beat is a small, self-contained agent installed on devices within an infrastructure (mostly servers and workstations) that acts as a client to send data to a centralised server where they are processed in a suitable way.
Beats are part of the Elastic Stack; they gather data and can send them to either Logstash or Elasticsearch and visualised on a Kibana dashboard.
There are different types of Beat agents available, each tailored for a different type of data. NetEye currently uses the Filebeat NetFlow Module for internal use. Additional information about the Beat feature can be found in the official documentation.
NetEye can receive data from Beats installed on monitored hosts (i.e., on the clients). The remainder of this section shows first how NetEye is configured to receive data from Beats, i.e., as a receiving point for data sent by Beats, then explains how to install and configure Beats on clients, using SSL certificates to protect the communication.
Overview of NetEye’s Beat infrastructure setup¶
Beats are part of the SIEM module, which is an additional module, that can be installed following the directions in the Feature Modules Installation section if you have the subscription.
Warning
Beats are intended as a replacement for Safed, even if they can coexist. However, since both Beat and Safed might process the same data, they would double the time and resources required, therefore it is suggested to activate only one of them.
The NetEye implementation allows Logstash to listen to incoming data on a secured TCP port (5044). Logstash then sends data into two flows:
to a file on disk, in the /neteye/shared/rsyslog/data folder, with the following name:
%{[agent][hostname]}/%{+YYYY}/%{+MM}/%{+dd}/[LS]%{[host][hostname]}.log
. The format of the file is the same used forsafed
files. This file is encrypted and its integrity validated, like it happens for Safed, and written to disk to preserve its inalterability.to Elastic, to be displayed into preconfigured Kibana dashboards.
Communication is SSL protected, and certificates need to be installed on clients together with the agents, see next section for more information.
Note
When the module is installed there is no data flow until agents are installed on the clients to be monitored. Indeed, deployment on NetEye consists only of the set up of the listening infrastructure.
The Beat feature is currently a CLI-only feature: no GUI is available and the configuration should be done by editing configuration files.