Resources Tuning¶
This section will contain a collection of suggested settings for various services running on NetEye.
MariaDB¶
MariaDB is started with default upstream settings. If the size of
an installation requires it, resource usage of MariaDB can be adjusted
to meet the higher requirements for performance. The following
settings can be added to a file
/neteye/shared/mysql/conf/my.cnf.d/custom.conf
:
[mysqld]
innodb_buffer_pool_size=16G
tmp_table_size = 512M
max_heap_table_size = 512M
innodb_sort_buffer_size=16000000
sort_buffer_size=32M
Icingaweb2 GUI¶
Performance of the Icingaweb2 Graphical User Interface, can significantly be improved in high load environments by adding INDEX and updating the COLUMN definition of hostgroups and history related tables. To do this, execute the below queries manually:
ALTER TABLE icinga_hostgroups MODIFY hostgroup_object_id bigint(20) unsigned NOT NULL;
ALTER TABLE icinga_hostgroups ADD UNIQUE INDEX idx_hostgroups_hostgroup_object_id (hostgroup_object_id);
ALTER TABLE icinga_commenthistory ADD INDEX idx_icinga_commenthistory_entry_time (entry_time);
ALTER TABLE icinga_downtimehistory ADD INDEX idx_icinga_downtimehistory_entry_time (entry_time);
ALTER TABLE icinga_notifications ADD INDEX idx_icinga_notifications_start_time (start_time);
ALTER TABLE icinga_statehistory ADD INDEX idx_icinga_statehistory_state_time (state_time);
InfluxDB¶
InfluxDB is a time series database designed to handle high volumes of write and query loads in NetEye. If you want to learn more about InfluxDB you can refer to the official InfluxDB documentation
Migration of inmem (in-memory) indices to TSI (time-series)
From NetEye 4.14, InfluxDB will use the Time Series Index (TSI).
However, the existing setup will still use the TSM index for writing and fetching data until you perform the migration procedure, which consists of the following steps.
Build TSI by running the
influx_inspect buildtsi
command:In a cluster environment, the below command must be executed on the node on which the InfluxDB resource is running:
sudo -u influxdb influx_inspect buildtsi -datadir /neteye/shared/influxdb/data/data -waldir /neteye/shared/influxdb/data/wal -v
Upon execution, the above command will build TSI for all the databases that exist in *
Note
If you want to build TSI only for a specific database then add the
-database <database_name>
parameter to the above command.Restart the
influxdb
service:
Single node:
systemctl restart influxdb
Cluster environment:
pcs resource restart influxdb
The official documentation of InfluxDB Upgrade contains more information about the inmem (in-memory) to TSI (time-series) migration process.
Log Analytics (Elastic Stack) Performance Tuning¶
This guide summarises the relevant configuration optimisations that allows to optimize Elastic Stack and boost the performance to optimise the use on Netye 4. Applying these suggestions proves very useful and is suggested, especially on larger Elastic deployments.
Elasticsearch JVM Optimization¶
In Elasticsearch, the default options for the JVM are specified in the
/neteye/local/elasticsearch/conf/jvm.options
file.
Please note how this file must not be modified, since it will be
overwritten at each update.
If you would like to specify or override some options, a new .options
file should
be created in the /neteye/local/elasticsearch/conf/jvm.options.d/
folder,
containing the desired options, one per line. Please note
that the JVM processes the options files according to the lexicographic order.
For example, we can set the encoding used by Java for reading and saving files to UTF-8
by creating a /neteye/local/elasticsearch/conf/jvm.options.d/01_custom_jvm.options
with the following content:
-Dfile.encoding=UTF-8
For more information about the available JVM options and their syntax, please refer to the official documentation.
Elasticsearch Database Tuning¶
Swapping
Swapping is very bad for performance, for node stability, and should be avoided at any costs, because it can cause garbage collections to last for minutes instead of milliseconds, it can cause nodes to respond slowly, or even to disconnect from the cluster. In a resilient distributed system, it proves more effective to let the operating system kill the node than allowing swapping.
Moreover, Elasticsearch performs poorly when the system is swapping the memory to disk. Therefore, it is vitally important to the health of your node that none of the JVM is ever swapped out to disk. The following steps allow to achieve this goal.
Configure swappiness. Ensure that the sysctl value
vm.swappiness
is set to 1. This reduces the kernel’s tendency to swap and should not lead to swapping under normal circumstances, while still allowing the whole system to swap in emergency conditions. Execute the following commands on each Elastic Node and made changes persistent:sysctl vm.swappiness=1 echo "vm.swappiness=1" > /etc/sysctl.d/zzz-swappiness.conf sysctl -p
Memory locking. Another best practice on Elastic nodes is use
mlockall
option, to try to lock the process address space into RAM, preventing any Elasticsearch memory from being swapped out. Set thebootstrap.memory_lock
setting to true, so Elasticsearch will lock the process address space into RAM, preventing any portion of memory used by Elasticsearch from being swapped out.Uncomment or add this line to the
/neteye/local/elasticsearch/conf/elasticsearch.yml
file:bootstrap.memory_lock: true
Edit limit of system resources on Service section creating the new file
/etc/systemd/system/elasticsearch.service.d/neteye-limits.conf
with the following content:[Service] LimitMEMLOCK=infinity
Restart resources:
systemctl daemon-reload systemctl restart elasticsearch
After starting Elasticsearch, you can see whether this setting was applied successfully by checking the value of
mlockall
in the output from this request:sh /usr/share/neteye/elasticsearch/scripts/es_curl.sh -XGET 'https://elasticsearch.neteyelocal:9200/_nodes?filter_path=**.mlockall&pretty'``
Increase file descriptor
Check if the amount of file descriptor suffices by using the command
lsof -p <elastic-pid> | wc -l
on each nodes. By default the setting
on Neteye is 65,535.
To increase the default value this create a file in
/etc/systemd/system/elasticsearch.service.d/neteye-open-file-limit.conf
with content such as:
[Service]
LimitNOFILE=100000
For more information, see the official documentation
DNS cache settings
By default, Elasticsearch runs with a security manager in place, which implies that the JVM defaults to caching positive hostname resolutions indefinitely and defaults to caching negative hostname resolutions for ten seconds. Elasticsearch overrides this behavior with default values to cache positive lookups for 60 seconds, and to cache negative lookups for 10 seconds.
These values should be suitable for most environments, including
environments where DNS resolutions vary with time. If not, you can edit
the values es.networkaddress.cache.ttl
and
es.networkaddress.cache.negative.ttl
in the JVM options drop-in folder
/neteye/local/elasticsearch/conf/jvm.options.d/
.
SIEM Additional Tuning (X-Pack)¶
Encrypt sensitive data check
If you use Watcher and have chosen to encrypt sensitive data (by setting
xpack.watcher.encrypt_sensitive_data
to true), you must also
place a key in the secure settings store.
To pass this bootstrap check, you must set the
xpack.watcher.encryption_key
on each node in the cluster. For
more information, see the official
documentation.
Kibana Configuration tuning¶
There are same interesting tuning that could be done also on Kibana settings to improve performance on production.
For more information, see the official documentation.
Require Content Security Policy (CSP)
Kibana uses a Content Security Policy to help prevent the browser from allowing unsafe scripting, but older browsers will silently ignore this policy. If your organization does not need to support Internet Explorer 11 or much older versions of our other supported browsers, we recommend that you enable Kibana’s strict mode for content security policy, which will block access to Kibana for any browser that does not enforce even a rudimentary set of CSP protections.
To do this, set csp.strict
to true in file
/neteye/shared/kibana/conf/kibana.yml
.
Memory
Kibana has a default maximum memory limit of 1.4 GB, and in most cases, we recommend leaving this setting to its default value. However, in some scenarios, such as large reporting jobs, it may make sense to tweak limits to meet more specific requirements.
You can modify this limit by setting --max-old-space-size
in the
NODE_OPTIONS
environment variable. In Neteye this can be configured
creating a file
/etc/systemd/system/kibana-logmanager.service.d/memory.conf
containing a limit in MB such as:
[Service]
NODE_OPTIONS="--max-old-space-size=2048"
For more information, see the official documentation.
User Customization
The Kibana environment file /neteye/shared/kibana/conf/sysconfig/kibana
contains some options used by the Kibana service. Please note how
this file must not be modified, since it will be overwritten at each update.
The dedicated file /neteye/shared/kibana/conf/sysconfig/kibana-user-customization
can be used to specify or override one or more Kibana environment variables.
How to Enable Load Balancing For Logstash¶
Warning
This functionality is in beta stage and may be subject to changes. Beta features may break during minor upgrades and their quality is not ensured by regression testing.
The load balancing feature for logstash exploits NGINX ability to act as a reverse proxy and distribute incoming (logstash) connections among all nodes in the cluster. In this way, logstash will no longer be a cluster resource anymore, but a standalone service running on each node of the cluster.
Note however, that if you enable this feature, you will lose the ability to sign the log files. This happens because with this setup, logmanager has access only to the log files that are present on file systems mounted on the node where it is running.
Indeed, rsyslog can not take advantage of the load balancing feature, therefore only the logs on the node on which logmanager is running will be signed.
In the case of Beat, log files will be sent through the load balancer and therefore they will not be signed.
This how-to will guide you in setting up load balancing for logstash. In a nutshell, you need first to disable the logstash cluster resource, then to modify or to add logstash and NGINX configurations, and finally to keep the logstash configuration in sync on all nodes.
More in details, this are the steps:
Permanently disable the cluster resource for logstash: run
pcs resource disable logstash
Create a local service of logstash on each node in the cluster, by following these steps:
The configuration files will be stored in
/neteye/local/logstash/conf
, so copy them over from/neteye/shared/logstash/conf
Fix all the paths into the conf files:
find /neteye/local/logstash/conf -type f -exec sed -i 's/shared/local/g' "{}" \;
Edit both the
/neteye/shared/logstash/conf/sysconfig/logstash
and/neteye/local/logstash/conf/sysconfig/logstash
files and add to them the following lines:LS_SETTINGS_DIR="/neteye/local/logstash/conf/" OPTIONS="--config.reload.automatic"
Add the host directive in
/neteye/local/logstash/conf/conf.d/0_i03_agent_beats.input
(use the cluster internal network IP): host => “192.168.xxx.xxx”Create a new logstash service (call it e.g., logstash-local.service) with the following content:
[Unit] Description=logstash local [Service] Type=simple User=logstash Group=logstash EnvironmentFile=-/etc/default/logstash EnvironmentFile=-/neteye/local/logstash/conf/sysconfig/logstash ExecStartPre=/usr/share/logstash/bin/generate-config.sh ExecStart=/usr/share/logstash/bin/logstash "--path.settings" "/neteye/local/logstash/conf/" $OPTIONS Restart=always WorkingDirectory=/ Nice=19 LimitNOFILE=16384 [Install] WantedBy=multi-user.target
Add the service into the neteye cluster local systemd targets. You can refer to the Cluster Technology and Architecture chapter of the user guide for more information.
Edit file /etc/hosts to point the host logstash.neteyelocal to the cluster IP.
Add the NGINX load-balancing configuration in a file called logstash-loadbalanced.j2 The name is very imporant because it will be used by neteye install to setup the correct mapping between the logstash service and NGINX. The file needs to have the following content. Please, pay special attention in copying the whole snippet AS-IS, especially the three-line of the
for
cycle, because it is essential in configuring NGINX on all the cluster nodes:upstream logstash\_ingest { {% for node in nodes %} server {{ hostvars[node].internal\_node\_addr }}:5044; {% endfor %} } server { listen logstash.neteyelocal:5044; proxy_pass logstash_ingest; }
Remember that the logstash standalone configuration must be kept in sync on all nodes, therefore the /neteye/local/logstash/conf/ directory must have the same content on all nodes. To achieve this goal you can for example set up a cron job that uses rsync to maintain the synchronisation.
Run neteye install only once on any cluster node.
Start the local logstash service on every node:
systemctl start logstash-local
on every node