User Guide

SIEM - Log Management

Elasticsearch is not functioning properly

When Elasticsearch or Kibana are not working, there can be a number of potential causes. The following checks, conducted with Elasticsearch’s REST API mostly via NetEye’s es_curl script, can help you diagnose the problem.

  1. The first thing to do is to make sure that the cluster is running properly overall. You can do this by checking that the output of this command contains status: green::

    # /usr/share/neteye/elasticsearch/scripts/es_curl.sh -X GET "https://elasticsearch.neteyelocal:9200/_cluster/health" | jq .
    
  2. If there is a problem with the connection between cluster nodes (e.g., network or certificate issues), they will not be able to carry out tasks that require communication. This command lists all those cluster nodes that have joined the Elasticsearch cluster. All of the cluster nodes should be included in the list.

# /usr/share/neteye/elasticsearch/scripts/es_curl.sh -X GET "https://elasticsearch.neteyelocal:9200/_cat/nodes?v"
  1. When a cluster is not working well, often the cause is an index in poor health (i.e., one marked with a “yellow” or “red” status, derived from the worst status of the index’s shards). In this case, there is a real risk of losing data if something goes wrong. You can find exactly which indices are problematic with this command:

    # /usr/share/neteye/elasticsearch/scripts/es_curl.sh -s -X GET "https://elasticsearch.neteyelocal:9200/_cat/indices?v"
    
  2. Shards contain the actual data in an Elasticsearch cluster, and can be relocated to or replicated in different cluster nodes. Like indices, the existence of problematic shards can be an important reason why a cluster is not working properly. You can check the status using the following command. Since there may be a large number of shards, you can find filtering and sorting options in the Elastisearch documentation.

# /usr/share/neteye/elasticsearch/scripts/es_curl.sh -X GET "https://elasticsearch.neteyelocal:9200/_cat/shards?v"
  1. Finally, the cluster may be in perfect health, but you may not be able to visualize any data because the Kibana module is down. You can check it is properly functioning by checking that the node with id marked “plugin:kibana” has state: green in the output of this command:

    # curl -X GET "http://kibana.neteyelocal:5601/api/status" | jq -r '.status.statuses[]'
    

Managing an Elasticsearch Cluster with a Full Disk

If a cluster node’s disk begins to fill up (defined as going below a low disk watermark, with the value for cluster.routing.allocation.disk.watermark.low set by default to 85%), then replicas will no longer be assigned to that node by setting the flag read_only_allow_delete. This may cause unexpected behavior because new indices and shards may not be allocated or replicated correctly.

Once more space is available for the Elasticsearch data (e.g., you have increased the available space for the logical volume on the cluster), you should reset the value to normal to allow the node to return to normal function by using the following command:

# /usr/share/neteye/elasticsearch/scripts/es_curl.sh 'https://elasticsearch.neteyelocal:9200/*/_settings' -X PUT -H 'Content-Type: application/json' -d'
     {
       "index.blocks.read_only_allow_delete": null
     }
     ' | jq .

Some logs are not indexed in Elasticsearch

It might happen that some log files collected by the NetEye Log Manager module are not indexed correctly or not indexed at all in Elasticsearch. These logs can be manually reindexed in Elasticsearch via the script elasticsearch-reindex-logs that can be found under /usr/share/neteye/backup/elasticsearch/

The script can be run by typing:

sh elasticsearch-reindex-logs -f /full/path/to/logfile.log.gz

The input must be a log file that has been previously gzipped by the Log Manager module. The full set of options is displayed by running the script as follows:

sh elasticsearch-reindex-logs --help

Pay attention to avoid to use the script to reindex a log that is already indexed in Elasticsearch. This causes the duplication of the same data in Elastisearch.

Debugging Logstash file input filter

Debugging if a file is parsed or not by Logstash is useful in cases where you are not sure if the syntax you are using in the Logstash file input filters is correct or not. This can happen for example for the ‘exclude’ field of the file input filter, for which the documentation in Logstash is not clear. The ‘exclude’ field is crucial for example for the exclusion of the Beats log files, which must be present on the FS for being signed, but must not be reindexed in Elasticsearch.

To check if your Logstash file input filter is correctly not parsing a file, you can create a file X which you expect to be parsed by logstash and a file Y that should not be parsed by Logstash. Then, as soon as you see that Logstash is reading the file X, check if file Y is read.

This precedure is suggested because you have to consider that Logstash takes a while to parse the files, but when it parses one file, it parses all of the files. So if we see that file X is read and file Y is not read, we are sure that Logstash is actually not parsing file Y.

So, you can do the following:

  1. Create a file X which should be always parsed by logstash and a file Y that should not be parsed by Logstash, and the permissions of both files, and the of the paths to the files, so that the logstash system user can read them

  2. Restart Logstash

  3. Check which files the logstash process is reading with the ‘lsof’ command:

    lsof -p &lt logstash_pid &gt
    
  4. If logstash is not reading file X, then repeat the lsof command, otherwise you already are seeing all the files logstash is parsing