User Guide

Shutdown Manager

The ability to execute a scheduled and ordered shutdown of all hosts in a datacenter may prove useful in a number of scenarios. This section describes how this task works on NetEye and then shows the proper configuration required.

Concepts

Shutting down large numbers of hosts can be very useful in the event of emergencies like fire or extended power loss. Shutting down servers in an orderly manner can prevent data loss and speed up recovery time.

The Shutdown Manager lets you define one or more shutdown groups, either manually, or automatically when triggered by a condition based on monitoring results. For a higher efficiency, groups have a declared order in which they will be turned off. Hosts within the same group will be turned off concurrently, while subsequent groups will only be shut down after a specified amount of time has elapsed after the previous group shutdown had started.

Architecture of the Shutdown Module

The Shutdown Manager module is organised on three levels: the top level is called Shutdown Definition and contains a definition of the actions to be taken; a Shutdown Group, which consists of a group of hosts that will be acted upon in the same time-span; and a Shutdown Host which consists of the single hosts on which the shutdown action will be carried out.

  • Each Shutdown Definition contains three main information: a name, a condition on either a host or a service, and a list of groups. It must then be deployed in Tornado, writing a rule that verifies the shutdown condition; afterwards, whenever the condition is met, a shutdown is scheduled on the groups. As an example, consider a host connected to a sensor which monitors the temperature in the server room. The condition can be a high temperature, and the groups to be shutdown contain the servers in the room, according to the order in which they must be turned off.

  • The Shoutdown Group contains the list of hosts that will be processed in parallel and a few information about them, and the timeout; i.e., the amount of time after which the next group will be processed.

  • The Shoutdown Host consists of an host and of the commands that will be executed on it during when a shutdown is invoked. The command definition might include variables, i.e., placeholders that will be replaced with suitable values when they are executed on each host.

The following sections describe the components comprising the Shutdown Manager and how to use the CLI to configure automatic shutdown scenarios and manually shut down groups and individual hosts.

Configuration

The Orchestrated Datacenter Shutdown allows to program a controlled shutdown of a large number of hosts.

A so-called shutdown command must be associated with each monitored Host that should be shut down by the module. This is the exact command which will be executed by the Shutdown Manager module on the host to shut it down. The link between monitored host and shutdown command is created by assigning a dedicated Custom Property in the host’s configuration form in Icinga Director.

To actually trigger the shutdown of a host, you will need to go to the Shutdown Manager module and invoke the command from there.

Shutdown Manager User

Shutdown manager uses a specific user called shutdownmanager to access icinga2 APIs to initiate the shutdown command. The configuration is automatically generated during neteye_secure_install with a random password and must not be modified by the user.

Credentials can be found in the file /neteye/shared/icinga2/conf/icinga2/conf.d/shutdownmanager-users.conf e.g.:

object ApiUser "shutdownmanager" {
  password = "c0tYWiowD9HIjkRcFRsp6hjhI1iZ"
  permissions = [ "objects/query/host","objects/query/service","actions/execute-command","objects/query/endpoint","objects/query/checkcommand"]
}

shutdownmanager user has only permissions to query hosts, services, endpoints, checkcommands and execute commands.

The Shutdown Command

Each shutdown command is run using icingacli.

Shutdown commands return the result of the operation as a JSON structure. For list commands, this structure will be an array of the returned objects. For the other commands (create, delete, etc.) it will contain:

  • result: Whether the command succeeded or failed (‘ok’ or ‘failed’)

  • message: A text-based confirmation message

  • info: Detailed information on relevant parameters

Warning

Once you finish configuring the Shutdown Command objects, you need to deploy the Director configuration in order to execute the Commands.

Create

The create command lets you add a new shutdown command which you can then apply to the Shutdown command custom property of the host or host group in Director.

Usage:

# icingacli shutdownmanager shutdowncommand create [parameters]

Available Parameters:

-name

(mandatory) Name of the command.

-command

(mandatory) The command field must consist of a JSON Array that represents the command and its associated parameters needed to correctly shut down the host. If the command requires host-related parameters, the user can specify them using the pattern $<property>$, and they will be replaced by the Shutdown Manager (if correctly specified in the host definition). Currently the supported parameters are:

  • $host$

  • $hostAddress$

  • $hostAddress6$

  • $hostAlias$

  • $hostIpv4$

  • $objectType$

  • $hostName$

–run-on-agent

(mandatory) Determines whether the command should be executed on the master, or on an Icinga agent. This will only succeed if the Icinga agent is running on the host, and is connected to either the master or a satellite when the shutdown command is triggered. The value to pass is either 0 (false, i.e. run on the master) or 1 (true, i.e. run on the agent).

Edit

The edit command lets you change one or more values on an existing shutdown command using the same parameters as the shutdowncommand create command above.

Usage:

# icingacli shutdownmanager shutdowncommand edit [parameters]

List

The list command lets you see a list of all existing shutdown commands in JSON format.

Usage:

# icingacli shutdownmanager shutdowncommand list

Available Parameters:

–id

(mandatory) The ID of the definition to list

Delete

The delete command lets you remove a shutdown command you have created, provided that it is not in use on any hosts. It requires the shutdown command’s ID, which you can obtain from the list command.

Usage:

# icingacli shutdownmanager shutdowncommand delete [parameters]

Available Parameters:

–id

(mandatory) The ID of the shutdown command to delete

Triggering the Shutdown of All Hosts in a Shutdown Group

All hosts in a Shutdown Group can be shut down directly using this dedicated icingacli command:

icingacli shutdownmanager shutdown shutdowngroup --id <ID>

The shutdown shutdowngroup command requires the group’s ID, which you can obtain by using the group’s list command. During command execution, all hosts in that group will be sent their individual shutdown command.

The shutdown shutdowngroup icingacli command will return one of the following values:

  • 0 : Ok

  • 1 : The mandatory ID parameter is missing

  • 2 : The group with the given ID does not exist

  • 3 : At least one error occurred while sending the shutdown command to the hosts in the group

Triggering the Shutdown of a Single Host

A single host can be shut down directly using the dedicated icingacli command:

icingacli shutdownmanager shutdown host --host-name <host name>

The shutdown host command requires a parameter for the host name. During command execution, a check is performed which validates that the specified host exists. In addition, the command itself verifies whether a shutdown command has been assigned to the given host.

The Shutdown Manager will then substitute the supported macro parameters and communicate to Icinga the resulting shutdown command. Icinga will take care of executing the shutdown command.

The shutdown host icingacli command will return one of the following values:

  • 0 : Ok

  • 1 : The mandatory host-name parameter is missing

  • 2 : No hosts or more than one host was passed as the host-name parameter

  • 3 : An error occurred while sending the shutdown command to Icinga

Shutdown Management Configuration CLI Commands

You can use the Shutdown Manager directly from the shell with the icingacli command.

Using the Shutdown Manager’s CLI commands, you can perform create, edit, delete and list actions on the following shutdownmanager objects:

  1. Shutdown Definition: A Shutdown Definition that specifies how to shut down multiple computers when a condition is met.

  2. Shutdown Group: A Shutdown Group which should all be shut down at the same time.

Below you can find detailed descriptions of the available commands and their parameters.

Shutdown Definition Commands

Create

The create command lets you construct a new shutdown definition. It requires a name and a Shutdown Condition, expressed as a Host State or a Service State depending on the command’s parameters.

Usage:

# icingacli shutdownmanager shutdowndefinition create [parameters]

Available Parameters:

–name

(mandatory) The name of the shutdown definition to be created.

–host-name

(mandatory) The name of the host whose status will trigger the shutdown, or which is running the service that will trigger the shutdown.

–service-description

(optional) If this parameter is empty, then the host’s status will be compared to the target status. If instead it is set to the name of a valid service on the host, then the service’s status will be used.

–status

(mandatory) The monitoring status that will trigger the shutdown procedure when it matches the host or service status.

Edit

The edit command lets you change one or more of the values for the fields in an existing shutdown definition using the same parameters as the shutdowndefinition create command above.

Usage:

# icingacli shutdownmanager shutdowndefinition edit [parameters]
List

The list command lets you see a list of all existing shutdown definitions in JSON format.

Usage:

# icingacli shutdownmanager shutdowndefinition list
Delete

The delete command lets you remove an existing shutdown definition given that definition’s ID, which you can obtain from the list command.

Usage:

# icingacli shutdownmanager shutdowndefinition delete [parameters]

Available Parameters:

–id

(mandatory) The ID of the definition to delete

Shutdown Group Commands

Create

The create command lets you construct a new shutdown group. It requires a name, a set of monitored objects, a shutdown definition to belong to, and an ordering relative to other groups.

Usage:

# icingacli shutdownmanager shutdowngroup create [parameters]

Available Parameters:

–name

(mandatory) The name of the shutdown group to be created.

–filter

(mandatory) A monitoring filter that will return a set of monitored objects to be included in this new group.

–shutdown-definition-id

(mandatory) The ID of a shutdown definition (obtainable via the list command) which this new group will then belong to.

–timeout

(mandatory) The number of seconds after which the shutdown process for the next group will be initiated.

–group-order

(optional) A number representing the order of this group relative to other groups in the shutdown definition. Lower-numbered groups are shut down before higher-numbered groups, and any groups having the same order number in the shutdown definition will be shut down in a random order relative to themselves. If no group order parameter is specified, then the value will default to the next highest available number (for instance, if you have only one group with order 3 in the definition, the new group will have order 4). If no groups yet exist in the shutdown definition, it will be set to 1.

Edit

The edit command lets you change one or more values for the fields in an existing shutdown group using the same parameters as the shutdowngroup create command above.

Usage:

# icingacli shutdownmanager shutdowngroup edit [parameters]
List

The list command lets you see all existing shutdown groups in JSON format.

Usage:

# icingacli shutdownmanager shutdowngroup list

Available Parameters: None

Delete

The delete command lets you remove an existing shutdown group given that group’s ID, which you can obtain from the list command.

Usage:

# icingacli shutdownmanager shutdowngroup delete [parameters]

Available Parameters:

–id

(mandatory) The ID of the group to delete

Listhosts

The listhosts command lets you see all hosts that belong to a shutdown group in JSON format. Each entry in the output list represents a host and contains:

  • The host name

  • The host address

  • The ID of the shutdown command associated with the host (null if the host is not associated with any shutdown command)

Usage:

# icingacli shutdownmanager shutdowngroup listhosts [parameters]

Available Parameters:

–id

(mandatory) The ID of the shutdown group whose hosts are to be displayed

Shutdown Commands

Shutdowndefinition

The shutdowndefinition command lets you trigger the shutdown of a pre-configured shutdown definition.

Once a shutdown group is triggered, a timer will count down while the hosts in that shutdown group are powering down. Once the timeout period has expired, the subsequent shutdown group will be triggered and its timer set, until no more shutdown groups remain.

Usage:

# icingacli shutdownmanager shutdown shutdowndefinition [parameters]

Available Parameters:

–id

(mandatory) The ID of the shutdown definition which has to be shutdown

The Shutdown Manager GUI

The Shutdown Manager GUI allows to configure the Shutdown Manager and manage all its components, without the need to access the CLI.

The Shutdown Definition

A shutdown definition is used to determine a condition on a host, that when met will start the shutdown process on a host or a group of hosts; it is built using the same parameters used in the same parameters, with the Status as the shutdown condition.

This page contains a list of all the existing shutdown definitions, along with the Shutdown groups on which each definition will operate. A click on the groups’ number will open the Shutdown Groups tab, where it is possible to see and edit each group.

In the Action column appears a Shutdown button: provided that users have the necessary permissions to start and execute the shutdown process, they will be able to click on the red shutdown button, otherwise the button will be grey and not clickable.

Upon clicking on the Shutdown button in the Action column, a confirmation panel will appear, and a click on the red Shutdown button in the right-hand side panel will execute the shutdown on the hosts in the first group, followed by the other groups in the Shutdown Definition, if there are more.

To allow users to execute the shutdown, there is a new permission called shutdownmanager/api/trigger-shutdown-definition for the Shutdown Manager module. In Configuration ‣ Authentication ‣ Roles (see Authentication Roles), assign the permission to a new or existing role, then add the user to that role.

New Shutdown definitions can be created by clicking on Add. In the form, provide the name for the new definition, the hosts or groups that will be interested by the definition and by the condition that must be met to invoke the shutdown.

The Shutdown Group

The Shutdown Group tab contains all the groups of hosts that have been created. For each group a number of information is shown:

  • the associated shutdown definition ID, which is clickable and would open the corresponding Shutdown Definition tab

  • the order in which the group is processed within the Shutdown Definition

  • the timeout before the definition operates

  • the monitoring view: when clicked, in the right-hand side panel will appear the detailed list of the hosts in the group along with a number of details.

New Shutdown groups can be created by clicking on Add.

The Shutdown Command

A Shutdown Command is the actual command that will be issued on the hosts or groups when the Shutdown Manager is invoked on a group of hosts. This page shows for each command the given name and if it should be run on the master (0) or on the agent (1). The same information must be provided when adding a new command.

Warning

Once you finish configuring the Shutdown Command objects, you need to deploy the Director configuration in order to execute the Commands.

The Shutdown Output

In this tab it is possible to check in real time the log files produced during the various shutdown processes. The output includes the timestamp of the shutdown process, the result of the Shutdown Command and error messages if the shutdown process failed on some hosts.

Icinga2 Integration

The Shutdown Manager takes advantage of the existing Icinga2 trust infrastructure. Commands executed by the Shutdown Manager can either run directly on the master node (e.g., for shutting down VMs via the vSphere API), or on agents (e.g., calling /usr/bin/halt on physical machines).

Note

Shutdown Commands are executed with the same permissions as the icinga2 daemon itself. This could mean that by default your Icinga2 Agent does not have enough permissions to shut down its own host.

The shutdown of a host is triggered by calling the dedicated API endpoint “shutdown-host”. It takes as parameters:

  • A host with an agent, in the form of a monitoring filter (e.g. "filter": "host.name==\"myagent.example.com\"")

  • A shutdown_command with arguments to be performed on the destination machine

The shutdown_command parameter MUST contain the ordered list of arguments that the command should execute. The first argument must be the command itself. (e.g. ["/usr/bin/systemctl", "poweroff", "-i"])

Full example:

curl -k -u $USER:$PW \
    -H 'X-HTTP-Method-Override: POST' \
    -H 'Accept: application/json' \
    -X POST 'https://192.0.2.14:5665/v1/actions/shutdown-host' \
    -d '{
           "type": "Host",
           "filter": "host.name==\"myagent.example.com\"",
           "shutdown_command": ["/usr/bin/systemctl", "poweroff", "-i"]
        }'

Depending on the run-on-agent setting of the associated shutdown commant, it will either be executed directly on the master node, or otherwise passed to the specified agent and subsequently executed there.

Shutdown Host Permissions

The shutdown of a given host using the configuration described in the previous sections can also be initiated by calling the /v1/actions/shutdown-host endpoint of an Icinga2 master or satellite node.

If you want to add your own automation for shutting down hosts, you will need to configure a valid Icinga2 API user and grant the actions/shutdown-host permissions as in this example:

object ApiUser "shutdown-automation" {
  password = "secret"
  permission = "actions/shutdown-host"
}

Note

Users with all permissions (“*”) will also be able to initiate shutdown procedures for eligible hosts!

You can then authenticate yourself to the Icinga2 API via BasicAuth.