System Supervisor(supSupervisor)

The System Supervisor provides the functionality for supervision and monitoring of a configurable set of subsystems.

alternate text

System Supervisor.

The main components of the System Supervisor server are:

  • State Machine engine based on SCXML and implemented in RAD. It contains a set of action and activity classes.

  • A Subsystem Factory class that creates the instances of all subsystem classes at start-up and based on the server configuration.

  • A Facade class that manages the interface between the state machine engine and the subsystem classes.

The System Supervisor uses the OLDB to store run-time information about itself and about the subsystems it monitors. The System Supervisor subscribes to the status information published by the subsystems. The System Supervisor publishes its own status as well like any other subsystem.

Command Line Arguments

Command line argument help is available under the option --help.

--server-id ARG| -i ARG (string)

Server id. If not specified uses the one included in the configuration file.

--config ARG| -c ARG (string)

Application configuration file.

--log-level ARG| -l ARG (enum) [default: ERROR]

Log level to use. One of ERROR, INFO, DEBUG, TRACE.

--log-prop-file ARG| -l ARG (string)

Log property file.

--req-endpoint ARG| -l ARG (string)

Server MAL Req/Rep endpoint (zpb.rr://<ipaddr>:<port>/).

Environment Variables

$CFGPATH

Used to resolve configuration file paths.

$DATAROOT

Specifies the default root path used as output directory for FITS metadata. Metadata files are stored under $DATAROOT/fcf/<fcs instance>.

System Supervisor State Machine

The System Supervisor uses a state machine described in a SCXML format that is interpreted by the state machine engine provided by the rad application framework. (SCXML specification).

alternate text

System Supervisor State Machine Diagram.

Configuration

System Supervisor Configuration

The SysSup in version 4.0.0 has been ported to the CII config-ng library. Unlike yaml-cpp, this library allows to define type information for the configuration parameters. The System Supervisor includes a predefined set of configuration definitions. These files can be found in the syssup/server/resources/config directory.

You can find more information about CII config-ng in the following link. (Config-ng manual).

Note

The entry point for the System Supervisor configuration is the file that contains the server configuration.

Each Subsystem has its own set of configuration parameters

An example of a server configuration is provided below.

!cfg.include config/sup/syssup/server/definitions.yaml:
server: !cfg.type:SysSup
    server_id       : 'sup'
    req_endpoint    : "zpb.rr://127.0.0.1:13082/"
    pub_endpoint    : "zpb.ps://127.0.0.1:13345/"
    db_timeout      : 2000
    scxml           : "sup/syssup/server/sm.xml"
    log_properties  : "config/sup/syssup/server/log_properties.cfg"
    oldb_prefix     : "ins1"
    req_timeout     : 60000
    ob_modes        : [
    {
    name: Engineering,
    subsystems: ['fcs1','dummy1']
    },
    {
    name: Imaging,
    subsystems: ['dummy2']
    }
    ]
    subsystems      : [
    {
    name: 'fcs1',
    scope: internal,
    type: ifw::sup::syssup::common::Generic,
    rr_endpoint: "zpb.rr://127.0.0.1:15085/StdCmds",
    ps_endpoint: "zpb.ps://127.0.0.1:15045/",
    access: true
    },
    {
    name: 'dummy1',
    scope: internal,
    type: ifw::sup::syssup::common::Generic,
    rr_endpoint: "zpb.rr://127.0.0.1:15086/StdCmds",
    ps_endpoint: "zpb.ps://127.0.0.1:15046/",
    access: false
    },
    {
    name: 'dummy2',
    scope: internal,
    type: ifw::sup::syssup::common::Generic,
    rr_endpoint: "zpb.rr://127.0.0.1:15087/StdCmds",
    ps_endpoint: "zpb.ps://127.0.0.1:15047/",
    access: true
    }
    ]

Supervisor OLDB

The supervisor stores the actual values of the server configuration parameters into the OLDB. This helps to verify whether the configuration has been loaded correctly. For details of the server configuration parameters, see :ref: sup_config_ref_.

Supervisor configuration OLDB keys

OLDB Key

<instrument id>/<server id>/cfg/db_timeout

<instrument id>/<server id>/cfg/db_task_period

<instrument id>/<server id>/cfg/dictionaries

<instrument id>/<server id>/cfg/req_timeout

<instrument id>/<server id>/cfg/mon_timeout

<instrument id>/<server id>/cfg/filename

<instrument id>/<server id>/cfg/fits_prefix

<instrument id>/<server id>/cfg/pub_endpoint

<instrument id>/<server id>/cfg/req_endpoint

<instrument id>/<server id>/cfg/scxml

<instrument id>/<server id>/cfg/oldb_prefix

<instrument id>/<server id>/cfg/log_properties

<instrument id>/<server id>/cfg/server_id

<instrument id>/<server id>/cfg/subsystems/<subsystem id>/scope

<instrument id>/<server id>/cfg/subsystems/<subsystem id>/type

<instrument id>/<server id>/cfg/subsystems/<subsystem id>/rr_endpoint

<instrument id>/<server id>/cfg/subsystems/<subsystem id>/ps_endpoint

<instrument id>/<server id>/cfg/subsystems/<subsystem id>/access

Server Status

The server stores the string representation of its state and substate into the OLDB DB.

Server status DB keys

OLDB Key

<instrument id>/<server id>/stat/states/state

<instrument id>/<server id>/stat/states/substate

<instrument id>/<server id>/stat/subsystems/<subsystem id>/states/state

<instrument id>/<server id>/stat/subsystems/<subsystem id>/states/substate

<instrument id>/<server id>/stat/subsystems/<subsystem id>/ob_mode

Status Estimation

The estimated state/substate of the overall system is based on the individual subsystem states/substates and according to the following criteria:

Each of the known state/substate strings have associated a coding system to simplify the estimation. In the case of the state, the estimation is just the minimum state withing all managed subsystems. Here we have normally only three possible cases: Undetermined, NotOperational and Operational.

In the case of the substate, the estimation it is similar. The overall substate is the minimum substate with the following exception: * if at least one of the substate of the subsystems is any of the transient substates like SettingUp or Recording. The estimated substate will reflect the minimum transient state. The above helps to report the ongoing activities of the managed subsystems.

Note

The estimation is done by a virtual method of the Supervisor Facade and it could be replaced by the applications if needed.

Warning

The estimation relies on the fact that subsystem publish their status according to the defined format.

Commands

The commands currently supported by the server are listed here: List of Commands.

Error Handling

Supervisor commands throw an exception in case of errors or timeouts. Client applications can catch the exceptions and obtain the error message associated with the function getDesc(). This error does not contain neither the history nor the error stack but it normally indicates precisely where the error occurred. Since CII Error service is not yet available, Supervisor cannot use it.

Note

The specific exceptions depends of the given command used.

try {
    auto reply = client->GetState();
 } catch (const stdif::ExceptionErr& e) {
    RAD_LOG_ERROR() << "Error reply " << e.getDesc()  << ").";
}

Serialization

The System Supervisor uses the CII MAL ZPB (ZeroMQ + Google Proto buffers) for serialising commands.

Note

Each command has two parts: a payload and its corresponding reply, see the details in the supif module. The normal replies are plain strings.

Setup Command

The Setup command is intended to produce a change in the run-time configuration.

Since there is a not long operations associated with the Setup command, this operation is blocking. The Supervisor executes the action and then it send the reply back to the originator.

The interface definition of the Setup command can be found in module supif.

Warning

The array does not have a fixed size but it has a limit of 100 elements. A limit is needed by the CII XML ICD.

<method name="Setup" returnType="string" throws="ExceptionErr">
    <argument name="payload" type="nonBasic" nonBasicTypeName="SetupElem" arrayDimensions="(100)"/>
</method>

SubsysNames Command

The SubsysNames command reports in a comma separated list, the subsystems managed by the System Supervisor. An example of the output generated by the SubsysNames command is shown below. The URI shall be adapted to the correct values.

$ supClient zpb.rr://134.171.3.48:30519 SubsysNames ""
subsim2, subsim3

SubsysStatus Command

The SubsysStatus command provides information about each subsystem managed by the System Supervisor. An example of the output generated by the SubsysStatus command is shown below.

$ supClient zpb.rr://134.171.3.48:30519 SubsysStatus ""
subsim2.access = true
subsim2.scope = internal
subsim2.connection_status = Connected
subsim2.state = Operational
subsim2.substate = Idle
subsim3.access = true
subsim3.scope = internal
subsim3.connection_status = Connected
subsim3.state = Operational
subsim3.substate = Idle

Subscriptions

Each subsystem instance created by the factory subscribes to the status of the subsystem. The subscription follows the following naming convention. The System Supervisor relies on this convention to monitor the status of the subsystems.

Subsystem

Parameter

end point

<subsystem>

status

<ps endpoint>/std/status

Publishing

The System Supervisor publishes as any other subsystem its estimated state/substate. This can be used to build a hierarchy of subsystems.

Parameter

end point

status

<ps endpoint>/std/status

Signal Handling

The supervisor handles the SIGUSR1 emitted by Nomad to notify when changes in the template configuration file at run-time. When the Supervisor receives this signal, it reloads the configuration and reconnect to the given subsystem if needed.

alternate text

Supervisor Handling of Nomad Signals.

Troubleshooting

Logging

The System Supervisor implements logging levels according to the log4cplus package where the concept is:

ALL < TRACE < DEBUG < INFO < WARN < ERROR < FATAL < OFF

The basic log levels supported by the SysSup for troubleshooting are listed in the table below.

Name

Verbosity

Description

ERROR

very low

Provide logging only in case of errors.

INFO

low

Provide information for the most important actions.

DEBUG

medium

Provide additional information for the developer.

TRACE

very high

Includes all the function tracing.

To activate a new logging, the command SetLogLevel shall be used. See the example below.

$ supClient zpb.rr://134.171.3.48:30519 SetLogLevel "TRACE"

Loggers

The System Supervisor provides a default configuration (log_cii_properties.cfg) for the logging with the CII logging service. This configuration defines one general logger (app).

Logger

Description

app

General logger for common server classes.

Log File

The default log configuration provides two appenders. One for the console and another one for a file. The file is stored in the CII Logging directory (CII_LOGS). The name of the file is supSupervisor.log.

Logging Viewer

Since version 5.0.0, the logs can be visualised using the CII Logging Viewer.