The Data Acquisition Process

Introduction

This section provides an overview of the Data Acquisition process, both in terms of what is currently implemented in OCM and what is planned for with DPM.

For an overview of conceptual model, processes and the relationships between components see the section Overview.

Stylistic Conventions

The following visual convention is used for states and transitions.

digraph G {
    # Config
    background = transparent;
    node [shape=Mrecord,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    # States
    Normal [label="{Normal|\l}"];
    Transitional [label="{Transitional|\l}", style=dashed];
}
Normal

Normal state.

Transitional

Transitional states are states where the Data Acquisition remain during handling of a command, if it cannot be completed immediately.

digraph Daq {
    # Config
    background = transparent;
    node [shape=Mrecord,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    A [label="{A|\l}"];
    B [label="{B|\l}"];

    # Transitions
    A:se -> B:ne [label="Command()"];
    A:sw -> B:nw [label="Event", style=dashed];
}
Command()

Transition that occurred when initiating the handling of a command. That it is a request/command can be identified by the suffix “()”.

Event

Transition that occurred due to an internally generated event, internal or external. This can also occur due to completion of the handling a command and is where the reply is sent.

Process Overview

Each successful Data Acquisition includes the following states (represented in the MAL API with DaqState) which spans the following high-level activities:

StateAcquiring

From the initial point it is created in OCM through the phase where data actually has been acquired by data sources.

StateMerging

Through the phase where a Data Product is created from the acquired data and released to the Online Archive System (OLAS) or other configured recipients.

StateCompleted

To the final state for a Data Acquisition. The two possible substates are:

which indicates a completed or user aborted Data Acquisition respectively.

These states, each with their own substates with an overview of transitions are visualized in Fig. 10.

../_images/DataAcquisition.png

Fig. 10 SysML diagram of Data Acquisition states and transitions. Transitions in bold indicate the transition path for a successful Data Acquisition. Purple states are transitional states used for the duration of an operation. The orthogonal region with states NotError and Error is exclusively a means to model the error flag used in the implementation. Expect further refinement in StateMerging in following releases.

Important

The possible final states for a Data Acquisition is either Completed or Aborted. OCM and DPM will never abort a successfully started Data Acquisition on its own volition; only at the request of a client (normally the instrument operator).

The only scenario where OCM aborts is if StartDaq() fails, in which case the potentially partially started Data Acquisition is forcibly aborted.

Note

Error is possible in any state which is indicated with an error flag in the Data Acquisition status structure DaqStatus (this can be considered an orthogonal error state as shown in Fig. 10). This means for example that a Data Acquisition may be Aborted with or without error.

Errors may prevent forward progress in some cases and in other cases errors may be ignored by forcing forward progress with commands such as ForceStopDaq() (and accepting degraded Data Product as a result).

The enumeration DaqState defines the top-level states and all the sub-states are enumerated together in DaqSubState. See section OCM Data Acquisition Control for description of the states.

The next sections provide further details of the Data Acquisition life-cycle and how to interact with OCM for control.

Data Validation

Although there is no general facility for validating all input data to detect problems early, there is support for using the Data Interface Dictionaries from Data Interface Tools [RD10] in daqOcmServer to validate FITS keywords as they are received in JSON format from the following sources:

Table 4 Places where keyword validation is made.

Command

Direction

Description

StartDaq() and StartDaqV2()

Request

Request is rejected and Data Acquisition is not started.

UpdateKeywords()

Request

Request is rejected and Data Acquisition is not updated with any keywords.

metadaqif.MetaDaq.StopDaq()

Reply

If invalid keywords are provided as part of the reply structure DaqStopReply, from the metadaqif StopDaq() request, this is treated as an error.

Keywords are validated and formatted against configured dictionaries and if this fails the command will be rejected as a whole without starting or modifying Data Acquisition. Formatting is made by using the format provided in dictionary. If no format is provided the built-in standard format is used. Similarly the keyword comment is used from the dictionary if none is provided.

For configuration of dictionaries see dictionaries.

Note

If no dictionaries are configured no validation or formatting is made.

StateAcquiring

This section documents the details of the top-level StateAcquiring state of a Data Acquisition; the observable states, the transitions and how it relates to the OCM command interface described here.

Warning

Although the OCM API disambiguates on which Data Acquisition to operate on (by requiring user to provide the Data Acquisition identifier), the interface to detector data sources recif does not, and will always operate on its current recording whether it is correct or not. This means that users must be extra careful and e.g. make sure to only have one active Data Acquisition at a time which use the same primary data source.

Overview

The following diagram shows an overview for a nominal Data Acquisition where commands succeed:

Note

For brevity ForceAbortDaq() is not shown in the diagram as it performs the same transitions as AbortDaq() except that when error occcurs it still performs the transition.

digraph Daq {
    # Config
    background = transparent;
    node [shape=Mrecord,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    # States
    NotStarted [label="{NotStarted|\l}"];
    Aborting [label="{Aborting|\l}", style=dashed];

    # States
    Stopping [label="{Stopping|\l}", style=dashed];
    subgraph cluster_g {
        rankdir="LR";
        style=invis;
        Aborted [label="{Aborted|\l}", rank=max];
        Stopped [label="{Stopped|\l}", rank=min];
    }
    Starting [label="{Starting|\l}", style=dashed];
    Acquiring [label="{Acquiring|\l}"];

    # Transitions
    NotStarted -> Starting [label="StartDaq()"];
    NotStarted -> Aborted [label="AbortDaq()"];

    Starting -> Acquiring [label="Acquiring", style=dashed];
    Starting -> Aborting [label="AbortDaq()"];

    Acquiring -> Stopping [label="StopDaq()"];
    Acquiring -> Stopping [label="Stopping", style=dashed];
    Acquiring -> Aborting [label="AbortDaq()"];

    Stopping -> Stopping [label="StopDaq()"];
    Stopping -> Stopped [label="Stopped", style=dashed];
    Stopping -> Aborting [label="AbortDaq()"];

    Aborting -> Aborted [label="Aborted", style=dashed];
    Aborting -> Aborting [label="AbortDaq()"];
}

Fig. 11 Data Acquisition state and transition overview

The transition for the event Stopping is performed automatically by OCM when all primary data sources stop or if there are no stateful data sources[1]. If e.g. detector is configured to integrate for 10 seconds it will automatically stop, which is then observed by OCM, which then triggers the transition to Stopping state where metadata sources are being stopped.

Starting

The following diagram show the initial states of a Data Acquisition created with StartDaq(), from the initial state NotStarted which is the point where the Data Acquisition has been created and registered internally in OCM but has not yet initiated any actions. When ready OCM transitions to state Starting in which all data sources are requested to start their data acquisition. When all sources acknowledge successfully the Data Acquisition transitions to Acquiring.

Note

StartDaq() creates a new data acquisition and starts it in one command, so the state NotStarted is never observed from outside. If a use-case requires it this can be changed to a two-step command, one that creates the Data Acquisition and one command that starts it.

digraph Daq {
    # Config
    node [shape=Mrecord,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    # States
    NotStarted [label="{NotStarted|\l}"];
    Starting [label="{Starting|\l}", style=dashed];
    Acquiring [label="{Acquiring|\l}"];

    # Transitions
    NotStarted -> Starting [label="StartDaq()"];
    Starting -> Acquiring [label="Acquiring", style=dashed];
}

Important

If StartDaq() does not succeed the user should clean up the failed data acquisition by aborting it. This will abort any partially started acquisitions for the configured sources. If a data source is not responding or otherwise report error, this will cause AbortDaq() to fail. For these cases ForceAbortDaq() can be used to force the transition to Aborted.

Stopping

The following diagram show from which states StopDaq() and ForceStopDaq() is valid. The diagram shows only StopDaq() but is valid also for ForceStopDaq().

Using ForceStopDaq() command the only difference is that the transition to Stopped is forcefully performed even in the presence of errors from e.g. data sources.

Note

If StopDaq() fails the Data Acquisition remains in Stopping state. At this point it is possible to retry StopDaq() or force it with ForceStopDaq().

Important

Since ForceStopDaq() stops a Data Acquisition even if data sources fail to stop, it means that user might have to perform manual error recovery on the faulty components.

digraph Daq {
    # Config
    node [shape=Mrecord,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45, bcolor=transparent, nodesep=1];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.53,0.09"];

    # States
    Acquiring [label="{Acquiring|\l}"];
    Stopping [label="{Stopping|\l}", style=dashed];
    Stopped [label="{Stopped|\l}"];

    # Transitions
    Acquiring -> Stopping [label="StopDaq()"];
    Stopping -> Stopping [label="StopDaq()"];
    Stopping -> Stopped [label="Stopped", style=dashed];
}

Aborting

The following diagram show from which states AbortDaq() and ForceAbortDaq() is valid. The diagram shows only AbortDaq() but is valid also for ForceAbortDaq().

Using ForceAbortDaq() command the only difference is that the transition to Aborted is forcefully performed even in the presence of errors from e.g. data sources.

Note

If AbortDaq() fails the Data Acquisition remains in Aborting state. At this point it is possible to retry AbortDaq() or force it with ForceAbortDaq().

Important

Since ForceAbortDaq() aborts a Data Acquisition even if data sources fail to abort, it means that user might have to perform manual error recovery on the faulty components.

digraph Daq {
    # Config
    background = transparent;
    node [shape=Mrecord,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    # States
    NotStarted [label="{NotStarted|\l}"];

    # States
    Stopping [label="{Stopping|\l}", style=dashed];
    subgraph cluster_g {
        rankdir="LR";
        style=invis;
        Aborted [label="{Aborted|\l}"];
    }
    Starting [label="{Starting|\l}", style=dashed];
    Acquiring [label="{Acquiring|\l}"];

    # Transitions
    NotStarted -> Starting [ style=invisible, arrowhead=none];
    NotStarted -> Aborted [label="AbortDaq()"];

    Starting -> Acquiring [style=invisible, arrowhead=none];
    Starting -> Aborting [label="AbortDaq()"];

    Acquiring -> Stopping [style=invisible, arrowhead=none];
    Acquiring -> Stopping [style=invisible, arrowhead=none];
    Acquiring -> Aborting [label="AbortDaq()"];

    Stopping -> Aborting [label="AbortDaq()"];

    Aborting -> Aborted [label="Aborted", style=dashed];
    Aborting -> Aborting [label="AbortDaq()"];
}

StateMerging

This section documents the details of the top-level state StateMerging state of a Data Acquisition. Compared to StateAcquiring the successful sequence is autonomous and can complete unattended. The exception to this is if the Data Acquisition should be aborted in which case the request AbortDaq() can be sent to OCM, which will delegate to DPM if required. If a failure occurs e.g. because of misconfiguration so that file transfer fails, DPM will attempt again next time it is started and when requested via command RetryMergeDaq().

Note

It is worth clarifying that clients are not required and are never expected to have to interact directly with DPM.

Overview

digraph Daq {
    # Config
    background = transparent;
    node [shape=Mrecord,fontname=helvetica,margin="0.22,0.09",fontsize=11];
    graph [fontname = "helvetica",nodesep=0.45,bgcolor=transparent, ordering=out];
    edge [fontname = "Lucida Console", fontsize=10, margin="0.23,0.09"];

    # States
    NotScheduled [label="{NotScheduled|\l}"];
    Collecting [label="{Collecting|\l}"];
    Scheduled [label="{Scheduled|\l}"];
    Merging [label="{Merging|\l}"];
    Releasing [label="{Releasing|\l}"];

    Aborting [label="{Aborting|\l}", style=dashed];

    subgraph cluster_g {
        rankdir="LR";
        style=invis;
        subgraph cluster_g {
            rankdir="TB";
            Aborted [label="{Aborted|\l}"];
        }
        Completed [label="{Completed|\l}", rank=min];
    }

    # Transitions
    NotScheduled -> Scheduled [label="Unspecified()", fontname="Lucida Console Italic"];
    NotScheduled -> Aborting [label="AbortDaq()"];

    Scheduled -> Collecting [label="Initiate"];
    Scheduled -> Aborting [label="AbortDaq()"];

    Collecting -> Merging [label="Collecting complete"];
    Collecting -> Aborting [label="AbortDaq()"];

    Merging -> Releasing [label="Merge complete"];
    Merging -> Aborting [label="AbortDaq()"];

    Releasing -> Aborting [label="AbortDaq()"];
    Releasing -> Completed [label="Release complete"];
    Aborting -> Aborted;
}

Fig. 12 Data Acquisition state and transition overview for StateMerging

NotScheduled

OCM will attempt to schedule Data Acquisition for merging. If DPM is offline or otherwise unreachable it will remain in this state.

As Data Acquisition has not yet been scheduled it is possible to abort the Data Acquisition without a connection to daqDpmServer.

Scheduled

Responsibility for completing the Data Acquisition is from this point on DPM and authoratitive Data Acquisition status originates from DPM, but still published by OCM.

If a request to abort Data Acquisition is made the normal behaviour is to forward the request to DPM. If DPM is offline the Data Acquisition can only be aborted with ForceAbortDaq(), but this will be unknown to DPM:

Warning

There is a risk of Data Acquisition state inconsistency if Data Acquisition is forcibly aborted. As DPM is offline or unreachable it may independently of OCM complete the merge process. As such it is possible the Data Acquisition status is inconsistent or may change after new information from DPM is available again.

Collecting

Files are collected from where they were created to the local daqDpmServer workspace. At this time there is no optimization implemented for the case the file is available from local file system mount.

Merging

This is the state where the final Data Product is created from all the previously acquired data. An overview of that process is provided in Data Product Creation.

Releasing

The completed Data Product is released to configured receivers (c.f. receivers in StartDaqV2Specification {JSON}). If a transfer fails RetryMergeDaq() can be issued to retry failed transfers, presumably after correcting underlying cause.

Changed in version 3.2.0: Failure to release data product is treated as error and Data Acquisition will remain in state DaqSubState.Releasing.

Note

If daqDpmServer is deployed on the same host as OLAS and host is empty, daqDpmServer will try to create a hard link of the Data Product to the OlasReceiver {JSON} path. If this fails then a symlink will be created instead - if this also fails a copy will be attempted.

Completed

This is the end of the Data Acquisition life-cycle, no activities are performed in this state.