MONTHLY-2004-06 Information

Integration Release R3.0

The software has been tagged with MONTHLY-2005-09-ITER-21

These are the patches we received and integrated subsystem per subsystem:

SUBSYSTEM NAME	TAG (corresponding to the last bug fixing)
ACS	MONTHLY-2005-09-2
ARCHIVE	MONTHLY-2005-09-5
CONTROL	MONTHLY-2005-09-4
CORR	MONTHLY-2005-09-12
EXEC	MONTHLY-2005-09-3
OFFLINE	MONTHLY-2005-09-3
PIPELINE	MONTHLY-2005-09-2
TELCAL	MONTHLY-2005-09-10
ICD/CONTROL	MONTHLY-2005-09-2
ICD/CORR	MONTHLY-2005-09-2
ICD/EXEC	MONTHLY-2005-09-1
ICD/HLA	MONTHLY-2005-09-1
ICD/OFFLINE	MONTHLY-2005-09-2

0. Comments

PSI wrote:

“Here are some problems we encountered during the latest release integration:

- Archive: a couple of usual problems are addressed in SPR ALMASW2005103 Another problem is when any archive operation fails because of "java.lang.out.of.memory" which appears in the file $ACSDATA/tmp/catalina.out. Simon tried to solve it increasing some buffers in the archiveStart command, who changed from: export JAVA_OPTS="-DACS.data=$ACSDATA" to export JAVA_OPTS="-DACS.data=$ACSDATA -Xms128m -Xmx512m" but I think the "java.lang.out.of.memory" error appears again even after that modification.

- another problem that can be due to Acs and/or to the way the implementation of the components is done within the different subsystems, is addressed in SPR ALMASW2005104

- a problem related to CORR, RTAI part, is the crash of the CorrSimulator^? container, which happens when one tries to bring the CORR subsystem from shutdown to init. It is enough to restart the CorrSimulator^? container and then normally the initialization works. CORR people are aware of that (in the sense that they issued the warning about that in their release notes for R3).

- one of the major difficulties during integration tests is that, if something goes wrong after the startup of the system and we have to shutdown one or more containers (or, also, the containers crash by themselves), it is then almost impossible to bring again the system in a perfectly working status. Normally, to be sure, we have to shutdown everything and restart everything again. In other words, it seems that we are somehow obliged to always entirely follow the (long) startup sequence, if not we are not sure of the results.

Then from R3, there are still many possible fields of investigation, for example how the communication works among QL, DC and TELCAL (how IDs are passed around, like execblock ids, scan and subscan numbers, number of rows in tables, which tables..., and where they are retrieved from.) (In principle, CORR too is involved but we never saw it during R3 because of missing code from CONTROL). Another possibility: prepare many different types of SBs (very simple, very complicated, containing something wrong or not) and see how the system (basically control and scheduling) reacts.”

1. Subsystems SLOC:

Data here reported are compared with previous release (March 2005 – RELEASE R2.1)

- on all the directories

- on src only

- on test only

- percentage test/src

SLOC detailed figures:

ACS

2. In-line Documentation

Assessment on the sufficiency of Doxygen-like in-line documentation: à Graph

3. Unit Tests

See the log at this link.

4. Test Coverage

You will have a page with the results per each subsystem (see list below). In this page of results, there is a table with 3 columns:

- the first one lists the modules belonging to the subsystem,

- the second gives the results of the tests (PASSED, FAILED, UNDETERMINED, when it is neither PASSED nor FAILED); sometimes the test directory can be missing, so you will just see the message: "No test directory, nothing to do here".

- the third one gives a resume produced by Purify/Coverage reporting the analysis results on:

Functions

Functions and exits

Statement blocks

Implicit blocks

Decisions

Loops

The values reported for every item in the list above give the number of hits for that item.

In the same cell with the resume there is a link to the "Complete Report" produced by Purify. In the Complete Report one has information about the lines where the hit happened. For a loop, one has also the values: 0 (the loop is executed without entering into it), 1 (the loop is entered once), 2+ (the loop is entered and repeated once or more times).

Sometimes, instead of the resume, you will see a message like:

ERROR: No coverage info available
No atlout.spt

This happens when the modular test of that module is a fake one (for example the test is actually just an "echo something"), so there is no code (Java, Python or Cpp) that can be instrumented.

ACS

5. Java Duplicated Classes

See the list at this link.

6. Channels and Events

See details at this link.