Common DFOS tools:
|dfos = Data Flow Operations System, the common tool set for DFO|
rollback of change in v2.2 (only RAWFILEs from ABs with status != CREATED are counted): we count executed AND created ABs, to include the science associations
|Note: before v3.0, the 'ALL' content in daily_stats was calculated from the modes. If the mode configuration was incomplete, the unconfigured modes were never counted. This systematic error is eliminated with v3.0 where the mode distinction has been terminated and ALL files are properly counted.|
This tool is used to extract statistics for the daily DFO workflow and to create monthly overviews of the statistics. In its standard mode (extractStat -d), it is usually called from finishNight. There is also an interactive mode (extractStat -i) which can be invoked for quick queries, and the mode -I (extractStat -I) for statistics overview for 3 month intervals.
The tool extracts statistical data for the following metrics:
With version 3.0, file-related parameters come only for the whole instrument, not per user-specified instrument mode (e.g. DET, IMG, LSS, IFU etc.) like before.
The tool is set up such that it overwrites existing entries for a given date. It can be invoked multiple times for a given date. The filling of entries is usually done in the background, with finishNight in which extractStat is wrapped.
All entries are written into the QC1 database tables daily_stat and monthly_stat. The QC1 database tables are visible under http://www.eso.org/qc/WISQ/QC1_DB_wisq.html .
The report mode is called interactively by 'extractStat -i'. There are 2 different kinds of reports:
|1||monthly report||all data aggregated for the specified month|
|2||trimonthly report||selected parameters for three months|
The results are read from the database and identical to the ones which you would see if you use the DFO database access interface, http://www.eso.org/qc/WISQ/QC1_DB_wisq.html. These options are offered for convenience only.
File size and numbers.
Raw files: Number of raw frames are extracted from executed ABs which effectively means that only raw frames are counted which have been processed. Their size is extracted from the ngas database. This mechanism does not require raw fits files physically stored under $DFO_RAW_DIR.
Product files: If products have already been ingested (the tool evaluates $DFO_MON_DIR/DFO_STATUS file for flags cal_Ingested and sci_Ingested), their number is read from ABs, and their size is read from the ngas database. In that case, no fits file is required to be physically stored under $DFO_CAL_DIR.
If products have not been ingested, their number and size are extracted from the content of $DFO_CAL_DIR. Then, any file not being present there is not counted.
ABs: AB numbers are counted in the $DFO_LOG_DIR tree (the final storage, after executing moveProducts). Two parameters are measured: the EXECUTED ABs (the ones successfully processed), and the created ABs. Their difference effectively counts the number of science ABs (plus a small bias introduced by unsuccessful ABs). Although not processed currently, the science ABs have a quality in themselves since they are stored in the calSelector database. (This was different before period 88, 2011-10, when CALIB and SCIENCE ABs were both executed and accounted.)
Note that we effectively count AB_detector jobs, i.e. one AB per detector. This is trivially the case for instruments like FORS2 or VIMOS (having one raw file per detector and therefore always one AB per detector). For instruments like CRIRES, HAWKI or VIRCAM, there is the configuration key MEF_FACTOR. It either takes into account that the AB is split during execution time into MEF_FACTOR jobs (this is the case for VIRCAM, MEF_FACTOR=16, and OMEGACAM, 32), or artificially accounts for a proper normalization of the N_AB parameter if the AB is executed sequentially for all detectors (like for CRIRES, MEF_FACTOR=3, and HAWKI, MEF_FACTOR=4).
The execution time for ABs is measured by adding up the TEXEC values from the ABs. The execution time for QC reports is calculated in a similar way, using TQCEXEC.
Statistics is writen into the QC1 database tables daily_stat and monthly_stat. The local tables $DFO_MON_DIR/STATISTICS_DAILY acts as a backup repository.
Type extractStat -h for on-line help, extractStat -v for the version number,
extractStat -d <DATE>
to extract statistics for <DATE> (with update of graphical reports),
to generate reports for the monthly overview per instrument,
for the 3 monthly report for ALLinstruments.
|Section 1: general|
|TOOL_MODE||AUTO | INTER||INTER: ask for confirmation before writing statistics
AUTO: do not ask for confirmation
|MEF_FACTOR||e.g. 1 or 4||for MEF instruments: multiplex factor, to provide correct N_AB; default: 1
|Section 2: instrument modes
Obsolete with v3.0
The following parameters are derived by extractStat and inserted into daily_stat, per instrument mode:
|Column name||Description||Format||Example entry|
|civil_date||DFO date (year-month-day)||YYYY-MM-DD||2005-02-09|
|instr_mode||always ALL (was: instrument mode)||char||DET|
|N_ACQ_RAW||Number of (raw) acquisition frames processed||integer||23|
|MB_ACQ_RAW||Total size (in Mbytes) of acquisition frames processed||float||0.21|
|N_CAL_RAW||Number of raw calibration frames processed||integer||103|
|MB__CAL_RAW||Total size (in Mbytes) of raw calibration frames processed||float||660.1|
|N_CAL_PRO||Number of calibration products created||integer||55|
|MB_CAL_PRO||Total size (in Mbytes) of calibration products||float||105.3|
|N_SCI_RAW||Number of raw science frames processed (= 0 with QC XXLight)||integer||23|
|MB_SCI_RAW||Total size (in Mbytes) of raw science frames processed (= 0 with QC XXLight)||float||276.4|
|N_SCI_PRO||Number of science products created (= 0 with QC XXLight)||integer||46|
|MB_SCI_PRO||Total size (in Mbytes) of science products created (= 0 with QC XXLight)||float||55.9|
|N_AB||Number of det.ABs created (CAL and SCI) (det.ABs: detector ABs, i.e. N_AB times mef_factor as configured||integer||26|
|N_EXEC||Number of successfully (pipline-) processed det.ABs||integer||22|
|T_AB_EXE||Total pipeline execution time (in minutes) for processed det.ABs||float||8.5|
|T_QC_EXE||Total QC report execution time (in minutes) for processed det.ABs||float||5.6|
The database table daily_stat has one row per instrument mode (only ALL after 2013-03) and day. One row with mode = ALL_INS_SUM is added with sums over all modes.
The table monthly_stat has the same column names and formats as daily_stat, plus summary values N_ALL_RAW (=ACQ+CAL+SCI), N_ALL_PRO (=CAL+SCI), GB_ALL_RAW (=CAL+SCI), GB_ALL_PRO (=CAL+SCI). Total file sizes are in GBytes (instead of MBytes), AB and QC execution times are in hours.
The primary statistics source are ABs within the $DFO_LOG_DIR/<DATE> directories. As a general rule, each raw or product file is counted only once. The following table summarises the implemented rules for counting files and ABs:
|Acquisition data||was counted before v2.2, not anymore|
|Raw calibrations||Every raw CALIB frame from RAWFILE section in an executed calibration AB is counted (before 2011-10-01: every associated and packed raw file)|
|Calibration products||Every ingested CALIB product is counted; if not yet ingested, every product file under $DFO_CAL_DIR (no matter if fits or hdr) is counted|
|Raw science data||Every raw SCIENCE frame from RAWFILE section in an executed AB is counted (before 2011-10-01: every associated and packed raw file)|
|Science data products||none (if existing, every ingested SCIENCE product would be counted)|
|Association blocks||All ABs in $DFO_LOG_DIR/<DATE> are counted and multiplied by MEF_FACTOR; ABs with PROCESS_STATUS!=CREATED are counted as pipeline executed; execution times are collected from the TEXEC key in the ABs|
|Last update: April 26, 2021 by rhanusch|