Early Science Packaging Script

The latest version of the packaging script has been committed to cvs here: AIV/science/qa2/QA2_Packaging_module.py

The contents of the package should be consistent with the guidelines described here: http://almasw.hq.eso.org/almasw/bin/view/Archive/Cycle01database

A pdf document with a more detailed description of the packaging script: documentation.pdf

Running the script

The script should be run from within casa. It can be run from any directory. The import is done with

from QA2_Packaging_module import *

Then run the script with

QA_Packager(origpath='/pathToMySB',readme='./README.header.txt',packpath='./2011.0.00XXX.S',
append='',mode='fake')

  • origpath should be the path to the reduction directory for a particular SB. Several subdirectories for EBs could exist within this directory.
  • packpath: the path to the destination folder (which should have the project code as name)
  • paths should not end in /
  • readme: The path to an ascii file with the text of the README header (for a template see this link at ESO or find it in the CVS repository here: AIV/science/qa2/README.header.txt, or use the text at the bottom of this page)
  • mode: The copying mode:
    • mode = ’fake’ : This is the default. The script creates empty (i.e., dummy) files at the destination folder.
    • mode = ’copy’ : The files are copied in the normal way.
    • mode = ’hard’ : The script generates hard links in the destination folder. This way, the file-pointers at both the origin and the destination folders, refer to the same physical locations (i.e., inodes) in the disk. This is the recommended way of packaging.
    • BEFORE PACKAGING WITH 'HARD' type:
    • <font size="2"><span style="font-size: 10pt">execfile('/diska/work/software/jao-mirror/AIV/science/qa2/QA2_Packaging_module.py')</span></font>

      Then run the script with

      QA_Packager(origpath='/pathToMySB',readme='./README.header.txt',packpath='./2011.0.00XXX.S',
      append='',mode='hard')
    • mode = ’move’ :The files are moved from the origin to the destination. Then, symbolic links are created at the origin. The symbolic links are never made to whole folders, but only to files (to avoid, for instance, an accidental deletion of whole folders at the destination path by removing the content of a linked folder).
    • mode = ’ticket’ : Similar to ’fake’, but the files to be added to the JIRA ticket are hard-linked. This way, the packager will generate valid ticket tar/zip files, but the measurement sets and tables will not be copied. Recommended for creating the jira ticket file.
  • append: The appending mode (see below).
    • append = ’’ : This is the default. It removes any previous data at the destination folder before the packaging.
    • append = ’group’ :The script appends a new group id to the destination folder (so the other groups, if any, are not removed).
    • append = ’member’ : The script appends new member id(s) to the group with highest id (so the other groups, if any, and the other member ids, if any, are not removed).

Examples

For example, for project 229, which contains four SBs, one of them is called 'IRC10216_setup3_b6_run_x2', you can run:

QA_Packager(origpath='/scratch04/arcproc/project229/IRC10216_setup3_b6_run_x2',readme='./README.header.txt',
packpath='./2011.0.00229.S',append='',mode='fake')

This will create a dummy directory structure 2011.0.00229.S. It is recommended to run it in this mode first. If everything looks OK, run it again with mode='hard', which is the recommended mode for packaging. The fake mode is very fast, but the 'hard' mode may take a while because split will have to be run several times to split out the relevant data columns from .split and .split.cal files.

The script will also produce two files with extension 'ticket.tar' and 'ticket.zip'. These are the files that can be attached to the data reduction ticket. They contain the README, the qa2 files, the scripts, and the expected size of the package. Attaching zip files to the jira ticket seems very convenient because these can be opened directly from the jira ticket, without having to download the whole file.

If the script is run in fake mode, note that the .ticket files are also dummy files. In order to make real ticket files, run the script in 'hard' mode, or use mode='ticket' (recommended). The latter mode will create a valid ticket file, but will only create a dummy packaging directory:

QA_Packager(origpath='/scratch04/arcproc/project229/IRC10216_setup3_b6_run_x2',readme='./README.header.txt',
packpath='./2011.0.00229.S',append='',mode='ticket')

Appending a member

To append another SB to the package, for example IRC10216_setup4_b6_run_x2, run with option append='member':
QA_Packager(origpath='/scratch04/arcproc/project229/IRC10216_setup4_b6_run_x2',readme='./README.header.txt',
packpath='./2011.0.00229.S',append='member',mode='fake')

Example data reduction directory structure

The packaging script follows the conventions described in the "How to reduce ALMA science data?" document. In case there are multiple executions for each SB, the recommended structure is the following:

project123 - parent directory for the data reduction of this project
   SB1 - directory for SB number 1
      Xab1 - separate directories for each of the asdm, named after the last few characters for the asdm
      Xabc
      calibrated.ms - the combined  ms of this SB
      the imaging and flux calibration scripts
      fits files      
      optional qa2 notes for the PI
   SB2 - directory for SB number 2
      Xbb1
      Xbbc

For single asdms per SB, the structure would look like this:

project123 - parent directory for the data reduction of this project
   SB1 - directory for SB number 1
      Xab1 - a directory for the asdm, named after the last few characters for the asdm
         including also the imaging script and fits files etc      
   SB2 - directory for SB number 2
      Xbb1
 

or, alternatively:

project123 - parent directory for the data reduction of this project
   Xab1 - a directory for the asdm, named after the last few characters for the asdm
      including also the imaging script and fits files etc      
   Xbb1
 

The directories named Xab1 etc should contain uid___xxx.scriptForCalibration.py, uid___xxx.split, (and uid___xxx.split.cal), all calibration tables and the checklist. The only subdirectory in here should be qa2. Please do not create a separate 'Calibration' subdirectory in here. Note that there is no requirement for these directories to start with 'X'. You can call them 'Calibration_Xab1', if you prefer.

Some data reducers may find it convenient to put results (images, cubes etc) in separate directories at the same level as the X directories. The packaging script will find the data products in those directories.

For example, for project 229, which contains four SBs, the data reduction directory looks like this:
project229=
=|-- IRC10216_setup3_b6_run_x2</verbatim>
- first SB
| |-- IRC10216_setup3_b6_runx2.imaging.txt - notes for the PI (optional)
| |-- X8c - first execution of this SB
| | |-- checklist.txt - checklist (does not need to be prepended by asdm name)
| | |-- qa2 - qa2 directory
| | | |-- qa2_part1.png
| | | |-- textfile.txt
| | | |-- etc
| | |-- uid___A002_X3c9295_X8c.ms
| | |-- uid___A002_X3c9295_X8c.ms.antpos
| | |-- uid___A002_X3c9295_X8c.ms.scriptForCalibration.py
| | |-- uid___A002_X3c9295_X8c.ms.split
| | |-- uid___A002_X3c9295_X8c.ms.split.cal
| | |-- uid___A002_X3c9295_X8c.ms.split.ap_pre_bandpass
| | |-- uid___A002_X3c9295_X8c.ms.split.ap_pre_bandpass.plots
| | |-- etc
| |-- X528
| | |-- uid___A002_X3dc0e7_X528.ms
| | |-- etc
| |-- Combined - Optional directory to put the combined products in
| |-- calibrated.ms - combined ms. Will not be packaged!
| |-- calibrated_all_setup3_b6.clean.fits - a fits file
| |-- scriptForFluxCalibration.py
| |-- scriptForImaging.py
|-- IRC10216_setup4_b6_run_x2
| |-- X3c9295_X1e0
| | |-- uid___A002_X3c9295_X1e0.ms
| | |-- etc
</verbatim>

Notes

  • File names that are not unique (do not start with uid name, or X...) will be prepended with the uid automatically. So 'qa2_part1.png' will be renamed 'uid___A002_X3dc0e7_X528__qa2_part1.png'. But 'X8c.checklist.txt' will not be renamed.
  • QA2 files will be found at the SB level, at the asdm level, and in the qa2 directory. They should match one of: *.jpg, *qa2.txt, *qa2-summ.txt, *.qa2.pdf, *textfile.txt', *qa2_part?.png
  • Checklists should match one of: *checklist, *checklist.txt, *Checklist, *Checklist.txt [script need updating]

Example output directory structure

After running
QA_Packager(origpath='/scratch04/arcproc/project229/IRC10216_setup3_b6_run_x2',readme='./README.header.txt',
packpath='./2011.0.00229.S',append='',mode='fake')
the following package was created:
|-- 2011.0.00229.S/
|   |-- sg_ouss_id/
|   |   |-- group_ouss_id/
|   |   |   |-- member_ouss_id/
|   |   |   |   |-- raw/
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528.ms.split/
|   |   |   |   |   |-- uid___A002_X3c9295_X8c.ms.split/
|   |   |   |   |-- qa/
|   |   |   |   |   |-- X528.checklist.txt
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528__qa2_part1.png
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528__textfile.txt
|   |   |   |   |   |-- uid___A002_X3c9295_X8c__qa2_part1.png
|   |   |   |   |   |-- X8c.checklist.txt
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528__qa2_part3.png
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528__qa2_part2.png
|   |   |   |   |   |-- uid___A002_X3c9295_X8c__qa2_part2.png
|   |   |   |   |   |-- uid___A002_X3c9295_X8c__qa2_part3.png
|   |   |   |   |   |-- uid___A002_X3c9295_X8c__textfile.txt
|   |   |   |   |-- script/
|   |   |   |   |   |-- scriptForImaging.py
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528.ms.scriptForCalibration.py
|   |   |   |   |   |-- uid___A002_X3c9295_X8c.ms.scriptForCalibration.py
|   |   |   |   |   |-- scriptForCalibration.py
|   |   |   |   |   |-- scriptForAprioriCalibration.py
|   |   |   |   |   |-- scriptForFluxCalibration.py
|   |   |   |   |-- log/
|   |   |   |   |   |-- ipython.log
|   |   |   |   |   |-- casapy-2012-05-03T115433.log
|   |   |   |   |   |-- uid___A002_X3c9295_X8c__casapy-20120502-102144.log
|   |   |   |   |   |-- uid___A002_X3c9295_X8c__ipython.log
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528__ipython-20120430-142604.log
|   |   |   |   |   |-- uid___A002_X3c9295_X8c__ipython-20120502-102151.log
|   |   |   |   |-- product/
|   |   |   |   |   |-- calibrated_all_setup3_b6.clean.fits
|   |   |   |   |-- calibrated/
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528.ms.split.cal/
|   |   |   |   |   |-- uid___A002_X3c9295_X8c.ms.split.cal/
|   |   |   |   |-- calibration/
|   |   |   |   |   |-- uid___A002_X3c9295_X8c.calibration.plots/
|   |   |   |   |   |-- uid___A002_X3c9295_X8c.calibration/
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528.calibration/
|   |   |   |   |   |-- uid___A002_X3dc0e7_X528.calibration.plots/

Then

QA_Packager(origpath='/scratch04/arcproc/project229/IRC10216_setup4_b6_run_x2',readme='./README.header.txt',
packpath='./2011.0.00229.S',append='member',mode='fake')
produces a second member:
|-- 2011.0.00229.S/
|   |-- sg_ouss_id/
|   |   |-- group_ouss_id/
|   |   |   |-- member_ouss_id/
|   |   |   |-- member_ouss_id2/

Tarring up the package

Once a new project is packaged and approved for delivery, the data have to be tarred up, be delivered to the PI and then sent to JAO for ingestion into the Archive (from where they get mirrored to the ARCs). This is the current procedure which might change as we go along.

In order to make data delivery to archival researchers in the future easy, please use tarsplit.py from cvs AIV/science/DSO to create the tarfiles. E.g.

tarsplit.py -n 2012-06-20 2011.01234.S

This helps to split the data into chunks that allow for efficient network transfer to and from the ARCs. Also it makes sure that the tarfiles obey to the naming convention which helps with the tracking as well as the mini-datapacker that will run in the Request Handler.

README

Atacama Large Millimeter/submillimeter Array (ALMA)

#####

Cycle: 0 (Early science)
Project code:
SB name:
PI name:
Project title:
Configuration:
Proposed rms:
CASA version used for reduction: 3.4
Comments from Reducer:

#####

This file describes the content of the tar file you have received. The
full data structure is inserted below.

At this stage, we are releasing data after completion of one SB (excuted
multiple times if required), so you will find only one member_ouss_id
directory.  This directory contains this README file and the following
directories: raw, calibrated, calibration, script, qa2, log, product.

- 'raw' contains the apriori calibrated ms for each execution block,
after being split into the science spectral windows.  This calibration
includes: WVR, Tsys and antenna position corrections and apriori
flagging.
- 'calibrated' contains the fully calibrated ms for each execution block.
- 'calibration' contains the files needed for calibration starting from
the initial ms to the fully calibrated data.   Plots are included.
- 'script' contains the reduction scripts used to process the initial ms
to calibrated data, but also to obtain concatenated data (if more than
one execution) and imaging products.  There are usually several scripts
dealing with different parts of the processing.
- 'product' contains the fits files of the selected image products. 
These will not include all images of scientific value, but will indicate
the quality of the calibration and images.
- 'qa' contains the qa2 reports that show plots and text information
needed to assess the quality of the processing.  The resultant image
rms, compared with that proposed, is given.
- 'log' contains the casa log files.

#####

Acknowledgements
Thanks very much to Ivan Marti-Vidal (Onsala) and Anita Richards (Manchester) for producing the script.


-- MartinZwaan - 20 May 2012

Topic attachments
I Attachment Action Size Date Who Comment
pdfpdf documentation.pdf manage 61.4 K 20 May 2012 - 09:20 MartinZwaan  
Topic revision: r9 - 04 Dec 2012 - 15:03:34 - ElizabethHumphreys
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding ARC TWiki? Send feedback