NGAS UPGRADE PLAN FEB2003 (GAR + LS + PAR): =========================================== Jens Knudstrup, 14.02.2003. This document describes in detail the variosu actions to be performed for the NGAS upgrade of the GAR + LS (+ PAR) NGAS installations. => The period of upgrade for GAR is: 17.02.2003->20.02.2003. => The period of upgrade for LS is: 22.02.2003->28.02.2003. During these periods the following main tasks will be carried out: 1. The NGAS HW will be upgraded (disk controllers (partly), new firmware for the 3ware disk controllers). 2. RedHat Linux 7.3 will be installed on all NGAS machines. 3. New version of the NGAS SW will be installed. 4. The NGAS DBs in GAR and LS will be upgraded + the replication modified in accordance with the new DB structure. 5. NGAS operators will receive instructions about new operational procedure. 6. An NGAS review will be carried out to improve existing features + improve documentation. 7. Planning of delivery of NGAS to PAR will be initiated. The following people will be involved in this (as far as I know, I have indicated for each name where each person will be active). Should I have left somebody out of the list, please let me know: ASC: Anton Scherml/LS. AWI: Andreas Wicenec/GAR/LS. BPI: Benoit Pirenne/GAR. CGU: Carlos Guirao/GAR. DSU: Dieter Suchar/GAR. EDAT: ESO DB Administration Team/GAR. JKN: Jens Knudstrup/GAR/LS. JPA: Jose Parra/LS. JPH: Jonathan Phillips/GAR. LSNG: People involved in NGAS at LS. LSSW: LS SW people. TIOS: Telescope Operators. In the step-by-step plan below, the following convention is used to describe the status of each task: o: To be done. -: Initiated. *: Partly finished (should be revised/completed). =: Done. ============================================================================== ==17.02.2003, Morning: = JKN: Switch off VIMOS/PAR ngamsIngest. - CGU/SZA: Deliver/install new ngamsIngest (for NG/AMS V2.0 to VIMOS/PAR). = DSU/JKN: Make NGAS 'unavailable'/switch off NGAS nodes (server shutdown on jewel68). = DSU: Install all NGAS nodes with RedHat + new NG/AMS (install new firmware). = EDAT: Switch off replication (OLASLS->ESOECF). = EDAT: Update NGAS DB according to new structure according to DB Upgrade Plan provided earlier by JKN (http://www.eso.org/~jknudstr/NGAS/NGAS_DB_UPDATE_PLAN_FEB_2003). = JKN: Deliver new NG/AMS C-API to BPI. = JKN: Check the upgraded DB structure. = JKN: Check DB contents + change wrong mime-types + add entries for new NGAS machines in ngas_hosts (ngasp). = BPI: Update/re-compile Request Handler with new C-API. = AWI: Update NGAS Zope WEB interface to reflect new NGAS DB structure. ==17.02.2003, Afternoon: = JKN: Configure all NGAS hosts to use proper NG/AMS Configuration. = JKN: Test all NGAS hosts, i.e., that the NG/AMS Server is running properly (+ started automatically @ reboot). = JKN: Prepare jewel68 for 1) Archiving of VIMOS/PAR pre-imiging data, 2) Archiving of Calibration Data (install Cal. Plug-In@jewel68). GOAL: => System should be back in operation, data can be delivered for the Archive operations. => Note: From this point on, the NGAS WEB Interfaces will be unavailable. I.e., ESOECF will not be updated with the newest status of the NGAS at LS. Hopefully, this situation will not interfere with LS/WFI operations. ------------------------------------------------------------------------------ ==18.02.2003, Morning/Afternoon: = BPI: Enable updated Request Handler + test data retrieval via this. = JKN/DSU: Prepare GAR AHU for JPH (install label printer on ngasp). = JKN: Introduction of new GAR Arch-Ops to JPH/BPI (using acngast1). = JKN: Decide with SEG/KHA about special patch version of NG/AMS for the GAR AHU). = AWI: Fix problem with CTRL-ALT-DEL (reboots rather than shutdown). = CGU/SZA: Enable (subscribe) VIMOS/PAR ngamsIngest + test. = JKN: Implement patch in NG/AMS Online Plug-In (mount rw @AHU, DFS01223) -> V2.0.1 + install on GAR AHU. - JKN: Set up GAR Arch-Ops NGAS WEB Management Pages (ngasmgr, ESOECF). = JKN: Instruct/assist JPH in starting to migrate data on old 80GB disks to 200GB disks (Disk Recycling). GOAL: => Migration of data from 80GB->200GB disk should be in progress and should run smoothly. => Archiving of Calibration Data should be in progress and should run smoothly. ------------------------------------------------------------------------------ ==19.02.2003, Morning/Afternoon: = JKN: Assist SZA in getting started with the archiving of the 400GB Calibration Data. - SZA: Test retrieval of Calibration Data. = JKN: Ongoing migration of data 80GB->200GB disks. - SZA: Ongoing archiving of Calibration Data. o BPI/JKN: Go through NGAS Commisioning Plan. = JKN: Stand-by to handle possible problems with operations of new system. = JKN: Script within ngasUtils to verify that cloning was successfull. GOAL: => Have Calibration Data archived into NGAS (the amount of completeness of this task to be defined by SZA). => Have the NGAS system officially delivered to NGAS Archive Operations. ------------------------------------------------------------------------------ ==20.02.2003, Morning: = JKN: Verify 80GB->200GB migrated data. = JKN: Remove disks that have been cloned from NGAS. = JKN: Test retrieval via Request Handler of 80GB->200GB migrated data. o JKN: Release NG/AMS UM (+ upgrade in DFS doc. rep. + NGAS WEB site - if time permits). = JKN: Stand-by to handle possible problems with operations of new system. o JKN: Clean up NGAS Disks according to Data Check Reports. ==20.02.2003, Afternoon: = JKN: Stop 80GB->200GB migration. = JKN: Prepare recycled 80GB disks for usage at LS. = AWI/JKN: Departure -> Chile. GOAL: => Have 8-10 80GB disks migrated + prepared for usage at LS. ============================================================================== ============================================================================== ==21.02.2003, Morning/Afternoon: = AWI/JKN: Trip to LS. ------------------------------------------------------------------------------ ==22.02.2003, Morning: = AWI/JKN: Trip to LS. ==22.02.2003, Afternoon: = JKN/ASC: Unsubscribe ngamsIngest@w2p2nau. = JKN/AWI: Upgrade w2p2nau + w2p2nau with new 3ware disk controllers + install new firmware. = JKN/AWI: Move existing WFI/NAU (w2p2nau) to main building (will become the "Archive Handling Unit", "wlsahu"?). = ASC/JKN: Upgrade NGAS DB in OLASLS + verify contents. = JKN: Install new NAU (sent from GAR) in rack in 2P2 computer room. = JKN: Install RedHat Linux/NGAS on new NAU. = JKN: Verify proper operation of new NAU. = ASC/JKN: Install new ngamsIngest on WFI DHS machine + re-configure to ingest data into NG/AMS directly from the DHS machine. = ASC/JKN: Test complete chain WFI->DHS->NGAS. GOAL: => Make w2p2nau operational with RedHat Linux + new NG/AMS SW (complete chain WFI->DHS->NGAS). ------------------------------------------------------------------------------ ==23.02.2003, Morning: = JKN: Install w2p2nbu with RedHat + new NG/AMS + configure. = JKN: Test proper functioning of NBU (insert completed disks, verify that online after boot). ==23.02.2003, Afternoon: = JKN: Install AHU with RedHat + new NG/AMS + configure. = JKN: Test basic functioning for LS AHU (Disk Checking). = JKN: Inform EDAT to switch on replication (if everything ready). ------------------------------------------------------------------------------ ==24.02.2003, Morning: = EDAT: Update replication to reflect the new DB structure, should not yet be switched on. = JKN/EDAT: Switch on replication OLASLS->ESOECF + check if ESOECF is properly synchronized with OLASLS. = JKN/AWI: Create Mindi/Mondo rescue CDs of LS WFI NAU, NBU and AHU NGAS machines. ==24.02.2003, Afternoon: = Buffer - finish up pending issues. GOAL: => Have NAU, NBU + AHU up and operating. => Have Mondo rescue CDs prepared. ------------------------------------------------------------------------------ ==25.02.2003, Morning: = JKN: Provide Acceptance Test/Training Check List. = JKN: Prepare small talk about NG/AMS and troubleshooting (if time permits). o AWI/JKN: Install NGAS Zope site at LS. o AWI/JKN: Support in NGAS WEB interfaces to retrieve NG/AMS Log File + NG/AMS Cfg. via the WEB? = JKN/ASH: Establish procedure for re-starting ngamsIngest@DHS machine. = ASH: Implement automatic start-up of ngamsIngest when the DHS machine is rebooted. = ASH/JKN: Test proper start-up of ngamsIngest at reboot of the DHS machine. ==25.02.2003, Afternoon: = JKN/AWI/TIOS: Training: 1) Swap NAU<->NBU, 2) = JKN/AWI/TIOS: Training: Create Mindi/Mondo CD. = JKN/AWI/ASH: Training: Carry out disk checking with the AHU. GOAL: => Have all LS NGAS people trained to handle all the standard operations needed to operate, maintain and to do troubleshooting of NGAS. => Collect ideas how to improve the documentation for NGAS and the system as such. ------------------------------------------------------------------------------ ==26.02.2003, Morning: = JKN/AWI/LSSW: Go through Acceptance Test Plan. = JKN/AWI/TIOS: NGAS Review: Questions/doubts concerning NGAS operation, maintenance, troubleshooting + comments to improvments. ==26.02.2003, Afternoon: = JKN/AWI/TIOS: Go through Acceptance Test Plan. GOAL: => Finish training of LS NGAS people. => Deliver the new NGAS 'oficially' to LS. => Collect ideas how to improve the documentation for NGAS and the system as such. ------------------------------------------------------------------------------ ==27.02.2003, Morning: = JKN/AWI/JPA: NGAS introduction to JPA. ==27.02.2003, Afternoon: = JKN/AWI/JPA: Walk through Acceptance Test Plan (as training). = JKN/JPA/AWI: Planning of delivery of NGAS to PAR. GOAL: => Make PAR (JPA) acquainted with NGAS. => Define the NGAS PAR infrastructure + define operations + identify possible special requirements from PAR towards NGAS + other practical aspects int his context. ------------------------------------------------------------------------------ ==28.02.2003, Morning: o JKN/TIOS2: Walk through Acceptance Test Plan/training. o AWI/JKN: Create Mondo Rescue CDs for NAU + NBU (2 copies). ==28.02.2003, Afternoon: o JKN/JPA/AWI: Leave LS. GOAL: => Train TIOS of other turno. ============================================================================== --- oOo ---