Dealing with Science Data

Martino Romaniello takes us into ESO’s Science Archive

15. kesäkuuta 2018
ESO is proud of being the most productive ground-based observatory in the world, making observations that led to over one thousand scientific papers in 2017 alone. But to produce such a huge number of papers, ESO’s telescopes must churn out mind-boggling amounts of data. So where is all this data stored and how do astronomers get their hands on it? We spoke to Martino Romaniello, Head of the Back-end Operations Department at ESO Headquarters, to find out.

Q: Martino, tell us a bit about your role at ESO.

A: I joined ESO in late 1998 as a postdoctoral fellow, fresh out of my PhD at the Scuola Normale Superiore in Pisa, Italy and the Space Telescope Science Institute in Baltimore, USA, the “home” of the Hubble Space Telescope. Those were the early days of science operations with ESO’s Very Large Telescope and I was immediately transfixed with the scale and ambition of the project, and of how much it aimed to change the paradigm of ground-based astronomy. I became an ESO staff member in the year 2000, sharing my time between functional duties for ESO and research as a member of the Science Faculty.

ESO makes all of the data generated by our telescopes that has scientific relevance openly available to the science community around the world

For functional duties, I served as a Support Astronomer until 2006. Since then, I have led different organisational units that dealt with the handling of science data. In our latest incarnation as Back-end Operations Department, we are responsible for the “last mile” of the long journey of science data, namely that the science content is there in the data, that it can be extracted and calibrated, and that it is made available to our community in a scientifically meaningful way through the ESO Science Archive.

In my own research, I am interested in the formation and evolution of stars, both as individual objects and in stellar systems. Specifically, I use a particular type of stars called Cepheids to gather hints on what might be driving the accelerated expansion of the Universe.

Q: What data does ESO make available to the community?

A: ESO makes all of the data generated by our telescopes that has scientific relevance openly available to the science community around the world. Preventing data from being accessible is the absolute exception and is reserved to cases such as the very early phases of commissioning of new instruments or other similar test phases in which the data has no scientific content to speak of.

In order to be useful for scientific measurements, the raw data acquired at the telescopes have to be processed to remove the signatures of the measurement process (from the telescope, instrument or Earth’s atmosphere) and extract and calibrate the science signal. In addition to the raw data, we also provide processed data directly through the archive. The availability of processed data for science analysis is specifically important to making the archive useful for the general community and increasing ESO’s overall scientific return.

Q: How much data is currently stored and where?

A: The data from the La Silla and Paranal Observatories amounts to a bit more than one petabyte (equal to one million gigabytes), and we store copies for redundancy and safety reasons. Two of these copies are in different locations at ESO Headquarters in Garching, Germany. The third one is hosted by the Max Planck Computing and Data Facility at the Garching Research Campus. The homepage of the ESO Science Archive is at archive.eso.org. ESO also host the European copy of the ALMA Science Archive at https://almascience.eso.org/alma-data/archive.

Data storage and exchange technologies are rapidly evolving, pushed by the increasing demands of scientific and commercial endeavours. We actively partner with the likes of ALMA, CERN and the Square Kilometre Telescope (SKA) to cater for our needs in the most efficient way possible.

Open access to data is a staple of scientific research and serves several purposes

Q: Why is ESO’s data available as open access for anyone to use? Has it always been that way?

A: Open access to data is a staple of scientific research and serves several purposes. The first is that it enables any scientific claim to be independently verified and challenged, which is a founding principle of the scientific method. Secondly, it allows for genuinely new science and knowledge to come from the data. This is both in conjunction with other data, or by using the archive as a primary source. In addition, archival data is used to design better experiments that require new data to be obtained. In fact, in order to apply for observing time with any of ESO’s telescopes, astronomers need to show that their proposed science goals cannot be achieved with data already available in the archives — this is much quicker, as applying for and receiving new data can take as long as one to two years.

ESO’s data open access policy can be traced back to 1988 with the introduction of “Key Programmes” on La Silla. Open access to data has been ingrained in the science operations policy of the Very Large Telescope and its interferometer since the very beginning of science operations in 1999. Initially, access was limited to ESO Member States. Following a decision by the ESO Council in December 2004, the archive was opened to the whole world on 1 April 2005.

A recent science paper described the ESO Science Archive as the “largest telescope facility ever.” While it may be a bit of a hyperbole, it does convey the power of reusing data collected over decades from some of the most powerful telescopes and instruments ever built.

Q: By sharing its data, how does ESO benefit?

A: What ESO gets in return is that more science is done with our data.

Enabling major scientific discoveries by the astronomical community is core to ESO’s mission. The ESO Science Archive plays a very significant role in this: 30% of the refereed publications that use ESO data make use of archival data. In addition, the Science Archive broadens the user base of ESO data: about 30% of the users of the archive do not use ESO in any other way. And again, open access to data is a staple of research. Astronomy as a discipline and ESO, in particular, have long been pioneers in this area. In order to further increase archive use of the data, we have recently developed the Archive Science Portal to provide more intuitive, enhanced data discovery tools to our users.

Open access to science data is also a pivotal policy point for governments and funding agencies around the world. Most notably, the European Commission has launched and is shaping the European Open Science Cloud (EOSC). ESO has endorsed the EOSC Declaration in recognition of the vital need for open access to trusted and reliable data in today’s world of scientific research. We also actively collaborate with other observatories and data centres worldwide, most notably with ESA and the Strasbourg astronomical Data Centre (CDS), to foster the open exchange of science data.

Since 2011, more than 7000 professional astronomers have accessed the ESO Science Archive

Q: Who uses ESO data?

A: Scientists are the main users of ESO data. Since 2011, more than 7000 professional astronomers have accessed the ESO Science Archive. For reference, this is between a half and two-thirds of astronomers worldwide, as gauged by the number of IAU members. As mentioned earlier, they use the data in a variety of ways that ultimately lead to more science being done and more knowledge being extracted from the data.

There are other scientists than astronomers who use the ESO Science Archive, for example, the people who study the Earth atmosphere. In contrast to astronomers, they are not interested in the celestial objects. Rather, they study the composition of the atmosphere above the observatory and how it changes over time, which relates to climate studies. Such cross-disciplinary science is growing in importance.

There are also amateur astronomers and teachers among the visitors to the ESO archive. Unfortunately, we do not have a good handle of what they do with it. Perhaps it would be worth it trying to learn more about this, as it may be worth experimenting with citizen science.

Q: Are there any restrictions — are some data off limits?

A: The basic policy is that access to data is initially restricted to the scientists who triggered their creation, after which it becomes publicly available.

Observing with ESO telescopes is a competitive process. Teams of astronomers submit their ideas for new observations to ESO, which organises a peer-review process within the astronomical community itself. The proposals that are approved through this process are executed and generate new data, which is stored in the ESO Science Archive. Access to this data is initially limited to the original proposers of the observations, typically for a period of one year, after which the data itself becomes available without restrictions. The purpose of the policy is to recognise the effort that went into new data being generated while preserving the principle of open data access.

There are, of course, exceptions to the general policies and they can go both ways. In some cases, most notably with the Public Surveys, raw data is public immediately. Also, the Principal Investigator of such surveys has to return processed data to the archive for the community at large to benefit. The same applies to Principal Investigators of Large Programmes. In both cases, this is in recognition that the large investment of telescope time needed to carry out these large, coordinated observational campaigns has to have a large return for the whole community.

The ELT will be a unique science machine that will generate preciously unique data and science opportunities. Open access to this data will be fundamental to fully exploit its amazing potential

In other cases, at the discretion of the Director General, the proprietary rights can be extended if a valid justification exists. This can be applied to individual observations or groups by extending the proprietary protection period, or even to the knowledge that certain data was acquired in the first place. An example of this is the follow-up with ESO telescopes of gravitational wave signals, in which the potential detections themselves were not immediately made public. In these situations, the fact itself of pointing a telescope in a given patch of the sky would give away confidential information, hence the special treatment.

Q: Do you expect ESO to continue to make this data widely available in the future, particularly in the era of the ELT?

A: Most definitely! The science return of doing so is evident, as are the wider cultural implications. Plus, ultimately the data generated by ESO is the result of a large investment of public money and it is only fair that it is accessible for everyone to benefit. The ELT will be a unique science machine that will generate preciously unique data and science opportunities. Open access to this data will be fundamental to fully exploit its amazing potential.

Interview with:
Martino Romaniello
Tags:

Numbers in this article

30 Percentage of refereed publications using ESO data that make use of archival data
30 Percentage of archive users who have never applied for their own time at the Very Large Telescope
1988 The year the ESO Science Archive was established
2005 The year the ESO Science Archive was made available to the whole world
7000 The number of professional astronomers who have accessed ESO archival data since 2011
1 000 000 Gigabytes (GB) of data stored in the ESO archive

Biography Martino Romaniello

Martino Romaniello is currently the Head of the Back-end Operations Department at ESO Headquarters. He completed his PhD at the Scuola Normale Superiore in Pisa, Italy, and the Space Telescope Science Institute in Baltimore, USA, before becoming an ESO Fellow in 1998. He later became a staff member, sharing his time between functional duties and science research. For the former, he initially served as a Support Astronomer before moving to science data management in 2006. In his science time, Romaniello studies the formation and evolution of stars, both as individual objects and in stellar systems. He focuses on Cepheid stars in order to understand what might be driving the accelerated expansion of the Universe.