VO Crawler: a crawling system for Virtual Observatory services

Yutaka Komiya (National Astronomical Observatory of Japan ), Y. Shirasaki (NAOJ), M. Ohishi (NAOJ), S. Eguchi (NAOJ),
Y. Mizumoto (NAOJ), Y. Ishihara (Fujitsu), J. Tsutsumi (Fujitsu),
T. Hiyama (Fujitsu), H. Nakamoto (SEC), M. Sakamoto (SEC)


We report on the development of VO Crawler under the Japanese Virtual Observatory (JVO) project. VO Crawler accesses Virtual Observatory (VO) Services around the world, and cache data over the whole sky. As all the data are managed in a single system, it enables quick access of and searches for huge data, and to find location of VO data on a sky map.
We are also developing Whole Sky Search system, which searches objects with specified characteristics from the data compiled by VO Crawler without requiring specifying a sky region to be searched.

VO Crawler compiles all catalog data, and images and spectrum metadata from all active VO services. We employ Hadoop; a software for distributed processing system developed by Apache, and the retrieved data are managed with HBase; database for Hadoop. Fundamental statistical analysis will be done after all the data are retrieved, and the results are stored in PostgreSQL database management system.

We have been developing JVO Sky, which is a graphical user interface to display observed area on a sky map by using the Google Sky API, and until now data obtained by the Subaru telescope and the Suzaku satellite are registered in the JVO Sky. Data compiled by VO Crawler, such as observed area/points with link to image/spectrum files and number density of data for each observation band are displayed and searchable on the JVO Sky map.

Our goal is that users can find data about objects with user given characteristics from whole sky without bothering a huge variety of data sources and huge data size.

