vpguess



 
This is the vpguess information page. Please note that it is updated from time to time. vpguess is free software but please see this.
 

Introduction

vpguess is a program to facilitate the fitting of multiple Voigt profiles to spectroscopic data. It is a graphical interface to VPFIT by R.F. Carswell, J.K. Webb and others. Although VPFIT has been outfitted with some graphics capabilities of its own I hope that vpguess may be a useful alternative. It was originally meant to simplify the process of setting up first guesses for a subsequent fit with VPFIT. However, it has since been developed into a full interface to VPFIT. It may also be used independently of VPFIT for displaying data, playing around with data and models, "chi-by-eye" fits, displaying the result of a proper fit, pretty plots, etc. vpguess is written in C. The graphics are based on PGPLOT by T. Pearson.
 

Download

You can get vpguess from this link: http://www.eso.org/~jliske/pub/vpguess-1.5.tar.gz. The latest version is 1.5 and its release date is 16/04/2008. See the README file that comes with the distribution for a list of changes from earlier versions. Please send me an email to let me know that you are using vpguess.
 

Installation

After unzipping and unpacking the file you will find a directory called vpguess-[version_number]. It contains the document you are reading, the README file, an atomic data file and a subdirectory src/ in which you'll find all the source code, header files and a Makefile. First have a look at the Makefile. At the top there are several sets of definitions. Edit one of the sets to suit your own needs and wants and comment out the other ones.

vpguess uses the cpgplot library for all graphics. Its FORTRAN base, PGPLOT, is pretty much standard but if you don't have it, you can get it here. cpgplot is the C extension of PGPLOT. Although it is included in the standard PGPLOT distribution, it is for some reason often not installed. If PGPLOT is installed on your computer, but cpgplot is not, then you need to get your system administrator to type make cpg in the PGPLOT directory (often /usr/local/pgplot). This will only work if the PGPLOT source or distribution directory has not been deleted. If the make was successful then the files libcpgplot.a (library) and cpgplot.h (header file) or links to them need to be put in the appropriate system directories so that your compiler can find them. These would be something like /usr/lib for the library and /usr/include for the header file.

vpguess also needs the cfitsio library. A reasonably newish version is required (> 2.4 or so) which can be downloaded here.

Once these two libraries are installed and the Makefile is edited appropriately, you can type make in the src/ directory. This will produce the executable vpguess. This should be put in your bin directory (or somewhere else in your PATH). You can then type make clean to clean up.

Although vpguess can be used by itself it is primarily meant to help facilitate the interaction with VPFIT. If you plan to use vpguess in conjunction with VPFIT (see below) then VPFIT must be installed on your system. vpguess currently works with VPFIT version 9.5. There is no guarantee that it will still work with earlier VPFIT versions. You can get the VPFIT code and some installation hints here. Before compiling VPFIT you should make sure that it is set up to work with ASCII data files. Read this (following "IMPORTANT") for a detailed explanation. From VPFIT version 7 onwards things seem to be set up by default to work with ASCII files, so you don't have to do anything. However, it's probably a good idea to check.
 

Environment variables

PGPLOT needs at least the environment variable PGPLOT_DIR so it can find various files. There are a number of other PGPLOT environment variables that you can set if you like and you can read about them here.

If you plan to call VPFIT from within vpguess you will have to let vpguess know where your version of VPFIT lives and what it is called. This is done by setting the environment variable VPG_VPFIT to the full path and name (something like /home/jliske/bin/vpfit) of the VPFIT executable that you wish to use with vpguess. If you don't have VPFIT then don't set the environment variable. In this case the command to fit ('F') will just fail harmlessly (but not quietly).

You can also set the variable VPG_ATOM_DATA to point to the atomic data file. If this is set you don't need the file atom.dat in your working directory. VPFIT uses its own environment variable (ATOMDIR) for the same purpose. If VPG_ATOM_DATA is not defined then vpguess will attempt use ATOMDIR instead.
 

Input files

Obviously, vpguess needs to read in the spectral data that you want to fit. For convenience the data are usually split into several files, each covering a different spectral region of interest. These are referred to as "data chunks". On startup, vpguess will read all the data chunks given on the command line or those specified in a save file.

From version 1.2 vpguess can read the spectral data and "associated arrays" either from ASCII files or from FITS files. In general, the "associated arrays" are the error array, the continuum and the sky (or background). However, the last is never used in vpguess and is included below only for completeness.

In addition to the spectral data itself, vpguess also needs corresponding error and continuum arrays. All error arrays are assumed to be 1 sigma unless variance is indicated (see below). Note that reasonable error estimates are important for meaningful fitting. Hence, if vpguess cannot find an associated error array (see below) then it will try to provide reasonable values by setting the error array to sqrt(data * (N_S^2 - SKY^2) + SKY^2), where N_S and SKY are parameters defined in spec.h. Once vpguess is up and running you can also use 'y' to create a new error array or to improve an existing one. If no continuum is found it is set to 1, i.e. the spectrum is assumed to be normalised.

ASCII files

An ASCII data file is assumed to consist of one or more space or tab-separated columns, where each line corresponds to a different wavelength and each column is either the wavelength, the spectrum or one of the associated arrays.

Let's assume your data file is called foo. First, vpguess will look for a header file called foo.hdr. If the data file is called foo.txt (or .dat or whatever) then vpguess will check for foo.hdr, foo.hdr.txt and foo.txt.hdr. If found, the header file is assumed to be an ASCII file in FITS header style, i.e. each line is assumed to consist of a keyword name (maximum of 8 letters), followed by a '=', followed by a keyword value.

If a header file is present, then vpguess will search it for wavelength information (just as in the case of FITS data files, see below). So far, only linear and log-linear coordinate systems are supported. (The latter is used by the SDSS.) IRAF's MULTISPEC format is not (yet) supported. The wavelength info is simply derived from the header keywords CTYPE1, CRPIX1, CRVAL1 and CD1_1 or CDELT1. If the wavelength info is found in the header file then the data file does not need to contain a wavelength array.

If no header file is found or if it contains none of the keywords that define the wavelength scale, then it is assumed that the first column of the data file lists the wavelengths.

If a header file is present, then vpguess will also use it to try to figure out which of the columns in the data file contain the spectrum and associated arrays. The keyword names identifying the columns can be either BANDIDn, ROWn or ARRAYn, where n is a number between 1 and the number of columns in the data file. The values of these keywords should contain the words (case insensitive):

If this fails for the spectrum, i.e. if no header file is found or if it contains no identifier for the spectrum, then the spectrum is assumed to be stored in either the first or second column of the data file, depending on whether the wavelength information is specified in the header or not (see above).

If the above fails for one of the associated arrays vpguess attempts to use the appropriate default column defined in spec.h (taking into account the possible presence of a wavelength array).

If this also fails, i.e. if the specified column doesn't exist in the data file, then the associated array is sought in a separate ASCII file. For example, if the continuum cannot be identified/found and if the data file is called foo, then vpguess will look for a file foo.cont. If the data file is called foo.txt (or .dat or whatever) then vpguess will check for foo.cont, foo.cont.txt and foo.txt.cont. Similarly, if the error array cannot be identified in foo, then vpguess will try to find an error array in foo.sig, foo.err, foo.var and all the various combinations involving the data file's extension (if present). If the sky is missing, then the file extensions used for trying to find a sky (or background) array are .sky or .bkg.

The format of the .sig/.cont/.sky files is

[Wavelength (A)] Sigma / Continuum / Sky
i.e. either one or two columns. If there are two columns the error/continuum/sky value is assumed to be the second column. The first column is ignored, i.e. no check is performed whether the wavelength values in these files correspond to the wavelength array of the spectrum (which was either created from the wavelength info in the header file or read in from the first column of the data file).

To summarise: if you do not wish to use a header (.hdr) file then your ASCII data file should look like this:

Wavelength (A) Flux [Sigma] [Continuum] [Sky]
i.e. between 2 and 5 space or tab-separated columns, where you have the option of storing the associated arrays in separate files. Specifying the wavelength scale in a header file would allow you to omit the first column while the use of column identifiers allows for non-standard arrangements in the order of the spectrum and associated arrays.

FITS files

vpguess can handle 1, 2 and 3-dimensional multi-extension spectral FITS files (which may be gzipped). In all cases, the first axis is assumed to be the spectral axis. If present, the second axis may either contain associated arrays (error, continuum, sky) or multiple spectra. If the third axis is present it is assumed to contain the associated arrays, while the second axis is assumed to contain multiple spectra. If the third axis is present, its length must be < 6.

1D case:
In this case it is obvious where to get the spectrum. The associated arrays (error, continuum, sky) are first sought in other FITS extensions of the same FITS file. vpguess looks for the following extension names:

If that fails vpguess looks for separate FITS files. If the data file is called foo.fit, then vpguess checks for foo.sig.fit, foo.err.fit, foo.var.fit, foo.cont.fit, foo.sky.fit, foo.bkg.fit.

As for ASCII files, vpguess never checks that the wavelength information in the various FITS extensions or files is the same.

2D case:
vpguess will first attempt to determine the positions of the spectrum and associated arrays along the second axis from FITS header keywords. The keyword names can be either BANDIDn, ROWn or ARRAYn, where n goes from 1 - length of second axis. The values of these keywords should contain either of the words SPECTRUM or SPEC for the data or the the extension names listed above for the associated arrays (all case insensitive). If this fails for the spectrum, i.e. if the identifying keywords (BANDIDn, etc.) do not exist in the header or if their values don't contain the 'magic' words, then vpguess reads the spectrum from the first row – unless the length of the second axis is > 9. In this case vpguess assumes that the file contains multiple spectra (as opposed to a single spectrum + associated arrays) and the user is queried for the position of the spectrum along the second axis.

If the identification process fails for the associated arrays (or if the length of the second axis is > 9), vpguess goes through the same process as in the 1D case (assuming that the separate FITS extensions or files are also 2D and using the same position along the second axis as for the spectrum). If this also fails, vpguess uses the default positions defined in spec.h (row 2 = error array, row 3 = continuum).

3D case:
First the user is queried for a position along the second axis. Then vpguess tries to determine the positions of the spectrum and the associated arrays along the third axis using FITS header keywords as above. If that fails the first position is used for the spectrum while the default positions defined in spec.h are used for the associated arrays (standard IRAF settings: band 4 = error array, band 5 = continuum).

Wavelength information:
So far, only linear and log-linear coordinate systems are supported. (The latter is used by the SDSS.) IRAF's MULTISPEC format is not (yet) supported. The wavelength info is simply derived from the FITS header keywords CTYPE1, CRPIX1, CRVAL1 and CD1_1 or CDELT1.
 

atom.dat

The file atom.dat, which comes with both vpguess and VPFIT, contains the atomic data for the various transitions of various ions that you are likely to encounter in QSO absorption spectra. Bob Carswell (the maintainer of VPFIT) does a good job of keeping this file up to date.

You need to have a copy of this file, or a link to it, in your current directory in order to be able to run vpguess. Alternatively, you can set one of the environment variables VPG_ATOM_DATA or ATOMDIR (or both) to point to it.

vpguess versions earlier than 1.2 used a different format for the atom.dat file than VPFIT. This issue is now resolved: vpguess version 1.2 and greater and VPFIT version 5 and greater can (and should) use the same atom.dat.
 

Getting started

Type vpguess data_files to run vpguess. vpguess will then ask you to supply the FWHM of the instrumental line spread function (including units, either km/s or Angstrom). If you already have a save file (= fort.13 in VPFIT-speak) from a previous session or from use with VPFIT you can use the -f command line switch: vpguess -f save_file. You can also call vpguess without any arguments. In this case vpguess will prompt you for a save file. Finally, if a setup file (called vpguessrc) is found in the current working directory you will be asked whether you want to use it.

Once vpguess is up and running you interact with it via single key strokes. Type '?' to get a list of all the available commands. Type all commands in the graphics window, not the terminal from which you started vpguess. The only exception is when you are in command line mode or when vpguess prompts you for input in the terminal. Here is a screen shot of what vpguess looks like in its simplest incarnation,

vpguess screenshot

and here is a more complicated (and bigger) example. If the vpguess window is too large or too small for your screen, go back to the vpguess/src distribution directory and change the definitions of WW and/or WH (and/or LEWW) in the file vpguess.h to something more suitable for your screen. You then have to recompile vpguess by typing make.
 

Getting help

Type vpguess -help to find out how to invoke it. Once vpguess is up and running you can type '?' (in the graphics window) to get help. This works in pretty much every mode or window and gives a list of all the currently available commands. The short descriptions of each command are meant to be helpful but I realise that it won't be entirely obvious to everyone what each command does. I guess I need to write a help page with detailed descriptions. Until then you'll just have to try things out yourselves... Also have a look here. Feel free to email me if you're stuck because something is just too obscure.
 

Interaction with VPFIT

Although vpguess is very useful by itself (at least I think so) it is primarily meant to help facilitate the interaction with VPFIT. For more or less straightforward Voigt profile fitting you won't have to know anything about the intricacies of VPFIT anymore: vpguess takes that off your hands. However, there are two caveats: 1. VPFIT has a number of capabilities that cannot be accessed via vpguess at the moment. This may change in the future. 2. The VPFIT code is not part of the vpguess distribution. That's because vpguess does not incorporate VPFIT as some sort of subroutine. Rather, VPFIT is accessed simply via a system call. As pointed out above, this means that you need to have a working copy of VPFIT installed on your system, if you want to be able to use the 'F'it command in vpguess.

Note that although VPFIT only accepts ASCII files in a particular multicolumn format you can still make use of the various formats and mechanisms for ASCII file input described above if you interact with VPFIT through vpguess.

As explained above already you should set the environment variable VPG_VPFIT to let vpguess know where VPFIT lives on your machine. If it is not set then vpguess will alternatively look for $HOME/bin/vpfit and ./vpfit. However, I recommend the use of VPG_VPFIT. If nothing else it allows you to have several versions of VPFIT without any danger of confusion. For example, if you have modified VPFIT in some way which makes it incompatible with vpguess, but you would still like to be able to use standard VPFIT through vpguess, then you would need to have two versions of VPFIT: your own version (probably called "vpfit") and one standard version compiled for use with ASCII files as explained above (called, e.g., vpg_vpfit). Simply point vpguess at vpg_vpfit by setting the VPG_VPFIT environment variable appropriately (full path name!) and there will never be any confusion.

For those of you who know anything about VPFIT and wonder about the details of how vpguess interacts with it: when you hit the 'F' key vpguess first writes out a fort.13 (a "save file" in vpguess lingo). It then writes all the data chunks to temporary files (in the multicolumn format expected by VPFIT) to ensure that VPFIT will work on the same data (= spectra, error arrays and continua) that you see on the screen. Thus if you have made any modifications to the data after it was read in (e.g. with the 'n' key or other features that may be added in future) these will be passed on to VPFIT. This procedure also allows the use of .sig and .cont files which cannot be read in by VPFIT. Using a system call vpguess then fires up VPFIT in "file mode", giving it appropriate answers where they are expected. (If you ever want to change the answers supplied to VPFIT you need to edit the file vpgvpfitif.c and recompile vpguess.) Finally, when the fit is done vpguess reads in the results for the fitted parameters and displays the corresponding model spectrum.

VPFIT produces various outputs. First there is the screen output. Although this rather voluminous output can be quite annoying sometimes vpguess does not just pipe it to /dev/null but simply displays it in the terminal from which vpguess was called. I am told that some people are actually interested in the fit statistics, parameter errors and actual parameter values. Also, it may be useful to check that no error occurred during the execution of VPFIT. (In addition I guess it wouldn't be right to suppress VPFIT's copyright notice...) Secondly, VPFIT produces various files: fort.18 and fort.26 (strange names, I know...). Since there is nothing in fort.26 which is not also in fort.18 vpguess deletes the first and leaves the latter. It basically contains a copy of the screen output (results for parameter values, errors and various fit statistics).

If you like the idea of using vpguess to set up first guesses but would rather invoke VPFIT independently of vpguess (e.g. because you want to use some feature of VPFIT not currently supported by vpguess) then you can do so. Simply write out the results of your "guess work" to a save file (hit the 's' key). The resulting file is in the style of a VPFIT fort.13 and can therefore be used to fire up VPFIT. This procedure allows you to manually edit your guess work or add things which are currently not (fully) supported by vpguess.

VPFIT comes with its own atom.dat file. It is identical to the one included in the vpguess distribution (at least at the time of the current release). Read about it here.

VPFIT uses a Voigt profile generation routine that is less accurate (but faster) than that used by vpguess (see below). So the model spectrum displayed by vpguess is not exactly that fit by VPFIT.
 

Voigt function

The Voigt function H(a,u) is non-analytical and there are many different algorithms for calculating H in the astronomy and atmospheric physics literature. Each finds a different balance between accuracy and speed in various regions of parameter space. See Bob Wells' page or the appendix of Michael Murphy's PhD thesis for some examples. The latter includes a description of the method used in VPFIT.

To compute H(a,u) vpguess uses a Taylor expansion in a:
H(a,u) = sumi=0 ai Hi(u)
going to fourth order. High-density, high-accuracy lookup tables are used for the non-analytical functions H1(u) and H3(u) and an asymptotic approximation is used for u > 20. (Note: a given absorption line's contribution to the optical depth, tau, is only included for those parts of the spectrum where the line's tau is > 10-7.) In those parts of the (a,u) parameter space probed by QSO absorption lines the above is more accurate than the method used by VPFIT by at least 2 orders of magnitude and is comparable to or better than the popular algorithm advanced by Humlicek (see Murphy's thesis for a direct comparison).
 

Some tips

The following is a fairly random list of tips and hints that came to my mind while I was writing the code or things that have been pointed out to me by other people.


 

Known bugs

If you find a bug that is not listed here please let me know. See the README file that comes with the distribution for a list of bug fixes and changes.
 

Things to do and notes to myself


 

License / Disclaimer

Copyright (C) 2001,2003,2005,2008 by Jochen Liske.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.


Joe Liske