The same lecture is now in the Notebook format, see there: http://python-astro.blogspot.mx/2014/09/interacting-with-files-reading-writing.html
What is the FITS format?
The FITS format is the most popular way to save and interchange astronomical data. The files are organized in units each of which contains a human readable header and a data. This structure is refereed as HDUs (Header/DATA Unit).A FITS file can contain one or more HDUs, the first of which is called "primary" and the rest are called "extensions". The primary HDU usually contains 1D spectrum, 2D image or 3D data cube, although any dimension from 0 to 999 are possible. The data are 1, 2 or 4 bytes integers or 4 or 8 bytes real numbers.
The extensions can contain or arrays as in the primary HDU or ascii tables or binary tables.
If a FITS file contains only tables, it primary HDU does not contain data, but only header.
Both headers and data in a FITS file are organized in blocs of 2880 bytes. The header contain 80 bytes lines each of which consists of a keyword of 8 bytes followed in most of the cases by '= ' in the position 9 and 10 and then the value of the keyword. The rest of the line is a comment string beginning with '/'. Each header begins with the following lines
SIMPLE = T / file conforms to FITS standard
BITPIX = 16 / number of bits per data pixel
NAXIS = 2 / number of data axes
NAXIS1 = 440 / length of data axis 1
NAXIS2 = 300 / length of data axis 2
which defines the format of the file as standard FITS, the data format and the dimensions of the stored data.
One block of 2880 bytes contains 36 lines of 80 characters per line. The header can have several blocks of 36 lines. The last block is identified by the presence of the keyword 'END' The next 2880 bytes block contains the first part of the data. The empty lines after 'END' keyword are filled with blanks and the unused bytes from the end of the data to the end of the 2880 bytes block are filled with NULLs.
Full description of the FITS format can be found at http://fits.gsfc.nasa.gov/fits_primer.html
PyFITS
As one can see, reading FITS files is not as simple as reading columns of data. Fortunately, there is a python package called PyFITS which makes the task much easier.The package can be downloaded from
http://www.stsci.edu/institute/software_hardware/pyfits
The documentation can be found on the same web page.
Although it is possible to download and install the package it is much easier to install it with
easy_install pyfits
or
pip pyfits
Information on easy_install: HERE and on pip: HERE.
I did that on both Mac and Linux machines and it working nicely.
How to read FITS file
First, as usual, one has to import the pyfits packageimport pyfits
then the file is read with
hdulist = pyfits.open('n10017o.fits')
where I used one of my FITS files from San Pedro Martir echelle spectrograph. The file can be downloaded from HERE.
The result hdulist is a list of HDU objects. In the case of a simple file, there is only one primary HDU so the list contains only one element
len(hdulist)
1
The information on what the file contains can be obtained by calling the info() method:
hdulist.info()
Filename: n10017o.fits
No. Name Type Cards Dimensions Format
0 PRIMARY PrimaryHDU 62 (2154, 2048) int16
The table said that there is only a primary HDU which contains 2154 X 2048 image with data stored in 2 bytes (16 bits) integers.
As described above, the HDU (header/data unit) contains header and data. The header is a dictionary. To see what keywords were used in the header one can do:
hdulist[0].header.keys()
['SIMPLE',
'BITPIX',
'NAXIS',
'NAXIS1',
'NAXIS2',
'EXTEND',
'COMMENT',
'BZERO',
'BSCALE',
'EXPTIME',
'DETECTOR',
'ORIGIN',
'OBSERVAT',
'TELESCOP',
'LATITUDE',
'LONGITUD',
'ALTITUD',
'SECONDAR',
'TIMEZONE',
'OBSERVER',
'OBJECT',
'INSTRUME',
'GAINMODE',
'FILTER',
'IMGTYPE',
'EQUINOX',
'ST',
'UT',
'JD',
'DATE-OBS',
'CCDSUM',
'RA',
'DEC',
'AH',
'AIRMASS',
'TMMIRROR',
'TSMIRROR',
'TAIR',
'XTEMP',
'HUMIDITY',
'ATMOSBAR',
'WIND',
'WDATE',
'DATE',
'NAMPS',
'CCDNAMPS',
'AMPNAME',
'CREATOR',
'VERSION',
'HISTORY']
and to get the value of a given keyword
hdulist[0].header['OBJECT']
'BD +59 363'
The header can be printed as it apears in the file by
print hdulist[0].header.ascardlist()
SIMPLE = T / conforms to FITS standard
BITPIX = 16 / array data type
NAXIS = 2 / number of array dimensions
NAXIS1 = 2154 / length of data axis 1
NAXIS2 = 2048 / length of data axis 2
EXTEND = T
COMMENT FITS (Flexible Image Transport System) format is defined in 'Astronomy
COMMENT and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H
BZERO = 32768 / BZERO
BSCALE = 1 / BSCALE
EXPTIME = 0.0 / Integration Time, sec.
DETECTOR= 'e2vm2 E2V-4240' / CCD Type
ORIGIN = 'UNAM ' / OAN SPM, IA-UNAM
OBSERVAT= 'SPM ' / Observatory
TELESCOP= '2.12m ' / Telescope
LATITUDE= '+31:02:39' / Latitude
LONGITUD= '115:27:49' / Longitud
ALTITUD = 2800 / altitud
SECONDAR= -1 / F/ Secondary type
TIMEZONE= 8 / Time Zone
OBSERVER= 'Leonid ' / Observer's Name
OBJECT = 'BD +59 363' / Object
INSTRUME= 'Echelle ' / Instrument
GAINMODE= 1 / Gain factor in the CCD
FILTER = 'None ' / Filter
IMGTYPE = 'zero ' / Image Type
EQUINOX = 2011.7 / Equinox
ST = '02:19:51.2' / Sideral Time
UT = '11:28:28' / Universal Time
JD = 2455803.5 / Julian Date
DATE-OBS= '2011-08-30' / Observation Date UTM
CCDSUM = '1 1 ' / Binning [ Cols:Rows ]
RA = ' 02:13:22.2' / Right Ascension
DEC = ' 52''14''44.0' / Declination
AH = ' 00:06:28.0' / Hour Angle
AIRMASS = 1.073 / Airmass
TMMIRROR= 0 / Primary Mirror Temperature (celsius degree)
TSMIRROR= 0 / Secundary Mirror Temperature (celsius degree)
TAIR = 0 / Internal Telescope Air Temperature (celsius deg
XTEMP = 14.6 / Exterior Temperature (celsius degree)
HUMIDITY= 44.0 / % external Humidity
ATMOSBAR= 731.7 / Atmosferic Presure in mb
WIND = 'S at 24.1 km/h' / Wind Direction
WDATE = '11:28:10, 08/30/11' / Weather Acquisition Date (Local time)
DATE = '2011-08-30T11:28:29' / file creation date (YYYY-MM-DDThh:mm:ss UT)
NAMPS = 1 / Number of Amplifiers
CCDNAMPS= 1 / Number of amplifiers used
AMPNAME = '1 Channel' / Amplifier name
CREATOR = 'Python Oan ccds' / Name of the software task that created the file
VERSION = '4.12D ' / Application Software Version
COMMENT Visit our weather site http://www.astrossp.unam.mx/weather15
COMMENT for complete meteorological data of your observation night
HISTORY bin2fits V1.0
HISTORY Programmer: Enrique Colorado [ colorado@astrosen.unam.mx ]
HISTORY Observatorio Astronomico Nacional -UNAM
HISTORY V1.00 By Arturo Nunez and Colorado >Ported to Python using pyfits
HISTORY V0.50 By E. Colorado >Added interior mirrors temperatures
HISTORY V0.49 By E. Colorado >Added BIASSEC parameter
HISTORY V0.48 By E. Colorado >Aditional info for autofocus calculations
HISTORY V0.4 By E. Colorado >Now we include timezone, and remove lat. sign
HISTORY V0.3 By E. Colorado >Now we include weather data
HISTORY V0.2 By E. Colorado >General OAN Working Release
The data in the file are accessible with
data = hdulist[0].data
and can be seen with [don't forget to import matplotlib.pyplot as plt before running this]:
plt.imshow(data)
A column from the data can be plotted with
plt.plot(data[:,1000])
where I am plotting the column number 1000. In the same way a line from the data is plotted with:
plt.plot(data[1000,:])
The data are numpy object so all manipulations are available.
Some more on displaying images
The main matplotlib function to display images is imshow. It has several parameters of which te most important are the one which controls the color scheme and the ones which control the dynamic range.plt.imshow(data, cmap=cm.gray, vmin=1000, vmax=10000)
cmap controls the pseudocolor map (same as color= in IDL). The predefined color maps can be seen on http://matplotlib.sourceforge.net/examples/pylab_examples/show_colormaps.html
vmin and vmax controls the range of values which are mapped in [0,1] range where 0 is black (the darkest color) and 1 is white (the lightest color).
origin = 'upper' | 'lower', place the pixel with coordinates 0,0 at the upper or lower corner of the plot
extent = (xmin,xmax,ymin,ymax) - This parameter defines the numbers written on the axes. No changes on the image.
Using FITS tables
For this example I'll use a spectrum obtain with the high dispersion camera on board of IUE.The file is opened as usual:
hdulist = pyfits.open('swp04345.mxhi')
Download the file from THERE.
but now hdulist has 2 elements (2 header/data units):
len(hdulist)
2
We can see that the primary header has dimension (), son does not contain any data. The data are in the extension.
hdulist.info()
Filename: swp04345.mxhi
No. Name Type Cards Dimensions Format
0 PRIMARY PrimaryHDU 421 () uint8
1 MEHI BinTableHDU 61 60R x 17C [1B, 1I, 1D, 1I, 1D, 1E, 1E, 768E, 768E, 768E, 768I, 768E, 768E, 1I, 1I, 1E, 7E]
The first header contains the minimal infirmation:
print hdulist[0].header.ascardlist()[:5]
SIMPLE = T / Standard FITS Format
BITPIX = 8 / Binary data
NAXIS = 0 / Two-dimensional image
EXTEND = T / Extensions are present
TELESCOP= 'IUE ' / International Ultraviolet Explorer
The number of axis is 0 which means there is no data block in the primary HDU. The header of the second HDU begins with the keyword XTENSION and with the specification of the data
print hdulist[1].header.ascardlist()[:5]
XTENSION= 'BINTABLE' / Binary table extension
BITPIX = 8 / Binary data
NAXIS = 2 / Two-dimensional table array
NAXIS1 = 16961 / Width of row in bytes
NAXIS2 = 60 / Number of orders
To progress further we need to know what is in the table. As usual, the columns have names and type of the stored data. These information can be obtained using the column attribute of hdulist:
cols = hdulist[1].columns
cols.info
<bound method ColDefs.info of ColDefs(
name = 'ORDER'; format = '1B'; unit = ' '
name = 'NPOINTS'; format = '1I'; unit = ' '
name = 'WAVELENGTH'; format = '1D'; unit = 'ANGSTROM'
name = 'STARTPIX'; format = '1I'; unit = 'PIXEL'
name = 'DELTAW'; format = '1D'; unit = 'ANGSTROM'
name = 'SLIT HEIGHT'; format = '1E'; unit = 'PIXEL'
name = 'LINE_FOUND'; format = '1E'; unit = 'PIXEL'
name = 'NET'; format = '768E'; unit = 'FN'
name = 'BACKGROUND'; format = '768E'; unit = 'FN'
name = 'NOISE'; format = '768E'; unit = 'FN'
name = 'QUALITY'; format = '768I'; unit = ' '
name = 'RIPPLE'; format = '768E'; unit = 'FN'
name = 'ABS_CAL'; format = '768E'; unit = 'ERG/CM2/S/A'
name = 'START-BKG'; format = '1I'; unit = 'PIXEL'
name = 'END-BKG'; format = '1I'; unit = 'PIXEL'
name = 'SCALE_BKG'; format = '1E'; unit = ' '
name = 'COEFF'; format = '7E'; unit = ' '
)>
the cols.info returns the names of the columns and the information of their format and units.
The data are available using (this example is NOT the right way of plotting the data, it's just an example) [and don't forget to import numpy as np to have np.arange working]:
data1 = hdulist[1].data
DTs = data1.ABS_CAL
WLs = data1.WAVELENGTH
DWs = data1.DELTAW
for WL, DW, DT in zip(WLs, DWs, DTs):
plot(WL + np.arange(len(DT)) * DW, DT)
Writing FITS files.
The creation of a FITS file pass through 4 steps.1) Creation of numpy array with the data.
x = np.arange(100)
2) Creation of the HDU from the data.
hdu = pyfits.PrimaryHDU(x)
thus created, the hdu has its basic header and the data.
print hdu.header.ascardlist()
SIMPLE = T / conforms to FITS standard
BITPIX = 64 / array data type
NAXIS = 1 / number of array dimensions
NAXIS1 = 100
EXTEND = T
3) Adding additional keywords to the header. The automatically created header contains only the required minimum of keywords. If additional keywords are needed they are added with
hdu.header.update('testkey',0.001,'some test value')
print hdu.header.ascardlist()
SIMPLE = T / conforms to FITS standard
BITPIX = 64 / array data type
NAXIS = 1 / number of array dimensions
NAXIS1 = 100
EXTEND = T
TESTKEY = 0.001 / some test value
4) Once all the keywords are ready, the final HDU list have to be created and written to the file:
hdulist = pyfits.HDUList([hdu])
hdulist.writeto('new.fits')
hdulist.close()
Alternative way
Another way to deal with FITS tables is to use the ATpy library, look there:http://atpy.github.com/index.html
References
To learn more on PyFITS: http://packages.python.org/pyfits/