Reducing Raw SDSS Specroscopic Data

These are the instructions for running the spectroscopic pipeline on the raw SDSS data. You must first install the IDL code, which are the three products: idlspec2d, idlutils, specflat.


The Directory Structure

The directory stucture for spectroscopic data is described by environment variables as follows:

$BOSS_SPECTRO_DATA/mmmmm - The raw image, "sdR-cc-eeeeeeee.fit"
$SPECLOG_DIR/mmmmm - The plug-map files, "plPlugMapM-pppp-mmmmm-rr.par"

where mmmmm refers to an MJD (such as 51690), cc refers to a camera name (such as b1 for blue-1), pppp refers to a plate number (such as 0306), eeeeeeee refers to an exposure number (such as 00003974), and rr refers to a fiber-mapper re-run number (such as 01).

At Princeton, these paths would be set as follows:

BOSS_SPECTRO_DATA=/u/dss/rawdata
SPECLOG_DIR=/u/dss/astrolog

To re-run a plate through the spectro pipeline, you need all of these files. Both paths must be set appropriately.

Obtaining the Raw Data

The raw images for a particular night ($MJD) can be found at:

sdssdata.astro.princeton.edu:/u/dss/rawdata/$MJD/sdR*.fit

The plug-map files for a particular night ($MJD) can be found at:

sdssdata.astro.princeton.edu:/u/dss/astrolog/$MJD/plPlugMapM*.par

These are also available from the CVS product "speclog".

Generating Plan Files

My convention is to put the reduced data in a directory separate from the data where the top-level directory is $BOSS_SPECTRO_REDUX. At Princeton, this would be set to

BOSS_SPECTRO_REDUX=/u/dss/spectro

Each plate is put in its own subdirectory, so the reductions of plate 306 would be in "/u/dss/spectro/0306".

Before running the spectro pipeline, you need to build plan files for each plate. Create the output directory $BOSS_SPECTRO_REDUX. From there, build the plan files...

IDL> spplan2d
IDL> spplan1d

This could take an hour to build plan files for all the data taken to date. However, you can limit this to particular nights of data by setting keywords to these procedures (see the full documentation).

The spplan2d command builds the files "spPlan2d-pppp-mmmmm.par". There is one such file for each night a plate is observed.

The spplan1d command builds the files "spPlancomb-pppp-mmmmm.par". This file merges exposures from multiple nights of observations of the same plate if those observations were taken without re-plugging the plate. If the plate was re-plugged between nights, then a given fiber will correspond to different objects in each night, and those nights' data shouldn't be combined with "spcombine".

Note that these plan files are ASCII files (in something called a Yanny parameter format) which can be hand-edited. That way, you can exclude particular exposures from a reduction by commenting-out lines with hash marks (#).

Running from the IDL Prompt

It takes approximately 3.5 hours to run one plate through Spectro-2D on a 1-GHz Pentium-III, and another 8 hours to run Princeton-1D.

In each output plate directory, you can run the following three commands from the IDL prompt:

IDL> spreduce2d
IDL> spcombine
IDL> spreduce1d

The spreduce2d command reduces individual exposures to "spFrame-cc-eeeeeeee.fits" files.

The spcombine command combines those exposures into the reduced plate file, "spPlate-pppp-mmmmm.fits".

The spreduce1d command finds the redshifts, and generates the file "spZbest-pppp-mmmmm.fits".

A number of other supplementary files are also produced. The history of the reductions are written to log files named "spDiag*.log", and some PostScript plots are written to "spDiag*.ps".

Running in the Background

For example, to reduce plate 306 from the command line,

echo "spreduce2d, 'spPlan2d-0306-51690.par'" | idl >& /dev/null &

Reducing Data Automatically with the Spectro Robot

We use an IDL script BATCH2D for batch processing many plates at once, which in turn calls DJS_BATCH:This script will run jobs across local or remote networks using rsh or ssh protocols. For a remote machine, the raw data files are shipped across the network, the plate is reduced, then the reductions are shipped back. Presumably, this would work just fine on the Fermi farms. The plan files need to be built before running this script. Also make certain that the remote machines have their UPS environment and the idlspec2d product set up from the ".bashrc" file, since the remote commands are launched from the bash shell.

There is a Spectro-Robot that automatically fetches data, builds plan files, and reduces it on a day-by-day basis. The command "sprobot_start" loads the cron job. The raw data is copied to the first disk with space listed in the SPROBOT_LOCALDISKS environment variable, then a link is built from $BOSS_SPECTRO_DATA/$MJD to that directory. At Princeton, the disk list is something like:

SPROBOT_LOCALDISKS='/scr/spectro1/data/rawdata /scr/spectro2/data/rawdata'

Other environment variables that need to be set for the Spectro-Robot:

If any of the above variables are not set, then "sprobot_start" will issue an error message and fail to load. A log file is written to the file "$BOSS_SPECTRO_DATA/sprobot.log".

There are two Yanny parameter files that list the computer names and protocols to use. There are default files in "$IDLSPEC2D_DIR/examples/batch2d.par" and "$IDLSPEC2D_DIR/examples/batch1d.par" for use with Spectro-2D and P-1D respectively. You can over-ride these default files by putting files with the same names in the directory $BOSS_SPECTRO_REDUX.

The Spectro-Robot commands:

sprobot_start   -- Start the Spectro-Robot.
sprobot_status  -- See if the Spectro-Robot is running.
sprobot_stop    -- Stop the Spectro-Robot.

Finally, if one wished to *not* run Princeton-1D, then the line containing "sprobot1d.sh" would have to be removed from the file "sprobot.sh".


Maintained by David Schlegel at Princeton University, Dept. of Astrophysics, Peyton Hall, Princeton NJ 08544