Monitoring ADLink Data Output

Data Collation

The ADLink raw data file directories on the ten ADC card host Windows computers, 'paf1' through 'paf10' are mounted as directories /mnt/paf1, /mnt/paf2 etc. on the Linux compuuter 'paf0'. Hence, 'paf0' is the machine that you must log into to collate the ten data files from each scan into one file.

After logging into 'paf0' change directories with

cd /home/paf/PAF0
where you will find the Python file
paf_directories.py
In this file you will find the names of the base directory (ADC_MNT) where the data files on the ten ADC host computers are accessed, the directory (ADC_DATA) where the same files are copied to the 'paf0' disk, the directory (AGG_DIR) where the collated files are stored, and the directory (FITS_DIR) containing the FITS files associated with the data acans. Assuming that these directories are properly specified, the original ADC data files are first copied from the ten data acquisition computers to a local drive on 'paf0' with the command
/opt/local/bin/python data_copy.py [program_name] [file_name]
For example, the data from one scan is copied with
/opt/local/bin/python data_copy.py TINT_072313 2013_07_25_18:50:48
If the directory TINT_072313 doesn't exist on the 'paf0' drive it will be created. To copy a sequence of files use an incomplete file name, For example, to copy all files after 18 hours but before 19 hours use the command
/opt/local/bin/python data_copy.py TINT_072313 2013_07_25_18
The advantage of using the data_copy.py script is that files already on the 'paf0' disk will not be recopied. This script uses the linux command, rsync, to accomplish this. The copied files are quite large, and there are ten files per scan, so the copy operation will take some time - roughly 12 seconds per scan, depending on network load.

Now, to collate the data from the ten ADC computers for one scan into file use the collate.py script, as in

/opt/local/bin/python collate.py TINT_072313 2012_07_15_14:59:20
where the last argument is the time/date name of the scan for which data are to be collated. Note that the file namee on the 'paf1'-'paf10' Windows disks will have 'h' and 'm' substituted for colons. These will be translated by the python code.

The collation process will produce either one or two files for each scan, depending on whether each array polarizationa are to be correlated separately or the full cross correlation matrix is produced. This is controlled by the FULL_CORRELATION parameter, set to either True or False in the paf_directories.py file. The only disadvantages to producing the full correlation matrix are this it takes a little more computing time, and the resulting file sizes are larger. In general, full correlation is the preferred option. In the (AGG_DIR) directory mentioned above the full correlation matrix will be storred in a file with the same date-time file name plus the extension '.agg'. If the polarizations are correlated separately, two files will be produced with file names ending in '_X.agg' and '_Y.agg'. If, for some reason, only one polarization was recorded from the ADC samplers, just one corellation output file will be written with a name ending in '_X.agg'.

Note that the FULL_CORRELATION parameter in paf_directories.py is read by later data processing and display routines so it should not be changed in the middle of a data processing session.

To save the time and effort of looking up which new scans require collation you can use the script, collate.py, without the file name. This scans the files created by the data_copy.py script and collates those which don't have corresponding '.agg' files.

/opt/local/bin/python collate.py TINT_072313

Once the data files have been collated their contents can be examined as explained in the next section or processed by the Software Correlator and the accumulated cross-product spectra archived in FITS files.

Displaying Uncollated Data

The primary purpose of the ADLink Data Monitors is to verify the basic integrity of the data acquired by the ADCs. Each of the ten ADC host computers produces a data file on its local hard drive for each data scan. A sample of the contents of any of these files may be displayed on the PAF0 computer without performing the collation described above by going to the directory /users/rfisher/Applications/PAF0nGBT/PAF0 and executing

/opt/local/bin/python paf_check.py [data_dir fits_dir]
where the optional parameters are the root data directory and the directory where the corresponding FITS files for these data are stored, as described in the section above. For example,
/opt/local/bin/python paf_check.py /export/raid1 /export/raid1/gbtdata
When the PAF is used on the GBT the receiver FITS files may go to the directory /home/gbtdata. One could either copy the files to the default directory /export/raid1/gbtdata or change this directory with the second argument in the command above.

The subdirectory and file name can be selected within the GUI using the 'File' menu. The files will be found via the directory structure

/mnt/paf?/Data/(program name)/

The plot type ('Sample Values' or 'Spectra') can be selected with the 'Plot' menu.

Under the 'Help' menu is a 'Data Header' selection that displays the receiver and ADC setup param,eters for this scan as taken from the rceiver FITS file. See Figure 3.

This python script must be run on PAF0 because the Windows hard drives on paf1 through paf10 are cross-mounted as Linux directories on paf0 in the directory, /mnt. For example,

paf0> ls /mnt
paf1  paf10  paf2  paf3  paf4  paf5  paf6  paf7  paf8  paf9
paf0> ls /mnt/paf2/Data/TGBT12C_000_00/
2013_03_23_00h35m20_01.dat  2013_03_28_14h39m04_01.dat
2013_03_23_01h23m25_01.dat  2013_03_28_14h40m58_01.dat
2013_03_23_01h27m31_01.dat  2013_03_28_15h47m14_01.dat
2013_03_26_15h26m49_01.dat  2013_03_28_15h54m21_01.dat
2013_03_26_19h16m09_01.dat  2013_03_29_19h40m18_01.dat
2013_03_26_19h35m34_01.dat  2013_03_29_19h40m42_01.dat
2013_03_27_19h42m45_01.dat  2013_03_29_20h03m17_01.dat
2013_03_28_14h23m47_01.dat  2013_04_01_15h30m05_01.dat

Displaying Collated Data

The Python source code file for displaying collated data is in the Green Bank network directory

/home/paf/PAF0
The main source code file is paf_monitor.py, and two other source files, RcvrFitsV1p1.py and wxmpl.py, in the same directory are also required. This code uses the following modules: wx, wxmpl, matplotlib, pyfits, scipy, and numpy. These are all available by using the following python executable:
/opt/local/bin/python paf_monitor.py
This will open the data from a default program name and file name, which can be changed from the 'file' menu on the display. To open the data monitor for specific data and FITS the command can be used with parameters as follows:
/opt/local/bin/python paf_monitor.py [data_dir fits_dir]
where the first parameter is the main directory for the collated data files, and the seond parameter the directory is where the corresponding receiver FITS files are stored. For example, the defaults are
/opt/local/bin/python paf_monitor.py /export/raid1 /export/raid1/gbtdata
When the PAF is used on the GBT the receiver FITS files may go to the directory /home/gbtdata. One could either copy the files to the default directory /export/raid1/gbtdata or change this directory with the second argument in the command above.

There are two primary displays for collated data as shown in Figures 1 and 2. The first shows about a tenth of a second of raw data values from twenty ADC channels. The short burst near the beginning is a sinusoidal timing calibration signal in one channel of each ADC card to allow the relative channel timing to be corrected for a half-clock offset due to a variable ambiguity between the start trigger pulse and the sample clock that runs at twice the sample rate.

Figure 2 shows the power spectra for the same data channels integrated over the first one second of data following the timing monitor burst. The spectrum intensities for all channels are plotted on the same vertical scales so a weak signal will show up as a weak spectrum. Note that it takes a few seconds to compute a new power spectrum display.

Use the 'Plot' menu to switch between data sample and spectrum displays. The left mouse button provides zoom by surrounding the area of interest with a box. Each spectrum panel zooms only within its subpanel, but the raw data values for one channel can occupy the full screen. Unzoom can be accomplshed either with the right mouse button or the 'Redraw' selection in 'Plot' menu.

The 'Polarization' menu selects which of the two polarization data (X or Y) are displayed from the full correlation data aggregation. If the aggregated data file contains only one polarization, this menu option will have no effect.

Under the 'Help' menu is a 'Data Header' selection that displays the receiver and ADC setup param,eters for this scan as taken from the rceiver FITS file. See Figure 3.

To display data from a different collated data file use the 'File' menu 'Open' selection to select the directory and file name. If activated in the Python code, the 'Auto' selection in the 'File' menu will watch for new collated files in th selected directory and update the display to the latest data file. The Auto update feature may be toggled on or off by selecting it again in the 'File' menu.

Figure 1 - Raw data samples at the beginning of the collated data file.

Figure 2 - Averaged power spectra for data near the beginning of the collated data file.

Figure 3 - Receiver and ADC setup parameters from the 'Help | Header Info' menu selection.

Displaying Receiver Temps from Hot/Cold Measurements

The Python source code file for displaying collated data is in the Green Bank network directory

/home/paf/PAF0
The main source code file is get_trx.py and the code that actually does the computation and display is compute_trx.py. The syntax is
/opt/local/bin/python get_trx.py [program_name] [hot_load_file_name]
[cold_load_file_name] -w [hot_load_temp] -c [cold_load_temp] -r [baseband freq ranges]
For example,
/opt/local/bin/python get_trx.py TINT_072313 2013_07_25_19:14:22 2013_07_25_19:14:36 -w 295 -c 15 -r 0.05-0.3,0.35-0.6
The hot and cold load temperatures are defaulted to 300 and 10 K and may be omitted, if these are acceptable. The baseband frequency ranges are in MHz, where the total bandwidth is 0.625 MHz as set by the sampling clock rate. The '-r' parameter argument specifies a frequency mask applied to the spectra before computing mean and median receiver tempertures from the spectra. The argument of this parameter must be a string without spaces in pairs of floating point numbers separated by a hyphen '-'. There may be any number of these pairs separated by commas. The syntax of the get_trx.py command may be shown with the help parameter, e.g.,
/opt/local/bin/python get_trx.py -h
If the data files have not yet been copied to the PAF0 raid drive and aggregated, get_trx.py does this first and then passes execution to compute_trx.py.

The algorithm used is by this script is

T_rx = (T_cold - R * T_hot) / (R - 1)
where T_hot and T_cold are the values given in the last two command arguments (or their defaults), and R is the ratio of powers computed from the two data scans.
R = P_cold / P_hot

To mitigate the effects of narrow-band interference the values of P_cold, P_hot, and R are derived from 128-channel spectra integrated for 2 seconds each. This integration takes about 30 seconds. The displayed values for R are the mean and median of channel-by-channel power ratios, omitting the spectral channels that fall outside of the specified frequency ranges. The command line and output text display look like the table below, although the values are not from a real hot-cold load measurement.

paf0> /opt/local/bin/python get_trx.py TINT_072313 2013_07_25_19:14:22 2013_07_25_19:14:36 -w 295 -c 15 -r 0.05-0.3,0.35-0.6

hot_temp: 295.00  cold_temp: 15.00
                                  RcvrTemp
ADC ADCChan Element MixerCard  mean    median
  1     1      1X       1      36.08    36.01
  1     2     11X       3      34.73    34.68
  1     3      1Y       6      31.64    31.62
  1     4     11Y       8      30.02    30.07
  2     1      2X       1      36.25    36.29
  2     2     12X       3      40.45    40.47
  2     3      2Y       6      29.10    29.15
  2     4     12Y       8      36.24    36.17
  3     1      3X       1      34.55    34.63
  3     2     13X       4      23.35    23.20
  3     3      3Y       6      35.03    35.05
  3     4     13Y       9      32.09    32.06
  4     1      4X       1      39.09    39.12
  4     2     14X       4      31.93    32.01
  4     3      4Y       6      33.12    33.13
  4     4     14Y       9      32.13    32.20
  5     1      5X       2      26.85    26.86
  5     2     15X       4      42.85    42.83
  5     3      5Y       7      27.39    27.32
  5     4     15Y       9      35.55    35.53
  6     1      6X       2      34.01    33.94
  6     2     16X       4      31.88    31.83
  6     3      6Y       7      30.93    31.02
  6     4     16Y       9      32.48    32.51
  7     1      7X       2      37.66    37.79
  7     2     17X       5      30.94    30.97
  7     3      7Y       7      30.43    30.41
  7     4     17Y      10      32.46    32.42
  8     1      8X       2      35.18    35.16
  8     2     18X       5      32.45    32.35
  8     3      8Y       7      35.68    35.86
  8     4     18Y      10      25.35    25.33
  9     1      9X       3      36.30    36.30
  9     2     19X       5      32.62    32.67
  9     3      9Y       8      35.72    35.69
  9     4     19Y      10      31.77    31.90
 10     1     10X       3      36.00    36.08
 10     2      NC       5      32.02    31.89
 10     3     10Y       8      39.85    39.86
 10     4      NC      10      31.75    31.68

 

where the channel and element mapping is read from the file

/export/raid1/PafParam/element_adc_map.txt