Notes on the GUPPI Raw Data Format S. Ellingson Nov 1, 2013 This format consists of blocks, with each block consisting of a text header and a raw binary data segment. An example of a header is shown in the file "header_example.txt". The header ends with the word "END". Some important fields are: OBSFREQ: [MHz] center of the RF passband OBSBW: [MHz] width of passband; negative sign indicates spectral flip OBSNCHAN: Number of channels (subbands) NPOL: Number of polarizations times 2. For example, NPOL=4 means 2 polarizations. NBITS: Number of bits per I or Q value. So, one complex-valued sample has 2*NBITS bits TBIN: [s] sample period within a channel CHAN_BW [MSPS] sample rate for a channel. Negative sign indicates spectral flip. OVERLAP: This many samples per subband from the previous data block are repeated at the beginning of this data block. BLOCSIZE: The size of the raw data segment in bytes. The center frequency of channel i (where i is in [1..OBSNCHAN]) is OBSFREQ - OBSBW/2 + (i-0.5)*CHAN_BW [MHz]. Pseudocode describing the structure of the raw data block is as follows: --- begin ---- for channel=1..OBSNCHAN, for nsamples=1..NDIM, for polarization=1..(NPOL/2) write I, Q --- end --- Above, NDIM is the number of samples per channel in the block; i.e., BLOCSIZE/(OBSNCHAN*NPOL*(NBITS/8)). This *includes* overlap bits. For NBITS=8, the samples are "signed char". For the one and only dataset I've worked with so far (identified below): ---- OBSFREQ = 1378.125 OBSBW = -200 OBSNCHAN = 32 NPOL = 4 NBITS = 8 TBIN = 1.6E-07 CHAN_BW = -6.25 OVERLAP = 512 BLOCKSIZE=1073545216 ---- In this case, NDIM = 8387072 and the time span covered by a raw data block is NDIM*TBIN = 1.3419 s. Keep in mind, however, that this 1.3419 s span overlaps with the next block by 512 samples. As an example, src/rg.c is C source code which reads a single header + raw data block from a GUPPI raw data file, extracts one channel, and writes it back out as time and spectra. (See the source code for compiling instructions and usage.) The script src/a.sh runs "rg" repeatedly to obtain the output for all channels. src/a.gp is a Gnuplot script which reads these files and plots the entire bandpass, including all channels. "guppi.png" is the output when the above code is applied to the file "guppi_56465_J1713+0747_0006.0000.raw" (NRAO folks: /lustre/pulsar/scratch/1713+0747_global/raw). For this particular dataset there is a large DC offset in each channel, which accounts for the spike in the center of each channel bandpass. In this output, channel 1 is on the right and channel 32 is on the left. Thanks to Paul Demorest for helping me figure this out.