Interferometers I

The largest fully steerable dishes have diameters $D \approx 100$ m.  The angular resolution of a diffraction-limited telescope is $\theta \approx \lambda/D$ radians, so impossibly large diameters are needed to achieve sub-arcsecond resolution at radio wavelengths.  Tracking accuracy is also a problem for a large single dish.  The telescope beam should be able follow a radio source on the sky within $\sigma \approx \theta/10$ for reasonably accurate photometry or imaging.  The accuracy with which the actual beam direction during an observation can be recovered by data analysis determines the accuracy with which the sky position of a radio source can be measured.  Gravitational sagging, deformations caused by differential solar heating, and torques caused by wind gusts combine to limit the mechanical tracking and pointing accuracies of the best radio telescopes to $\sigma \sim 1"$.  The geometric area of a single dish is only $\pi D^2/4$, while the geometric area $N \pi D^2/4$ of an interferometer with $N$ dishes can be arbitrarily large.  Finally, the continuum sensitivity of a single dish is strongly limited by confusion at frequencies below about 10 GHz. 

Interferometers comprising $N \geq 2$ moderately small dishes have mitigated these many other practical problems associated with single dishes, such as vulnerability to fluctuations in atmospheric emission and receiver gain, radio-frequency interference, and pointing shifts caused by atmospheric refraction.  Historically, the total bandwidths and numbers of simultaneous frequency channels of interferometers were lower than those of single dishes.  Recent advances in correlator electronics have largely overcome these practical limitations, so interferometers are playing an increasingly dominant role in observational radio astronomy. 

photo of Westerbork

The Westerbork Synthesis Radio Telescope (WSRT) of 14 25-meter telescopes on an east-west baseline 3 km in length. It has the effective collecting area of a single dish with $D \approx 92$ m, the wide field-of-view of a 25 m telescope, the high angular resolution of a telescope 3000 m in diameter, and it can measure positions of radio sources with sub-arcsecond accuracy despite larger pointing errors of the individual telescopes.  Image credit

The Two-Element Narrow-band Interferometer

The basic interferometer is a pair of radio telescopes whose voltage outputs are correlated (multiplied and averaged). Even the most elaborate interferometers with $N \gg 2$ elements can be treated as $N(N-1)/2$ independent interferometer pairs, so we begin by analyzing the simplest case, a two-element narrowband interferometer.

This block diagram shows the components of the simplest two-element interferometer observing in a very narrow frequency range centered on $\nu = \omega / (2 \pi)$.  The correlator multiplies and averages the voltage outputs $V_1$ and $V_2$ of the two dishes.  $\hat{s}$ is the unit vector in the direction of a distant point source and $\vec{b}$ is the vector baseline from antenna 1 to antenna 2. The output voltage $V_1$ of antenna 1 is the same as the output voltage $V_2$ of antenna 2, but it is retarded by the geometric delay $\tau_{\rm g} = \vec{b} \cdot \hat{s} / c$.  These voltages are multiplied and time averaged by the correlator to yield an output response whose amplitude $R$ is proportional to the point-source flux density and whose phase depends on the delay and the frequency.  The quasi-sinusoidal output fringe shown occurs if the source direction in the interferometer frame is changing at a constant rate $d \theta / dt$.  The broad Gaussian envelope of the fringe is caused by primary-beam attenuation if the individual dishes do not track the source.

The figure shows two identical dishes separated by a baseline vector of length $b$.  Both dishes are pointing in the direction specified by ${\hat{s}}$.  Plane waves from a distant point source in this direction must travel an extra distance $\vec{b} \cdot {\hat{s}} = b \cos\theta$ to reach antenna 1, so the output of antenna 1 is the same as that of antenna 2, but it lags by the geometric delay $\tau_{\rm g} = {\vec b} \cdot {\hat{s}} / c$.  For simplicity, we first consider a quasi-monochromatic interferometer, one that responds only to radiation in a very narrow band centered on frequency  $\nu =  \omega / (2 \pi)$.  Then the output voltages of antennas 1 and 2 can be written as
$$V_1 = V \cos [\omega (t - \tau_{\rm g})] \quad {\rm and} \quad V_2 = V \cos (\omega t) ~,$$ where $t$ is time.

The correlator first multiplies these two voltages to yield the product
$$V_1 V_2 = V^2 \cos(\omega t) \cos[\omega (t - \tau_{\rm g})] = \biggl({V^2 \over 2}\biggr) [\cos (2 \omega t - \omega \tau_{\rm g}) + \cos (\omega \tau_{\rm g})]$$ and then takes a time average long enough [$\Delta t \gg (2\omega)^{-1}$] to remove the high-frequency term $\cos(2\omega t - \omega\tau_{\rm g})$ from the final output $R$:
$$R = \langle V_1 V_2 \rangle = \biggl({V^2 \over 2}\biggr) \cos(\omega \tau_{\rm g})~.$$ The amplitudes $V_1$ and $V_2$ are proportional to the electric field produced by the source multiplied by the the voltage gains of antennas 1 and 2.  Thus the output amplitude $V^2/2$ is proportional to the point-source flux density $S$ multiplied by $(A_1 A_2)^{1/2}$, where $A_1$ and $A_2$ are the effective collecting areas of the two antennas.  Uncorrelated noise from the receivers and the atmosphere over the two telescopes does not appear in the correlator output, so fluctuations in receiver gain or atmospheric emission are much less significant than for a total-power observation with a single dish.  Pulsed interference with duration $t \ll \vert \vec{b} \vert / c$ is also suppressed because it usually does not reach both telescopes simultaneously.

The correlator output voltage $R = (V^2 / 2) \cos(\omega \tau_{\rm g})$ varies sinusoidally with the change of source direction in the interferometer frame.  These sinusoids are called fringes, and the fringe phase
$$\phi = \omega \tau_{\rm g} = {\omega \over c} b \cos\theta$$ depends on $\theta$ as follows:
$$ {d \phi \over d \theta} = {\omega \over c} b \sin \theta
= 2 \pi \biggl( {b \sin \theta \over \lambda}\biggr)~.$$ The fringe period ($\Delta \phi = 2 \pi$) corresponds to an angular change $\Delta \theta = \lambda / (b \sin \theta)$.  The fringe phase is an exquisitely sensitive measure of source position if the projected baseline $b \sin \theta$ is many wavelengths long.  Note that fringe phase and hence measured source position is not affected by small tracking errors of the individual telescopes.  It depends on time, and times can be measured by clocks with much higher accuracy than angles (ratios of lengths of moving telescope parts) can be measured by rulers.  Also, an interferometer whose baseline is horizontal is not affected by the plane-parallel component of atmospheric refraction, which delays the signals reaching both telescopes equally.  Consequently, interferometers can determine the positions of compact radio sources with unmatched accuracy.  Absolute positions with errors as small as $\sigma_\theta \approx 10^{-3}$ arcsec and differential positions with errors down to  $\sigma \approx 10^{-5}$ arcsec $<10^{-10}$ rad have frequently been measured.

photo of Westerbork

The VLBA was used to measure the parallax and proper motion of the radio star T Tau Sb.  The measured parallax $\Pi = 6.82 \pm 0.03$ milli-arcsec implies a distance $D = 146.7 \pm 0.6$ pc (Loinard et al. 2007, ApJ, 671, 546), an improvement by nearly two orders of magnitude over the distance $D = 177^{+68}_{-39}$ pc obtained by the Hipparcos astrometry satellite.

If the individual antennas comprising an interferometer were isotropic, the interferometer point-source response would be a sinusoid spanning the sky.  Such an interferometer is sensitive to only one Fourier component of the sky brightness distribution, the component with angular period $\lambda / (b \sin \theta)$.  The response $R$ of a two-element interferometer with directive antennas is that sinusoid multiplied by the product of the voltage patterns of the individual antennas.  Normally the two antennas are identical, so this product is the power pattern of the individual antennas and is called the primary beam.  The primary beam is usually a Gaussian much wider than a fringe period, as indicated in the block diagram above.  The Fourier transform of the product of two functions is the convolution of their Fourier transforms, so the interferometer with directive antennas responds to a finite range of angular frequencies centered on $b \sin \theta / \lambda$.  Since the antenna diameters $D$ must be smaller than the baseline $b$ (else the antennas would overlap), the angular frequency response cannot extend to zero and the interferometer cannot detect an isotropic source, the bulk of the 3 K cosmic microwave background for example.

Improving the instantaneous point-source response of an interferometer requires more Fourier components; that is, more baselines.  An interferometer with $N$ antennas contains $N(N-1)/2$ pairs of antennas, each of which can be connected as an interferometer, so the instantaneous synthesized beam (the point-source response obtained by averaging the outputs of all pairs) rapidly approaches a Gaussian as $N$ increases.  The point-source responses of a two-element interferometer with projected baseline length $b$, a three-element interferometer with three baselines (projected lengths $b/3$, $2b/3$, and $b$), and a four-element interferometer with six baselines (projected lengths $b/6$, $2b/6$, $3b/6$, $4b/6$, $5b/6$, and $b$) are shown below.

The instantaneous point-source responses of interferometers with overall projected length $b$ and two, three, or four antennas distributed as shown are indicated by the thick curves.  The synthesized main beam of the four-element interferometer is nearly Gaussian with angular resolution $\approx \lambda/b,$ but the sidelobes are still significant and there is a broad negative "bowl" caused by the lack of spacings shorter than the diameter of an individual antenna.  The individual responses of the three two-element interferometers comprising the three-element interferometer and the six two-element interferometers comprising the four-element interferometer are plotted as thin curves. 

Most radio sources are stationary; that is, their brightness distributions do not change significantly on the time scales of astronomical observations.  For stationary sources, a two-element interferometer with moveable antennas can make $N(N-1)/2$ observations to duplicate one observation with an $N$-element interferometer.

Extended Sources and the Complex Correlator

The response of the two-element interferometer with "cosine" correlator output
$R_{\rm c} = (V^2 / 2) \cos(\omega \tau_{\rm g})$ to a spatially incoherent extended source with sky brightness distribution $I_\nu(\hat{s})$ near frequency $\nu = \omega/(2 \pi)$ is obtained by treating the extended source as the sum of independent point sources:
$$ R_{\rm c} = \int  I_\nu(\hat{s}) \cos(2 \pi \nu \vec{b}\cdot \hat{s} / c) d \Omega = \int I_\nu(\hat{s}) \cos(2 \pi \vec{b} \cdot \hat{s} /\lambda) d \Omega~.$$ Notice that the even cosine function in this response is sensitive only to the even (inversion-symmetric) part $I_{\rm E}$ of an arbitrary source brightness distribution, which can be written as the sum of even and odd (antisymmetric) parts: $I = I_{\rm E} + I_{\rm O}$.  To detect the odd part $I_{\rm O}$ we need a "sine" correlator whose output is odd,
$R_{\rm s} = (V^2 / 2) \sin(\omega \tau_{\rm g})$.  This is implemented by a second correlator that follows a $90^\circ$ phase delay inserted into the output of one antenna.  Then
$$R_{\rm s} = \int I_\nu(\hat{s}) \sin(2 \pi \vec{b}\cdot \hat{s} / \lambda) d \Omega~.$$ The combination of "cosine" and "sine" correlators is called a "complex" correlator because it is convenient to write the cosines and sines as complex exponentials using the identity
$$ e^{i \phi} = \cos(\phi) + i\sin(\phi)~,$$ which you can verify by expanding each term as a Taylor series:
$$ 1 + i\phi - {\phi^2 \over 2} -i{\phi^3 \over 6}  + {\phi^4 \over 24} \dots = 1 - {\phi^2 \over 2} + {\phi^4 \over 24} \dots
+ i (\phi - {\phi^3 \over 6} \dots) $$ We define the complex visibility $V \equiv R_{\rm c} -i R_{\rm s}$ and write it in the form
$$V = A e^{-i\phi}$$ where
$$A = (R_{\rm c}^2 + R_{\rm s}^2)^{1/2}$$ is the visibility amplitude and
$$ \phi = {\rm tan}^{-1} (R_{\rm s} / R_{\rm c})$$ is the visibility phase.  The response to an extended source with brightness distribution $I_\nu(\hat{s})$ of the two-element interferometer with a complex correlator is the complex visibility
$$\bbox[border:3px blue solid,7pt]{V_\nu = \int I_\nu(\hat{s})
\exp(-i2\pi \vec{b} \cdot\hat{s} / \lambda) \, d \Omega}\rlap{\quad \rm {(3F1)}}$$

Effects of Finite Bandwidths and Averaging Times

Equation 3F1 for quasi-monochromatic interferometers may be generalized to interferometers with finite bandwidths and integration times, which are necessary for high sensitivity. If the source brightness and the response of the interferometer are constant in the small but finite frequency range $\Delta \nu$ centered on frequency $\nu_{\rm c}$, Equation 3F1 becomes
$$V = \int \biggl[ (\Delta \nu)^{-1} \int_{{\nu_{\rm c}} - \Delta \nu/2}^{{\nu_{\rm c}}+\Delta\nu/2}  I_\nu(\hat{s})
\exp(-i2\pi \vec{b} \cdot\hat{s} / \lambda) \,d\nu \biggr] d \Omega$$
$$V = \int \biggl[ (\Delta \nu)^{-1} \int_{{\nu_{\rm c}} - \Delta \nu/2}^{{\nu_{\rm c}}+\Delta\nu/2}  I_\nu(\hat{s})
\exp(-i2\pi \nu \tau_{\rm g}) \,d\nu \biggr] d \Omega$$ The integral in brackets is just the Fourier transform of a rectangle function, so
$$V = \int I_\nu(\hat{s}) {\rm sinc}(\Delta\nu\tau_{\rm g})
\exp(-i 2 \pi \nu_{\rm c}\tau_{\rm g}) d \Omega~.$$ For a finite bandwidth and delay, the fringe amplitude is attenuated by the factor ${\rm sinc}(\Delta\nu\tau_{\rm g})$.  This attenuation can be eliminated in any one direction $\hat{s}_0$ called the delay center by introducing a compensating delay $\tau_0 \approx \tau_{\rm g}$ in the signal path of the "leading" antenna, as shown below.  As the Earth turns, $\tau_0$ must be continuously adjusted to track $\tau_{\rm g}$ within a tolerance $\vert \tau_0 - \tau_{\rm g} \vert \ll (\Delta\nu)^{-1}$.  This is usually done with digital electronics.

The compensating delay $\tau_0$, shown here as an extra loop of cable between antenna 2 and the correlator, must track the geometric delay $\tau_{\rm g}$ in the direction $\hat{s}_0$ of the delay center accurately enough to keep $\vert \tau_0 - \tau_{\rm g} \vert \ll (\Delta\nu)^{-1}$ in order to minimize attenuation.

Since the geometric delay varies with direction, delay compensation can be exact in only one direction.  The angular radius $\Delta\theta$ of the usable field-of-view is determined by the variation of $\tau_{\rm g}$ with offset $\Delta\theta$ from the direction $\hat{s}_0$. Since
$ c \tau_{\rm g} = \vec{b} \cdot \vec{s} = b \cos\theta$, $\vert c \Delta\tau_{\rm g} \vert = b \sin\theta \Delta \theta~.$ Requiring $$\Delta \nu \Delta\tau_{\rm g} \ll 1$$ implies
$$\Delta \nu (b \sin\theta) \Delta\theta / c \ll 1~.$$ Substituting  $\lambda\nu = c $ and using $\theta_{\rm s} \approx \lambda / (b \sin\theta)$ for the synthesized beamwidth, we get the requirement
$$\bbox[border:3px blue solid,7pt]{\Delta \theta \Delta \nu \ll \theta_{\rm s} \nu }\rlap{\quad \rm {(3F2)}}$$ At larger offsets $\Delta \theta$, bandwidth smearing will radially broaden the synthesized beam by convolving it with a rectangle of angular width $\Delta \theta \Delta \nu / \nu$. 

Satisfactory wide-field images can be made with a larger total bandwidth only by dividing that bandwidth into a number of narrower frequency channels each satisfying Equation 3F2.  For example, the synthesized beamwidth of the VLA "B" configuration (maximum baseline length $ b \approx 10 {\rm ~km}$) at $\lambda = 20 {\rm ~cm}$ ($\nu = 1.5 {\rm ~GHz}$) is $\theta_{\rm s} \approx [(0.2 {\rm ~m})/(10^4 {\rm ~m})] {\rm ~rad} \approx 4 {\rm ~arcsec}$.  To image out to an angular radius $\Delta \theta = 15{\rm ~arcmin} = 900{\rm ~arcsec}$ equal to the half-power radius of the VLA primary beam requires channel bandwidths $\Delta \nu$ small enough that
$$\Delta \nu \ll {\nu \theta_{\rm s} \over \Delta\theta}
= {1.5 \times 10^9{\rm ~Hz} \times 4 {\rm ~arcsec}
\over 900 {\rm ~arcsec}} \approx 7 {\rm ~MHz}$$
The correlator averaging time $\Delta t$ must be kept short enough that the Earth's rotation will not move the source position in the frame of the interferometer by as much as the synthesized beamwidth $\theta_{\rm s} \approx \lambda / b$.  For example, if the delay is set to track the north celestial pole, a source $\Delta \theta$ away from the north pole will appear to move at an angular rate $2 \pi \Delta \theta / P$, where $P \approx 23^{\rm h} 56^{\rm m} 04^{\rm s} \approx 86164 {\rm ~s}$ is the Earth's sidereal rotation period. Excessive correlator averaging times will cause time smearing that tangentially broadens the synthesized beam.  To minimize time smearing in an image of angular radius $\Delta \theta$, we require
$$\bbox[border:3px blue solid,7pt]{\Delta\theta \Delta t \ll {\theta_{\rm s} P \over 2 \pi} \approx \theta_{\rm s} \times 1.37\times10^4 {\rm ~s} }\rlap{\quad \rm {(3F3)}}$$ Continuing with the previous example, to image out to an angular radius $\Delta\theta = 900 {\rm ~arcsec}$ when $\theta_{\rm s} = 4 {\rm ~arcsec}$ requires averaging times $\Delta t$ short enough that
$$\Delta t \ll {\theta_{\rm s} \over \Delta \theta} \times 1.37\times10^4 {\rm ~s} = {4 {\rm ~arcsec} \over 900 {\rm ~arcsec}} \times 1.37 \times 10^4 {\rm ~s} \approx 60 {\rm ~s}$$

Earth-rotation Aperture Synthesis

We can use the Earth's rotation to vary the projected baseline coverage of an interferometer whose elements are fixed on the ground.  In particular, all baselines of an interferometer whose baselines are confined to an east-west line will remain in a single plane perpendicular to the Earth's north-south rotation axis as the Earth turns daily.  Confining all baselines to two dimensions has the computational advantage that the brightness distribution of a source is simply the two-dimensional Fourier transform of the measured visibilities.

The figure below illustrates Earth-rotation aperture synthesis by an east-west two-element interferometer at latitude $+40^\circ$ as viewed from a source at declination $\delta = +30^\circ$.  We define $u$ as the east-west component of the projected baseline in wavelengths and $v$ as the north-south component of the projected baseline in wavelengths.

Viewed from a distant radio source, at declination  $\delta = +30^\circ$ for this drawing, the Earth rotates counterclockwise with a period of one sidereal day about the north-south axis indicated by the arrow emerging from the north pole.  The antennas of a two-element east-west interferometer at latitude $+40^\circ$ are shown, from left to right, as they would appear at hour angles $-6^{\rm h}$, $-3^{\rm h}$, $0^{\rm h}$, $+3^{\rm h}$, and $+6^{\rm h}$.  Projected onto the plane of the page, which is normal to the line of sight, the interferometer baseline rotates continuously from purely north-south at $-6^{\rm h}$ through east-west at $0^{\rm h}$ and back to north-south at $+6^{\rm h}$.  The projected antenna separation also changes. During this 12-hour period, the projected baseline traces an ellipse in the (u,v) plane as shown by the dashed curve, with points on the (u,v) ellipse highlighting the instantaneous coverage at $-6^{\rm h}$, $-3^{\rm h}$, $0^{\rm h}$, $+3^{\rm h}$, and $+6^{\rm h}$.  The $v$ axis of the ellipse is smaller by a factor $\cos\delta$ than the $u$ axis.

During the 12-hour period centered on source transit, the interferometer traces out a complete elllipse on the $(u,v)$ plane.  The maximum value of $u$ equals the actual antenna separation in wavelengths, and the maximum value of $v$ is smaller by the projection factor $\cos\delta$, where $\delta$ is the source declination.  If the interferometer has more than two elements, or if the spacing of the two elements is changed daily, the $(u,v)$ coverage will become a number of concentric ellipses having the same shape.  Thus the synthesized beam obtained by east-west Earth-rotation aperture synthesis can approach an elliptical Gaussian.  The synthesized beamwidth is $\approx u^{-1}$ radians east-west and $\approx u^{-1} \sec\delta$ radians in the north-south direction.  The synthesized beam is circular for a source near the celestial pole, but the north-south resolution is very poor for a source near the celestial equator.

ATCA photo 6 x 22 m dishes

The Australia Telescope Compact Array (ATCA) of six 22 m telescopes on an east-west baseline located about 500 km northwest of Sydney, Australia. Image credit