Essential Radio Astronomy

Chapter 3 Radio Telescopes and Radiometers

3.1 Antenna Fundamentals

An antenna is a passive device that converts electromagnetic radiation in space into electrical currents in conductors or vice versa, depending on whether it is being used for receiving or for transmitting, respectively. Radio telescopes are receiving antennas, and radar telescopes are also transmitting antennas. It is often easier to calculate the properties of transmitting antennas and to measure the properties of receiving antennas. Fortunately, most characteristics of a transmitting antenna (e.g., its radiation pattern) are unchanged when that antenna is used for receiving, so any analysis of a transmitting antenna can be applied to a receiving antenna used in radio astronomy, and any measurement of a receiving antenna can be applied to that antenna when used for transmitting.

3.1.1 Radiation from a Short Dipole Antenna (Hertz Dipole)

Figure 3.1: The coordinate system used to describe the radiation from a short (total length lλ) dipole driven by a current source of frequency ν.

The simplest antenna is a short (total length l much smaller than one wavelength λ) dipole antenna, which is shown in Figure 3.1 as two collinear conductors (e.g., wires or conducting rods). When they are driven at the small gap between them by an oscillating current source (a transmitter), the current going into the bottom conductor is 180 degrees out of phase with the current going into the top conductor. The radiation from a dipole depends on the transmitter frequency, so consider a sinusoidal driving current I with angular frequency ω2πν:

I=I0cos(ωt), (3.1)

where I0 is the peak current going into each half of the dipole. It is computationally convenient to replace the trigonometric function cos(ωt) with its complex exponential equivalent (Appendix B.3), the real part of

e-iωt=cos(ωt)-isin(ωt), (3.2)

so the driving current can be rewritten as

I=I0e-iωt (3.3)

with the implicit understanding that only the real part of I represents the actual current. The driving current accelerates charges in the antenna wires, so Larmor’s formula can be used to calculate the radiation from the antenna by converting from charges and accelerations to time-varying currents.

The electric current in a wire is defined as the flow rate of electric charge along the wire:

Idqdt. (3.4)

For a wire on the z-axis,

I=dqdt=dqdzdzdt=dqdzv, (3.5)

where v is the instantaneous flow velocity of the charges.

Many people incorrectly believe that the velocity v of individual electrons in a wire is comparable with the speed of light c because electrical signals do travel down wires at nearly the speed of light. However, a wire filled with electrons is like a garden hose already filled with water, a nearly incompressible fluid. When the faucet is turned on, water flows from the other end of a full hose almost immediately, even though individual water molecules have moved only a short distance along the hose. The example above shows that electrons move so slowly in a wire that Larmor’s nonrelativistic equation accurately predicts the radiation from antennas.

Example. Estimate the speed of electrons flowing through a copper wire of cross section σ=1 mm=210-6 m2 and carrying a current of 1 ampere. The number density of free electrons is about equal to the number density of copper atoms in the wire, n1029 m-3. In MKS units, the charge of an electron is -e4.80×10-10statcoul×1coul3×109statcoul1.60×10-19coul. One ampere is defined as one coulomb per second, so the number of electrons flowing past any point along the wire in one second is N˙=I|e|1couls-11.60×10-19coul6.25×1018s-1. The average electron velocity is only vN˙σn6.25×1018s-110-6m21029m-36×10-5ms-1c. Thus the nonrelativistic Larmor equation may safely be used to calculate the radiation from a wire.

Equation 2.136 from the derivation of Larmor’s formula

E=qv˙sinθrc2

can be applied to yield the dE contributed by each infinitesimal dipole segment of length dz. If the dipole is short (lλ), all of these electric fields are in phase and add directly to give the total E produced by the dipole:

E=z=-l/2+l/2dqdz𝑑zv˙sinθrc2. (3.6)

At distances rl, (1/r) is nearly constant over the whole antenna and can be taken outside the integral. For a sinusoidal driving current, v˙=-iωv and

E=-iωsinθrc2-l/2+l/2dqdzv𝑑z=-iωsinθrc2-l/2+l/2I𝑑z. (3.7)

That is, the radiated electric field strength E is proportional to the integral of the current distribution along the antenna. The current at the center is the driving current I=I0e-iωt, and the current must drop to zero at the ends of the antenna, where the conductivity goes to zero. The current distribution along a short dipole is the tail end of a standing-wave sinusoid, which declines almost linearly from the driving current at the center to zero at the ends:

I(z)I0e-iωt[1-|z|(l/2)]. (3.8)

Then

-l/2+l/2I𝑑zI0l2e-iωt (3.9)

and

E-iωsinθrc2I0l2e-iωt. (3.10)

Substituting ω=2πc/λ gives

E-i2πcsinθλrc2I0l2e-iωt=-iπsinθcI0lλe-iωtr. (3.11)

The time-averaged Poynting flux (power per unit area) follows from Equation 2.139; it is

S=c4πE2. (3.12)

Thus

S=c4π(12)(I0lλπc)2sin2θr2, (3.13)

where the factor (1/2) reflects the fact that sin2(ωt)=cos2(ωt)=1/2. (This is a good relation to remember and an easy one to derive because sin2(ωt)+cos2(ωt)=1 and sin2(ωt)=cos2(ωt).)

The power pattern of a transmitting antenna is the angular distribution of its radiated power, often normalized to unity at the peak. From Equation 3.13 the normalized power pattern of a short dipole is

Psin2θ. (3.14)

The radiation from a short dipole has the same polarization and the same doughnut-shaped power pattern as Larmor radiation from an accelerated charge because all of the charges in the short dipole are being accelerated along one line much shorter than one wavelength. From the observer’s point of view, the power received depends only on the projected (perpendicular to the line of sight) length lsinθ of the dipole. The electric field strength received is proportional to the apparent length of the dipole, and the radiation from the dipole is linearly polarized parallel to the projected dipole. The time-averaged total power emitted is obtained by integrating the Poynting flux over the surface area of a sphere of any radius rl centered on the antenna:

P=S𝑑A =c4π(12)(I0lλπc)2ϕ=02πθ=0πsin2θr2rsinθdϕrdθ (3.15)
=c4π(12)(I0lλπc)22πθ=0πsin3θdθ. (3.16)

Recall that 0πsin3θdθ=4/3, so the time-averaged power radiated by a short dipole is

P=π23c(I0lλ)2, (3.17)

where I0cos(ωt) is the driving current and l/λ is the total length of the dipole in wavelengths.

Most practical dipoles are half-wave dipoles (lλ/2) because half-wave dipoles are resonant, meaning that they provide a nearly resistive load to the transmitter. When each half of the dipole is λ/4 long, the standing-wave current is highest at the center and naturally falls as I=I0cos(2πz/λ) to zero at the ends of the conductors.

Figure 3.2: A ground-plane vertical antenna is just half of a dipole above a conducting plane. The lower half of the dipole is the reflection of the vertical in the mirror provided by the conducting “ground plane.” The image vertical is 180 degrees out of phase with the real vertical. Above the ground plane, the radiation from the ground-plane vertical is exactly the same as the radiation from the dipole.

The ground-plane vertical antenna shown in Figure 3.2 is very similar to the dipole. A ground-plane vertical is one half of a dipole above a conducting plane, which is called a “ground plane” because historically the conducting plane for vertical antennas was the surface of the Earth. The transmitter is connected between the base of the vertical, which is insulated from the ground, and the ground plane near the base. Many AM broadcast transmitting antennas are tall (at ν1 MHz, λ300 m and a λ/4 vertical antenna is about 75 m high), insulated towers acting as quarter-wave verticals. The conducting ground plane is a mirror that creates the lower half of the dipole as the mirror image of the upper half. Electric fields produced by the vertical antenna induce currents in the conducting plane to make the horizontal component of the electric field go to zero on the conductor. The virtual electric fields from the image vertical have the same amplitude but are 180 degrees out of phase, exactly as in a half-wave dipole. Consequently the radiation field from a ground-plane vertical is identical to that of a dipole in the half space above the ground plane and zero below the ground plane.

Figure 3.3: Most high-frequency feeds are quarter-wave ground-plane verticals inside waveguide horns. The only true antenna in this figure is the λ/4 ground-plane vertical, which converts electromagnetic waves in the waveguide to currents in the coaxial cable extending down from the waveguide.

According to the strict definition of an antenna as a device for converting between electromagnetic waves in space and currents in conductors, the only antennas in most radio telescopes are half-wave dipoles and their relatives, quarter-wave ground-plane verticals. The large parabolic reflector of a radio telescope serves only to focus plane waves onto the feed antenna. (The term “feed” comes from radar antennas used for transmitting; the “feed” antenna feeds transmitter power to the main reflector. Receiving antennas used in radio astronomy work the other way around, and the “feed” actually collects radiation from the reflector.)

Actual half-wave dipoles, backed by small reflectors about λ/4 behind them to focus the dipole pattern in the direction of the main dish, are normally used as feeds at low frequencies (ν<1 GHz) or long wavelengths (λ>0.3 m) because of their relatively small size. However, the radiation patterns of half-wave dipoles backed by small reflectors are not well matched to most parabolic dishes, so their performance is less than optimum.

For shorter wavelengths, almost all radio-telescope feeds are quarter-wave ground-plane verticals inside waveguide horns. Radiation entering the relatively large (size >λ) rectangular or circular aperture of the tapered horn is concentrated into a rectangular or circular waveguide with parallel conducting walls. In the case of the rectangular waveguide whose cross section is shown in Figure 3.3, the side walls are separated by slightly over λ/2 so that vertical electric fields can travel down the waveguide with low loss. The top and bottom walls are separated by somewhat less than λ/2 so only the mode with vertical electric fields can propagate (Section 3.4). The λ/4 vertical antenna inserted through a small hole in the bottom wall collects most of this vertically polarized radiation and converts it into an electric current that travels down the coaxial cable to the receiver. The backshort wall about 1/4 of the guide wavelength λw (Equation 3.144) behind the dipole ensures that the dipole sees only radiation coming from the direction of the horn opening.

Both dipoles and quarter-wave verticals are linearly polarized feeds. The voltage response of a linearly polarized feed to a linearly polarized source is proportional to cosΔ, where Δ is the angle between the feed and the source electric field, and the power response is proportional to cos2Δ=[cos(2Δ)+1]/2. Consequently the degree of polarization and the polarization position angle of a partially linearly polarized radio source can be measured by rotating the linearly polarized feed of a radio telescope while tracking the source. The degree of polarization p of a partially linearly polarized source defined by Equation 2.58 is

pIpI=IpIp+Iu, (3.18)

where Ip is the polarized flux density and I=Ip+Iu is the total flux density of the source. The power response of the feed R(Δ)Ipcos2Δ+Iu/2 will be RIp+Iu/2 when the feed and source polarizations are parallel (Δ=0) and R=Iu/2 when they are perpendicular (Δ=π/2). In terms of the observables R and R, the degree of polarization is

p=IpIp+Iu=Ip+Iu/2-Iu/2Ip+Iu/2+Iu/2=R-RR+R. (3.19)

Figure 3.4 shows how the relative power output R(Δ) of a linearly polarized feed varies as it is rotated through Δ=π radians relative to the polarization position angle of sources with fractional polarizations p=1.0, 0.1, and 0.0.

Figure 3.4: The relative power output from a linearly polarized antenna as a function of the polarization position angle difference between the source and the antenna for sources with fractional polarizations p=1.0 (dashed curve), p=0.1 (solid curve), and p=0.0 (dotted line). Abscissa: Position angle difference Δ (rad). Ordinate: Relative power output R.

To measure all four Stokes parameters of an arbitrarily polarized source, it is necessary to combine the voltage outputs of two orthogonally polarized feeds. For example, two orthogonal quarter-wave verticals can be inserted into a square waveguide to receive both the horizontally and the vertically polarized components simultaneously. If their output voltages are added in phase (phase difference δ=ϕx-ϕy=0 in Figure 2.15), the feed combination will respond to radiation linearly polarized in position angle π/4=45. If a phase difference δ=π/2 is inserted either mechanically (by moving one feed λ/4 behind the other) or electrically (by inserting a λ/4 longer cable between one feed and the point where the two outputs are added), then δ=π/2 and the feed combination will respond to circular polarization.

3.1.2 Radiation Resistance

The power flowing through a circuit is

P=VI, (3.20)

where V is the voltage (defined as energy per unit charge) and I is the current (defined as the charge flowing through the circuit per unit time), so P has dimensions of energy per unit time. The physicist George Simon Ohm observed that the current flowing through most (but not all) materials is proportional to the applied voltage, so most objects have a well-defined resistance R defined by Ohm’s law,

RVI. (3.21)

When Ohm’s law holds,

P=I2R=V2R. (3.22)

The average power in a resistive circuit with time-varying currents is

P=I2R. (3.23)

In the particular case of sinusoidal currents I=I0cos(ωt) and I2=I02/2, so

P=I02R2. (3.24)

Thus the (frequency-dependent) radiation resistance of an antenna is defined by

R2PI02. (3.25)

For a short dipole, the power emitted is given by Equation 3.17 and the radiation resistance is

R=2π23c(lλ)2. (3.26)
Example. A “half-wave” (length l=λ/2) dipole is a resonant antenna. Resonant antennas are used in most real applications because the impedance of a resonant antenna is resistive; nonresonant antennas have large capacitive or inductive reactances as well. Most of the antenna current in a half-wave dipole is a standing wave with a current distribution I=I0e-iωtcos(2πz/λ) that has a maximum I=I0e-iωt at the z=0 feed point and declines co-sinusoidally to zero at the endpoints z=±λ/4. With the (no longer so accurate) assumption that the radiation from all parts of the dipole emit in phase, Equation 3.7 still holds: E=-iωsinθrc2-l/2+l/2I𝑑z. The current distribution of the half-wave dipole is I=I0e-iωtcos(2πzλ) so -l/2+l/2I𝑑zI0e-iwt-λ/4+λ/4cos(2πzλ)𝑑z=I0λπe-iωt, which is a factor of 2λ/(πl) larger than the comparable integral for a short dipole (Equation 3.8): -l/2+l/2I𝑑zI0l2e-iwt. The average power P radiated by a given I0 is proportional to the square of this factor (see Equations 3.10 through 3.17), or (2λπl)2. Thus the radiation resistance R of a half-wave dipole is the radiation resistance of a short dipole (Equation 3.26) multiplied by the factor squared: R =[2π23c(lλ)2](2λπl)2=83c =833×1010cms-189×10-10scm-1. Engineers and real test instruments use the MKS “ohm” (symbol Ω) as the unit of resistance. The conversion factor is 1 Ω = (10-11/9)scm-1, so R=(89×10-10scm-1)/(19×10-11scm-1ohm-1)=80Ω. This is pretty close to the R73Ω result from an exact calculation that doesn’t use the lλ approximation.

A ground-plane vertical of height l/2 emits exactly like a dipole of length l above the ground plane and nothing below the ground plane. Thus the total power emitted by the vertical is half the power emitted by the dipole, and the radiation resistance of the vertical is half the radiation resistance of the dipole.

The radiation resistance 𝑹 of free space (sometimes called the impedance Z0 of free space) can be obtained from the relations

|S|=c4πE2  and  P=V2R. (3.27)

The electric field E is just the voltage per unit length V/l and the flux is the power per unit area l2, so

|S|=cV24πl2=V2R0l2 (3.28)

and

R0=4πc=4π3×1010cms-1=4.19×10-10scm-1. (3.29)

Converting from CGS to MKS units yields the radiation resistance of space in ohms:

R0=4π3×1010cms-11/9×10-11scm-1Ω-1=120πΩ377Ω. (3.30)

The tapered opening of a waveguide horn feed (Figure 3.3) acts as an impedance transformer to match the impedance of the waveguide to the impedance of free space to minimize standing waves and couple power efficiently between the waveguide and space, just as the bell of a trombone is an acoustic transformer matching sound vibrations of air in the trombone to the outside environment.

A black hole is a perfect absorber of radiation, so its resistance must also be 120πΩ to match that of free space. A black hole spinning in an external magnetic field can generate electrical power with a voltage/current ratio of 120πΩ, and this process may be important in powering quasar jets [12].

3.1.3 The Power Gain of a Transmitting Antenna

The power gain G(θ,ϕ) of a transmitting antenna is defined as the power transmitted per unit solid angle in direction (θ,ϕ) relative to an isotropic antenna, which has the same gain in all directions. Frequently, the value of G is expressed logarithmically in units of decibels (dB):

G(dB)10log10(G). (3.31)

For any lossless antenna, energy conservation requires that the gain averaged over all directions be

G=1. (3.32)

Consequently, all lossless antennas obey

sphereGdΩ=4π. (3.33)

Different lossless antennas may radiate with different directional patterns, but they do not alter the total amount of power radiated. Consequently, the gain of a lossless antenna depends only on the angular distribution of radiation from that antenna. In general, an antenna having peak gain G0 must beam most of its power into a solid angle ΔΩ such that ΔΩ4π/G0. This motivates the definition of the beam solid angle ΩA:

ΩA4πG0. (3.34)

Thus the higher the gain, the smaller the beam solid angle.

The antenna efficiency η is defined as the ratio of radiated power to input power. If ohmic losses reduce η, then the gain G in Equations 3.31 through 3.34 should be replaced by the directivity defined by DG/η.

Example. What is the power gain of a lossless short dipole? It is sufficient to recall only the angular dependence of the short-dipole power pattern (Equation 3.14) Psin2θ, where θ is the angle from the dipole axis. Thus Gsin2θ=G0sin2θ. The maximum gain G0 is determined by energy conservation: sphereG𝑑Ω=ϕ=02πθ=0πG0sin2θdθsinθdϕ=4π, 2πG00πsin3θdθ=4π. Recall that 0πsin3θdθ=4/3 so G0=4π2π34=32 and G(θ,ϕ)=3sin2θ2. Expressed in dB, the maximum gain G0 of a short dipole is G0=10log10(3/2)1.76dB. Note that G(θ,ϕ) is nearly independent of the antenna length so long as lλ because the power pattern of a short dipole is nearly independent of l. Varying lλ affects only the radiation resistance.

3.1.4 The Effective Area of a Receiving Antenna

The receiving counterpart of transmitting power gain is the effective area or effective collecting area of a receiving antenna. Imagine an ideal antenna of geometric area A that could collect all of the radiation falling on it from a distant point source and convert it to electrical power—a “rain gauge” for collecting photons. The total spectral power incident on the antenna is the product of its geometric area and the incident spectral power per unit area, or flux density Sν. However, any single antenna can respond to only one polarization, so its output Pν can equal all of the input spectral power (Pν=ASν) from a fully polarized source whose polarization matches that of the antenna, but only half of the incident power (Pν=ASν/2) from an unpolarized source and nothing at all from an orthogonally polarized source. The output of a real antenna is always smaller than this and most radio sources are nearly unpolarized, so radio astronomers find it useful to define the effective collecting area Ae of an antenna whose output spectral power is Pν in response to an unpolarized point source of total flux density Sν by

Ae2PνSν. (3.35)

The average collecting area

Ae4πAe𝑑Ω4π𝑑Ω=14π4πAe𝑑Ω (3.36)

of any lossless antenna can be calculated via another thermodynamic thought experiment.

Figure 3.5: A cavity in thermodynamic equilibrium at temperature T containing a resistor R is coupled to an antenna, also at temperature T, through a filter blocking electromagnetic radiation but passing currents having frequencies in the range ν to ν+dν.

Imagine an antenna inside a cavity in full thermodynamic equilibrium at temperature T connected through a transmission line to a matched resistor (whose resistance equals the radiation resistance of the antenna) in a second cavity at the same temperature (Figure 3.5). A filter between the cavities passes only currents in a narrow range of frequencies between ν and ν+dν. Because this entire system is in thermodynamic equilibrium, no net power can flow through the wires connecting the antenna and the resistor. Otherwise, one cavity would heat up and the other would cool down, in violation of the second law of thermodynamics. The total spectral power Pν from all directions collected in one polarization is half the total spectral power in the unpolarized blackbody radiation, so

Pν=124πAe(θ,ϕ)Bν𝑑Ω (3.37)

must equal the Nyquist spectral power Pν produced by the resistor. Inserting the Nyquist formula from Equation 2.119 and Planck’s law from Equation 2.85,

Pν=kT[hνkTexp(hνkT)-1]  and  Bν=2kTλ2[hνkTexp(hνkT)-1] (3.38)

leads to

kT=2kT2λ24πAe(θ,ϕ)𝑑Ω, (3.39)

and finally,

4πAe(θ,ϕ)𝑑Ω=4πAe=λ2. (3.40)

Without using Maxwell’s equations we have obtained the remarkable result

Ae=λ24π (3.41)

which implies that all lossless antennas, from tiny dipoles to the 100-m diameter Green Bank Telescope (GBT), have the same average collecting area. Ae is proportional to λ2 because space has two more dimensions than a transmission line has.

The collecting area of an isotropic receiving antenna is proportional to λ2, so most satellite broadcast services, GPS (Global Positioning System) or satellite FM radio for example, operate at relatively long wavelengths (10 to 20 cm). Likewise, practical radio telescopes constructed from arrays of dipoles are reasonably sensitive only at long wavelengths.

By analogy with Equation 3.34, the beam solid angle of a lossless receiving antenna is defined as

ΩA4πAe(θ,ϕ)A0𝑑Ω, (3.42)

where A0 is the maximum effective collecting area, so

A0ΩA=λ2. (3.43)

The much larger peak collecting area of the GBT implies it has a much smaller beam solid angle ΩA.

3.1.5 Reciprocity Theorems

Many antenna properties are the same for both transmitting and receiving. It is often easier to calculate the gain of a transmitting antenna than the collecting area of a receiving antenna, and it is often easier to measure the receiving power pattern of a large radio telescope than to measure its transmitting power pattern. Thus this receiving/transmitting “reciprocity” greatly simplifies antenna calculations and measurements. Reciprocity can be understood via Maxwell’s equations or by thermodynamic arguments.

Burke and Graham-Smith [20] state the electromagnetic case for reciprocity clearly: “An antenna can be treated either as a receiving device, gathering the incoming radiation field and conducting electrical signals to the output terminals, or as a transmitting system, launching electromagnetic waves outward. These two cases are equivalent because of time reversibility: the solutions of Maxwell’s equations are valid when time is reversed.”

The strong reciprocity theorem states,

If a voltage is applied to the terminals of an antenna A and the current is measured at the terminals of another antenna B, then an equal current (in both amplitude and phase) will appear at the terminals of A if the same voltage is applied to B. (Figure 3.6)

It can be formally derived from Maxwell’s equations (see a partial derivation in Wilson et al. [116, Appendix D]) or by network analysis (see Kraus et al. [63, “Antennas”, p. 252]).

Figure 3.6: The strong reciprocity theorem implies that the transmitter voltages VA and VB are related to the receiver currents IA and IB by IB-1VA=IA-1VB for any pair of antennas A and B.

Most radio astronomical applications do not depend on the detailed phase relationships of voltages and currents, so it is sufficient to use a weak reciprocity theorem that relates the angular dependences of the transmitting power pattern and the receiving collecting area of any antenna: “The power pattern of an antenna is the same for transmitting and receiving”; that is,

G(θ,ϕ)Ae(θ,ϕ). (3.44)

The weak reciprocity theorem can be proven by another simple thermodynamic thought experiment: An antenna is connected to a matched load inside a cavity initially in equilibrium at temperature T. The antenna simultaneously receives power from the cavity walls and transmits power generated by the resistor. The total power transmitted in all directions must equal the total power received from all directions because no net power can be transferred between the antenna and the resistor; otherwise the resistor would not remain at temperature T. Moreover, in any direction, the power received and transmitted by the antenna must be the same, else the cavity wall in directions where the transmitted power was greater than the received power would rise in temperature and the cavity wall in directions of lower transmitted/received power ratio would cool, leading to a violation of the second law of thermodynamics.

The constant of proportionality relating G and Ae can be derived from Equations 3.32 and 3.41:

Ae=λ24π  and  G=1. (3.45)

Thus energy conservation and the weak reciprocity theorem imply

Ae(θ,ϕ)=λ2G(θ,ϕ)4π (3.46)

for any antenna. This extremely useful equation shows how to compute the receiving power pattern from the transmitting power pattern and vice versa.

Example. Use the transmitting power pattern of a short dipole (Equation 3.14) to calculate the effective collecting area of a short dipole used as a receiving antenna: Ae(θ,ϕ)=  λ2G(θ,ϕ)4π=λ24π3sin2θ2, Ae=  3λ2sin2θ8π. The effective collecting area of a short receiving dipole does not depend on the length l of the dipole itself because the transmitting power pattern of a short (lλ) dipole is independent of l.

3.1.6 Antenna Temperature

A convenient practical unit for the power output per unit frequency from a receiving antenna is the antenna temperature TA. Antenna temperature has nothing to do with the physical temperature of the antenna as measured by a thermometer; it is only the temperature of a matched resistor whose thermally generated power per unit frequency in the low-frequency Nyquist approximation (Equation 2.117) equals that produced by the antenna:

TAPνk. (3.47)

It is widely used for the following reasons:

  1. 1.

    1 K of antenna temperature is a conveniently small power per unit bandwidth. TA=1 K corresponds to Pν=kTA=1.38×10-23JK-11K=1.38×10-23WHz-1.

  2. 2.

    It can be calibrated by a direct comparison with hot and cold loads (another word for matched resistors) connected to the receiver input.

  3. 3.

    The units of receiver noise are also K, so comparing the signal in K with the receiver noise in K makes it easy to compare the signal and noise powers.

Combining Equations 3.35 and 3.47 shows that an unpolarized point source of flux density S increases the antenna temperature by

TA=Pνk=AeSν2k, (3.48)

where Ae is the effective collecting area. It is often convenient to express the point-source sensitivity of a radio telescope in units of “kelvins per jansky” rather than in units of effective collecting area (m2). The effective collecting area corresponding to a sensitivity of 1KJy-1 is

Ae=2kTASν=21.38065×10-23JK-11K10-26Wm-2Hz-1=2761m2. (3.49)

In an arbitrary radiation field Iν(θ,ϕ), Equation 3.37 becomes

Pν=124πAe(θ,ϕ)Iν(θ,ϕ)𝑑Ω. (3.50)

Replacing Pν by antenna temperature TA using Equation 3.47 and inserting Iν(θ,ϕ)=2kTb(θ,ϕ)/λ2 (Equation 2.33) gives

kTA =124πAe(θ,ϕ)2kλ2Tb(θ,ϕ)𝑑Ω, (3.51)
TA =1λ24πAe(θ,ϕ)Tb(θ,ϕ)𝑑Ω. (3.52)

In the limit of a very extended source having nearly constant Tb across the entire beam,

TA=Tbλ24πAe(θ,ϕ)𝑑Ω (3.53)

so

TA=Tb. (3.54)

In words, the antenna temperature produced by a smooth source much larger than the antenna beam equals the source brightness temperature.

If a lossless antenna is pointed at a compact source covering a solid angle Ωs much smaller than the beam and having uniform brightness temperature Tb, then

TA=A0TbΩsλ2, (3.55)

where A0 is the on-axis effective collecting area. Substituting A0ΩA=λ2 (Equation 3.43) gives the result:

TATb=ΩsΩA. (3.56)

Stated in words, the antenna temperature equals the source brightness temperature multiplied by the fraction of the beam solid angle filled by the source. A Tb=104 K source covering 1% of the beam solid angle will add 100 K to the antenna temperature. The ratio (Ωs/ΩA) is called the beam filling factor.

The main beam of an antenna is defined as the region containing the principal response out to the first zero; responses outside this region are called sidelobes or, very far from the main beam, stray radiation. The main beam solid angle ΩMB is defined as

ΩMB1G0MBG(θ,ϕ)dΩ. (3.57)

The fraction of the total beam solid angle lying inside the main beam is called the main beam efficiency or, loosely, the beam efficiency:

ηBΩMBΩA. (3.58)

3.2 Reflector Antennas

3.2.1 Paraboloidal Reflectors

Antennas useful for radio astronomy at short wavelengths must have collecting areas much larger than the λ2/(4π) collecting area of an isotropic antenna and much higher angular resolution than a short dipole provides. Because arrays of dipoles are impractical at wavelengths λ<1 m or so, most radio telescopes use large reflectors to collect and focus power onto their small feed antennas, such as waveguide horns or dipoles backed by small reflectors, that are connected to receivers. The most common reflector shape is a paraboloid of revolution because it can focus the plane wave from a distant point source onto a single focal point.

To focus plane waves onto a single point, the reflector must keep all parts of an on-axis plane wavefront in phase at its focal point. Thus the total path lengths to the focus must all be the same, and this requirement is sufficient to determine the shape of the desired reflecting surface. Clearly the surface must be rotationally symmetric about its axis. In any plane containing the axis, the surface looks like the curve in Figure 3.7.

Figure 3.7: A plane containing the axis of a paraboloidal reflector with focal length f. Plane wave fronts from a distant point source are shown as dotted lines perpendicular to the z-axis. From a wavefront at height h above the vertex (the point r=0, z=0) of the paraboloid, the ray path (dashed line) lengths at all radial offsets r down to the reflector and up to the prime focus at z=f must be equal.

The requirement of constant path length can be written by equating the on-axis path length (f+h) from any height h to the reflector and then back to the prime focus at height f with the off-axis path length:

(f+h)=r2+(f-z)2+(h-z). (3.59)

This yields the reflector height z as a function of radius r:

r2+(f-z)2=f+z,
r2+f2+z2-2fz=f2+z2+2fz;

the result is

z=r24f. (3.60)

This is the equation of a paraboloid with focal length f.

The ratio of the focal length f to the diameter D of the reflector is called the 𝒇/𝑫 ratio or focal ratio. Note that the gain, collecting area, and beamwidth of a reflector antenna depend only weakly and indirectly on f/D, via the effect of f/D on illumination taper. In principle, f/D is a free parameter for the telescope designer, but in practice it is constrained. If the reflector f/D is too high, the support structure needed to hold the feed or subreflector at the focus of a large radio telescope becomes very long and unwieldy. Consequently large radio telescopes usually have f/D0.4, an order-of-magnitude lower than the typical focal ratio of an optical telescope. The drawback of a low f/D is a small field of view. The focal ellipsoid is the volume around the exact focal point that remains in reasonably good focus, and the focal circle is defined by the intersection of the focal ellipse and the transverse plane at z=f. Ruze [95] showed that the angular radius of the focal circle is proportional to (f/D)2, and only a small number (about seven) of discrete feeds can fit inside the focal circle of an f/D0.4 paraboloid. Large arrays of feeds or imaging cameras require larger f/D ratios, obtained either by using a shallower paraboloid or by using magnifying subreflectors to increase the effective focal length.

The primary mirrors of most radio telescopes are circular paraboloids or sections thereof for the following reasons:

  1. 1.

    The effective collecting area Ae of a reflector antenna can approach its projected geometric area A=πD2/4.

  2. 2.

    They are electrically simple (compared with a phased array of dipoles, for example).

  3. 3.

    A single reflector can work over a wide range of frequencies. Changing frequencies only requires changing the feed antenna and receiver located at the focal point, not building a whole new radio telescope.

3.2.2 The Far-Field Distance

How far away must a point source be for the received waves to satisfy the assumption that they are nearly planar across the reflector? The answer depends on both the wavelength λ and the reflector diameter D. Figure 3.8 shows the spherical wave emitted by a point source a finite distance R from a flat aperture, an imaginary circular hole that covers the reflector. It could be located at the plane z=h shown in Figure 3.7, for example.

Figure 3.8: The spherical wavefront (dashed circle) emitted by a point source at distance R deviates from a plane by Δ at the edge of an aperture of diameter D.

The maximum departure Δ from a plane wave occurs at the edge of the aperture. The far-field distance Rff is somewhat arbitrarily defined by requiring that Δ<λ/16. At the aperture edge, the Pythagorean theorem gives

R2=(R-Δ)2+(D2)2. (3.61)

Thus

R=Δ2+D28Δ. (3.62)

In the limit ΔD, we have Δ/2D2/(8Δ) and

RD28Δ. (3.63)

Given the Δ=λ/16 criterion, the far-field distance is

Rff2D2λ. (3.64)

If R<Rff, the path-length errors will introduce significant phase errors in the waves coming from the off-axis portions of the reflector, reducing the effective collecting area and degrading the antenna pattern.

Example. What is the far-field distance of the Green Bank Telescope (D=100 m) observing at λ=1 cm? Equation 3.64 gives Rff=2(100m)21cm=2×104m20.01m=2×106m=2000km. Such a large far-field distance makes ground-based measurements of the GBT antenna pattern impractical. To measure small errors in the GBT reflector surface using radio holography, it is necessary to observe a geostationary satellite having an orbital altitude R 36,000 2000 km. Similarly, the easiest way to determine the transmitting power pattern for a large radar antenna such as the D=305-m Arecibo reflector is to scan across a celestial point source in the far field and use the reciprocity theorem to equate the transmitting and receiving patterns.

3.2.3 Patterns of Aperture Antennas

In optics, the term aperture refers to the opening through which all rays pass. For example, the aperture of a paraboloidal reflector antenna would be the plane circle, normal to the rays from a distant point source, that just covers the paraboloid (Figure 3.9). The phase of the plane wave from a distant point source would be constant across the aperture plane when the aperture is perpendicular to the line of sight.

Figure 3.9: The aperture plane associated with a paraboloidal dish of diameter D.

Another example of an aperture is the mouth of a waveguide horn antenna (Figure 3.10).

Figure 3.10: “Doc” Ewen looking into the rectangular aperture of the horn antenna used to discover the λ = 21-cm line of neutral hydrogen. Image credit: NRAO/AUI/NSF.

How can the beam pattern, or power gain as a function of direction, of an aperture antenna be calculated? For simplicity, first consider a one-dimensional aperture of width D (Figure 3.11) and calculate the electric field pattern at a distant (RRff) point.

Figure 3.11: Coordinate system for a one-dimensional linear aperture spanning -D/2<x<+D/2. For a source distant from the center of the aperture (RD), the fractional change in the distance r between the source and any aperture element at a distance |x| from the aperture center is small, so the variable r(x) in the denominator of Equation 3.66 can be replaced by the constant R. However, the variation of r(x) across the aperture can be much larger than the wavelength λ, so the oscillating numerator of Equation 3.66 cannot be replaced by a constant.

When used in a transmitting antenna, the feed can illuminate the aperture antenna with a sine wave of fixed frequency ν=ω/(2π) and electric field strength g(x) that varies across the aperture. The illumination induces currents in the reflector. The currents will vary with both position and time:

Ig(x)exp(-iωt). (3.65)

The constant of proportionality doesn’t matter yet; it can be calculated later from energy conservation. Huygens’s principle asserts that the aperture can be treated as a collection of small elements which act individually as small antennas. Huygens’s principle actually applies to waves of any type, sound waves for example. The electric field produced by the whole aperture at large distances is just the vector sum of the elemental electric fields from these small antennas. The field from each element extending from x to x+dx is

dfg(x)exp(-i2πr(x)/λ)r(x)dx, (3.66)

where r(x) is the distance between the source and the aperture element at position x (Figure 3.11). In the far field (Equation 3.64), the Fraunhofer approximation

rR+xsinθ (3.67)

is valid. This equation is usually written in the form

rR+xl, (3.68)

where

lsinθ. (3.69)

For the small angles θ1 rad relevant to large Dλ apertures, l=sinθθ.

At large distances, the quantity

1r1R (3.70)

is nearly constant across the aperture and can be absorbed by the constant of proportionality in Equation 3.66. Although exp(-i2πR/λ) is a constant because R is fixed, the variable part xl of r=R+xl in the numerator of Equation 3.66 cannot be ignored at any distance:

dfg(x)exp(-i2πxl/λ)dx. (3.71)

When θ0 the phase 2πxl/λ2πxsinθ/λ varies linearly across the aperture, and different parts of the aperture add constructively or destructively to the total electric field f(l). Defining

uxλ (3.72)

to express position along the aperture in units of wavelength yields

f(l)=apertureg(u)e-i2πludu. (3.73)

In words, this very important equation says that in the far field, the electric-field pattern f(l) of an aperture antenna is the Fourier transform (Appendix A.1) of the electric field distribution g(u) illuminating that aperture.

3.2.4 The Electric-Field and Power Patterns of a Uniformly Illuminated Aperture

What is the electric-field pattern of a uniformly illuminated one-dimensional aperture of width D at wavelength λ? Uniform illumination means that the strength of the illumination is constant over the aperture:

g(u)=constant,-D2λ<u<+D2λ.

This question is best answered in two steps: first find the far-field pattern of a unit aperture (D=λ) and then use the similarity theorem (Equation A.11) for Fourier transforms to scale the first result to an aperture of any size.

The unit rectangle function is defined as

Π(u)1,-1/2<u<+1/2, (3.74)

and Π(u)=0 otherwise. The function symbol (an uppercase pi) is easy to remember because it looks like the function graph shown in the top panel of Figure 3.12.

Figure 3.12: The symbol Π is shaped like the unit rectangle function it represents. The function sinc(l)sin(πl)/(πl) is the Fourier transform of the unit rectangle function and is the electric-field pattern of a uniformly illuminated unit aperture. The power pattern of a uniformly illuminated unit aperture is shown in the bottom panel. For large (Dλ) apertures, the zeros at l=±1,±2, appear at the angles θ±λ/D,±2λ/D,.

Inserting Π(u) into Equation 3.73 gives the field pattern f(l) of the uniformly illuminated unit aperture:

f(l)=-Π(u)e-i2πlu𝑑u. (3.75)

Thus

f(l)=-1/2+1/2e-i2πludu=e-i2πlu-i2πl|-1/2+1/2=e-iπl-eiπl-i2πl. (3.76)

Next, difference the mathematical identities (Appendix B.3)

eiπl=  cos(πl)+isin(πl),
e-iπl=  cos(πl)-isin(πl)

to derive

eiπl-e-iπl=2isin(πl).

Inserting this result into Equation 3.76 gives

f(l)=-2isin(πl)-2iπl=sin(πl)(πl)sinc(l). (3.77)

The useful sinc function defined in Equation 3.77 is plotted in the middle panel of Figure 3.12.

The power pattern p(l) is the square of the field pattern f(l). The power pattern p(l)=sinc2(l) of a uniformly illuminated unit aperture is graphed in the bottom panel of Figure 3.12. The central peak of the power pattern between the first zeros at l=±1 is called the main beam. The smaller peaks are called sidelobes. They are separated by zeros or nulls in the power pattern at l=±1,±2,.

Next apply the powerful similarity theorem for Fourier transforms: if f(l) is the Fourier transform of g(u), then

1|a|f(la)

is the Fourier transform of g(au), where a0 is a dimensionless scaling factor. According to the similarity theorem, making a function g wider (0<a<1) or narrower (a>1) makes its Fourier transform f narrower and taller, or wider and shorter, respectively, always conserving the area under the transform. Consequently the beamwidth of an aperture antenna is inversely proportional to the aperture size in wavelengths and the on-axis field strength is directly proportional to the aperture size in wavelengths.

The scale factor for a uniformly illuminated one-dimensional aperture of width D operating at wavelength λ is a=λ/D, so the electric field pattern becomes

f(l)=(Dλ)sin(πlD/λ)(πlD/λ)Dλsinc(lDλ).

If the aperture is large (D/λ1), the relevant angles θ are so small (θ1 radian) that l=sinθθ and

f(θ)=Dλsinc(θDλ). (3.78)

The power pattern is proportional to the square of the electric field pattern, so

P(l)(Dλ)2sinc2(lDλ).

If θ1 radian, then

P(θ)=(Dλ)2sinc2(θDλ). (3.79)

Radio astronomers use the angle between the half-power points to specify the angular width of the main beam, calling it the half-power beamwidth (HPBW) or the full width between half-maximum points (FWHM). The narrow beamwidth θHPBW1rad of a large (Dλ) one-dimensional uniformly illuminated aperture satisfies

P(θHPBW/2)=12=sinc2(θHPBWD2λ), (3.80)
0.443θHPBWD2λ, (3.81)
θHPBW0.89λD. (3.82)

The similarity theorem implies the general scaling relation

θHPBWλD. (3.83)

The constant of proportionality varies slightly with the illumination taper. Even an ideal aperture antenna of finite size has a finite resolving power that is limited by diffraction, the spreading of rays passing through a finite aperture, and Equation 3.82 specifies the diffraction-limited resolution of a uniformly illuminated aperture antenna.

The weak reciprocity theorem (Section 3.1.5) says that the preceding analysis of the transmitting power pattern of an aperture antenna also yields its receiving power pattern, or the variation of Ae with orientation. In receiving terms, the analog of the power pattern is called the point-source response. For a uniformly illuminated aperture, scanning a radio telescope beam in angle θ across a point source will cause the antenna temperature to vary as sinc2(θ), and the width of the half-power response will equal the transmitting HPBW. The receiving HPBW is sometimes called the resolving power of a telescope because two equal point sources separated by the HPBW are just resolved by the Rayleigh criterion that the total response has a slight minimum midway between the point sources.

3.2.5 The Electric-Field and Power Patterns with Tapered Illumination

Practical feeds such as small waveguide horns or half-wave dipoles backed by small subreflectors cannot illuminate a large aperture uniformly. A better approximation to their illumination is the cosine-tapered field pattern (cosine-squared tapered power pattern)

g(u)=π2cos(πu),-1/2<u<+1/2, (3.84)

and g(u)=0 otherwise (Figure 3.13). The (π/2) normalization factor in Equation 3.84 ensures that

-1/2+1/2g(u)𝑑u=1. (3.85)

The corresponding field pattern of a one-dimensional unit aperture is given by

f(l)=-1/2+1/2π2cos(πu)e-i2πlu𝑑u. (3.86)

This Fourier transform can be evaluated as follows:

f(l) =π4-1/2+1/2(eiπu+e-iπu)e-i2πlu𝑑u (3.87)
=π4[eiπ(1-2l)uiπ(1-2l)|-1/2+1/2+e-iπ(1+2l)u-iπ(1+2l)|-1/2+1/2] (3.88)
=π4[eiπ(1/2-l)-e-iπ(1/2-l)i2π(1/2-l)+eiπ(1/2+l)-e-iπ(1/2+l)i2π(1/2+l)] (3.89)
=π4[2isin[π(1/2-l)]i2π(1/2-l)+2isin[π(1/2+l)]i2π(1/2+l)] (3.90)
=π4[cos(πl)π(1/2-l)+cos(πl)π(1/2+l)]=π4cos(πl)(ππ2/4-π2l2) (3.91)
Figure 3.13: The cosine-tapered field illumination g(u) (top) yields the field pattern f(l) and power pattern P(l) on a unit aperture. The low sidelobes of P(l) are clearly visible only on a plot of P(dB) (bottom).

to yield the field pattern

f(l)=cos(πl)1-4l2 (3.92)

of a one-dimensional unit aperture with cosine-tapered illumination given by Equation 3.84. Both the field pattern and the power pattern

P(l)=[f(l)]2=[cos(πl)1-4l2]2 (3.93)

are shown in Figure 3.13. The sidelobes are so weak that a plot of P(dB)=10log10(P) is needed to show them clearly (bottom panel of Figure 3.13).

Tapering increases the half-power beamwidth. If Dλ, the normalized power pattern is

P(θ)=[cos(πθD/λ)1-4(θD/λ)2]2, (3.94)

and

P(θHPBW2)=12=[cos[πθHPBWD/(2λ)]1-4[θHPBWD/(2λ)]2]2 (3.95)

can be solved numerically to yield

θHPBW1.2λD. (3.96)

This beamwidth is typical of most radio telescopes.

The perfectly sharp cutoff of illumination at the edge of the aperture shown in the top panel of Figure 3.13 cannot be achieved in practice. Any illumination extending beyond the reflector is called spillover. In the case of a receiving antenna, a prime-focus feed looking down at an aperture also sees spillover radiation from the surrounding ground. Most soils are good absorbers, which emit blackbody radiation at the ambient temperature T300K, and ground radiation can add significantly to the system noise temperature of a radio telescope. The purpose of the 15-m high annular ground screen surrounding the Arecibo reflector (Figure 8.2) is to intercept most of the spillover radiation and redirect it to the cold sky in order minimize the system temperature.

3.3 Two-Dimensional Aperture Antennas

3.3.1 The Field Pattern of a Two-Dimensional Aperture

The method used to show that the field pattern of a one-dimensional aperture is the one-dimensional Fourier transform of the aperture field illumination (Equation 3.73) can easily be generalized to the more realistic case of a two-dimensional aperture:

f(l,m)--g(u,v)e-i2π(lu+mv)dudv, (3.97)

where m is the y-axis analog of l on the x-axis, and

vyλ. (3.98)

In words, Equation 3.97 states that the electric field pattern of a two-dimensional aperture is the two-dimensional Fourier transform of the aperture field illumination.

3.3.2 The Uniformly Illuminated Rectangular Aperture

Figure 3.14: A two-dimensional rectangular aperture with side lengths Dx and Dy. Dividing lengths in the aperture plane by the wavelength λ yields the normalized coordinates ux/λ and vy/λ. The direction from the origin to any distant point can be specified by lsinθx and msinθy, where θx is the angle from the (y,z) plane and θy is the angle from the (x,z) plane.

The two-dimensional counterpart of a uniformly illuminated one-dimensional aperture is a uniformly illuminated rectangular aperture with side lengths Dx and Dy. Dividing lengths in the aperture plane by the wavelength λ yields the normalized coordinates ux/λ and vy/λ. The direction from the origin of the (u,v) plane to any distant point can be specified by lsinθx and msinθy, where θx is the angle from the (y,z) plane and θy is the angle from the (x,z) plane (Figure 3.14). If the illumination g(x,y) is constant over the aperture, the integrals over u and v in the Fourier transform are separable and

f(l,m)sinc(lDxλ)sinc(mDyλ). (3.99)

Squaring the electric field pattern gives the relative (normalized to unity at the peak) power pattern

Pn(l,m)=sinc2(lDxλ)sinc2(mDyλ). (3.100)

The absolute power gain G in any direction can be calculated from the relative power pattern by invoking energy conservation:

G𝑑Ω=4π=G0-1+1-1+1Pn(l,m)𝑑l𝑑m, (3.101)
4π=G0-1+1[sin(πlDx/λ)πlDx/λ]2𝑑l-1+1[sin(πmDy/λ)πmDy/λ]2𝑑m. (3.102)

Defining the temporary variable a as

aπlDxλ,soda=πDxλdl, (3.103)

gives, for Dxλ,

-1+1[sin(πlDx/λ)πlDx/λ]2dl[-sin2aa2da]λπDx=λDx (3.104)

because the value of the definite integral in square brackets is π. [To prove this, simply apply Rayleigh’s theorem (Equation A.7) to the Fourier transform pair sinc(l) (Equation 3.77) and Π(u) (Equation 3.74).] Then

4π=G0λ2DxDy. (3.105)

Thus the peak power gain is

G0=4πDxDyλ2, (3.106)

and the power pattern of a uniformly illuminated rectangular aperture with side lengths Dx and Dy is

G=4πDxDyλ2sinc2(lDxλ)sinc2(mDyλ), (3.107)
and
G4πDxDyλ2sinc2(θxDxλ)sinc2(θyDyλ) (3.108)

when θx and θy are much smaller than 1 radian.

In general, the peak power gain of an aperture antenna is proportional to the geometric area Ageom (Ageom=DxDy in this case) of the aperture. The constant of proportionality is 4π/λ2 for a uniformly illuminated aperture and somewhat less for any other illumination pattern.

Using Equation 3.46

Ae=λ2G4π, (3.109)

we find that the on-axis effective collecting area is

A0=λ2G04π=4πλ2DxDy4πλ2=DxDy=Ageom. (3.110)

The peak effective area of an ideal uniformly illuminated aperture equals its geometric area, independent of wavelength. With any other illumination taper, the effective area is smaller than but proportional to the geometric area. It is useful to define the aperture efficiency ηA as the ratio of the effective area to geometric area:

ηAA0Ageom. (3.111)

Thus ηA=1 for an ideal uniformly illuminated aperture and ηA<1 otherwise. The aperture efficiencies of most radio telescopes are ηA70%, although phased-array feeds control the illumination well enough to let ASKAP (Figure 8.6) reach ηA80%.

Large (Dλ) rectangular waveguide horns are nearly uniformly illuminated unblocked apertures, so their actual gains and effective collecting areas can be calculated accurately. This makes them useful for measuring the absolute flux densities of strong sources such as Cas A and Cyg A and defining the practical flux-density scales used by radio astronomers [6].

Most apertures associated with reflectors and lenses are circular. The power pattern of a uniformly illuminated circular aperture is known as the Airy pattern.66See http://www.olympusfluoview.com/java/resolution3d/index.html for an interactive plot showing how the Airy pattern behaves as a function of wavelength and aperture size.

3.3.3 Gaussian Beam Solid Angle and Beamwidth

Figure 3.15: The beams of most radio telescopes are nearly Gaussian, and their beamwidths are usually specified by the angle θHPBW between the half-power points. Abscissa: offset θ from the beam center in units of the HPBW. Ordinate: Effective aperture Ae normalized by the peak effective aperture A0.

For any realistic illumination taper, the beam solid angle (Equation 3.42)

ΩA4πAe(θ,ϕ)A0𝑑Ω

of a radio telescope is about equal to the square of the half-power beamwidth θHPBW. In fact, the beams of most radio telescopes are nearly Gaussian and can be written as

AeA0=exp(-xθ2), (3.112)

where θ is the angle from the beam center and x is a scaling factor such that Ae/A0=1/2 when θ=±θHPBW/2 (Figure 3.15):

12=exp[-x(θHPBW2)2]. (3.113)

Thus

x=4ln2θHPBW2, (3.114)
AeA0=exp[-4ln2(θθHPBW)2], (3.115)

and

ΩA=θ=0ϕ=02πexp[-4ln2(θθHPBW)2]θ𝑑ϕ𝑑θ. (3.116)

Integrating over ϕ and substituting the dummy variable y=4ln2(θ/θHPBW)2 yields

ΩA= 2π(θHPBW28ln2)y=0exp(-y)𝑑y, (3.117)

so the beam solid angle of a Gaussian beam is

ΩA=(π4ln2)θHPBW21.133θHPBW2. (3.118)

3.3.4 Reflector Accuracy Requirements

Real radio telescopes don’t have perfectly smooth paraboloidal reflectors. Small deviations from the best-fit paraboloid may be caused by permanent manufacturing errors, changing gravitational deformations as the reflector is tilted, thermal distortions resulting from solar heating, and bending by strong winds. There will be some shortest wavelength λmin below which these surface errors degrade the reflector performance so severely that the telescope becomes unusable. The reflector surface efficiency ηs is defined as the power gain of the actual reflector divided by the power gain of a perfect paraboloidal reflector with the same size and illumination. The following calculation of how ηs varies with the rms (root mean square) surface error in wavelengths (ϵ/λ) is based on the classic method of Ruze [96].

Figure 3.16: Deviations ϵ of the actual reflector surface (thick curve) from the best-fit paraboloid (thin curve) degrade short-wavelength performance (top). Vector sums of the electric fields E produced by N elements of perfect and imperfect apertures are shown bottom left. Bumps in the imperfect aperture produce phase shifts ±δ±4πϵ/λ which lower the vector sum of electric fields from NE to NEcosδ (bottom right).

Where the actual reflector surface deviates from the best-fit paraboloid by a distance ϵ (Figure 3.16), the path length of the reflected wave will be in error by almost 2ϵ and the phase error δ (radians) of the reflected wave will be

δ2πλ(2ϵ)=4πϵλ. (3.119)

An oversimplified example would be a bumpy surface, half covered with small bumps of height ϵλ and half covered with small dips of the same depth ϵ. Then the contribution of each area element to the far (electric) field (Figure 3.16) is reduced by a factor cosδ. In the limit δ1 rad, cosδ1-δ2/2+ and

E(δ)E(0)1-δ22+, (3.120)

so the relative power gain is

G(δ)G(0)[E(δ)E(0)]21-δ21-(4πϵλ)2. (3.121)

This rough estimate shows that the surface errors must be an order-of-magnitude smaller than the shortest usable wavelength, a severe requirement indeed.

A more realistic calculation makes use of the fact the most errors have roughly Gaussian amplitude distributions. Suppose that the surface errors have a Gaussian probability distribution P(ϵ) with rms σ:

P(ϵ)=12πσexp(-ϵ22σ2). (3.122)

Then the relative field strength is obtained as the weighted sum over all possible ϵ:

E/E(0)-cos(4πϵλ)12πσexp(-ϵ22σ2)𝑑ϵ. (3.123)

Substituting eiz=cosz+isinz turns this integral into a more familiar one, the Fourier transform of a Gaussian:

E/E(0)-exp(-i4πϵλ)12πσexp(-πϵ22πσ2)𝑑ϵ. (3.124)

Note that the isinz part drops out immediately because it is antisymmetric in an otherwise symmetric integral. To make this look even more familiar, let s2/λ, xϵ, and a(2πσ)-1. Then

E/E(0)-exp(-i2πsx)exp(-π(ax)2)𝑑x. (3.125)

Recall that the Fourier transform of f(x)=exp(-πx2) is F(s)=exp(-πs2) (Appendix B.4) and apply the similarity theorem (Equation A.11) to get

E/E(0) =1|a|2πσexp[-π(sa)2] (3.126)
=exp[-2π2σ2s2] (3.127)
=exp(-8π2σ2λ2). (3.128)

Power is proportional to E2 so the reflector surface efficiency is simply

ηs=exp[-(4πσλ)2]. (3.129)

Equation 3.129 is often called the Ruze equation; it is plotted in Figure 3.17.

Figure 3.17: The surface efficiency ηs declines rapidly as the rms surface error in wavelengths σ/λ exceeds 1/160.06.

The surface efficiency ηs is closely related to the Strehl ratio S used by optical astronomers to specify the peak intensity loss caused by optical aberrations or atmospheric turbulence. The Strehl ratio is normally expressed in terms of the rms wavefront error in wavelengths ω, which is about twice the rms surface error in wavelengths σ/λ, so Equation 3.129 implies

S=exp[-(2πω)2]. (3.130)

A traditional rule-of-thumb for the shortest wavelength λmin at which a radio telescope works reasonably well is

σλmin16 (3.131)

because the surface efficiency at λ=λmin is only

ηsexp[-(π4)2]0.54 (3.132)

and falls exponentially at shorter wavelengths. For example, the 100-m diameter GBT is intended to operate at frequencies as high as ν100 GHz, or λmin3 mm. To meet this specification, the rms deviation from a perfect paraboloid must not exceed σ3mm/16200μm, the thickness of two sheets of paper. The power gain of a perfect paraboloidal reflector is proportional to ν2. If the reflector surface has a Gaussian error distribution with rms σ, then its gain increases as ν2 at low frequencies, reaches a maximum at

λ=4πσ, (3.133)

and decreases quickly at higher frequencies.

3.3.5 Pointing-Accuracy Requirements

Real radio telescopes don’t have perfectly accurate pointing. Small errors in tracking a target source reduce the gain in the source direction and contribute to the uncertainty in flux-density measurements of compact sources. Tracking errors are just as important as surface errors in limiting the short-wavelength performance of large radio telescopes.

The power patterns of most radio telescopes are nearly Gaussian near the peak. In terms of the beamwidth between half-power points θHPBW, the relative gain at a point offset by angle ρ from the beam axis is

GG0=exp[-4ln2(ρθHPBW)2]. (3.134)

If the one-dimensional tracking error in each coordinate (e.g., azimuth or elevation angle) has a Gaussian distribution with rms σ1, the tracking error ρ in two dimensions has a Rayleigh distribution

P(ρ)=ρσ12exp(-ρ22σ12). (3.135)

The mean squared tracking error is

ρ2=0ρ2P(ρ)𝑑ρ=2σ12. (3.136)

The rms value of the two-dimensional tracking error is σ2=21/2σ1, so small tracking errors reduce the average on-source gain by the factor

G/G0=[1+4ln2(σ2θHPBW)2]-1. (3.137)

More importantly, the fluctuating on-source gain caused by tracking errors contributes a fractional uncertainty77http://wwwlocal.gb.nrao.edu/ptcs/ptcssn/ptcssn3.pdf.

σSS=z(1+2z)1/2, (3.138)

where

z4ln2(σ2θHPBW)2, (3.139)

to a measurement of source flux density S. Thus an rms tracking error of 0.2θHPBW will contribute a 10% rms flux-density uncertainty. For 5% accuracy, σ2/θHPBW0.14 (or σ1/θHPBW0.10 in each coordinate) is needed.

For example, we can calculate the largest tracking error in arcsec compatible with making flux-density measurements with 5% rms errors using the GBT 100-m telescope at ν=33 GHz. From Equation 3.138, σS/S=0.05 when σ2/θHPBW0.14. The half-power beamwidth of the GBT at ν=33 GHz (λ9.1mm) is

θHPBW1.2Dλ=1.20.0091m100m1.09×10-4rad23arcsec. (3.140)

Thus the total tracking error must be smaller than σ2=0.14×23arcsec=3.2arcsec, or σ12-1/2σ22.2arcsec10-5rad in azimuth and in elevation angle.

The thermal expansion coefficient of steel is about 10-5C-1, so changing the temperature differential across the steel GBT support structure by only 1 centigrade degree could produce a 10-5rad2arcsec pointing shift. For this reason, high-frequency observers must monitor pointing calibration sources and correct the GBT pointing every hour or so, particularly just after sunrise on sunny days. Wind gusts also degrade pointing accuracy, but they fluctuate on much shorter timescales.

3.4 Waveguides

Figure 3.18: The upper drawing shows the cross section of a rectangular waveguide having interior width a in the x-direction and height ba/2 in the y-direction. The component of any electric field E parallel to a conducting wall must go to zero at the wall, just as with radiation in a cavity (Figure 2.16). The curves indicate three allowed x distributions of electric field strength (called modes). The lower drawing is a plan view of the waveguide. It is analogous to Figure 2.18 showing waves in a conducting cavity. Radiation of wavelength λ must travel through the waveguide in the direction indicated by the large arrow to satisfy the boundary condition nx=2a/λx=(2a/λ)cosα, where nx=1,2,3, (Equation 2.66). Only the TE10 mode shown, with nx=1 (a=λx/2) and no variation of E=Ey with height (ny=0), is normally used.

Waveguides are low-loss shielded “pipes” used to transport electromagnetic waves between antennas and receivers or between sections of a receiver. The simplest waveguide is a hollow rectangular tube with conducting walls (Figure 3.18, top) separated by distance a in the horizontal (x) direction and ba/2 in the vertical (y) direction. At the conducting walls, the parallel component of any electric field inside the waveguide must be zero. Three permitted distributions of the electric field strength E along the horizontal axis are shown as curves in the top panel of Figure 3.18, which is similar to Figure 2.16 for standing waves in a cavity. However, only the longest-wavelength dominant mode with nx=1 is normally used, and higher-order modes with nx=2,3, are deliberately suppressed because they travel down the waveguide with different group velocities.

The bottom panel of Figure 3.18 presents the plan view of the dominant radiation mode traveling through the waveguide with a wave normal in the direction of the large arrow and a wave node (|E|=0) indicated by the dashed line. It is analogous to Figure 2.18 showing waves in a conducting cavity. Radiation of wavelength λ traveling through the waveguide in the direction indicated by the arrow must satisfy the boundary condition nx=2a/λx=(2a/λ)cosα, where nx=1,2,3, (Equation 2.66). When nx=1, λx/2=a=(λ/2)cosα.

The maximum wavelength (cosα=1) that can propagate (α0) in the waveguide is the cutoff wavelength

λc=2a, (3.141)

and the corresponding minimum frequency

νc=c/λc (3.142)

is called the cutoff frequency. Waveguides are extremely effective high-pass filters.

The group velocity of propagation down the waveguide is

vg=csinα=c(1-cos2α)1/2=c[1-(νcν)2]1/2, (3.143)

which varies quite rapidly with frequency as ν approaches νc (sinα approaches 0). The waveguide phase velocity vp=c2/vgc, so the guide wavelength

λw=cν[1-(νcν)2]-1/2 (3.144)

is somewhat greater than the free-space wavelength.

To minimize dispersion (the variation of vg with frequency), waveguides are rarely used at frequencies below ν1.25νc. Higher-order modes with nx=2,3, have frequencies ν>2νc,3νc, and propagate with different group velocities. To suppress them, waveguides are not used for frequencies ν>2νc, the cutoff frequency of the nx=2 mode. Practical waveguides are usually limited to frequencies ν<1.9νc. The requirement ba/2 ensures that λ/2>b for all ν<2νc so ny=0, so only this TE10 (Transverse Electric field with nx=1,ny=0) mode can propagate. The TE10 mode electric field is vertically polarized and its strength is independent of y.

The combination of these upper and lower frequency limits restrict most waveguide applications to octave bandwidths, and waveguides of different sizes cover different octaves. Many of the waveguide band names in use today originated as deliberately confusing code names for World War II radar bands. They and their frequency ranges are listed in Appendix F.5. For example, the standard X-band waveguide has interior dimensions a=0.9inches2.286cm, b=0.4inches1.016cm. Its cutoff wavelength is λc=2a4.572cm and its cutoff frequency is νc=c/λc6.557GHz. Its nominal frequency range extends from 1.25νc8.2GHz to 1.9νc12.4GHz. Unfortunately, the waveguide band names are so deeply embedded in radio-astronomy jargon that radio observers cannot avoid them any more than optical astronomers can avoid “magnitudes.”

Each feed and receiver on a radio telescope covers only one waveguide band, so several feeds and receivers are needed to span the much wider useful frequency range of the telescope itself. At the VLA, the frequency range from 1 to 50 GHz is covered by eight sets of feeds and receivers in eight waveguide bands: L (1–2 GHz), S (2–4 GHz), C (4–8 GHz), X (8–12 GHz), Ku (12–18 GHz), K (18–26.5 GHz), Ka (26.5–40 GHz), and Q (40–50 GHz).

3.5 Radio Telescopes

The radio band is too wide (five decades in wavelength) to be covered effectively by a single telescope design. The surface brightnesses and angular sizes of radio sources span an even wider range, so a combination of single telescopes and aperture-synthesis interferometers are needed to detect and image them. It is not practical to build a single radio telescope that is even close to optimum for all of radio astronomy.

The ideal radio telescope should have a large collecting area to detect faint sources. The effective collecting area Ae(θ,ϕ) of any antenna averaged over all directions (θ,ϕ) is (Equation 3.41)

Ae=λ24π, (3.145)

so large peak collecting areas imply extremely directive antennas at short wavelengths. Only at long wavelengths (λ>1 m) is it feasible to construct sensitive antennas from reasonable numbers of small, nearly isotropic elements such as dipoles. Jansky’s λ15-m “wire” antenna (Figure 1.7) is an array of phased dipoles. It produces a wide fan beam near the horizon but has a large collecting area because λ2 is so large. Directive aperture antennas are needed for adequate sensitivity at higher frequencies.

The simplest aperture antenna is a waveguide horn. Radiation incident on the opening is guided by a tapered waveguide. At the narrow end of the tapered horn is a waveguide with parallel walls, and inside this waveguide is a quarter-wave ground-plane vertical antenna that converts the electromagnetic wave into an electrical current that is sent to the receiver via a cable.

Horn antennas pick up very little ground radiation because, unlike most paraboloidal dishes, their apertures are not partially blocked by external feeds and feed-support structures, which scatter ground radiation into the receiver. This freedom from ground pickup allowed Penzias and Wilson [80] to show that the zenith antenna temperature of the Bell Labs horn (Figure 3.19) was 3.5 K higher at ν4 GHz than expected—the first detection of the cosmic microwave background radiation.

Figure 3.19: The horn antenna at Bell Labs, Holmdel, NJ used by Penzias and Wilson to discover the 3 K cosmic microwave background radiation in 1965. Reprinted with permission of Alcatel-Lucent USA Inc.

The aperture of a waveguide horn is not blocked by any feed-support structure, so it is also easier to calculate the gain of a horn antenna from first principles than to calculate the gain of a partially blocked reflecting antenna. Thus small horn antennas have been used by radio astronomers to measure the absolute flux densities of very strong sources such as Cas A. Radio astronomers observing with large dishes typically do not measure the absolute flux densities of sources, only their relative flux densities by comparison with secondary calibration sources whose flux densities relative to that of Cas A are known in advance. The painstaking process of measuring the absolute flux densities of Cas A and comparing them with the flux densities of weaker point sources suitable for calibrating observations made with large radio telescopes was described in detail by Baars et al. [6].

Figure 3.20: The 140-foot (43-m) telescope in Green Bank, WV is the largest telescope with an equatorial mount. Image credit: NRAO/AUI/NSF.

Most radio telescopes use circular paraboloidal reflectors to obtain large collecting areas and high angular resolution over a wide frequency range. Because the feed is on the reflector axis, the feed and legs supporting it partially block the path of radiation falling onto the reflector. This aperture blockage has a number of undesirable consequences:

  1. 1.

    The effective collecting area is reduced because some of the incoming radiation is blocked.

  2. 2.

    The beam pattern is degraded by increased sidelobe levels.

  3. 3.

    Radiation from the ground that is scattered off the feed and its support structure increases the system noise.

  4. 4.

    Radiation from the Sun and artificial sources of radio frequency interference (RFI) far from the main beam will be mixed with the desired signal.

Radio telescopes are so large that paraboloids with high f/D ratios are impractical; typically f/D0.4. Thus radio “dishes” are relatively deep, as shown in Figure 3.20. Another consequence of a low f/D ratio is a tiny field of view at the prime focus. The instantaneous imaging capability of a large single dish is severely limited by the small number of feeds that can fit into the tiny focal circle.

Nearly all radio telescopes have alt-az mounts consisting of a horizontal azimuth track on which the telescope turns in azimuth (the angle measured clockwise from north in the horizontal plane) and a horizontal elevation axle about which the telescope tips in altitude or elevation angle (two names for the angle above the horizon). The 140-foot telescope in Green Bank is unique among large radio telescopes in having an equatorial mount (Figure 3.20). The advantage of a equatorial mount is tracking simplicity—the declination axis is fixed and the hour-angle axis turns at a constant rate while tracking a distant celestial source. (The hour angle is the angle past the meridian, measured in hours. The meridian is the great circle passing through the north pole, south pole, and zenith.) In contrast, both the altitude and the azimuth of a celestial source change nonlinearly with time. When the 140-foot telescope was being designed, the ability of computers to perform the real-time calculations needed for an alt-az telescope to track a source accurately was in doubt. The disadvantage of a equatorial mount is mechanical—the sloped hour-angle yoke and polar axle with its huge tail bearing are very difficult to build and support.

Figure 3.21: Cross section of a radio telescope rotationally symmetric around the z-axis and having a Cassegrain subreflector. Parallel rays from a distant radio source are reflected by a circular paraboloid whose prime focus is at the point marked f1. The convex Cassegrain subreflector is a circular hyperboloid located below the prime focus. It reflects these rays to the feed located at the secondary focus f2 just above the vertex of the paraboloid. The angle 2θ1 subtended by the main reflector viewed from the prime focus is much larger than the angle 2θ2 subtended by the subreflector viewed from the secondary focus, so Cassegrain feeds have to be much larger than primary feeds.

Figure 3.20 clearly shows the Cassegrain optical system of the 140-foot telescope. Radiation reflected from the main dish is reflected a second time from the convex Cassegrain subreflector located just below the focal point down to feed horns and receivers near the vertex of the paraboloid. A subreflector system has some advantages over a prime-focus system:

  1. 1.

    The magnifying subreflector can multiply the effective f/D ratio; values of f/D2 are typical. This greatly increases the size of the focal ellipsoid. Multiple feeds can be located within the focal ellipsoid to produce multiple simultaneous beams for faster imaging.

  2. 2.

    The subreflector is many wavelengths in diameter so it can be used to tailor the illumination taper to optimize the trade-off between high aperture efficiency and low sidelobes.

  3. 3.

    Receivers can be located near the vertex, not the focal point, where they are easier to access.

  4. 4.

    Feed spillover radiation is directed toward the cold sky instead of the warm ground, lowering overall system temperatures.

  5. 5.

    The subreflector can be nutated (rocked back and forth) rapidly to switch the beam between two adjacent positions on the sky. Such differential observations in time and space can be used to remove receiver baseline drift in time and large-scale spatial fluctuations of atmospheric noise.

  6. 6.

    The subreflector can be tilted to select one of several feeds at the secondary focus, so that the observing frequency band can be changed rapidly.

A subreflector system has some disadvantages:

  1. 1.

    Relatively large feeds are required to produce the narrow beams needed to illuminate the subreflector, which typically subtends only a small angle as viewed from the vertex.

  2. 2.

    Standing waves in the leaky cavity formed by the reflector and subreflector cause sinusoidal ripples with frequency period Δνc/(2f) in the observed spectra of strong continuum radio sources. These ripples can be minimized by alternately defocusing the subreflector radially by ±λ/8 and averaging the data from both subreflector positions.

  3. 3.

    A Cassegrain subreflector blocks the prime-focus position, so prime-focus feeds cannot be used when the Cassegrain subreflector is in position.

The geometry of a symmetrical radio telescope with a Cassegrain subreflector is shown in Figure 3.21. The paraboloidal shape of the primary reflector was determined by the requirement that all incoming rays parallel to the z-axis travel the same distance to reach the prime focus at f1. Likewise, the secondary reflector shape is determined by the requirement that these rays travel the same distance to reach the secondary focus at f2. For a subreflector located below the prime focus, the required shape is a hyperboloid whose major axis coincides with the major axis of the paraboloid. The equation

z2a2-r2b2=1 (3.146)

with a>b defines such a hyperboloid. From any point on the hyperboloid, the difference between the distance to f2 and the distance to f1 is 2a. The distance between the foci is 2(a2+b2)1/2. The two free parameters a and b can be adjusted to set both the diameter of the subreflector as needed to intercept rays from the edge of the primary and the height of the secondary focus on the z-axis. The magnification provided by the subreflector is

M=tan(θ1/2)tan(θ2/2), (3.147)

where θ1 is the half angle subtended by the primary viewed from f1 and θ2 is the half angle subtended by the secondary viewed from f2. A small subreflector is light, easy to tilt, and reduces standing waves, but it subtends a small angle 2θ2 at f2 so a feed horn several wavelengths in diameter is required to illuminate it properly.

The Parkes 210-foot (since renamed to 64-m) telescope (Figure 3.22) in Australia was built about the same time as the 140-foot telescope, but its alt-az mount and centrally concentrated reflector backup structure pointed the way to the design of modern radio telescopes.

Figure 3.22: The Parkes 64-m telescope. Photo © Shaun Amy.

Elevation-dependent gravitational deformations degrade the short-wavelength performance of tilting reflectors. The deformations can be controlled by designing the backup structure so that the deformed surface remains paraboloidal. The deformations cause the focal point to shift slightly in elevation, but this shift can be accommodated by moving the feed slightly to track the focus. The first large homologous telescope deliberately designed to deform this way is the 100-m telescope (Figure 3.23) of the Max Planck Institut für Radioastronomie (MPIfR) near Effelsberg, Germany. Despite its huge size, its passive surface remains accurate enough to work at wavelengths as short as λ=7 mm over a range of elevations.

Figure 3.23: The 100-m telescope near Effelsberg, Germany. The first deliberately homologous telescope, it works to λ7 mm. Note the large Gregorian subreflector above the prime focus. Photo by Matthias Kadler.

The 100-m telescope has a concave Gregorian subreflector above the prime focus. The geometry of a symmetric Gregorian system is shown in Figure 3.24. As with the Cassegrain subreflector, the Gregorian reflector shape is determined by the requirement that all parallel axial rays travel the same distance to reach the secondary focus at f2. For a subreflector located above the prime focus, the required shape is an ellipsoid whose major axis coincides with the major axis of the paraboloid. The equation

z2a2+r2b2=1 (3.148)

with a>b defines such an ellipsoid. From any point on the ellipsoid, the sum of the distance to f2 and the distance to f1 is 2a. The distance between the foci is 2(a2-b2)1/2.

Figure 3.24: Cross section of a radio telescope rotationally symmetric around the z-axis and having a Gregorian subreflector. Parallel rays from a distant radio source are reflected by the circular paraboloid whose prime focus is at the point marked f1. The Gregorian subreflector is a circular ellipsoid located above the prime focus. It reflects these rays to the feed located at the secondary focus f2 just above the vertex of the paraboloid.

The Arecibo radio telescope (Figures 8.2 and 3.25) was originally designed as a radar facility to study the ionosphere via Thomson scattering of 430 MHz (λ=70 cm) radio waves by free electrons. Thermal motions of truly free electrons would greatly Doppler broaden the bandwidth of the radar echo and lower the received signal-to-noise ratio, so a very large antenna was built for sensitivity. However, ionospheric electrons are coupled to the much heavier ions on scales larger than the ionospheric Debye length, which is only a few mm. This is much smaller than the 70 cm wavelength, so the actual bandwidth is determined by thermal motions of the much heavier ions and is lower by two orders of magnitude. Thus a far smaller dish would have sufficed! Astronomers have benefited from this oversight and use Arecibo’s huge collecting area at frequencies up to about 10 GHz for Solar-System radar (planets, moons, asteroids), pulsar studies, Hi 21-cm line observations of galaxies, and other observations that need high sensitivity.

Figure 3.25: The Arecibo feed-support platform can steer the beam anywhere up to 20 degrees from the zenith even though the spherical reflector is fixed. The curved azimuth arm rotates about the vertical under a circular ring at the base of the fixed triangular structure. The carriage house under the left side of the azimuth arm carries a waveguide line feed that corrects for spherical aberration. The dome under the carriage house on the right side contains the Gregorian secondary mirror and tertiary correcting mirror, illuminated by waveguide horn feeds. The carriage houses can move along tracks at the bottom of the azimuth arm to change the zenith angle of the beam.

The spherical reflector can be very large because it is does not move. A sphere is symmetric about any axis passing through its center, so the Arecibo beam can be steered by moving the feed instead of the reflector. The curved feed-support arm visible in Figure 3.25 is 300 feet long and rotates in azimuth below the fixed triangular structure. The feeds are mounted under two carriage houses that move along tracks on the bottom of the feed arm and permit tracking at zenith angles up to 20 degrees. The feed illumination spills over the edge of the fixed reflector at high zenith angles, so a large ground screen surrounds the spherical reflector to reflect the spillover onto the cold sky and keep it away from the warm and noisy ground.

A spherical reflector focuses a distant point source onto a radial line segment, so a radial line feed (see Figure 3.25) up to 96 feet long is needed to illuminate the entire aperture efficiently from the prime focus. The line feed is a slotted waveguide tapered to control the group velocity (Equation 3.143) and phase up radiation arriving from all over the reflector. However, long slotted-waveguide line feeds are inherently narrowband, and ohmic losses in the long slotted waveguide increase the system temperature significantly at short wavelengths. The “golf ball” under the feed arm at Arecibo (Figure 3.25) houses an enormous Gregorian subreflector and a tertiary reflector that allow low-noise wideband point feeds to illuminate an ellipse about 200 m by 225 m in size on the main reflector.

Figure 3.26: Vertical cross section showing the symmetry plane of the GBT. The actual dish shown by the continuous curve is an asymmetric section of the symmetric parent paraboloid (dotted curve) whose diameter is 208 m. The inner edge of the GBT reflector is 4 m to the right of the z-axis of symmetry so the foci and feed-support structure to the left of the z-axis never block the incoming radiation. The primary focal length is f1 = 60 m, and the distance from f1 to the secondary focus f2 is 11 m. The secondary focus is offset by 1.068 m from the symmetry axis to minimize instrumental polarization. The diameter of the Gregorian subreflector is 8 m. The secondary focus is far above the vertex of the parent paraboloid, but the off-axis feed support arm of the GBT is strong enough to support a large feed/receiver cabin (Figure 3.27) at this height.

The 100-m Robert C. Byrd Green Bank Telescope (GBT) (Figure 8.1) is the successor to the collapsed 300-foot telescope in Green Bank, and it incorporates a number of new design features to optimize its sensitivity and short-wavelength performance.

The actual reflector is a 110 m × 100 m off-axis section of an imaginary symmetric paraboloid 208 m in diameter. Projected onto a plane normal to the beam, it is a 100-m diameter circle. Because the projected edge of the actual reflector is 4 m away from the axis of the 208-m paraboloid, the focal point does not block the aperture. The GBT enjoys the same clear-aperture benefits of waveguide horns—a very clean beam and low spillover noise—but is much larger than any practical horn antenna. The clean beam is especially valuable for suppressing radio-frequency interference (RFI) and stray radiation from very extended sources, such as Hi emission from the Galaxy.

Figure 3.27: The concave Gregorian subreflector just above the prime focus of the GBT images sources onto conical horn feeds extending through the top of the rectangular receiver cabin. The prime-focus feed arm is shown stowed out of the way of the subreflector. None of these offset structures block radiation reflected from the main aperture. Image credit: NRAO/AUI/NSF.

The vertical cross section of the GBT plotted in Figure 3.26 shows how the offset Gregorian subreflector does not block any radiation falling onto the primary reflector. The Gregorian subreflector is above the prime focus at f1, so prime-focus operation is possible by raising a swinging boom carrying the prime-focus feeds into position below the subreflector, although this temporarily blocks the Gregorian subreflector. The huge feed-support arm is over 60 m long, the focal length of the 208-m paraboloid. The feed-support arm has a much larger cross section than the feed-support structures of symmetrical telescopes, which must be kept as thin as possible to minimize blockage. This GBT arm is very strong and can support heavy subreflectors, feeds, equipment rooms, and an elevator. At the top of the arm and above the prime focus is the concave Gregorian subreflector. This subreflector illuminates feeds emerging through the roof of a large receiver cabin attached to the feed arm a short distance below (Figure 3.27). Because these feeds are relatively close to the subreflector, even a moderately small subreflector subtends a large angle as viewed from the feeds, which can then be moderately small themselves. Most of the receivers and feeds needed to cover the frequency range 1<ν(GHz)<100 can fit into the receiver cabin simultaneously and are available for use on short notice.

The main reflector is supported by a backup structure that deforms homologously to ensure good efficiency at wavelengths as short as λ=2 cm. The active reflecting surface consists of approximately two thousand panels, each about 2 m on a side. The corners of individual panels are mounted on computer-controlled actuators that can move the panels up or down as needed to continuously correct the overall shape of the surface. Photogrammetry was used to measure the surface at the rigging elevation (the elevation at which the surface was originally set). The gravitational deformations at other elevation angles predicted by the finite-element computer model of the GBT are continuously removed by the actuators as the telescope moves. As a result, the rms surface error is only σ0.2mm and the GBT has a high surface efficiency at wavelengths as short as λ3 mm.

The 30-m IRAM (Institut de Radioastronomie Millimétrique) telescope (Figure 3.28) is the largest telescope operating at 3, 2, 1, and 0.8 mm. Its rms surface error is only 55μm, and its pointing accuracy is about 1 arcsec.

Figure 3.28: The 30-m IRAM telescope on Pico Veleta in Spain. Image credit: IRAM.

3.6 Radiometers

Natural radio emission from the cosmic microwave background, discrete astronomical sources, the Earth’s atmosphere, and the ground is random broadband noise that is nearly indistinguishable from the noise generated by a warm resistor (Section 2.5) or by receiver electronics. A radio receiver used to measure the average power of the noise coming from a radio telescope in a well-defined frequency range is called a radiometer. The noise voltage has a Gaussian amplitude distribution with zero mean, and it fluctuates on the very short timescales (nanoseconds) comparable with the inverse of the radiometer bandwidth Δν. A square-law detector in the radiometer squares the input noise voltage to produce an output voltage proportional to the input noise power. Noise power is always greater than zero, and the noise from most astronomical sources is stationary, meaning that its mean power is steady when averaged over much longer timescales τ (seconds to hours). The Nyquist–Shannon sampling theorem (Appendix A.3) states that any function having finite bandwidth Δν and duration τ can be represented by 2Δντ independent samples spaced in time by (2Δν)-1. By averaging a large number N=(2Δντ) of independent noise samples, an ideal radiometer can determine the average noise power with a fractional uncertainty as small as (N/2)-1/2=(Δντ)-1/21 and detect faint sources that increase the antenna temperature by only a tiny fraction of the total noise power. The ideal radiometer equation expresses this result in terms of the radiometer bandwidth and the averaging time. Gain variations in practical radiometers, fluctuations in atmospheric emission, and confusion by unresolved radio sources may significantly degrade the actual sensitivity compared with that predicted by the ideal radiometer equation.

3.6.1 Band-Limited Noise

Figure 3.29: The output voltage V of a radio telescope varies rapidly on short timescales, as indicated by the upper plot showing 100 independent samples of band-limited noise drawn from a Gaussian probability distribution P(V/Vrms) (lower plot) having zero mean and fixed rms Vrms. See Appendix B.5 for a mathematical description of the Gaussian distribution.

The voltage at the output of a radio telescope is the sum of noise voltages from many independent random contributions. The central limit theorem [15] states that the amplitude distribution of such noise is nearly Gaussian. Figure 3.29 (lower panel) shows the histogram of about 20,000 independent voltage samples randomly drawn from a Gaussian parent distribution having rms Vrms and mean V=0. Figure 3.29 (upper panel) shows N=100 successive samples drawn from the Gaussian noise distribution. This sequence of voltages is representative of band-limited noise in the frequency range from 0 to Δν during a time interval τ such that (Δντ)=N/2=50, e.g., noise with all frequencies up to Δν=1 MHz sampled every (2Δν)-1=0.5μs for τ=50μs. This is what the band-limited noise output voltage of a radio telescope looks like.

It is convenient to describe noise power in units of temperature. The noise power per unit bandwidth generated by a resistor of temperature T is Pν=kT in the low-frequency limit, so we can define the noise temperature of any noiselike source in terms of its power per unit bandwidth Pν:

TNPνk, (3.149)

where k1.38×10-23 joule K-1 is Boltzmann’s constant.

The temperature equivalent to the total noise power from all sources referenced to the input of a radiometer connected to the output of a radio telescope is called the system noise temperature Ts. It is the sum of many contributors to the antenna temperature plus the radiometer noise temperature Tr:

Ts=Tcmb+Trsb+ΔTsource+[1-exp(-τA)]Tatm+Tspill+Tr+. (3.150)

There are seven antenna-temperature contributions listed explicitly in Equation 3.150:

  1. 1.

    Tcmb2.73 K is from the nearly isotropic cosmic microwave background.

  2. 2.

    Trsb is the average sky brightness temperature contributed by all “background” radio sources. Extragalactic sources add [31]

    (Trsb0.1K)(ν1.4GHz)-2.7 (3.151)

    in all directions, and the Galactic plane is a bright diffuse source at low (ν<0.5GHz) frequencies [43].

  3. 3.

    ΔTsource is from the astronomical source being observed, written with a Δ to emphasize that it is usually much smaller than the total system noise: ΔTsourceTs. For example, in the νRF4.85 GHz sky survey made with the 300-foot telescope, the system noise was Ts60 K, but the faintest detected sources added only ΔTsource0.01 K.

  4. 4.

    [1-exp(-τA)]Tatm is the brightness of atmospheric emission in the telescope beam (Section 2.2.3).

  5. 5.

    Tspill accounts for spillover radiation that the feed picks up in directions beyond the edge of the reflector, primarily from the ground.

  6. 6.

    Tr is the radiometer noise temperature attributable to noise generated by the radiometer itself, referenced to the radiometer input. All radiometers generate noise, and any radiometer can be represented by an equivalent circuit consisting of a noiseless radiometer whose input is connected to a resistor of temperature Tr. Radiometer noise is usually minimized by cooling the radiometer to cryogenic temperatures. However, radiometers are not just matched resistors, so Tr may be either lower or higher than the physical temperature of the radiometer itself.

  7. 7.

    ” represents any other noise sources that might be important. An example is emission resulting from ohmic losses in the long slotted waveguide feed at Arecibo (Figure 3.25).

3.6.2 Radiometers

The purpose of the simplest total-power radiometer is to measure the time-averaged power of the input noise in some well-defined radio frequency (RF) range

νRF-Δν2toνRF+Δν2, (3.152)

where Δν is the receiver bandwidth. For example, the receivers used on the 300-foot telescope to make the λ6 cm continuum survey of the northern sky had a center radio frequency νRF4.85×109 Hz and a bandwidth Δν6×108 Hz.

The simplest radiometer (Figure 3.30) consists of four stages in series: (1) a low-loss bandpass filter that passes input noise only in the desired frequency range; (2) a square-law detector whose output voltage Vo is proportional to the square of its input voltage; that is, Vo is proportional to its input power; (3) a signal averager or integrator that smooths the rapidly fluctuating detector output; and (4) a voltmeter or other device to measure and record the smoothed voltage.

Figure 3.30: The simplest radiometer filters the broadband noise coming from the telescope, multiplies the filtered voltage by itself (square-law detection), smooths the detected voltage, and measures the smoothed voltage. The function of the detector is to convert the noise voltage, which has zero mean, to noise power, which is proportional to the square of voltage.

After passing through an input filter of width Δν<νRF, the noise voltage is no longer completely random; it looks more like a sine wave of frequency νRF whose amplitude envelope (dashed curve in Figure 3.31) varies randomly on timescales Δt(Δν)-1>νRF-1. The positive and negative envelopes are similar so long as ΔννRF.

Figure 3.31: The voltage output V(t) of the filter with center frequency νRF and bandwidth Δν<νRF is a sinusoid with frequency νRF whose envelope (dashed curves) fluctuates on timescales (Δν)-1>(νRF)-1.

The filtered output is sent to a square-law detector whose output voltage Vo is proportional to its input power. For a narrowband (quasi-sinusoidal) input voltage Vicos(2πνRFt) at frequency νRF, the detector output voltage would be Vocos2(2πνRFt). This can be rewritten as [1+cos(4πνRFt)]/2, a function whose mean value is proportional to the average power of the input signal. In addition to the DC (zero-frequency) component there is an oscillating component at twice the input frequency νRF. The detector output spectrum for a finite bandwidth Δν and a typical waveform is shown in Figure 3.32.

Figure 3.32: The output voltage Vo of a square-law detector (Figure 3.33) is proportional to the square of the input voltage. It is always positive, so its mean (DC, or zero-frequency component) is positive and proportional to the input power. The high frequency (ν2νRF) fluctuations add no information about the source and are filtered out in the next stage.

The oscillations under the envelope approach zero every Δt(2νRF)-1. Thus the oscillating component of the detector output is centered near the frequency 2νRF. The detector output also has frequency components near zero (DC) because the mean output voltage is greater than zero.

Figure 3.33: The upper plot shows the output voltage Vo of a square-law detector whose input is the Gaussian noise shown by the upper plot in Figure 3.29. The output voltage histogram (lower plot) is peaked sharply near zero and has a long positive tail. The mean detected voltage Vo equals the mean squared input voltage, and the rms of the detected voltage distribution is 21/2Vo. For a full derivation of the detector output distribution and its rms, see Appendix B.6.

Both the rapidly varying component at frequencies near 2νRF and its envelope vary on timescales that are normally much shorter than the timescales on which the average signal power ΔT varies. The unwanted rapid variations can be suppressed by taking the arithmetic mean of the detected envelope over some timescale τ(Δν)-1 by integrating or averaging the detector output. This integration might be done electronically by smoothing with an RC (resistance plus capacitance) filter or numerically by sampling and digitizing the detector output voltage and then computing its running mean.

Integration greatly reduces the receiver output fluctuations. In the time interval τ there are N=(2Δντ) independent samples of the total noise power Ts, each of which has an rms error σT21/2Ts. The rms error in the average of N1 independent samples is reduced by the factor N, so the rms receiver output fluctuation σT (see Appendix B.6 for a formal derivation of this result) is only

σT=21/2TsN1/2. (3.153)

In terms of bandwidth Δν and integration time τ,

σTTsΔντ (3.154)

after smoothing. The central limit theorem of statistics implies that heavily smoothed (Δντ1) output voltages also have a nearly Gaussian amplitude distribution. This important equation is called the ideal radiometer equation for a total-power receiver. The weakest detectable signals ΔT only have to be several (typically five) times the output rms σT given by the radiometer equation, not several times the total system noise Ts. The product (Δντ) may be quite large in practice (108 is not unusual), so signals as faint as ΔT5×10-4Ts would be detectable. Figures 3.34 and 3.35 illustrate the effects of smoothing the detector output by taking running means of lengths N=50 and N=200 samples.

Figure 3.34: The smoothed output voltage from the integrator varies on timescale τ with small amplitude σT given by the ideal radiometer equation. The top part of this figure shows the detected voltage smoothed by an N=50 sample running mean, and the bottom part shows the amplitude distribution of the smoothed voltage. This amplitude distribution has mean Vo and rms (2/N)1/2Vo=Vo/5. As N grows, the smoothed amplitude distribution approaches a Gaussian. The sampling theorem (Appendix A.3) states that N=(2Δντ) so (Δντ)=25 for this example.
Figure 3.35: When the same detector output is smoothed over N=200 samples instead of N=50 samples (Figure 3.34), the mean remains the same but the rms falls by a factor of 41/2=2 to Vo/10. In this example (Δντ)=100.

3.6.3 Some Caveats

The ideal radiometer equation suggests that the sensitivity of a radio observation improves as τ1/2 forever. In practice, systematic errors set a floor to the noise level that can be reached. Receiver gain changes, erratic fluctuations in atmospheric emission, or “confusion” by the unresolved background of continuum radio sources usually limit the sensitivity of single-dish continuum observations.

Receiver Gain and Atmospheric Fluctuations

Radiometers contain a series of amplifiers that multiply the weak input powers Pin=kTsΔν10-14W to milliwatt levels. The output voltage of a total-power receiver is directly proportional to the overall power gain G of the receiver. If G isn’t perfectly constant, the change in output voltage caused by a gain fluctuation ΔG in a practical radiometer produces a false signal whose apparent temperature

σG=Ts(ΔGG) (3.155)

is indistinguishable from a comparable change σT caused by noise in an ideal radiometer. Receiver gain fluctuations and noise fluctuations are independent random processes, so their variances (the variance is the square of the rms) add, and the total receiver output fluctuation becomes

σT2 =σnoise2+σG2 (3.156)
=Ts2[1Δντ+(ΔGG)2]. (3.157)

The practical total-power radiometer equation is thus

σTTs[1Δντ+(ΔGG)2]1/2. (3.158)

Clearly, radiometer gain fluctuations will degrade the sensitivity of an observation unless

(ΔGG)1Δντ. (3.159)

For example, the 5 GHz receiver used to make the sky survey with the 300-foot telescope had Δν6×108 Hz and τ0.1 s, so the fractional gain fluctuations on timescales up to a few seconds (the time to scan one baseline length) had to satisfy

ΔGG16×108Hz0.1s=1.3×10-4. (3.160)

This is difficult to achieve in practice. Gain fluctuations typically have “1/f” power spectra, where f is the postdetection frequency, so they are larger on longer timescales and increasing τ eventually results in a higher output noise level. The gain stability of a receiver is often specified by the “1/f knee” frequency fk, the postdetection frequency at which σnoise=σG. Integrations longer than τ1/(2πfk) will likely increase the receiver output fluctuations. Depending on the stability and bandwidth of the radiometer, 1Hz<fk<1kHz.

Figure 3.36: Block diagram of a beam-switching differential radiometer. The total-power receiver is switched between two feeds, one pointing at the source and one displaced by a few beamwidths to avoid the source but measure emission from nearly the same sample of atmosphere. The output of the total-power receiver is multiplied by +1 when the receiver is connected to the on-source feed and by -1 when it is connected to the reference feed. Fluctuations in atmospheric emission and in receiver gain are effectively suppressed for frequencies below the switching rate, which is typically in the range 10 to 1000 Hz.

Fluctuations in atmospheric emission also add to the noise in the output of a simple total-power receiver. Water vapor is the main culprit because it is not well mixed in the atmosphere, and noise from water-vapor fluctuations can be a significant problem at frequencies of 5 GHz and up.

One way to minimize the effects of fluctuations in both receiver gain and atmospheric emission is to make a differential measurement by comparing signals from two adjacent feeds. The method of switching rapidly between beams or loads is called Dicke switching after Robert Dicke, its inventor. Figure 3.36 shows the block diagram of a beam-switching Dicke radiometer. If the system temperatures are T1 and T2 in the two positions of the switch, then the receiver output is proportional to T1-T2T1 and the effect of gain fluctuations is only

σG(T1-T2)ΔGGT1ΔGG. (3.161)

Likewise, the atmospheric emission in two nearly overlapping beams through the troposphere is nearly the same, so most of the tropospheric fluctuations cancel out. The main drawback with Dicke switching is that the receiver output fluctuations, relative to the source signal in a single beam, are doubled because the source signal is being received only half the time while the noise power is present all the time. The ideal radiometer equation for a Dicke switching receiver is

σT=2TsΔντ. (3.162)

Confusion

Figure 3.37: A profile plot covering 45 deg2 of sky imaged by the 300-foot telescope at 1.4 GHz with θ=12 arcmin resolution [25]. The ubiquitous fluctuations with rms σ20mJybeam-1 are caused by the superposition of numerous faint sources, not receiver noise.
Figure 3.38: The contour image [25] is a 4 deg2 subset of the area shown in Figure 3.37. The contours start at 45mJybeam-12σc and are spaced by factors of 21/2, so sources with fewer than four contours are below the 5σc confusion limit. The gray-scale plot is a 1.4 GHz VLA image made with θ=45 arcsec resolution. Some of the faint “sources” seen by the 300-foot telescope are blends of two or more fainter sources resolved by the VLA.

Single-dish radio telescopes have large collecting areas but relatively broad beams at long wavelengths. Nearly all discrete continuum sources are extragalactic and extremely distant, so they are distributed randomly and isotropically on the sky. The sky-brightness fluctuations caused by numerous faint sources in every telescope beam are called confusion, and confusion usually limits the sensitivity of single-dish continuum observations at frequencies below ν10GHz. Figure 3.37 is a profile plot of confusion fluctuations in a low-resolution image. Figure 3.38 shows contours from a portion of that low-resolution image superimposed on an overlapping high-resolution gray-scale image.

Although the amplitude distribution of confusion is distinctly non-Gaussian, the “rms” confusion σc calculated by ignoring the long positive tail is a widely quoted. At cm wavelengths, the rms confusion in a Gaussian telescope beam with FHWM θ is

(σcmJybeam-1){0.2(νGHz)-0.7(θarcmin)2(θ>0.17arcmin),2.2(νGHz)-0.7(θarcmin)10/3(θ<0.17arcmin). (3.163)

Individual sources fainter than the confusion limit 5σc cannot be detected reliably, no matter how low the receiver noise. Most continuum observations of faint sources at frequencies below ν10GHz are made with interferometers instead of single dishes because interferometers can synthesize much smaller beamwidths θ and hence have significantly lower confusion limits.

Confusion by steady continuum sources has a much smaller effect on observations of spectral lines or rapidly varying sources such as pulsars.

3.6.4 Superheterodyne Receivers

Few actual radiometers are as simple as those described above. Nearly all practical radiometers are superheterodyne receivers (Figure 3.39), in which the RF amplifier is followed by a mixer that multiplies the RF signal by a sine wave of frequency νLO generated by a local oscillator (LO). The product of two sine waves contains the sum and difference frequency components,

2sin(2πνLOt)sin(2πνRFt)=cos[2π(νLO-νRF)t]-cos[2π(νLO+νRF)t], (3.164)

so the mixer acts as a frequency shifter. For example, if νLO=12GHz and νRF=9GHz, the mixer output frequency, called the intermediate frequency (IF), will be νLO-νRF=3GHz.

The advantages of superheterodyne receivers include

  1. 1.

    shifting the signals to lower frequencies νIF<νRF where they are easier to amplify, transmit over long distances, filter, and digitize;

  2. 2.

    tunability over a wide range of νRF;

  3. 3.

    tuning by adjusting only the local oscillator frequency so that

  4. 4.

    the IF amplifier and back-end devices such as multichannel filter banks or digital spectrometers can all operate over fixed frequency ranges.

Figure 3.39: Block diagram of a simple superheterodyne receiver. Only the local oscillator is tuned to change the observing frequency range.

3.6.5 Spectrometers

Figure 3.40: An analog filter bank splits the broadband output of an IF amplifier into N contiguous frequency channels of width δν each. In effect, each channel is a narrowband IF amplifier whose output voltage is detected (multiplied by itself), smoothed, and recorded.

The simplest superheterodyne radiometer measures the total power in its normally broad IF passband of width Δν. A spectrometer is a backend that divides that passband into N adjacent narrow frequency ranges of width δνΔν/N and simultaneously measures the power in all N channels to quickly locate and resolve spectral features such as atomic and molecular lines (Section 7.1).

The most straightforward spectrometer is a filter bank of narrowband analog filters connected in parallel and with center frequencies uniformly spaced by δν (Figure 3.40). Each channel acts as a separate IF and has its own detector. However, the channel gains, bandpasses, and detector responses of an analog filter bank must be very closely matched and stable to yield smooth spectral baselines, so analog filter banks with more than N102 channels are difficult to build and tune. Analog filter banks are also inflexible because their channel bandwidths δν and numbers N cannot be changed easily. Flexible spectrometers with N103 or even N104 frequency channels require digital signal processing (DSP) techniques.

For many years, most digital spectrometers were autocorrelation spectrometers using the Wiener–Khinchin theorem (Equation A.18) to compute power spectra from digitally sampled time series (see Appendix A.3) of the band-limited IF output. A sampled copy of a portion of the input radio signal is delayed by a series of progressively longer time delays, the delayed signals are multiplied with the original signal, and their products are integrated. This series of operations is an autocorrelation (Appendix A.7). If the digital samples contain only one or two bits (two or three levels) each, autocorrelation can be performed in hardware and often in a single chip, with relatively simple digital logic. These autocorrelation functions (ACFs) can be integrated to build up signal-to-noise and then finally converted into a power spectrum via a discrete Fourier transform (usually an FFT; see Appendix A.2) of the ACF via the Wiener–Khinchin theorem. Autocorrelation spectrometers allow the integration of very deep (i.e., long-duration) spectra using relatively simple digital hardware and without computing many “costly” FFTs directly on incoming Nyquist-sampled data; only one FFT is computed at the very end of the integration. Similar techniques, but using cross-correlation of the signals from different antennas, are often used to calculate spectra from radio interferometers.

With the continuing improvements in the speeds and capabilities of DSP systems, spectra are increasingly being computed directly via FFTs of a Nyquist-sampled band. The Fourier amplitudes are squared to make power spectra, and the power spectra are accumulated for deep spectral integrations. Such systems are known as Fourier transform spectrometers, and the FFTs can be computed in a variety of ways. Many recent spectrometers use Field Programmable Gate Arrays (FPGAs) to compute the FFTs, integrate, and compute polarization products, all on a single chip. Other hybrid designs use FPGAs to divide the band into coarse channels and pass those effectively Nyquist-sampled subbands off to CPUs, other FPGAs, or Graphical Processing Units (GPUs) for further processing, such as coherent dedispersion and folding of pulsar data in a pulsar back-end, or much finer frequency resolution and perhaps even active interference removal for high-sensitivity spectroscopy applications. The new VErsatile GBT Astronomical Spectrometer (VEGAS) is a hybrid Fourier transform spectrometer. The capabilities of such systems, especially given the fidelity provided by sampling with eight or more bits precision, is making them the new standard back-end technology for radio astronomy.

3.6.6 Measuring Radiometer Noise

The radiometer itself usually contributes significantly to the total system noise temperature Tsys. Any radiometer can be modeled by an equivalent circuit consisting of an ideal noiseless radiometer plus an input matched load resistor at temperature Tr, where Tr is called the radiometer input noise temperature.

The simplest way to measure Tr is to connect a matched “hot” load resistor whose physical temperature is Th to the radiometer input and record the detector output voltage Vh, and then replace it with a “cold” load whose physical temperature is Tc and record the output voltage Vc. Often the hot load is just a resistor at room temperature Th290K and the cold load is a resistor immersed in liquid nitrogen at its boiling temperature Tc77K.

For each measurement, the square-law detector output voltage is proportional to the total input noise power generated by the actual load plus the imaginary resistor whose temperature is Tr. In the low-frequency Nyquist approximation Pν=kT, so

Vh= PνΔνG=k(Th+Tr)ΔνG, (3.165)
Vc= PνΔνG=k(Tc+Tr)ΔνG, (3.166)

where Δν is the bandwidth and G is the overall gain of the radiometer. Both G and Δν cancel out in the Y factor defined by

YVhVc=Th+TrTc+Tr, (3.167)

so they do not have to be measured. Equation 3.167 can be solved for the radiometer noise temperature

Tr=Th-YTcY-1. (3.168)

This technique for measuring Tr is called the Y-factor method.

Communications engineers often specify the radiometer noise factor Fn defined by

FnTr+T0T0, (3.169)

where the standard temperature defined as T0290K is close to room temperature. The numerator in Equation 3.169 is proportional to the detected output voltage of the radiometer connected to an ambient-temperature load and the denominator is the output of a noiseless radiometer connected to an ambient-temperature load. In terms of Fn, the radiometer noise temperature is

Tr=(Fn-1)T0. (3.170)

The related radiometer noise figure NF used by many commercial manufacturers of amplifiers and radiometers is just the noise factor Fn expressed in dB:

NF10log10(Fn). (3.171)

3.7 Interferometers

Every practical single-dish radio telescope (Section 3.5) has relatively low angular resolution and pointing accuracy, small field-of-view, and limited sensitivity. The largest fully steerable dish has diameter D100 m and its angular resolution is diffraction limited to θλ/D radians, so impossibly large diameters would be needed to achieve sub-arcsecond resolution at radio wavelengths. Pointing and source-tracking accuracy is also a problem for a large single dish. The telescope beam should be able to follow a radio source on the sky within σθ/10 for reasonably accurate photometry or imaging. The accuracy with which the actual beam direction during an observation can be recovered by later data analysis determines the accuracy with which the sky position of a radio source can be measured. Gravitational sagging, telescope deformations caused by differential solar heating, and torques caused by wind gusts combine to limit the mechanical tracking and pointing accuracies of the best radio telescopes to σ1 arcsec. Most optical telescopes can make high-resolution images covering large areas of sky rapidly because their large fields-of-view ΩFoVθ2 cover millions or billions of pixels. In contrast, most single-dish radio telescopes have only one or several beams. The geometric area of a single dish is just πD2/4, while the geometric area NπD2/4 of an interferometer with N dishes can be arbitrarily large. The continuum sensitivity of a single dish is strongly limited by confusion at frequencies below about 10 GHz.

Aperture-synthesis interferometers comprising N2 moderately small dishes have mitigated these and many other practical problems associated with single dishes, such as vulnerability to fluctuations in atmospheric emission and receiver gain, radio-frequency interference, and pointing shifts caused by atmospheric refraction. For example, the Westerbork Synthesis Radio Telescope (Figure 8.3) consists of N=14, D=25m telescopes on east–west baselines up to b3km in length. Its total collecting area is that of a single dish with diameter DtotN1/2D92 m. It has the high angular resolution of a diffraction-limited telescope 3 km in diameter. It has the large instantaneous field-of-view of a 25-m telescope, so it can image (b/D)2104 pixels at once with only one receiver on each telescope. It can measure positions of radio sources with subarcsecond accuracy despite the much larger source-tracking errors of the individual telescopes.

Historically, the total bandwidths and numbers of simultaneous frequency channels of aperture-synthesis interferometers with many dishes were lower than those of single dishes. Recent advances in correlator electronics and computing have largely overcome these practical limitations, so new or updated interferometers such as ALMA (Figure 8.5) and the JVLA (Figure 8.4) are playing an increasingly dominant role in observational radio astronomy. The primary uses of single dishes today are

  • (a)

    observing pulsars, which are time variable so they are easy to separate from confusion by time-independent continuum sources;

  • (b)

    spectroscopic observations of extended low-brightness sources, again largely immune to confusion;

  • (c)

    complementing interferometers by providing “zero-spacing” data on very extended sources or by serving as elements of very long baseline arrays.

3.7.1 The Two-Element Quasi-Monochromatic Interferometer

The simplest radio interferometer is a pair of radio telescopes whose voltage outputs are correlated (multiplied and averaged), and even the most elaborate interferometers with N2 antennas, often called elements, can be treated as N(N-1)/2 independent two-element interferometers.

Figure 3.41: This block diagram shows the components of a two-element quasi-monochromatic multiplying interferometer observing in a very narrow radio frequency range centered on ν=ω/(2π). s^ is the unit vector in the direction of a distant point source and b is the baseline vector pointing from antenna 1 to antenna 2. The output voltage V1 of antenna 1 is the same as the output voltage V2 of antenna 2, but it is retarded by the geometric delay τg=bs^/c representing the additional light-travel delay to antenna 1 for a plane wavefront from a source at angle θ from the baseline vector. These voltages are amplified, multiplied (×), and time averaged () by the correlator to yield an output response whose amplitude R is proportional to the flux density of the point source and whose phase (ωτg) depends on the delay and the frequency. The quasi-sinusoidal output fringe shown occurs if the source direction in the interferometer frame is changing at a constant rate dθ/dt. The broad Gaussian envelope of the fringe shows the primary-beam attenuation as the source passes through the beam of the dishes.

Figure 3.41 shows two identical dishes separated by the baseline vector b of length b that points from antenna 1 to antenna 2. Both dishes point in the same direction specified by the unit vector s^, and θ is the angle between b and s^. Plane waves from a distant point source in this direction must travel an extra distance bs^=bcosθ to reach antenna 1, so the output of antenna 1 is the same as that of antenna 2, but it lags in time by the geometric delay

τg=bs^c. (3.172)

For simplicity, we first consider a quasi-monochromatic interferometer, one that responds only to radiation in a very narrow band Δν2π/τg centered on frequency ν=ω/(2π). Then the output voltages of antennas 1 and 2 at time t can be written as

V1=Vcos[ω(t-τg)]andV2=Vcos(ωt). (3.173)

These output voltages are amplified versions of the antenna input voltages; they have not passed through square-law detectors. Instead, a correlator multiplies these two voltages to yield the product

V1V2=V2cos[ω(t-τg)]cos(ωt)=(V22)[cos(2ωt-ωτg)+cos(ωτg)] (3.174)

that follows directly from the trigonometric identity cosxcosy=[cos(x+y)+cos(x-y)]/2. The correlator also takes a time average long enough (Δt(2ω)-1) to remove the high-frequency term cos(2ωt-ωτg) from the correlator response (output voltage) R and keep only the slowly varying term

R=V1V2=(V22)cos(ωτg). (3.175)

The voltages V1 and V2 are proportional to the electric field produced by the source multiplied by the voltage gains of the two antennas and receivers. Thus the correlator output amplitude V2/2 is proportional to the flux density S of the point source multiplied by (A1A2)1/2, where A1 and A2 are the effective collecting areas of the two antennas.

Notice that the time-averaged response R of a multiplying interferometer is zero. There is no DC output, so fluctuations in receiver gain do not act on the whole system temperature Ts as for a total-power observation with a single dish (Equation 3.155). Uncorrelated noise power from very extended radio sources such as the cosmic microwave background and the atmosphere over the telescopes, also averages to zero in the correlator response. Short interference pulses with duration t|b|/c are also suppressed because each pulse does not reach both telescopes simultaneously. Likewise, a multiplying radio interferometer differs from a classical adding interferometer, such as the optical Michelson interferometer, that adds the uncorrelated noise power contributions.

The correlator output voltage R=(V2/2)cos(ωτg) varies sinusoidally as the Earth’s rotation changes the source direction relative to the baseline vector. These sinusoids are called fringes, and the fringe phase

ϕ=ωτg=ωcbcosθ (3.176)

depends on θ as follows:

dϕdθ =ωcbsinθ (3.177)
=2π(bsinθλ). (3.178)

The fringe period Δϕ=2π corresponds to an angular shift Δθ=λ/(bsinθ). The fringe phase is an exquisitely sensitive measure of source position if the projected baseline bsinθ is many wavelengths long. Note that fringe phase and hence measured source position is not affected by small tracking errors of the individual telescopes. It depends on time, and times can be measured by clocks with much higher accuracy than angles (ratios of lengths of moving telescope parts) can be measured by rulers. Also, an interferometer whose baseline is horizontal is not affected by the plane-parallel component of atmospheric refraction, which delays the signals reaching both telescopes equally. Consequently, interferometers can determine the positions of compact radio sources with unmatched accuracy, as shown in Figure 1.6. Absolute positions with errors as small as σθ10-3 arcsec and differential positions with errors down to σθ10-5 arcsec <10-10 rad have frequently been measured.

If the individual antennas comprising an interferometer were isotropic, the interferometer point-source response would be a sinusoid spanning the sky. Such an interferometer is sensitive to only one Fourier component of the sky brightness distribution: the component with angular period λ/(bsinθ). The response R of a two-element interferometer with directive antennas is that sinusoid multiplied by the product of the voltage patterns of the individual antennas. Normally the two antennas are identical, so this product is the power pattern of the individual antennas and is called the primary beam of the interferometer. The primary beam is usually a Gaussian much wider than a fringe period, as indicated in Figure 3.41. The convolution theorem (Equation A.15) states that the Fourier transform of the product of two functions is the convolution of their Fourier transforms, so the interferometer with directive antennas responds to a finite range of angular frequencies centered on (bsinθ/λ). Because the antenna diameters D must be smaller than the baseline b (else the antennas would overlap), the angular frequency response cannot extend to zero and the interferometer cannot detect an isotropic source—the bulk of the 3 K cosmic microwave background for example. The missing short spacings (b<D) can be provided by a single-dish telescope with diameter D>b. Thus the D = 100 m GBT can fill in the missing baselines b<25m that the D = 25 m VLA dishes cannot obtain.

Improving the instantaneous point-source response pattern of an interferometer requires more Fourier components; that is, more baselines. An interferometer with N antennas contains N(N-1)/2 pairs of antennas, each of which is a two-element interferometer, so the instantaneous synthesized beam (the point-source response obtained by averaging the outputs of all of the two-element interferometers) rapidly approaches a Gaussian as N increases. The instantaneous point-source responses of a two-element interferometer with projected baseline length b, a three-element interferometer with three baselines (projected lengths b/3, 2b/3, and b), and a four-element interferometer with six baselines (projected lengths b/6, 2b/6, 3b/6, 4b/6, 5b/6, and b) are shown in Figure 3.42.

Figure 3.42: The instantaneous point-source responses of interferometers with overall projected length b and two, three, or four antennas distributed as shown are indicated by the thick curves. The synthesized main beam of the four-element interferometer is nearly Gaussian with angular resolution θλ/b, but the sidelobes are still significant and there is a broad negative “bowl” caused by the lack of spacings shorter than the diameter of an individual antenna. Thus the synthesized beam is sometimes called the dirty beam. The instantaneous dirty beam of the multielement interferometer is the arithmetic mean of the individual responses of its component two-element interferometers. The individual responses of the three two-element interferometers comprising the three-element interferometer and of the six two-element interferometers comprising the four-element interferometer are plotted as thin curves.

Most radio sources are stationary; that is, their brightness distributions do not change significantly on the timescales of astronomical observations. For stationary sources, a two-element interferometer with movable antennas could make N(N-1)/2 observations to duplicate one observation with an N-element interferometer.

3.7.2 Slightly Extended Sources and the Complex Correlator

The response Rc=(V2/2)cos(ωτg) of the quasi-monochromatic two-element interferometer with a “cosine” correlator (Figure 3.41 and Equation 3.175) to a spatially incoherent slightly extended (much smaller than the primary beamwidth) source with sky brightness distribution Iν(s^) near frequency ν=ω/(2π) is obtained by treating the extended source as the sum of independent point sources:

Rc=I(s^)cos(2πνbs^/c)𝑑Ω=I(s^)cos(2πbs^/λ)𝑑Ω. (3.179)

Notice that the even cosine function in this response is sensitive only to the even (inversion-symmetric) part IE of an arbitrary source brightness distribution, which can be written as the sum of even and odd (antisymmetric) parts: I=IE+IO. To detect the odd part IO we need a “sine” correlator whose output is odd, Rs=(V2/2)sin(ωτg). This can be implemented by a second correlator that follows a π/2rad=90 phase delay inserted into the output of one antenna because sin(ωτg)=cos(ωτg-π/2). Then

Rs=I(s^)sin(2πbs^/λ)𝑑Ω. (3.180)

The combination of cosine and sine correlators is called a complex correlator because it is mathematically convenient to treat the cosines and sines as complex exponentials using Euler’s formula (Appendix B.3)

eiϕ=cosϕ+isinϕ. (3.181)

The complex visibility is defined by

𝒱Rc-iRs (3.182)

which can be written in the form

𝒱=Ae-iϕ, (3.183)

where

A=(Rc2+Rs2)1/2 (3.184)

is the visibility amplitude and

ϕ=tan-1(Rs/Rc) (3.185)

is the visibility phase. The response to an extended source with brightness distribution I(s^) of the two-element quasi-monochromatic interferometer with a complex correlator is the complex visibility

𝒱=I(s^)exp(-i2πbs^/λ)dΩ. (3.186)

3.7.3 Effects of Finite Bandwidths and Averaging Times

Equation 3.186 for quasi-monochromatic interferometers may be generalized to interferometers with finite bandwidths and integration times, which are necessary for high sensitivity. In the small but finite frequency range Δν centered on frequency νc, Equation 3.186 becomes

𝒱 =[νc-Δν/2νc+Δν/2Iν(s^)exp(-i2πbs^/λ)𝑑ν]𝑑Ω (3.187)
=[νc-Δν/2νc+Δν/2Iν(s^)exp(-i2πντg)𝑑ν]𝑑Ω. (3.188)

If the source brightness and the response of the interferometer are nearly constant over Δν, the integral over frequency is just the Fourier transform of a rectangle function, so

𝒱Iν(s^)sinc(Δντg)exp(-i2πνcτg)𝑑Ω. (3.189)

For a finite bandwidth Δν and delay τg, the fringe amplitude is attenuated by the factor sinc(Δντg). This attenuation can be eliminated in any one direction s^0 called the delay center or the phase reference position by introducing a compensating delay τ0τg in the signal path of the “leading” antenna, as shown in Figure 3.43. As the Earth turns, τ0 must be continuously adjusted to track τg within a tolerance |τ0-τg|(Δν)-1. This is usually done with digital electronics.

Figure 3.43: The compensating delay τ0, shown here as an extra loop of cable between antenna 2 and the correlator, must track the geometric delay τg in the direction s^0 of the delay center accurately enough to keep |τ0-τg|(Δν)-1 in order to minimize attenuation.

The geometric delay varies with direction, so delay compensation can be exact in only one direction. The angular radius Δθ of the usable field-of-view is determined by the variation of τg with offset Δθ from the direction s^0. Because cτg=bs=bcosθ, |cΔτg|=bsinθΔθ. Requiring

ΔνΔτg1 (3.190)

implies

Δν(bsinθ)Δθ/c1. (3.191)

Substituting λν=c and using θsλ/(bsinθ) for the synthesized beamwidth, we get the requirement

ΔθθsνΔν. (3.192)

At larger angular offsets Δθ from the phase reference position, bandwidth smearing will radially broaden the synthesized beam by convolving it with a rectangle of angular width ΔθΔν/ν.

Satisfactory wide-field images can be made with a larger total bandwidth only by dividing that bandwidth into a number of narrower frequency channels each satisfying Equation 3.192. For example, the synthesized beamwidth of the VLA “B” configuration (maximum baseline length b10km) at λ=20cm (ν=1.5GHz) is θs[(0.2m)/(104m)]rad4arcsec. To image out to an angular radius Δθ=15arcmin=900arcsec equal to the half-power radius of the VLA primary beam requires channel bandwidths

ΔννθsΔθ=1.5×109Hz4arcsec900arcsec7MHz. (3.193)

Likewise, the correlator averaging time Δt must be kept short enough that the Earth’s rotation will not move the source position in the frame of the interferometer by as much as the synthesized beamwidth θsλ/b. For example, if the delay is set to track the north celestial pole, a source Δθ away from the north pole will appear to move at an angular rate 2πΔθ/P, where P23h56m04s86164s is the Earth’s sidereal rotation period. Excessive correlator averaging times will cause time smearing that tangentially broadens the synthesized beam. To minimize time smearing in an image of angular radius Δθ, we require

2πΔtPΔt1.37×104sθsΔθ. (3.194)

Continuing with the previous example, to image out to an angular radius Δθ=900arcsec when θs=4arcsec requires averaging times Δt short enough that

ΔtθsΔθ1.37×104s=4arcsec900arcsec1.37×104s60s. (3.195)

3.7.4 Earth-Rotation Aperture Synthesis

The Earth’s rotation varies the projected baseline coverage of an interferometer whose elements are fixed on the ground. In particular, all baselines of an interferometer whose baselines are confined to an east–west line will remain in a single plane perpendicular to the Earth’s north–south rotation axis as the Earth turns daily. Confining all baselines to two dimensions has the computational advantage that the brightness distribution of a source is simply the two-dimensional Fourier transform of the measured visibilities.

Figure 3.44 illustrates Earth-rotation aperture synthesis by an east–west two-element interferometer at latitude +40 as viewed from a source at declination δ=+30. Let u be the east–west component of the projected baseline in wavelengths and v be the north–south component of the projected baseline in wavelengths.

Figure 3.44: Viewed from a distant radio source, at declination δ=+30 for this drawing, the Earth rotates counterclockwise with a period of one sidereal day about the north–south axis indicated by the arrow emerging from the north pole. The antennas of a two-element east–west interferometer at latitude +40 are shown, from left to right, as they would appear at hour angles -6h, -3h, 0h, +3h, and +6h. Projected onto the plane of the page, which is normal to the line of sight, the interferometer baseline rotates continuously from purely north–south at -6h through east–west at 0h and back to north–south at +6h. The projected antenna separation also changes. During this 12-hour period, the projected baseline traces an ellipse in the (u,v) plane as shown by the dashed curve, with points on the (u,v) ellipse highlighting the instantaneous coverage at -6h, -3h, 0h, +3h, and +6h. The v-axis of the ellipse is smaller by a factor sinδ than the u-axis.

During the 12-hour period centered on source transit, the interferometer traces out a complete ellipse on the (u,v) plane. The maximum value of u equals the actual antenna separation in wavelengths, and the maximum value of v is smaller by the projection factor sinδ, where δ is the source declination. If the interferometer has more than two elements, or if the spacing of the two elements is changed daily, the (u,v) coverage will become a number of concentric ellipses having the same shape. Thus the synthesized beam obtained by east–west Earth-rotation aperture synthesis can approach an elliptical Gaussian. The synthesized beamwidth is u-1 radians east–west and u-1cscδ radians in the north–south direction. The synthesized beam is circular for a source near the celestial pole, but the north–south beamwidth is very large for a source near the celestial equator.

3.7.5 Interferometers in Three Dimensions

The VLA (Very Large Array) shown in Figure 8.4 is Y-shaped and is instantaneously a nearly coplanar two-dimensional array of 27 25-m telescopes on the high Plains of San Augustin in New Mexico. It baselines are not confined to an east–west line but it is nearly coplanar, so “snapshot” observations much shorter than a sidereal day can be treated as two dimensional. On longer timescales, Earth rotation causes the VLA baselines to fill a three-dimensional volume. The north–south baselines allow imaging with a nearly circular synthesized beam even near the celestial equator. Figure 8.4 shows the “D” configuration spanning about 1 km. The telescopes can be moved along railroad tracks to form the “C”, “B”, and “A” configurations spanning 3.4, 11, and 36 km, respectively for higher angular resolution. The VLA recently underwent a major upgrade to become the JVLA (the “J” stands for “Jansky”), with new wideband receivers completely covering the frequency range 1 to 50 GHz and a far more powerful and versatile correlator. It is up to an order of magnitude more sensitive than the original narrow-band VLA.

The (u,v,w) coordinate system used to describe any baseline vector b in three dimensions is shown in Figure 3.45. The w-axis is in the reference direction s^0 usually chosen to contain the target radio source. The u- and v-axes point east and north in the (u,v) plane normal to the w-axis. u, v, and w are the components of b/λ, the baseline vector in wavelength units. An arbitrary unit vector s^ has components (l,m,n) as drawn, where n=cosθ=(1-l2-m2)1/2. The components (l,m,n) are called direction cosines.

Figure 3.45: The (u,v,w) coordinate system for interferometers. The w-axis points in the reference direction s^0 usually containing the source to be imaged. Projected onto the plane normal to the w-axis, u is the east–west baseline in wavelengths and v is the north–south baseline in wavelengths. l, m, and n are projections of the unit vector s^ onto the u-, v-, and w-axes, respectively.

Because

dΩ=dldm(1-l2-m2)1/2, (3.196)

the three-dimensional generalization of Equation 3.186 is

𝒱(u,v,w)=Iν(l,m)(1-l2-m2)1/2exp[-i2π(ul+vm+wn)]dldm. (3.197)

This is not a three-dimensional Fourier transform.

However, if w=0, Equation 3.197 becomes a two-dimensional Fourier transform, which can be inverted to give the source brightness distribution in terms of the measured visibilities:

Iν(l,m)(1-l2-m2)1/2=𝒱(u,v,0)exp[+i2π(ul+vm)]dudv. (3.198)

That is the case for an Earth-rotation aperture synthesis by an east–west interferometer if we choose s^0 to coincide with the Earth’s rotation axis, in which case (1-l2-m2)1/2=cosθ=sinδ, where δ is the declination of the reference position.

For any interferometer, if we consider only directions close to s^0, then n=cosθ1-θ2/2 and

𝒱(u,v,w)exp(-i2πw)Iν(l,m)(1-l2-m2)1/2exp[-i2π(ul+vm-wθ2/2)]𝑑l𝑑m. (3.199)

The factor exp(-i2πwθ2/2) can be kept close to unity by keeping wθ21; that is, by imaging only a small field of view whose radius is θw-1/2(λ/b)1/2. For example, θ0.01 radians is sufficiently small for an interferometer baseline 104 wavelengths long. Then

𝒱exp(i2πw)=Iν(l,m)(1-l2-m2)1/2exp[-i2π(ul+vm)]𝑑l𝑑m. (3.200)

A field wider than θw-1/2 can be imaged with two-dimensional Fourier transforms by breaking it up into smaller facets, much like a fly’s eye, and merging the facets to make the final image.

3.7.6 Sensitivity

The point-source sensitivity of a two-element interferometer can be derived from the radiometer equation for a total-power receiver on a single antenna because a square-law detector is equivalent to a correlator multiplying two identical input voltages supplied by one antenna. Consider an interferometer with two identical elements, each of which also has a square-law detector, observing a point source. The correlator multiplies the voltages from the two antennas, while each square-law detector multiplies the voltage from one antenna by itself, so the correlated/detected output voltages of the interferometer and each single dish are equal in strength. Thus the effective collecting area Ae of the two-element interferometer equals the effective collecting area of each element. However, the noise voltages from the two interferometer elements are almost completely uncorrelated (only the point source contributes correlated noise), while the noise voltages going into the square-law detectors are completely correlated (identical). The correlator output voltage distribution before smoothing is shown in Figure 3.46, and Figure 3.47 shows the correlator output voltage distribution after smoothing over N=50 samples. In the limit where the antenna temperature ΔT contributed by the point source is much smaller than the system noise Ts, the correlator output noise is 21/2 lower than the square-law detector noise from each antenna. For an unpolarized point source of flux-density S, then kΔT=SAe/2, so for a single antenna,

σS=2kTsAe(Δντ)1/2 (3.201)

and for a two-element interferometer,

σS=21/2kTsAe(Δντ)1/2. (3.202)

The point-source sensitivity of a two-element interferometer is therefore 21/2 times better than the sensitivity of each antenna, but 21/2 times worse than that of a single dish whose area is that of two antennas. The reason the two-element interferometer is less sensitive than a single dish having the same total collecting area is that the information contained in the two independent square-law detector outputs has been discarded. Together they have 21/2 times the sensitivity of a single dish. Combined with the independent correlator output, the total sensitivity is (2+2)1/2= twice the sensitivity of a single dish, or exactly the sensitivity of a single dish whose area equals the total area of the two-element interferometer.

Figure 3.46: The unsmoothed output voltage of a correlator whose inputs are uncorrelated Gaussian noise has a symmetric distribution with zero mean, and the rms fluctuation is a factor 21/2 times smaller than that of a square-law detector (Figure 3.33).
Figure 3.47: The smoothed output voltage of a correlator approaches a Gaussian with zero mean, and the rms noise is reduced by the square root of the number of independent samples averaged together. This figure shows noise from an N=50 sample running mean. The rms fluctuation is a factor 21/2 times smaller than that of a square-law detector (Figure 3.34).

An interferometer with N dishes contains N(N-1)/2 independent two-element interferometers. So long as the signal from each dish can be amplified coherently before it is split up to be multiplied by the signals from the N-1 other antennas, its point-source rms noise is

σS=2kTsAe[N(N-1)Δντ]1/2. (3.203)

In the limit of large N, [N(N-1)]1/2N and the point-source sensitivity of an interferometer approaches that of a single antenna whose area equals the total effective area NAe of the N interferometer antennas. For example, the VLA with N=27 dishes each d=25 m in diameter has the point-source sensitivity of a single dish whose diameter is D=[N(N-1)]1/4d=[27(26)]1/425m=129m. Had the square-law detector outputs been used as well, the point-source sensitivity of the N-element interferometer would be exactly the same as the sensitivity of a single dish having the same total collecting area.

Practical interferometers are slightly less sensitive than this because their correlators use digital multipliers that sample and quantize the input voltage, not perfect analog multipliers. For example, a digital multiplier that samples at twice the Nyquist rate with three quantization levels (-1,0,+1) is only 0.89 times as sensitive as a perfect analog multiplier. The chapter “Digital Signal Processing” in Thompson et al. [106] covers this and other consequences of quantization in detail.

Although the point-source sensitivity of an interferometer is comparable with the point-source sensitivity of a single dish having the same total area, beware that the brightness sensitivity of an interferometer is much worse because the synthesized beam solid angle of an interferometer is much smaller than the beam solid angle of a single dish of the same total effective area. The angular resolution of an interferometer with maximum baseline b is λ/b and the angular resolution of the single dish with diameter D is λ/D, so the beam solid angle of the interferometer is smaller by a factor (D/b)2. This is roughly the area filling factor of the interferometer, defined as the ratio of the area covered by all of the antennas to the area spanned by the interferometer array. For example, the VLA in its b11km “B” configuration has a filling factor (129m/1.1×104m)21.2×10-4. A high-resolution interferometer cannot detect a source of low surface brightness, no matter how high its total flux density.

The intensity axis of any astronomical image has dimensions of spectral brightness or specific intensity (e.g., units of Jy per beam solid angle or MJy sr-1 or K), not flux density (e.g., Jy). The point-source rms σS in Equation 3.203 corresponds to image flux density per beam solid angle, e.g., Jy beam-1. Published radio images usually have intensity axes in units of Jy beam-1 because the flux density of a point source equals its brightness in those units and because σS is independent of beam solid angle. However, a proper spectral brightness depends only on the source. The “spectral brightness” specified in Jy beam-1 has the dimensions of spectral brightness, but beware that this is not a proper spectral brightness because it depends on the synthesized beam solid angle and not just on the radio source. Infrared astronomers frequently specify image intensity in MJy sr-1, which is a proper brightness. The brightness temperature T is a convenient proper brightness for radio images. The rms brightness-temperature sensitivity σT of an image made with beam solid angle ΩA follows directly from σS and the Rayleigh–Jeans approximation:

σT=(σSΩA)λ22k. (3.204)

Most interferometer images are restored with Gaussian beams. The beam solid angle (Equation 3.34) of a Gaussian beam with HPBW θHPBW is (Equation 3.118)

ΩA=πθHPBW24ln2,

so

σT=(2ln2c2πkν2)σSθHPBW2. (3.205)

For example, all of the 1.4 GHz NRAO VLA Sky Survey (NVSS) images have rms noise σS0.45mJybeam-1 and were restored with a circular Gaussian beam whose half-power beamwidth is θHPBW=45arcsec2.18×10-4rad. Consequently, NVSS rms brightness temperature noise is

σT[2ln2(3×108ms-1)2π1.38×10-23JK-1(1.4×109Hz)2]0.45×10-29WHz-1(2.18×10-4rad)20.14K.

This is good enough to detect (5σT0.7K) normal spiral galaxies with median Tb1K at 1.4 GHz. Beware that a high-resolution (low ΩA) image with a good point-source sensitivity (low σS) may still have a poor brightness-temperature sensitivity (high σT).