HERA Memo #49: Mapping HERA's Primary Beam

Tyler Cox, Arizona State University, 8/10/18

One of the challenges of observing highly redshifted 21 cm emission is separating the signal from bright foreground sources. Simulations have found that when these bright foreground sources are far from HERA's center, their flux is significantly reduced due to beam attenuation and they become difficult to separate from the 21 cm signal. Knowledge of the shape of the beam at different frequencies will allow for precise calibration of the array, which will help to isolate the 21 cm cosmological signal from bright foregrounds. The goal of this notebook is to observe HERA's beam response at different frequency ranges by using IDR2.1 data to observe foreground point sources as they drift across the beam.

1. Imaging IDR2.1 Data

To map HERA's primary beam by tracking sources as they move across the beam, we need to image a set of HERA data that is sensitive enough that we can accurately measure of flux of sources as they move across the beam. The data set that we have decided to image to make this measurement is the IDR2.1 data release. H1C IDR 2.1 contains pre-calibrated, LST-binned data from observations made from 18 nearly-consecutive nights of data. Since this data has been binned over many days of observation, it should provide the sensitivity needed to accurately measure the flux of the sources we will choose to track across the beam.

One change that needed to be made to this data before making beam measurements from the images was splitting each data file from one 20 minute time integration into 10 two-minute time integrations. The reason for doing this was to create more images with less time passage between subsequent images, thus creating a smoother beam measurement for each source. Splitting the images into two-minute time integrations does reduce the sensitivity of the measurements, but they should still allow us to make accurate beam measurements. The package pyuvdata was used to split these 20 minute time integrations into 10 two-minute time integrations.

Once we've created a set of two-minute time integration data, we image by using CASA to flag all of the autocorrelations and run the CLEAN command below on all two time integrations:

clean(infile, outfile, niter = 6000, weighting = 'briggs', robust = '-0.5', 
      imsize = [512,512], cell = ['250arcsec'], mode = 'mfs', nterms = 1)

We also image the data in bandwidths of 10 MHz (110-120 MHz, 120-130 MHz, ...) and 20 MHz (110-130 MHz, 130-150 MHz, ...), so that we can measure the beam response at different frequencies. This is done by changing the spectral window in the CLEAN command:


# Example CLEAN command for imaging data using 180-190 MHz frequencies

clean(infile, outfile, niter = 6000, weighting = 'briggs', robust = '-0.5', imsize = [512,512], 
      cell = ['250arcsec'], mode = 'mfs', nterms = 1, spw = '0:819~912')

Now that we've imaged all of the time split IDR2.1 data at different frequency ranges, we use these images to make beam measurements of sources as they track across the beam.

2. Tracking Foreground Sources

With the images generated for each set of frequencies, we can start tracking bright foreground sources as they drift across the beam. To find these sources, the HERA_calibration_sources code is used. This package uses the TGSS ADR catalogue to find radio sources within a right ascension and declination range and some minimum flux. Once the desired values are entered, the name of the source, its right ascension, declination, and total flux are returned in a pandas dataframe.

Using this package, we can isolate bright sources within a narrow declination range over the entire right ascension range that was observed in IDR 2.1. We keep the declination range narrow so that the final observed beam map can be compared to a cut from the beam model centered at HERA's zenith declination.

In [1]:
from glob import glob
import numpy as np
import matplotlib.pyplot as plt
from astropy.io import fits
from astropy import units as u
from astropy.wcs import WCS
from HERA_calibration_sources import add_fluxes
from pyuvdata import UVBeam
import healpy as hp

import warnings
from astropy.utils.exceptions import AstropyWarning
warnings.simplefilter('ignore', category=AstropyWarning)

plt.rc('font', family='serif')
plt.rc('xtick', labelsize='small')
plt.rc('ytick', labelsize='small')
In [2]:
tb = add_fluxes(RA_range=('00:00:00','13:00:00'),dec_range=1, min_flux=6)
tb
Out[2]:
Name of Center RA Dec Total flux in region
0 TGSSADR J020012.1-305327 30.050440 -30.891060 25311.699249
1 TGSSADR J043300.3-295615 68.251488 -29.937559 10452.800014
2 TGSSADR J043736.5-295406 69.402321 -29.901720 9057.500076
3 TGSSADR J045826.4-300720 74.610413 -30.122379 19984.100246
4 TGSSADR J045514.2-300650 73.809464 -30.113991 23645.400387
5 TGSSADR J051330.6-303042 78.377602 -30.511909 18364.499821
6 TGSSADR J073819.7-302505 114.582474 -30.418230 14347.499702
7 TGSSADR J082126.4-301105 125.360031 -30.184980 20358.700781

The function below is where we track the flux sources found in the HERA_calibration_sources code as they move across the beam. For each image, the function loops through the list of sources to find the flux of the source if it exists in the image. We find whether or not the source is in the image by converting the source's right ascension and declination into a pixel value using astropy. The flux found at this pixel value is then divided by its true total flux, taken from the TGSS ADR catalog, to obtain a beam measurement at a particular angle from HERA's zenith. These values are then plotted as a function of angle from the telescope's zenith and compared to the beam model.

In [3]:
def track_sources(fits_files, sources):
    '''
    Creates a beam measurement for each source by measuring their flux in fits images generated from HERA data 
    and dividing by their true total flux.
    
    Parameters
    ----------
    fits_files : array-like
        Fits files generated from the imaging pipeline to take flux measurements from
    
    sources : dict
        Dictionary of sources containing information about their positions and total flux
        generated by HERA_calibration_sources. Used to track sources to make beam measurements
    
    Returns
    -------
    plot_data : dict
        Dictionary containing the zenith angle and corresponding beam measurement for each source. 
        The key is the source name.
    
    '''
    
    plot_data = {v['Name of Center']: {'RA_diff': [], 'Flux': []} for _,v in tb.iterrows()}
    
    for f in fits_files:
        with fits.open(f) as HDU:
            
            
            # Read in file information

            fits_info = HDU[0].header
            ax1        = fits_info['NAXIS1']
            ax2        = fits_info['NAXIS2']
            c_ra       = fits_info['OBSRA']
            c_dec      = fits_info['OBSDEC']
            pix_size   = abs(fits_info['CDELT1'])

            # Flip to data to correct the axes
            data = HDU[0].data
            data = np.flip(data[0][0],axis=0)

            w = WCS(f)

            for _,src in sources.iterrows():

                # Convert a position to pixel values
                idx_1,idx_2 = w.all_world2pix(src['RA'],src['Dec'],0,0,0)[:2]

                if not np.isnan(idx_1) and not np.isnan(idx_2):
                    idx_1 = int(idx_1)
                    idx_2 = int(ax2-idx_2)


                # ax1 is the right-most pixel, ax2 is the bottom-most pixel
                # we want to make sure that the both values are within the
                # correct range

                if (0 < idx_1 < ax1) and (0 < idx_2 < ax2):

                    flux = np.abs(data[idx_2,idx_1])
                    plot_data[src['Name of Center']]['Flux'].append(flux/src['Total flux in region'])
                    
                    ra_diff = c_ra-src['RA']

                    # Correct for the difference in c_ra error
                    if ra_diff > pix_size*ax1:
                        ra_diff -= 360
                    if ra_diff < -pix_size*ax1:
                        ra_diff += 360

                    plot_data[src['Name of Center']]['RA_diff'].append(ra_diff)
                    
    return plot_data

With the function below, we combine the beam measurements made from the functions above into a single beam measurement. This is done by gridding the measurements by zenith angle and averaging overlapping measurements made at each angle.

In [4]:
def observed_beam(cleaned_set):
    '''
    
    Creates a single beam cut from beam measurements made for each source
    
    Parameters
    ----------
    cleaned_set : dict
        Dictionary of normal flux values and their corresponding angle from zenith for each source.
        Obtained from the track_sources function.
    
    Returns
    -------
    theta : array-like
        Zenith angles for the final beam measurement
        
    obs_beam : array-like
        Single beam measurement compiled from individual measurements
        
    beam_std : array-like
        Standard deviation of the beam for each zenith angle measure
        
    '''
    n = 0
    x = []
    y = []

    min_x = 30
    max_x = -30

    for _,src in plot_data.iteritems():
        x.append(src['RA_diff'])
        flux = np.array(src['Flux'])
        y.append(flux)
        if len(src['Flux']) > 0:
            if len(src['Flux']) > n:
                n = len(src['Flux'])
            if min(src['RA_diff']) < min_x:
                min_x = min(src['RA_diff'])
            if max(src['RA_diff']) > max_x:
                max_x = max(src['RA_diff'])

    bins = np.linspace(min_x, max_x, n)
    vals = np.zeros(n)
    counts = np.ones(n)
    std_arr = [[] for _ in range(n)]

    for i, d in enumerate(x):
        idx = np.digitize(d, bins) - 1
        vals[idx] += y[i]
        counts[idx] += 1
        for m,n in enumerate(idx):
            std_arr[n].append(y[i][m])

    obs_beam = vals/counts
    obs_beam /= obs_beam.max()
    obs_beam = 10*np.log10(obs_beam)
    beam_std = np.array([np.std(10*np.log10(std)) for std in std_arr])
    return bins, obs_beam, beam_std

Using the two functions above with the source finding code, we can create a single beam measurement. We use images generated using the frequencies 110-130 MHz to make our first beam measurement. If this is done correctly, we hope to see a beam measurement that appears to be roughly gaussian looking at the zenith angles we will be measuring out to.

In [5]:
freq_range = '110-130MHz'
freq_ends = [110,130]

files = np.sort(glob('/data6/HERA/HERA_imaging/IDR2/'+freq_range+'/imgs/*fits'))
plot_data = track_sources(files,tb)

theta, obs_beam, obs_std = observed_beam(plot_data)
In [6]:
plt.figure(figsize=(15,8))

for key,item in plot_data.items():
    flux = np.array(item['Flux'])
    plt.plot(item['RA_diff'],10*np.log10(flux),label=key)
    
plt.xlabel(r'Beam Angle $(^\circ)$')
plt.ylabel(r'Amplitude (dBi)')
plt.legend()
plt.grid()
plt.show()
In [7]:
plt.figure(figsize=(12,7))
plt.xlabel(r'Beam Angle $(^\circ)$')
plt.ylabel('Amplitude (dB)')
plt.title('Measured Beam Sensitivity: ')
plt.errorbar(theta[4:-4], obs_beam[4:-4], yerr=obs_std[4:-4])
plt.grid()
plt.show()

The beam measurement above looks roughly like what we were expecting. The real test will comparing the beam measurement to a simulation of HERA's beam at these frequencies. We can now look into modeling HERA's beam at the range of frequencies used in this measurement, 110-130 MHz.

3. Modeling HERA's Beam

To model HERA's beam, which we will use to compare to our beam measurement, we will use a electromagnetic simulation of HERA's antenna created by Nicolas Fagnoni. This simulation contains the beam response at every integer frequency that HERA is currently capable of observing. We'll start by importing this simulation using the UVBeam object in pyuvdata.

In [8]:
hera_beam = UVBeam() #Instantiating HERA's beam as a UVBeam object
hera_beam.read_beamfits("NF_HERA_power_beam_healpix.fits") #Reading in the beam model

Now, we define a function to find HERA's beam at a specific frequency. We can use this function to find the shape of the beam over a range of frequencies by averaging beam models over a particular frequency range.

In [9]:
def hera_beam_freq_slice(hera_beam, freq, beam_center = 90, ang_from_center = 15, n = 100):
    '''
    Returns a beam cut at a defined frequency
    
    Parameters
    ----------
    hera_beam : UVBeam
        HERA beam model simulation read into a UVBeam object
    freq: int
        Frequency to pull from simulation
    beam_center: float, optional
        Zenith declination, set to 90 degrees in the simulation
    ang_from_center: int
        Angle from zenith to map the simulated beam out to
    n: int
        Number of points in the plot 
    
    Returns
    -------
    ang : array-like
        Angle from zenith for each point in the beam model
    beam_slice: array-like
        Beam model at the given frequency and angle range
    
    '''
    ang = np.linspace(beam_center-ang_from_center,beam_center+ang_from_center,n)
    beam_slice     = []
    beam_map       = hera_beam.data_array[0][0][0][freq-100]
    beam_map_norm  = beam_map/np.max(beam_map) 
    
    for i in ang:
        beam_slice.append(hp.get_interp_val(beam_map_norm,0, i, lonlat=True))  
    
    return ang, np.array(beam_slice)

To make sure that this function works, we can plot a few beam models to see they look similar to the plots made in beam simulation memo.

In [10]:
freqs = np.array([100,120,140,160,180,200])
model_beam = np.zeros(100)

fig, axs = plt.subplots(2,3, figsize=(15, 8), facecolor='w', edgecolor='k',sharex='col')
fig.subplots_adjust(hspace = .3)

axs = axs.ravel()

for i,freq in enumerate(freqs):
    ang, beam = hera_beam_freq_slice(hera_beam, freq, ang_from_center=45, n=100)
    model_beam = 10*np.log10(beam)
    axs[i].plot(ang-90,model_beam,'k--')
    axs[i].set_title('Beam Model: ' + str(freq) + ' MHz')
    axs[i].grid()
    if i > 2:
        axs[i].set_xlabel('Beam Angle $(^\circ)$')
    if i == 0 or i == 3:
        axs[i].set_ylabel('Amplitude (dBi)')


plt.show()

We can also plot a model averaged over a frequency range against models at a single frequency to make sure the averaged model is consistent with the single frequency models.

In [11]:
freqs = np.arange(110,131)
model_beam = np.zeros(200)

plt.figure(figsize=(15,8))

for freq in freqs:
    ang, beam = hera_beam_freq_slice(hera_beam, freq, ang_from_center=90, n=200)
    model_beam += beam
    if freq%5 == 0:
        plt.plot(ang-90,10*np.log10(beam),label=(str(freq) + ' MHz'))
        
model_beam /= freqs.shape[0]
model_beam = 10*np.log10(model_beam)

plt.xlabel(r'Beam Angle $(^\circ)$')
plt.ylabel(r'Amplitude (dBi)')
plt.title(r'Beam Model: 110-130 MHz')
plt.plot(ang-90,model_beam,'k--', label = '110-130 MHz')
plt.grid()
plt.legend()
plt.show()

4. Comparing the Beam Measurement and the Model

With the measurement and simulation both made, we can now plot the simulation against the model for the frequency range 110-130 MHz. We can also plot all of the 20 MHz and 10 MHz bandwith beam measurements to see how they compare to the models.

In [12]:
freqs = np.arange(freq_ends[0],freq_ends[1])
model_beam = []

for freq in freqs:
    ang, beam = hera_beam_freq_slice(hera_beam, freq, ang_from_center=15, n=100)
    model_beam.append(beam)

model_beam = np.mean(model_beam, axis = 0)
model_beam /= model_beam.max()
model_beam = 10*np.log10(model_beam)
In [13]:
plt.figure(figsize=(12,7))

plt.plot(ang-90,model_beam,'k--',label=(r'Model: ' + freq_range))
plt.errorbar(theta[5:-3],obs_beam[4:-4],yerr=obs_std[4:-4],label='Data: ' + freq_range)
plt.legend()
plt.title('Model vs. Observed Extragalactic Sources: ' + freq_range)
plt.xlabel(r'Beam Angle ($^\circ$)')
plt.ylabel('Amplitude (dB)')
plt.grid()
plt.show()

As we can see above, the beam map created from measuring the flux of the sources matches the model well out to a zenith angle of ~$12^\circ$. It's possible that the noise level is being hit and that is what's causing the beam measurement to flatten out past ~$12^\circ$ from zenith, but this is something that will require more work.

Now that we've successfully produced a beam measurement, we can use the same procedure as above for the rest of the 20 MHz bandwidth images.

In [14]:
freq_range = ['110-130MHz','130-150MHz','150-170MHz','170-190MHz']
freq_ends = [[110,130],[130,150],[150,170],[170,190]]

fig, axs = plt.subplots(2,2, figsize=(15, 10), facecolor='w', edgecolor='k')
fig.subplots_adjust(hspace = .3)

axs = axs.ravel()

for i,freq_set in enumerate(freq_range):
    
    
    # Find the files and track the sources
    
    files = np.sort(glob('/data6/HERA/HERA_imaging/IDR2/' + freq_set + '/imgs/*fits'))
    plot_data = track_sources(files,tb)
        
    # Model the beam for a range of frequencies
    
    freqs = np.arange(freq_ends[i][0],freq_ends[i][1]+1)
    model_beam = []
    
    for freq in freqs:
        ang, beam = hera_beam_freq_slice(hera_beam, freq, ang_from_center=15, n=100)
        model_beam.append(beam)
        
    model_beam = np.mean(model_beam, axis = 0)
    model_beam /= model_beam.max()
    model_beam = 10*np.log10(model_beam)
    
    
    # Find the measured beam and standard deviation for error bars
    
    theta, obs_beam, obs_std = observed_beam(plot_data)
    
    
    # Plot the beam and model
    
    axs[i].plot(ang-90,model_beam,'k--',label=(r'Model: ' + freq_set))
    axs[i].errorbar(theta[5:-3],obs_beam[4:-4],yerr=obs_std[4:-4],label='Data: ' + freq_set)
    axs[i].set_title(freq_set)
    axs[i].grid()
    axs[i].set_ylim([-25,5])
    axs[i].set_xlabel(r'Beam Angle ($^\circ$)')
    axs[i].set_ylabel(r'Amplitude (dB)')
    
plt.show()

The beam measurements in the plots above also seem to match well out to ~$12^\circ$. We can also recreate the plots above for 10 MHz bandwidth images beginning at 110 MHz and ending at 190 MHz.

In [15]:
freq_range = ['110-120MHz','120-130MHz',
              '130-140MHz','140-150MHz',
              '150-160MHz','160-170MHz',
              '170-180MHz','180-190MHz']

freq_ends = [[110,120], [120,130],
             [130,140], [140,150],
             [150,160], [160,170],
             [170,180], [180,190]]

fig, axs = plt.subplots(4,2, figsize=(15, 20), facecolor='w', edgecolor='k')
fig.subplots_adjust(hspace = .3)

axs = axs.ravel()

for i,freq_set in enumerate(freq_range):
    
    # Find the files and track the sources
    
    files = np.sort(glob('/data6/HERA/HERA_imaging/IDR2/' + freq_set + '/imgs/*fits'))
    plot_data = track_sources(files,tb)
    
    # Model the beam for a range of frequencies
    
    freqs = np.arange(freq_ends[i][0],freq_ends[i][1]+1)
    model_beam = []
    
    for freq in freqs:
        ang, beam = hera_beam_freq_slice(hera_beam, freq, ang_from_center=15, n=100)
        model_beam.append(beam)
        
    model_beam = np.mean(model_beam, axis = 0)
    model_beam /= model_beam.max()
    model_beam = 10*np.log10(model_beam)
    
    
    # Find the observed beam and standard deviation for error bars
    
    theta, obs_beam, obs_std = observed_beam(plot_data)
    
    
    # Shift the higher frequency plots down to account for the center dip 
    
    if i == 5:
        obs_beam -= 0.5
    
    if i == 6:
        obs_beam -= 0.75
        
    if i == 7:
        obs_beam -= 1.25
    
    
    # Plot the beam and model
    
    axs[i].plot(ang-90,model_beam,'k--',label=(r'Model: ' + freq_set))
    axs[i].errorbar(theta[5:-3],obs_beam[4:-4],yerr=obs_std[4:-4],label='Data: ' + freq_set)
    axs[i].set_title(freq_set)
    axs[i].grid()
    axs[i].set_ylim([-25,5])
    axs[i].set_xlabel(r'Beam Angle ($^\circ$)')
    
    if i % 2 == 0:
        axs[i].set_ylabel(r'Amplitude (dBi)')

plt.show()  

We can see in this set of plots that the beam measurements match the model well, but not quite as well as 20 MHz bandwidth images. One interesting observation that can be made from these plots is that at higher frequencies there are dips in the beam measurements near zenith that doesn't appear in the 20 MHz bandwidth measurements. This may be a result of the weighting that was used for imaging the data, which is something that will require further investigation.

4. Future Work

  • Investigate the cause of the dips near the beam center in the higher frequency 10 MHz bandwidth beam measurements
    • Why don't the same dips appear in 20 MHz bandwidth beam measurements
  • Current IDR2.1 images are using a briggs weighting parameter in CASA's CLEAN command that is shifted more closely to uniform
    • What might the effect of imaging with uniform and natural weighting have on the shape of the observed beam.
  • Model the noise level for certain frequencies to test whether or not we are actually hitting the noise level
    • Simple comparison might be using the rms of noise on the edge of the image to see if the value is consistent with the "noise" level we see in the beam measurement