The Wide-field VLBA Calibrator Survey: WFCS

Published 2020 December 7 © 2020. The American Astronomical Society. All rights reserved.
, , Citation Leonid Petrov 2021 AJ 161 14 DOI 10.3847/1538-3881/abc4e1

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1538-3881/161/1/14

Abstract

This paper presents the results of the largest very long baseline interferometry (VLBI) absolute astrometry campaign to date of 13,645 radio source observations with the Very Long Baseline Array. Of these, 7220 have been detected, including 6755 target sources that have never been observed with VLBI before. This makes the present VLBI catalog the largest ever published. The positions of the target sources have been determined with the median uncertainty of 1.7 mas, and 15,542 images of 7171 sources have been generated. Unlike previous absolute radio astrometry campaigns, observations were made at 4.3 and 7.6 GHz simultaneously using a single wide-band receiver. Because of the fine spectral and time resolutions, the field of view was 4'–8'—much greater than the 10''–20'' in previous surveys. This made possible the use of input catalogs with low position accuracy and the detection of a compact component in extended sources. Unlike previous absolute astrometry campaigns, both steep- and flat-spectrum sources were observed. The observations were scheduled in the so-called filler mode to fill the gaps between other high-priority programs. This was achieved by the development of the totally automatic scheduling procedure.

Export citation and abstract BibTeX RIS

1. Introduction

The method of very long baseline interferometry (VLBI) first proposed by Matveenko et al. (1965) provides very high angular resolution. It was quickly realized that VLBI is a powerful tool for geodesy and astronomy. The first catalog of source coordinates determined with the VLBI contained 35 objects (Cohen & Shaffer 1971). Since then, continuous efforts have been put into extending the source list and improvement of accuracy. Absolute astrometry and geodesy programs in the 20th century at 8.6 and 2.3 GHz (X and S bands) using the Mark3 recording system resulted in the ICRF1 catalog of 608 sources (Ma et al. 1998). Later, thousands of sources were observed with the Very Long Baseline Array (VLBA), the Long Baseline Array (LBA) in the Southern Hemisphere, and the Chinese VLBI Network in a number of dedicated absolute astrometry programs: the VLBI Calibrator Survey (VCS; Beasley et al. 2002; Fomalont et al. 2003; Petrov et al. 2005, 2006, 2008; Kovalev et al. 2007; Gordon et al. 2016), the VLBA Imaging and Polarimetry Survey (Helmboldt et al. 2007; Petrov & Taylor 2011), the VLBA Galactic plane Survey (VGaPS; Petrov et al. 2011a), the Long Baseline Array Calibrator Survey (Petrov et al. 2011b, 2019a), the Ecliptic Plane Survey (Shu et al. 2017), the VLBA regular geodesy RDV program (Petrov et al. 2009), and several other programs (Immer et al. 2011; Petrov 2011, 2012, 2013; Popkov et al. 2020).

The goal of these programs was to build a catalog of positions of the most compact components in active galactic nuclei (AGNs) with a nanoradian level of accuracy (1 nrad ≈0.2 mas). Such a catalog is used for imaging of weak sources with phase referencing, as targets for geodetic VLBI observations, for space navigation, for associations of γ-ray sources (Petrov et al. 2013; Schinzel et al. 2015, 2017), and for fundamental physics (Lambert & Le Poncin-Lafitte 2011). Until recently, the method of VLBI was the most accurate. Gaia Data release 2 (Lindegren et al. 2018) demonstrated the accuracy on par with or better than VLBI. However, a detailed analysis of the differences (Petrov & Kovalev 2017a; Petrov et al. 2019b) showed that a fraction of matching sources has statistically significant position offsets along the AGN jet directions. Petrov & Kovalev (2017a), Kovalev et al. (2017), and Plavin et al. (2019) presented convincing argumentation in support of a claim that such offsets are not due to errors in VLBI or optical Gaia catalog, but is a manifestation of milliarcsecond-scale optical jets that shift the centroid position. As a result, Petrov & Kovalev (2017b) concluded that the high accuracy of optical catalogs cannot be transferred to the radio range beyond the 1–10 mas level, and positions derived from the analysis of dedicated VLBI observations are necessary for applications that require higher position accuracy.

Despite the total number of compact radio sources with positions derived from VLBI observations surpassing 7000 by 2013 January, the density of calibrator sources was not high enough to ensure that there is a good calibrator within 2°–3° of any direction. Therefore, a program for the densification of the grid of compact radio sources with precisely determined coordinates was proposed. The goals of the program were

  • 1.  
    To increase the density of calibrator sources in the areas at δ > −40° where their density was lower than that on average, in particular to have at least one calibrator within any field of view of PanSTARRS (Chambers et al. 2016; disk of 1fdg5 radius) or LSST (disk of 1fdg75 radius). The program was formulated and observed before the Gaia data release. It was not known at that time how useful Gaia astrometry of faint sources of 15–20 mag can be. In the absence of Gaia astrometry, the presence of several objects with positions known with the accuracy better than 1 mas could be used as calibrators and boost the accuracy of the PanSTARRS source position catalogs.
  • 2.  
    To reach completeness over the 95% level at the 150 mJy level to perform a study of the population of compact radio sources.
  • 3.  
    To study the population of steep-spectrum sources. The population of steep-spectrum sources is poorly studied due to a heavy selection bias in prior surveys toward the so-called flat-spectrum sources, i.e., the sources with spectral index α > −0.5 defined as S ∼ f+α, where S is the flux density and f is the frequency.
  • 4.  
    To reobserve the sources with large radio-to-optical position offsets.

We consider a source to be a calibrator if it can be detected with signal-to-noise ratio (S/N) > 10 at baselines longer 5000 km for 30 s integration. The S/N is defined as the ratio of the fringe amplitude to the mean amplitude of the noise. Sources brighter than 15–20 mJy satisfy this criteria, provided they are observed at the 4 Gbps recording mode at the network with sensitivity similar to that of the VLBA, i.e., with the system effective flux density (SEFD) in the 250–400 Jy range.

The objectives of the program were (1) to determine the coordinates of target sources with a milliarcsecond level of accuracy and (2) to synthesize images of all detected sources. The catalog has been available online since 2013 April 19 even before the campaign was completed. Because the campaign used a number of novel techniques, it is necessary to describe them in depth. The technology of the VLBI surveys is the main focus of this article. The design of the campaign, the results of the pilot programs, the source selection, and the scheduling algorithm are discussed in Section 2. The postcorrelator, astrometric, and imaging data analyses are described in Sections 3 and 4. The catalog is presented in Section 5 followed by the summary.

2. Observations

2.1. Selection of Frequencies

The objectives of the program determine the choice of receiver and recording mode. To get a high position accuracy, observations should be done at two bands simultaneously. The combination of group-delay observables at upper and lower frequencies,

Equation (1)

is not affected by the ionosphere, as the ionospheric path delay is reciprocal to the square of frequency. Here, τu and τl are group delays at the upper and lower frequencies, respectively, and fu and fl are the effective frequencies that are close to the central frequencies of the recorded bands. In 2013–2016, when the experiments were observed, the VLBA supported two options of dual-band single-polarization observations at a 2 Gbps recording rate: (1) simultaneous 2.2–2.4 GHz (S-band) and 8.4–8.9 GHz (X-band) observations using a dichroic plate that sends the signal to two receivers and (2) recording at two remote wings of the broadband 4–8 GHz receiver.

In 2013, when the program started, there was no technical capability to record the entire band. The band is split into 16 subbands, 32 MHz wide, hereafter called intermediate frequencies (IF). The hardware imposes certain restrictions on frequency selection. In particular, the subbands should be assigned to two groups, and each group should have the spanned bandwidth not exceeding 480 MHz. The placement of IFs within subgroups affects the accuracy of group-delay computation and the probability of picking up a secondary maximum during the fringe fitting processes. The frequency setup used in this campaign is presented in Table 1. The highest secondary maximum of the delay resolution function for this sequence is 0.678 at 2.7 ns, and the uncertainty of the group delay is 90.9 ps at S/N = 10.

Table 1.  The Frequency Sequence of the Low Edges of the 32 MHz-wide IFs Used in the Campaign

Low Band Upper Band
(GHz) (GHz)
4.128 7.392
4.160 7.424
4.192 7.456
4.224 7.552
4.416 7.744
4.512 7.776
4.544 7.808
4.576 7.840

Download table as:  ASCIITypeset image

I ran several pilot projects. In the first one, I examined the VLBA performance at different frequency setups using the 4–8 GHz receiver, and in the second, I examined the differences in the ionosphere total electron contents (TEC) derived from quasi-simultaneous 4.3/7.6 and 2.2/8.4 GHz observations. Test observations have confirmed no noticeable degradation of sensitivity with respect to the more frequently used 4.9–6.6 GHz part of the band.

The use of dual-band observations increases the uncertainty of the ionosphere-free combination of lower- and upper-band with respect to single-band obserables:

Equation (2)

The wider the frequency separation, the lower the increase in the uncertainty with respect to a single-band observation at the upper band. Therefore, at a given S/N, the group-delay uncertainty from the data collected with the broadband C-band receiver is worse than the group-delay uncertainty from the data collected with the S/X band receiver. However, the sensitivity of the C-band receiver is higher than the sensitivity of the S/X receiver. According to the National Radio Astronomy Observatory (NRAO) gain monitoring program,1 the SEFD in the zenith direction of the pietown antenna in 2015 was 190, 250, and 280 Jy at 4.9, 6.6, and 8.4 GHz, respectively. If this is taken into account, the group-delay uncertainty from the data collected with the broadband C band is worse than that from the S/X receiver by a factor of 1.26 for flat-spectrum sources and by a factor 1.22 for sources with the spectral index of −0.5, which is typical for the program sources. However, the sensitivity at 4.3 GHz is a factor of 1.47 better than that at 8.4 GHz for a flat-spectrum source and is a factor of 2.06 better for a source with spectral index −0.5. If a source is weak, it may not be detected at all if the antenna sensitivity is not high enough.

Another factor that affects the choice of frequencies used is the presence of radio interference (RFI). The RFI is worst at 2.2–2.4 GHz. It reduces the usable band to less than 140 MHz, requires considerable effort to edit the data, and poses the risk of losing some observations. At the same time, no serious RFIs were reported within 4–8 GHz in 2013, except for the presence of narrowband signals (bandwidth less than 100 kHz) at 4.2 and 7.8 GHz frequencies that are due to a leakage from synthesizers.

I consider the improvement of the detection limit by a factor of 1.5–2.1 and the RFI alleviation to be more important than a 22%–26% improvement in source position accuracy for strong objects.

2.2. The Field of View of the Survey

The FWHM of VLBA antennas is 10' at 4.3 GHz and 5farcm8 at 7.6 GHz (see Figure 1). However, for traditional observations with accumulation period lengths 2 s and spectral resolutions of 128 channels per IF that are very often used as the default for processing VLBI observations, the field of view of a radio interferometer is significantly narrower. The field of view of the VLBA with these settings is limited to 10''–20''. I define the field of view as the area of the sensitivity reduction at a level not exceeding 50% with respect to the pointing direction.

Figure 1.

Figure 1. The primary beam attenuation of a 25 m VLBA antenna at 4.3 (upper green curve) and 7.6 GHz (lower blue curve).

Standard image High-resolution image

There are four factors that affect the field of view:

  • 1.  
    Antenna primary beam. All VLA and VLBA antennas are identical 25 m dishes. The primary beam power diagram of an ideal antenna is described by the Airy pattern (e.g., Born & Wolf 1999),
    Equation (3)
    where J1(x) is the Bessel function of the first order, and x is the normalized distance from the center of the field πD/λθ, where D is the antenna diameter, λ is the wavelength, and θ is the offset with respect to the pointing direction.The presence of the obstructing secondary mirror, the quadrapod, and the deviation of the antenna surface from the parabaloid cause a departure of the beam pattern from expression (3). Even when the beam power diagram is known precisely, pointing errors cause errors in the computation of the primary beam attenuation.This effect cannot be mitigated for an antenna of a given size, but the amplitude loss can be calibrated and taken into account during data analysis.
  • 2.  
    Tapering. The DiFX correlator (Deller et al. 2007, 2011) used for data analysis is of the FX type. Input data stream segments are shifted according to the a priori station-based delays, Fourier-transformed, cross-multiplied, and accumulated. If the a posteriori delay is the same as the a priori delay, all data in the input segments are used for cross-multiplication and accumulation. If the a posteriori delay differs from the a priori delay, one input data stream is shifted with respect to another, and therefore, only a fraction of the data is cross-multiplied and accumulated. If the shift exceeds the segment length, no data can be accumulated at all. Because the data were recorded at the Nyquist frequency without overlapping, the share of the accumulated data is
    Equation (4)
    where Δτ is the residual delay, N is the segment length (1024 for this survey), and Bsr is the sampling rate (64 megasamples s−1).The antennas are pointed to the direction where a source is expected, and the correlator uses the a priori path delay computed for these directions. If the source is located at $\alpha +{\rm{\Delta }}\alpha ,\delta +{\rm{\Delta }}\delta $ position, i.e, ${\rm{\Delta }}\alpha ,{\rm{\Delta }}\delta $ off the a priori position $\alpha ,\delta $, the path delay is incremented by ∂τ/∂αΔα + ∂τ/∂δΔδ, and therefore, the fringe amplitude is reduced according to Equation (4). The array loses its ability to detect a source even if Lt > 0, when the S/N is reduced by Lt to a level below the detection threshold. An obvious way to mitigate tapering is to increase N and therefore, to increase of the spectral resolution of visibilities, $2{B}_{\mathrm{IF}}/N$, where BIF is the IF bandwidth. This results in the growth of the correlator output size that was considered undesirable in the past and made the wide-field VLBI unpopular. Advances in computer hardware have made wide-field VLBI affordable.
  • 3.  
    Time smearing. Although the correlator used 15.625 μs long segments for this campaign, the visibilities are averaged over longer accumulation periods. Averaging visibilities over a finite time causes decorrelation at the edges of the time intervals. It can be easily shown (e.g., Thompson et al. 2017) that the fringe amplitude-loss factor due to time smearing is
    Equation (5)
    where Δt is the accumulation period during correlation and f0 is the reference frequency. However, amplitude losses due to time smearing can be mitigated by reducing accumulation period lengths. This also increases the output data set size.
  • 4.  
    Nonlinearity of the fringe phase. The fringe search procedure assumes the fringe phase varies linearly over a scan with both time and frequency (Petrov et al. 2011a). The fringe phase at the accumulation period i and the frequency channel j is expressed as
    Equation (6)
    where ω is an angular frequency, and τp and τg are the phase delay and group delay. The contribution of the third mixed delay derivatives, namely ${\partial }^{3}\tau /{\partial }^{2}t\partial \alpha $ and ${\partial }^{3}\tau /{\partial }^{2}t\partial \delta $, causes a quadratic term in the dependence of the fringe phase on time. It may become significant if the position offset is large and a scan is long. In the first approximation, the time delay is expressed via the baseline vector ${{\boldsymbol{r}}}_{1}-{{\boldsymbol{r}}}_{2}$ and the unit source position vector $\underline{{\boldsymbol{S}}}$ up to terms O(1/c2) as
    Equation (7)
    where $\widehat{{ \mathcal E }}$ is the Earth rotation matrix. Differentiating it over time twice, we get an expression for $\ddot{\tau }$:
    Equation (8)

The maximum value of $\ddot{\tau }$ for a baseline with Earth's radius is $1/c\,{R}_{\oplus }{{\rm{\Omega }}}_{\oplus }^{2}$, or about 10−10 s−1. The maximum values of ${\partial }^{3}\tau /{\partial }^{2}t\partial \delta $ and ${\partial }^{3}\tau /{\partial }^{2}t\partial \alpha $ are close to $\ddot{\tau }$. Let us consider a source observed at 7.6 GHz in a 60 s long scan that is 10−3 rad (3farcm3) off the pointing direction at a baseline with the length of Earth's radius. Then, the maximum magnitude of the phase curvature over the scan will be $2\pi f/c\,{R}_{\oplus }\,{{\rm{\Omega }}}_{\oplus }^{2}\,{(t/2)}^{2}/2\approx 2$ radians. The contribution for a 2 minute long scan will be 8 radians, i.e., over one phase turn.

The fringe amplitude loss due to the nonlinear phase variation can be mitigated by iterations when a preliminary source position is determined at the first iteration, and then that position is used for computation of the nonlinear phase variation that is subtracted before the next iteration of fringe fitting. As the nonlinear phase variation due to the error in the a priori source position is proportional to the baseline length, a source can still be detected at short baselines. The accuracy of its positions derived from these observations is sufficient to compute precisely the quadratic term in fringe phases.

There are two approaches for an increase of the field of view: (1) to increase the spectral and time resolutions during correlation and (2) to perform multiple correlation passes with different phase centers at the expected source positions (Middelberg et al. 2013). The first approach is more straightforward but generates a large amount of data that requires significant computing resources after correlation. The second approach requires more resources during correlation. The DiFX correlator implements this approach very efficiently. This approach has an advantage if we have the a priori knowledge that one or several sources are located within 10''–20'' of the specific positions within the primary beam of an antenna and other areas are excluded from the search. However, if the primary beam is being covered with a mosaic of many phase centers as Morgan et al. (2011) and Middelberg et al. (2013) did, the output data set size and the amount of required resources for postprocessing are not decreasing, and data analysis becomes more complex. The data set size can be decreased substantially if correlating at phase centers of the sources known from low-resolution connected element interferometers, such as NVSS (Condon et al. 1998). Deller & Middelberg (2014) used this approach to detect weak sources within the primary beam at 1.4 GHz. However, the downside of this approach is that we can find only those sources that are known before and lose others. There are instances when a compact component is located far away from the brightest extended component that dominates at low-resolution images (see Figures 1–5 in Petrov 2013). In contrast, the first approach relaxes the requirement to the accuracy of a priori positions: a source will be detected anywhere in the field of view. I used the first approach in this campaign.

2.3. Design of the Wide-field VLBI Campaign

We see in the previous section that finite spectral and time resolutions reduce the field of view with respect to the individual antenna beam size. The old hardware correlators limited the output rate and restricted the choice of time and spectral resolutions. Advances in computing hardware made it feasible to correlate VLBI experiments at general purpose computers using flexible software correlators, such as DiFX or SFXC (Keimpema et al. 2015), and these limitations were lifted. However, postprocessing of large data sets was considered impractical until recently.

I evaluated the size of the field of view at 4.3 GHz band achievable with the wide-field correlation setup (0.1 s time resolution and 62,500 Hz frequency resolution). Tapering and time smearing depend on the baseline vector. I ran a simulated 24 hr schedule of observing three sources at decl. −30°, 20°, and 70° every 5 minutes. Observations at elevations below 5° were discarded. I averaged the amplitude losses due to the four factors discussed in the previous section for three subarrays: (1) 10 short baselines in the inner part of the array with lengths shorter 1000 km, (2) 13 medium baselines with lengths in a range of 2000–4000 km, and (3) 6 baselines with lengths longer 5000 km. Figure 2 shows the simulation results.

Figure 2.

Figure 2. Typical fields of view of the VLBA network at 4.3 GHz when correlated with accumulation periods of 100 ms and with the spectral resolutions of 62,500 Hz. The color shows a reduction of the fringe amplitude as a function of the source position offset with respect to the pointing direction in the range of 0–1. The first row provides the averaged field of view for the inner part of the VLBA at baselines shorter than 1000 km, the second row provides the field of view at baseline lengths in the range of 2000–4000 km, and the third row provides the field of view at baselines longer 5000 km. Three columns correspond to observations of a source at decl. 70°, 20°, and −30° respectively.

Standard image High-resolution image

We see that tapering and time smearing almost do not affect the field of view at 4.3 GHz for short baselines: it is determined by the size of the primary beam. The field of view shrinks to 7'–8' at medium-size baselines and to 3farcm5 at the longest baselines. The spectral resolution should have been increased by a factor of 4 to avoid it. The field of view is smaller by the factor of 1.76 at short baselines at 7.6 GHz and by a factor of 1.1 at the longest baselines.

2.4. Pilot Observations to Test Elimination of the Ionosphere Contribution

Because astrometry at 4–8 GHz with the broadband C-band receiver was new in 2013, I ran a pilot project (NRAO code BP175) in order to quantify possible systematic differences of 4.3/7.6 GHz astrometry with respect to the traditional absolute astrometry at 2.3/8.6 GHz. Ten sessions 3–8 hr long each were observed with the VLBA in 2013 October–December during the pilot project. Sources with correlated flux density brighter than 200 mJy were observed in scans 180 s long. The array observed at 2.3/8.6 GHz for 50 s, then switched receivers within 20 s, observed at 4.3/7.6 GHz for 50 s, then switched back to 2.3/8.6 GHz, and the array observed the same source for 50 s more. The same recording rate, 2 Gbps, was used for all observations. Although the 256 MHz-wide band was recorded within [2.188, 2.444] GHz, only its 178 MHz-wide fraction was used, with the remaining part masked out due to the interference caused by the satellite radio and due to the front-end filters at some stations. The frequency setup at 8.6 GHz differed from the 7.6 GHz setup only by shifting frequencies by 1 GHz up.

The goal of the pilot project was to assess using real data (1) the sensitivity of 4.3/7.6 GHz absolute astrometry, (2) the errors in the determination of ionosphere-free combinations of group delays at two frequencies, and (3) the magnitude of the systematic differences in source positions with respect to the traditional 2.3/8.6 GHz absolute astrometry.

I found that fringe phase and amplitude are oscillating within 5 s at the beginning of each scan and after each receiver change. The first 5 s of data after receiver change were masked out in further analysis. Therefore, the total integration time in each scan was 90 s at 2.3/8.6 GHz and 45 s at 4.3/7.6 GHz.

First, 8.6 GHz data were processed using the fringe fitting procedure implemented in ${ \mathcal P }{ \mathcal I }{ \mathcal M }{ \mathcal A }$ software package.2 Residual group delays, phase-delay rates, group-delay rates, and fringe phases were computed at the fringe reference time that was set to the weighted mean epoch of 8.6 GHz data. Then, the data from three other bands were processed, and the fringe reference time for each scan was set to be the same as for 8.6 GHz data. Then, the total group delays, phase-delay rates, group-delay rates, and fringe phases at 2.3/8.6 GHz were computed to the scan reference time—the common moment of time for all observations at all baselines of a given scan. After that, similar quantities at 4.3/7.6 GHz were computed to the same scan reference time as the 2.3/8.6 GHz data.

These quantities were imported to the geodetic VLBI data analysis software VTD/pSolve and were preprocessed the same way as all other VLBI observations under geodesy and absolute astronomy programs. Data analysis included editing, suppression of the outliers that exceed 3.5 times the normalized error, checking for clock breaks, and update of additive weight corrections. In total, 3% of group delays were flagged out. Careful analysis revealed sudden phase jumps at approximately 1 rad that affected about 2% observations. This problem was traced to the digital baseband converter hardware and was fixed before the start of the main observing campaign.

Figure 3 shows the relationship between the TEC estimates from 2.3/8.6 and 7.6/4.3 GHz data. The correlation coefficient is 0.997. No systematic differences were found. Thus, I was compelled to conclude that the ionospheric contribution derived from 2.3/8.6 and 4.3/7.6 GHz is practically the same.

Figure 3.

Figure 3. The estimates of the total electron contents from quasi-simultaneous dual-band observations at 2.3/8.6 and 4.3/7.6 GHz.

Standard image High-resolution image

Figure 4 shows the histograms of the ratio of the uncertainty of the ionosphere-free combination of group delays derived from 4.3/7.6 GHz data to the uncertainty of such a quantity derived from simultaneous 2.3/8.6 GHz observations. The 2.3/8.6 uncertainties were scaled by $1/\sqrt{2}$ because 4.3/7.6 GHz observations were twice shorter. The histogram shows two peaks: around 1.1, i.e., cases when the uncertainties are about the same, and around 0.5, i.e., cases when 4.3/7.6 GHz uncertainties are twice smaller. I investigated that result further. When I kept only the observations with 8.6 GHz group-delay uncertainties greater than 30 ps, only the peak around 0.35 remained (the red curve in the right graph of Figure 4). When I kept only the observations with strong S/N which provided uncertainty less than 5 ps at 8.6 GHz, the peak shifted to 0.8. This can be easily explained. ${ \mathcal P }{ \mathcal I }{ \mathcal M }{ \mathcal A }$ applies additive reweighting when computes uncertainties in group delay (see Petrov et al. 2011a for details) that accounts for instrumental phase variations. This sets the floor in group-delay uncertainties, typically 5–10 ps. When the floor is reached, the effective sensitivity at 7.6 and 8.6 GHz is about the same, and a wider frequency separation, 2.3 and 8.6 GHz, provides an advantage. That explains the peak around 1.1 in the left histogram of Figure 4. This error floor is not reached for weaker sources, and the higher sensitivity at 7.6 GHz explains the peak at 0.35 (red curve on the histogram in Figure 4).

Figure 4.

Figure 4. Left graph, green curve: the normalized histograms of the ratio of the uncertainty of the ionosphere-free combination of group delays derived using the 4.3/7.6 GHz data to the uncertainty of such a quantity derived from simultaneous 2.3/8.6 GHz observations. Right graph, red curve: a similar histogram but built using only the group delays with uncertainties at 8.6 GHz greater 30 ps. Right graph, blue curve: similar histogram, but using only group delays at 8.6 GHz uncertainties less than 5 ps.

Standard image High-resolution image

Analysis of the median in the cumulative distributions derived from the distribution presented in Figure 4 allows to us conclude that when a weak source is observed with S/N < 30 at the X band with VLBA, the estimates of the ionosphere-free combinations of group delays from 4.3/7.6 GHz observations are a factor of 1.53 more precise than the combinations of delays derived from 2.3/8.6 GHz. When a strong source is observed, with S/N > 200 at the X band, the 2.3/8.6 GHz ionosphere-free combinations of group delays have uncertainties lower by a factor of 1.18. Thus, the tests have confirmed predictions.

The 4.3/7.6 and 2.3/8.6 GHz dual-band observables were used to estimate the positions of 394 observed sources in two separate least-squares (LSQ) solutions. Prior to that, I ran a global reference solution that used all dual-band data from 1980 through 2020, except for the data in the test campaign. That reference solution was made the same way as the analysis of the main campaign was done. Then, the variance-covariance matrix of the reference solutions was reduced by stripping the elements related to source positions. That reduced variance-covariance matrix was used as an input for analysis of the test data. Therefore, only observations of the test data contributed to source positions. The test solutions used the same parameterization as the reference solution, except for fixing Earth's orientation parameters to the IERS C04 series (Bizouard et al. 2019).

Figure 5 shows the differences in estimates of 394 source positions. The median position uncertainty is 0.62 mas over R.A. scaled by cos δ and 0.46 mas over decl. The rms of position differences is 0.54 mas over R.A. scaled by cos δ and 0.56 mas over decl. The mean weighted bias of the 4.3/7.6 GHz position estimates versus the 2.3/8.6 GHz position estimates is −0.04 ± 0.03 mas over R.A. and 0.03 ± 0.03 mas over decl. The bias is statistically insignificant. This compels us to draw the conclusion that switching from the traditional 2.3/8.6 GHz setup to 4.3/7.6 GHz does not introduce measurable systematic errors.

Figure 5.

Figure 5. The differences in source position estimates from simultaneous dual-band observations at 2.3/8.6 and 4.3/7.6 GHz.

Standard image High-resolution image

2.5. Pilot Observations to Test Off-beam Observations

To check whether the off-beam observations provide the expected results, I ran two tests. First, the same scan was recorrelated multiple times using different clock models that caused an increase in residual delay. The fringe amplitude is expected to decrease linearly with an increase in the residual delay according to expression (4), and the fringe phase is expected to remain the same. The test showed no statistically significant deviation from the linear dependence. The phase difference remained statistically insignificant for Lt above 0.1 and a systematic deviation reached 0.2 rad at the lowest Lt tested, 0.02. The source was not detected with greater residual delays.

A 25 minute long test was conducted on 2020 January 31 as part of VLBI experiment BP245 that had the same frequency and correlation setup as the main campaign. A strong source, 1741-038, was observed in 14 scans. In each scan, the antennas were first pointed 1', 2', 3' ... 7' away from its a priori position along the R.A. (hereafter, off-beam pointing) and then 1', 2', 3' ... 7' along the decl. Within each scan, the antenna recorded for 30 s during off-beam pointing, then pointed to 1741-038, recorded for 30 s (hereafter, in-beam pointing), and then moved again to the same off-beam pointing. Then, the procedure was repeated for the next off-beam pointing.

Fringe fitting was performed using both portions of the same off-beam pointing, and the fringe reference time was set the same for both in-beam and off-beam pointings. The S/N around 1000 was achieved during in-beam pointing. The S/N was less for most of the off-beam pointings, and some observations at large offsets were not detected at all, which means the fringe amplitude was less than 0.006 with respect to the direct pointing.

The in-beam and off-beam pointings were treated as different sources with different names. Both 4.3 and 7.6 GHz data were processed, and dual-band ionosphere-free combinations of group delays were used for estimation of the 1741-038 position from in-beam scans and 14 separate off-beam scans. The parameter estimation results are shown in Table 2. Astrometry from off-beam pointings showed no systematic errors above 1σ when a source is observed up to 5' off beam, and the fringe amplitude was above the 0.05 level achieved during in-beam pointing. There are systematic errors at 3σ only at pointings 6' and 7' off the beam when the amplitudes are less than 1% of the in-beam pointing.

Table 2.  The Differences in Estimates of the 1741-038 Off-beam Position with Respect to Its In-beam Position Derived from Analysis of the Test Experiment (Columns 2–3 and 7–8) and the Flux Density Estimates at 4.3 and 7.6 GHz Integrated over the Restored Image

Off Offset over α Off Offset over δ
Beam Δα Δδ $C{}_{\mathrm{tot}}$ $X{}_{\mathrm{tot}}$ Beam Δα Δδ $C{}_{\mathrm{tot}}$ $X{}_{\mathrm{tot}}$
(') (mas) (mas) (Jy) (Jy) (') (mas) (mas) (Jy) (Jy)
1 −0.07 ± 0.25 −0.07 ± 0.53 3.84 4.25 1 0.03 ± 0.25 −0.06 ± 0.53 3.89 4.40
2 −0.02 ± 0.26 −0.15 ± 0.53 3.82 4.34 2 0.03 ± 0.25 −0.07 ± 0.52 3.98 4.53
3 −0.01 ± 0.26 −0.13 ± 0.54 3.97 4.34 3 −0.02 ± 0.24 −0.10 ± 0.51 3.84 4.56
4 −0.14 ± 0.31 −0.72 ± 0.58 3.74 4.57 4 −0.10 ± 0.24 −0.11 ± 0.50 3.95 4.87
5 0.54 ± 0.51 −0.59 ± 0.72 3.69 6.02 5 0.27 ± 0.25 −0.67 ± 0.54 3.87 6.34
6 3.28 ± 1.11 1.55 ± 1.26 3.65 n/a 6 0.90 ± 0.36 −0.97 ± 1.00 3.70 n/a
7 4.54 ± 1.63 11.20 ± 2.22 3.33 n/a 7 −1.97 ± 0.49 2.64 ± 1.58 3.66 n/a

Note. The flux densities at zero offset are 3.84 and 4.26 Jy, respectively. The five left columns show the position differences and the flux densities when pointing was offset along the R.A. direction. The five right columns show position differences and flux densities when pointing was offset along the decl. direction.

Download table as:  ASCIITypeset image

I analyzed the normalized ratios η(θx, θy) of the off-beam fringe amplitudes aob to aib:

Equation (9)

These ratios should differ from 1 within the uncertainty of amplitude measurements. However, analysis revealed a significant scatter and a positive bias. Middelberg et al. (2013) investigated the primary beam of the VLBA at 1.38 GHz. They adjusted the effective antenna diameter as a single parameter for all the antennas. I tried to do the same, but this resulted in a small reduction of variance. Then I extended the estimation model: I estimated the effective antenna diameter for each site separately and included estimation of the antenna pointing corrections along the R.A. and decl. Considering that the loss factors Lt and Lts are known precisely, we can present the unnormalized amplitude ratio ρ (θx, θy) for a given baseline via the off-beam amplitude ${a}_{{\rm{ob}}}^{c}$ corrected for tapering and time smearing as

Equation (10)

where ${x}_{i},{y}_{i}$ are the pointing corrections along R.A. and decl. and di is a correction to the a priori antenna effective diameter D with respect to the geometric diameter 25 m. After taking logarithms from both sides, we get a system of nonlinear equations for xi, yi, and di. In order to avoid strong variations of fringe amplitude at large pointing offsets over 480 MHz bandwidth, I reprocessed the data at a narrower bandwidth, keeping three IFs around 4.54 GHz and three around 7.43 GHz. An iterative weighted LSQ was used to solve this system of equations. The following weights were used for parameter estimation:

Equation (11)

where σ0 = 0.01 is the floor of the amplitude variance and the factor of $\sqrt{2/\pi }$ is due to the relationship between the variance and the mean for the Rayleigh distribution. Only observations at pointing offsets not exceeding 5' were used for processing 7 GHz data. Analysis was performed separately for 4 and 7 GHz data.

Surprisingly large pointing corrections up to 55'' were found. Pointing corrections at 4.5 and 7.4 GHz agree for most of the stations within 10''. The estimates of the mean effective antenna diameters are 23.6 m and 25.4 m at 4.5 and 7.4 GHz, respectively. Unfortunately, this test was not designed to determe pointing corrections because the antennas were pointed off the beam toward only one direction, which resulted in 0.95 correlations between estimates of pointing corrections and the effective diameters. Therefore, specific values of pointing corrections should be interpreted with caution. However, estimation of pointing corrections and effective diameters resulted in a very substantial reduction of variance (see Figure 6)—a factor of 4 at the C band and a factor of 3 at 7.4 GHz.

Figure 6.

Figure 6. The normalized amplitude ratios η(θ1, θ2) for 4.5 GHz (left) and 7.4 GHz (right). The green points denote observed ratios as a function of nominal pointing direction. The blue points denote adjusted ratios when pointing corrections were estimated.

Standard image High-resolution image

Adjusted ratios show no systematic bias with respect to 1 and their scatter is greatly reduced. Figure 7 shows the root mean square (rms) of the scatter with respect to 1 as a function of the position offset before adjustment (green hollow circles) and after adjustment (blue filled points). Amplitude errors at the offsets that reduce the primary beam power by a factor of 2 (5' at 4.5 GHz and 3' at 7.4 GHz) are 12% before adjustment for pointing offsets. After adjusting for pointing corrections, these errors are reduced to a level of 3%. It should be noted that on average, pointing errors reduced the amplitude in the in-beam direction by 3% at 7.4 GHz. A 30% increase in the total flux density estimate at the pointing offset is consistent with expected amplitude errors shown in Figure 7, and therefore, I consider pointing errors as its most probable cause. No correction for pointing offsets were made during the course of the WFCS campaign.

Figure 7.

Figure 7. The amplitude errors as a function of pointing offset at 4.5 GHz (left) and 7.4 GHz (right). The green points denote observed ratios as a function of nominal pointing direction without adjustments. The blue points denote adjusted ratios when pointing corrections were estimated.

Standard image High-resolution image

2.6. Source Selection

Even with the use of the wide-band VLBI technique, the VLBI field of view is still narrow. With rare exceptions, we observe with VLBI the sources detected with low-resolution single-dish or connected interferometer instruments. Scheduling algorithms ingest the input list of sources. A priority is assigned to a given source. Usually, an input source list has more objects than one can observe within the allotted time. The higher the oversubscription rate, i.e., the ratio of time required to observe every source in the list to the amount of the allotted time, the more efficient observing schedules the algorithm is able to generate because it has more candidates to choose from.

There were three observing campaigns within the survey, VCS7, VCS8, and VCS9, that had different input source lists.

The parent catalog for VCS7 was generated using the CATS database3 (Verkhodanov et al. 1997) containing almost all radio catalogs known by 2013. Sources from these catalogs within 20'' of their reported positions were matched against each other, and the spectral index was computed. The following sources were included in the input list: (a) at decl. > −45°, (b) with flux densities extrapolated at 8 GHz and greater 100 mJy, (c) with spectral index >−0.55, (d) that have no planetary nebulae or H ii region with 2', and (e) those that were not observed with VLBI before. The sources in the zone of low density of known VLBI calibrators had higher priorities. In addition, 151 sources that are known to have intraday variability and listed in MASIV catalog (Lovell et al. 2008) and 27 sources previously observed with VLBI that showed significant differences between VLBI and optical catalog NOMAD (Zacharias & Zacharias 2014) were added (V. Makarov 2020, private communication). In total, 6554 objects were selected as candidates and 1486 were observed.

The parent catalog for VCS8 was also generated with the use of the CATS database. The selection criteria were similar to those in VCS7, except the extrapolated flux density limit was raised to 150 mJy. In addition, 23 sources detected in prior observations with position accuracy worse than 25 mas were added to the schedule. In total, 5712 objects were selected as candidates and 1233 were observed.

The parent catalogs for VCS9 were GB6 (Gregory et al. 1996) and PMN (Condon et al. 1993; Griffith & Wright 1993; Griffith et al. 1994, 1995; Tasker et al. 1994; Wright et al. 1994, 1996) at 4.8 GHz. The input list included 20,641 sources with decl. > −40° brighter 70 mJy, excluding those that were previously observed. The list also contained 174 previously observed sources with significant differences between VLBI and the optical catalog NOMAD, 414 sources candidates with γ-ray associations, and 183 sources with poor positions. Of 20,641 input sources, 10,575 were observed.

2.7. Observing Schedules

Scheduling was made with the use of the software program sur_sked in a totally automatic fashion. The VLBI schedule is the sequence of start and stop times when the array or its part called a subarray records radio emission. This time interval is called a scan, and data collected at a given baseline at this interval are called an observation. Considering that VLBA has baselines over two-thirds of Earth's diameter, generating the optimal sequence of scans that satisfies a number of constraints is a highly nontrivial problem (for more details, see Schartner & Böhm 2020).

First, the minimum elevation angle and the minimum number of required stations was determined. It was set to 15° and 10 stations, i.e., the full VLBA network, for sources with decl. above −10°, and it was gradually lowered to 7° and 7 stations for sources with decl. below −41° and 5°, and 6 stations for sources with decl. below −46°, although very few sources were scheduled at decl. below −40°. Otherwise, sources with low decl. would have little chance to be observed. Then, the interval of time when a given source is above the specified elevation limits at the minimum number of required stations was computed, and azimuth and elevation angles were calculated and expanded into the B-spline basis as a function of time for fast interpolation. Such sources are called visible at a given moment of time. After that, the sequence of observed sources was generated. The list of sources that are visible was determined for the moment of time at the end of the integration time of the previous source. Slewing time was calculated for each visible source. The following score was computed for every visible source:

Equation (12)

where δ—decl. in degrees, ts—slewing time in seconds, and tc—time elapsed since the upper culmination at the pietown station in seconds. The first term upweights low-decl. sources because they have a shorter visible time, the second term upweights sources with short slewing time, and the third term upweights sources that are near the meridian at pietown station and therefore are observed at higher elevations. The last term p is a priority—a number assigned to a source to quantify its preference. The source with the highest score is put in the schedule, and the process is repeated. The scheduled source is excluded from further consideration. With some exceptions, each source was scheduled in one scan 60 s long. The average schedule efficiency, defined as the ratio of the total on-source time for target sources to the total duration of an experiment, is 0.51. Accounting for observations of calibrator sources, the schedule efficiency is 0.63. Remaining time is used for slewing.

Some sources were scheduled in more than one scan 120–300 s long. These are the sources with large radio-optical offsets and with an excessive number of outliers from analysis of previous observing sessions.

The process of generating the sequence of sources is interrupted every hour, and four calibrators are inserted. Calibrator sources are selected in such a way that at least two of them are observed at the elevation range [15°, 35°] and two of them are observed at the elevation range [45°, 88°] at each station. The minimum number of stations for calibrator observations was six. The calibrator list consists of 323 compact sources with the correlated flux density at 2 and 8 GHz greater than 0.3 Jy and with the ratio of the correlated flux density at projected baseline lengths over 5000 km to the total flux density above 0.5. A brute force algorithm was used to find four calibrators at a given moment of time first selecting those that have the maximum number of participating stations in each scan and then choosing among them the variant that has the minimum slewing time. Each calibrator was observed for 60 s.

The calibrators were included in order to (a) improve estimation of the atmospheric path delay in the zenith direction by including the observations at low and high elevation angles, which helps to separate variables, (b) connect positions of the new sources, never observed with VLBI before, with frequently observed objects with precisely known positions, (c) provide observations with high S/N to compute complex bandpasses, and (d) provide observations of strong sources, images of which can be determined with the high dynamic range for gain calibration.

The scheduling algorithm picks up the sources to observe from the input list automatically. Assigning source priorities increases the chance for the sources with high priorities to be included in the schedule at the expense of decreasing the overall schedule efficiency. The priorities were increased for (a) the sources in the area with a low density of known sources detected with VLBI, (b) flat-spectrum sources, i.e., sources with spectral index flatter than −0.5, (c) brighter sources, and (d) sources of special interest, such as MASIV, or possible counterparts of a γ-ray object detected with Fermi, or sources with large radio-optical offsets.

All three campaigns were scheduled in the so-called filler mode. That means the start time of observing sessions and their duration are not known beforehand. The array operator finds a gap between high-priority projects that requires good weather, enters the start and stop dates using the web form, and within 1–3 minutes gets an automatically generated schedule file. The minimum duration of the observing session is 3.5 hr. The advantage of this approach is that the project can be observed almost for free: it takes time that the array would be idle otherwise. This is the only practical way to acquire hundreds of hours of VLBA observing time for projects like that. The first disadvantage of this approach is that there is no direct control over which source will be observed. The only leverage the principal investigator has is raising or lowering source priorities. That increases or decreases the chance of a given source to be included in the schedule. The second disadvantage is that the distribution of scheduled sources over the R.A. depends on the scheduling pressure of the array. There are ranges in the R.A., specifically near the Galactic plane, that have a higher demand from competing projects. Because the filler projects are scheduled at the lowest priority, the chance to observe the sources that culminate in the R.A. ranges that are in high demand is lower. As a result, the distribution of scheduled sources is not uniform over the R.A., although the distribution of the input catalog is uniform, and it inherits the footprint of the VLBA subscription pressure.

Table 3 summarizes the statistics of the observing schedules.

Table 3.  The Summary of the Observing Campaigns

Number of Sources    
camp inp new known # seg obs time time interval
          in hours  
(1) (2) (3) (4) (5) (6) (7)
vcs7 6554 1486 140 17 71.7 2013 Apr–2013 Aug
vcs8 5712 1233 153 10 47.7 2014 Jan–2014 Feb
vcs9 20641 10575 433 99 536.0 2015 Aug–2016 Sep
all 27,609 13,154 491 126 655.4 2013 Apr–2016 Sep

Note. Because some sources were listed in several campaigns, the total may be less than the sum. Columns: (1) the campaign ID, (2) the number of sources in the input catalog, (3) the number of new sources observed in this campaign that have never been observed with VLBI before, (4) the number of known sources in the input catalog, (5) the number of observing sessions, (6) the amount of observed time allotted in hours, and (7) time interval of observations.

Download table as:  ASCIITypeset image

3. Data Analysis

Raw data were processed with the DiFX (Deller et al. 2007, 2011) correlator. The spectral and time resolutions were 62.5 KHz and 100 ms, respectively. These settings are 4 times finer for the spectral resolution and 20 times finer for the time resolution than typical values. That choice has increased the data set size by a factor of 80 and increased computation time at least by a factor of 4 log2× 20 log2 20 ≈ 700 with respect to a commonly used correlation setup. The correlator is able to provide even finer resolution and therefore a wider field of view. The choice of resolutions was dictated by the available computer resources that the author was able to secure: a 20 core Xeon E5-2660-v3. Processing a 48 Tb data set with 580,626,235 visibilities required 12 yr of CPU time per single core. Because some experiments have to be reprocessed due to errors discovered at latter stages during quality control, fringe fitting took one and a half years to finish.

The correlator provides the cross- and autospectrum of visibilities, the spectrum of phase-calibration signal, and the coefficients of the polynomials for the a priori model used during correlation. The output data set also contains system temperature and atmospheric parameters.

3.1. Postcorrelator Analysis

I used a custom-designed software package ${ \mathcal P }{ \mathcal I }{ \mathcal M }{ \mathcal A }$ (Petrov et al. 2011a) for postcorrelator processing. The goals of the postcorrelator analysis are (a) to determine the group-delay and phase-delay rate for each observation as well as realistic uncertainties of these quantities for their subsequent use for absolute astrometry; and (b) to determine time- and frequency-averaged visibilities for their subsequent use for imaging.

The data analysis pipeline has a number of steps.

  • 1.  
    Automatic examination of the data set, splitting the data into scans, indexing, cross-referencing, discarding bad and orphaned data, i.e., visibilities without a header, time tag, corresponding autocorrelation, or phase calibration.
  • 2.  
    Editing the phase-calibration signal. The VLBA injects narrowband impulses into the feedhorn. Its spectrum in the form of a rail with a step of 1 MHz is computed by the correlator. Because the phase-calibration unit and other parts of the VLBI hardware are fed by the signal from the same hydrogen maser distributed at 1 MHz, harmonics of the 1 MHz signal are spilled over and interfere with the phase-calibration signal. However, their impact on a wide-band signal from observed sources is negligible with some exceptions. The phase-calibration signal has to be edited before use by removing spikes in the phase of the spectrum and masking out tones at the IF edges. The VLBA IF filter has significant spillover to adjacent IFs. This reduces the fringe amplitude at the edges but also causes an interference of the phase-calibration signals from adjacent IFs. Figure 8 demonstrates typical phase-calibration signals. After removal of phase-calibration tones affected by the internal interference, the phase of the phase-calibration signal spectrum is interpolated and/or extrapolated across each IF and applied to data, i.e., subtracted from fringe phases.
  • 3.  
    Coarse fringe fitting. A simplified procedure of fringe fitting is performed, without oversampling and without LSQ refinement. The goal of the coarse fringe fitting is to get a set of observations with high S/N at each baseline. Fringe fitting is preformed for the 4.3 and 7.6 GHz bands, separately.
  • 4.  
    Computation of the complex bandpass. Deficiencies in the implementation of phase calibration does not allow us to restore precisely the amplitude and phase characteristics of the VLBI signal chain. Assuming that the spectrum of observed sources within 480 MHz bandwidth is flat, we can derive the phase and amplitude response by processing a number of strong sources with a high S/N. First, the algorithm runs fine fringe fitting for 12 sources with the highest S/N at all baselines with a certain station taken as a reference. Then, it subtracts the contribution of phase delay and phase-delay rate, coherently averages over time and over four adjacent spectral frequency channels in order to improve the S/N in each spectral bin, and then normalizes the amplitude for the integral over frequency at each IF to be equal to unity. Then, the algorithm solves for the station-based complex bandpass Bi(f) using LSQ by fitting the residual phases and logarithm of the normalized amplitudes in a form of a B-spline expansion with nine knots uniformly distributed over frequency within each IF that relates the observed visibility ${V}_{{ij}}^{\mathrm{obs}}(f)$ and the ideal signal Π(f) through: the ${V}_{{ij}}^{\mathrm{obs}}(f)\,={B}_{i}(f)\,{B}_{j}^{\star }(f)\,{\rm{\Pi }}(f)$ relationship. Some spectral channels are masked out, i.e., corresponding visibilities are replaced with zero. These are the channels at the edge of IFs, channels at 4.2 and 7.8 GHz that coincide with the synthesizer frequencies, and channels affected by the RFI. Occasionally, entire IFs have to be masked out due to hardware failures.
  • 5.  
    Fine fringe fitting. The fringe fitting was repeated with the oversampling factor 4, with the digitization calibration, phase calibration, and phase bandpass applied. Group delay, phase-delay rate, and group-delay rate were refined using LSQ in the vicinity of the main maximum of the 2D Fourier transform with additive reweighting applied using all visibility data of a given observation. See Petrov et al. (2011a) for a detailed explanation of how this is done.
  • 6.  
    First astrometric solution. Results of fringe fitting are exported in a database in geoVLBI format4 and ingested by the VLBI astrometry/geodesy data analysis software pSolve.5 The first problem is to filter out observations with no signal detected. Because, on average, the signal was detected only in 45% of observation, LSQ will not work unless the fraction of observations with nondetections is reduced to a manageable level, <10%. Figure 9 shows the S/N distribution. The observed distribution is a superposition of two very distinct partially overlapping distributions: the distribution of detections (blue line) that gradually increases toward low S/N because the number of weaker sources is greater than the number of strong sources and the distribution of nondetections (red line) that peaked at S/N = 4.8. These two distributions can be separated, fitted, and integrated. The cumulative normalized distribution of nondetections immediately provides us an estimate of the probability of a false detection (Petrov et al. 2012). The probability of false detection at S/N = 6.0 is 0.0016. The probability reaches 0.1 at S/N = 5.25 and 0.3 at S/N = 5.01. Therefore, we can consider that observations with S/N <5 are nondetections, with S/N > 6 mostly detections, and with S/N from 5 to 6 in a transition zone. It should be noted that the false detection probability depends on the search window size. At the first step, observations with S/N < 6.0 are temporarily suppressed, i.e., excluded from consideration. The positions of the sources without prior astrometric VLBI observations are estimated using LSQ, as well as clock functions for all stations except the one taken as a reference. Then, the automatic procedure for outlier elimination is executed: the observations with the largest normalized residuals are suppressed, the solution is updated, and the procedure is iterated until no observations with normalized residuals greater than 4.5σ remained. Although the mathematical expectation of the number of nondetected sources with S/N > 6.0 among 5000 observations (the typical number of sources in each individual observing session) is 8, group delays may be wrong for a greater number of sources for a variety of reasons: interference, phase instability due to hardware malfunction, etc. The sources with fewer than three remaining observations are considered to be nondetected, because one can always solve for two parameters using two observations and get zero residuals. Postfit group-delay residuals of detected sources have the Gaussian distribution with σ = 30–70 ps, while group delays derived from the noise have the uniform distribution within the search window [−8000, 8000] ns. Considering that among detected observations the typical rms of postfit residuals is 60 ps, the probability of finding residuals <4.5σ among three observations with at least one nondetection is 3 × 4.5 × 0.06/8000 ≈ 10−4. This estimate shows that a requirement for a source to have at least three observations with postfit residuals less than <4.5σ is a very powerful filter. The sources with more than three observations with S/N > 6 but with fewer than three observations remaining after outlier elimination are reexamined. It may happen that the outlier elimination process removed good observations but one or two bad observations affected by the RFI were kept. Examination of fringe phase residuals allows us to identify the observations with a certain systematic pattern, flag them out, and the rerun the outlier elimination process for the remaining sources from the very beginning. This usually fixes the problem and allows us to restore the observations that were incorrectly flagged out. Then, the S/N limit is reduced to 5.9, 5.8, and 5.7, and the reexamination process is repeated. After that, the parametric model is expanded, and estimation of the residual atmospheric path delay in the zenith direction at each station is included. At the next step, a procedure reciprocal to the outlier elimination runs. It examines the suppressed observations with S/N > 4.8, finds the one that has the minimum normalized residual, flips the suppression flag, updates the solution, and repeats iterations until the minimum normalized residual reaches 4.5σ. Because the presence of outliers distorts parameter estimates and residuals, the procedure of the outlier elimination and restoration has to be repeated two to three times to reach convergence.
  • 7.  
    The second fringe fitting run. The source positions determined in the previous step are used as new a priori. The nonlinear term in phases due to large differences between the actual source positions and the positions used for computation of the correlator delay is evaluated. The visibilities are phase-rotated to cancel the contribution of the quadratic term in phases, and the fringe fitting process is repeated. Refringing improves the S/N of the sources with large corrections to their initial a priori positions. The results of fringe fitting were transformed to a database, but the previous flags were preserved.
  • 8.  
    Second astrometric solution. The procedure for outlier elimination and restoration of the previously suppressed observations is repeated, starting with the flags preserved in the previous astrometric solution. This time, no limit on the S/N is imposed.
  • 9.  
    Third fringe fitting run. The fringe fitting procedure is repeated for all suppressed observations. Using results of the second astrometric solution, expected group delays are computed. The fringe search window has the semi-width of 2 ns at 4.3 GHz and 1.3 ns at 7.6 GHz and is centered at the expected values of group delay. The observations with the S/N > 4.8 are selected and added to the group-delay data set. This S/N limit is significantly lower as the search window is much narrower, and the probability that the peak in the narrow window with S/N greater 4.8 could be found by chance is lower. Here, information from another observation is utilized, which allows us to detect sources with a lowerr S/N. This procedure also often helps recover observations affected by the RFI.
  • 10.  
    Third astrometric solution. The procedure for restoration of previously suppressed observations is repeated. The observations with normalized residuals less than 4.5σ are retained. Finally, the ionosphere-free combinations of group delays of those observations that are detected at both bands are formed. The algorithm runs a new LSQ solution using the ionosphere-free combinations. The procedure of outlier elimination and restoration of previously suppressed observation is repeated until no outliers exceeding 3.5σ or no suppressed observations with normalized residuals less than 3.5σ remains. The criteria is lowered for the dual-band solution because the contribution of the additional noise due to the ionosphere is eliminated.

Figure 8.

Figure 8. The phase of the phase-calibration signal at station br-vlba during experiment BP192J8 on 2016 September 7 after fitting for group delay, its removal, and unwrapping phase ambiguities. Red points denote the tones that were masked out because they are affected by internal interference.

Standard image High-resolution image
Figure 9.

Figure 9. The S/N distribution. The green points show the observed distribution. The blue line shows the distribution of the S/N among detected observations. The line is extrapolated to S/N = 4.0. Red points show the distribution of nondetections.

Standard image High-resolution image

3.2. Detection Statistics

Of 13,154 target sources, 6755, or 51%, have been detected in at least one band. Among 491 calibrator sources, 465 have been detected. Table 4 shows the cumulative distribution of the distance of the detected sources from the a priori position.

Table 4.  The Share of Sources Detected at a Given Distance from the A Priori Position

50% >10''
20% >23''
10% >36''
5% >59''
1% >207''

Download table as:  ASCIITypeset image

I computed the spectral indices of the emission at kiloparsec scales between 4.85 and 1.4 GHz by cross-matching the GB6 and PMN catalogs against the NVSS catalog. If a given source was observed in both GB6 and PMN, GB6 was used. The distributions of spectral indices for detected and nondetected sources are shown in Figure 10.

Figure 10.

Figure 10. The distributions of observed sources over the spectral indices at kiloparsec scale among those that have not been detected (left) and among those that have been detected. Red color shows the steep-spectrum source population (spectral index <−0.5). Green color shows the flat-spectrum population (spectral index >−0.5). The filled boxes show the spectral histogram built using both PMN, GB6, and NVSS catalogs. The light rose line shows the histograms built using only GB6/NVSS data.

Standard image High-resolution image

Among detected sources, the share of steep-spectrum objects is 42%, while among nondetected sources, it is 83%. The detection rate among observed flat-spectrum sources is 79% and among observed steep-spectrum sources is 35%. The detection rate drops to 20% among the sources with spectrum index steeper than −1.0. However, caution should be taken in interpreting these numbers because the input sample is neither complete over the flux density nor over the spectral index.

GB6, PMN, and NVSS catalogs were observed in different epochs: in 1986–1987, 1990, and 1993–1996, respectively. Source variability affects spectral index estimates made from flux densities measured at different epochs. Although as a comparison of 544 detected sources that are present in both GB6 and PMN catalogs show that the spectral indices derived from GB6/NVSS and PMN/NVSS cross-matching indeed differ, the impact of these differences on the distributions is negligible. The Kolmogorov–Smirnov test shows the spectral index distributions formed from a subset of GB6/NVSS and PMN/NVSS data belong to the same parent distribution (p value = 0.53).

3.3. Final Astrometry Analysis

I ran three independent global least-squares solutions. They differ by the observables used. The experiment list consisted of two parts: basic and specific. The basic part contains all observations collected under all VLBI geodesy programs, the VLBA regular geodesy program RDV, and VCS1-6. The specific part contains 4.3 GHz group-delay observables for the wfcs_c solution, 7.6 GHz group-delay observables for the wfcs_x solution, and ionosphere-free linear combinations of 4.3 C and 7.6 GHz observables for the wfcs_xc solution.

Estimated parameters are split into three categories. Global parameters included station positions, station velocities, parameters of station nonlinear motion, and source coordinates. They are estimated using all of the data. Session-wide parameters included pole coordinates, UT1 angle, their time derivatives, nutation angle offsets, baseline clock offsets, and clock breaks for some stations. They are estimated using data from each observing session. Segment-wide parameters included the atmospheric path delay in zenith direction and the clock function for each station. They are modeled with a B spline with the time span of 1 hr with constraints imposed on their rate of change with reciprocal weights of 40 ps s−1 and 2 × 10−14 for the atmospheric path delay rate and the clock rate, respectively.

The estimated parameters include sine and cosine coefficients of the harmonic site position variations at diurnal, semidiurnal, annual, and semiannual frequencies to take into account residual mass loading, thermal variations, and systematic errors in modeling the atmospheric path delay. In addition, parameters of B spline with multiple nodes were estimated for some stations to account for coseismic crustal deformation (mk-vlba station) and local motion of the antenna foundation (pietown station, see Petrov et al. 2009 for details).

No-net-translation constraints were imposed on station positions and velocities, and non-net-rotation constraints were imposed on station positions and velocities as well as source coordinates to find the solution of the system of equations of the incomplete rank. In particular, the new adjustments of the so-called 212 defining sources listed in the ICRF1 catalog are required to have zero net rotation with respect to the positions reported in that catalog.

Weights associated with observations are computed the following way:

Equation (13)

where σg is the group-delay uncertainty, k is the multiplicative factor, a is the elevation-independent additive weight correction, and b is the elevation-dependent weight correction. This form of weights accounts for the contribution of systematic errors. I used k = 1.3 based on the analysis of VLBI–Gaia offsets (Petrov et al. 2019b). The additive parameter a was found by an iterative procedure that makes the ratio of the weighted sum of postfit residuals to their mathematical expectation close to unity. Parameters b(e) were computed differently for dual-band and single-band solutions.

I used b(e) in the form of $b{(e)}^{2}=\beta \,(\tau ({e}_{1}{)}_{\mathrm{atm},1}^{2}+\tau ({e}_{2}{)}_{\mathrm{atm},2}^{2})$ for processing dual-band observations, where $\tau {({e}_{i})}_{i,\mathrm{atm}}$ is the atmospheric path delay at the ith station. I set β to 0.02 because it minimizes the baseline length repeatabilities (see Petrov et al. 2009 for details on how such tests are performed).

I computed the contribution of the ionosphere to group delay utilizing the TEC maps derived from analysis of Global Navigation Satellite System (GNSS) observations and used them for processing single-band observations. Specifically, CODE TEC time series (Schaer 1999)6 with a resolution of 5° × 2fdg× 2h were used. However, the TEC maps account only partially for the ionospheric path delay due to the coarseness of their spatial and temporary resolution. I have developed a simplified stochastic model in order to account for the impact of the mismodeled ionosphere contribution to positions derived from processing of single-band data.

I collected statistics of the differences between the ionospheric contribution to group delays at 4.3 GHz derived from VLBI data, ${\tau }_{{\rm{v}},\mathrm{iono}}$, and derived from GNSS TEC maps, ${\tau }_{{\rm{g}},\mathrm{iono}}$. For each experiment and each baseline, I computed the weighted mean ${\tau }_{{\rm{m}},\mathrm{iono}}$ and the rms of the zenith residual ionospheric contribution as $({\tau }_{{\rm{g}},\mathrm{iono}}-{\tau }_{{\rm{v}},\mathrm{iono}}-{\tau }_{{\rm{m}},\mathrm{iono}})/\tilde{M}(e)$, where $\tilde{M}(e)$ is the arithmetic mean of the ionospheric mapping functions over two stations at a given baseline. Then, I computed the expected variance of the residual ionospheric contribution for each observation as ${\sigma }_{\mathrm{res},\mathrm{iono}}={\rm{rms}}\cdot \tilde{M}(e)$. That variance was added in quadrature to parameter a computed by the reweighting procedure. The additive variance was rescaled by the (4.3/7.6)2 factor when applied to group-delay weights at 7.6 GHz.

I compared 4.3 and 7.6 GHz only solutions against the solution that used the ionosphere-free combinations of observables. First, I modified source position uncertainties σ0 by adding in quadrature an ad hoc term lr = 0.05 mas to account for systematic errors. This quantity was derived from the decimation test: the data set was divided into two equal subsets and the differences in source positions derived from the subsets were analyzed.

I divided the position differences of the 4.3 and 7.6 GHz solutions by the maximum between the dual-band and single-band formal uncertainties σm and computed their distributions. I fitted them into the Gaussian distribution and adjusted the mean position offset and two parameters of the formal uncertainty rescaling law, l and s, in the form ${\sigma }_{a}=\sqrt{{l}^{2}+{(s{\sigma }_{m})}^{2}}$. The results of the fit are shown in Table 5, and the plots of the normalized position uncertainties over R.A. and decl. from the 4.3 GHz solution are shown in Figure 11. The position offset was subtracted from all single-band position estimates. The original source position formal errors from the 4.3 GHz group-delay solution were rescaled as

Equation (14)

Formal errors from the 7.6 GHz group-delay solution were rescaled in a similar way. The final catalog contains positions from the solution that provided the smallest semimajor error ellipse axes for a given source.

Figure 11.

Figure 11. The distributions of the normalized position differences between the 4.3 GHz and the dual-band solutions after their rescaling (green connected points). The blue lines show the Gaussian distribution with σ = 1 as a reference. The left plot shows the differences over R.A. scaled by $\cos \,\delta $, and the right plot show the differences over decl.

Standard image High-resolution image

Table 5.  The Parameters of the Uncertainty Rescaling Law and the Position Offsets in R.A. and Decl. for the 4.3 GHz and 7.6 GHz Solutions with Respect to the Dual-band Solution

Freq R.A. Decl.
  l s off l s off
GHz mas   mas mas   mas
4.3 0.160 1.259 0.107 0.271 1.299 0.003
7.6 0.018 1.005 0.055 0.151 1.149 0.041

Download table as:  ASCIITypeset image

4. Imaging Analysis

The same visibility data were used for imaging. Astronomers often spend hours to get a VLBI image manually. Considering that the project involved generating over 15,000 images, manual imaging was not an option. Therefore, an automated pipeline has been developed. The imaging procedure includes a number of steps: cleaning system temperature, computing the a priori gain, recalibration, flagging visibilities when the antennas were off source, averaging the data over time and frequency, generating images using hybrid self-calibration, and computation of the total and median flux densities at long and short baselines.

4.1. Amplitude Calibration

The VLBA hardware measures the system temperature (Tsys) every 30 s. It characterizes the antenna power in the absence of a signal. In addition, the NRAO periodically measures antenna gains and makes the results of these measurements available. The ratios of Tsys to gain would be sufficient for amplitude calibration if they were perfect. However, the measured Tsys often suffers from RFIs that manifest themselves as spikes in time series, overflows due to hardware problems, and missed values for some observations. Therefore, the raw Tsys data are to be processed in order to clean them. I developed two procedures called if-clean and tmod-clean.

The if-clean procedure assumes the ratio of Tsys between IFs within a given band is kept the same because the data come from the same receiver and are transferred to the data acquisition system using the same cable. First, the algorithm determines the reference IF that has the least amount of missing data and spikes. Second, the time series of logarithms of Tsys ratios with respect to the reference IF is computed, and an iterative procedure discards the values exceeding 3 × rms. Then, the ratios ${r}_{\mathrm{ij}}$ between a given IF and the reference IF are computed for all IF combinations. Third, I compute a substitute for discarded values of Tsys if Tsys in at least one IF for the same moment of time is not flagged. The ${T}_{\mathrm{sys},k}$ substitute at station k is determined as the geometric mean of the products of ${T}_{\mathrm{sys},i}\,{r}_{{ik}}$.

The if-clean procedure cannot recover Tsys if all IFs are affected. Because Tsys has a strong elevation dependence and observations at adjacent scans are made at substantially different elevations, direct interpolation will not work, and a more sophisticated procedure is required.

The tmod-clean procedure performs a decomposition of Tsys into time and elevation dependence:

Equation (15)

Both Ta(e) and Tz(t) are sought using iterative nonlinear LSQ in the form of their expansion into the B-spline basis of the first degree normalized in such a way that their minimum value is 1.0. Iterations are started using Tz(t) = 1 and expanding Tsys cleaned by the if-clean procedure into the B-splines basis over elevations with 16 knots in the range of 3°–92°. The outliers exceeding n normalized deviations are discarded. This expansion is normalized for Ta(e), postfit residuals are computed, and they are used for the evaluation of the Tz(t) B-spline expansion with the time span between knots of 20 minutes. Again, the outliers are eliminated, and the procedure is repeated for the residuals with respect to Tz(t) and Ta(e) taken from previous iterations. In total, eight iterations are performed. The outlier elimination criterion is 8.0 for the first iteration, and it is consecutively reduced to 6.0, 5.0, 4.0, 3.5, 3.0, 3.0, and 3.0σ. Constraints on Tsys values as well as derivatives over time and elevation with reciprocal weights 9 K, 0.0001 K s−1, and 5 K rad−1 are imposed to stabilize the solution. The procedure may not converge if the number of observations at a given station is fewer than 8–10 per hour. The Tsys substitute is computed according to the empirical model in Equation (15) for missing measurements or those flagged as outliers. Figure 12 shows the result of Tsys decomposition at station fd-vlba as an example.

Figure 12.

Figure 12. The result of the system temperature decomposition on time and elevation dependence at station fd-vlba for experiment BP192J8, which ran on 2016 September 7. Left: time dependence. Right: elevation dependence. The connected green points show the model value. The blue points show the sum of the model and residuals.

Standard image High-resolution image

Finally, the amplitude was multiplied by the a priori SEFD: the ratio of cleaned Tsys to the antenna gain.

4.2. Amplitude Renormalization

The response of an ideal system to a source with the continuum flat spectrum at a given IF has a Π shape. The bandpass of the VLBA system deviates from the ideal and falls off at the edges. To compensate the signal loss due to the bandpass' nonrectangular shape, the visibilities are divided after fringe fitting by the function called the amplitude calibration bandpass, which is the average of the normalized observed cross-spectrum of strong sources. The cross-correlation bandpass falls at edges stronger than the product of autocorrelations. This suggests that the signal incurs additional losses due to decorrelation at the edges. To account for these losses, I split the band into three areas: the central part called the band kernel and the two edge parts (see Figure 13). The amplitude bandpass is normalized in such a way that its integral over the kernel area is unity.

Figure 13.

Figure 13. The amplitude of the visibility spectrum of WFCS J0745+1011 at baseline la-vlba/nl-vlba within IF 1 at 4.128 GHz. The amplitude is reduced at the band edges due to the interference with adjacent bands. The shadowed area shows the band kernel that was used for amplitude renormalization.

Standard image High-resolution image

The part of the band where the bandpass falls below 0.1 or has spikes due to internal RFI is usually masked out before fringe fitting, as visibilities in this part of the spectrum are corrupted and bring more noise than signal. The autocorrelation spectrum was originally normalized to unity over the entire IF. Because the system temperature was recorded in the entire IF, removal of a part of the spectrum distorts normalization. Therefore, the autocorrelation is renormalized to unity over the unmasked fraction of the spectra, and the cross-correlation amplitude is divided by the renormalization factor Rm,

Equation (16)

where Ai is the ith constituents of the autospectrum, mi is the mask, 0 or 1, and n is the number of spectral channels.

4.3. On–Off Flagging

Although the VLBA antennas have a flagging mechanism to report events when the antenna has finished slewing and reached the source, it is not uncommon to receive data recorded during slewing (see Figure 14). If not flagged, the use of such data will seriously corrupt the image. I implemented an algorithm for detecting off-source data. It assumes there is a period of time when the data are trusted, called a scan kernel, and a period of time when the data are questionable. The scan kernel was defined as a [0.4, 0.95] scan fraction.

Figure 14.

Figure 14. The amplitude of the visibility spectrum of the calibrator source J1642+6856 at baseline br-vlba/nl-vlba at 7.6 GHz. The signal-to-noise ratio is 355. The antenna was off the source for the first 13.9 s (the red area), on the source since 15.0 s (the green area), and was moving on source in between (the yellow area).

Standard image High-resolution image

The first step is to compute the amplitude averaged over time and frequency over the scan kernel, applying the results of fringe fitting. Then, the data before the kernel are split into segments. The segment size measured in the number of accumulation periods is defined as

Equation (17)

where a is the kernel fringe amplitude, σa is the rms of the amplitude scatter with respect to a over the kernel interval, and n is a parameter. The algorithm coherently averages the visibilities within each segment and checks their amplitudes in the reverse time order starting from the segment that is the closest to the kernel. If it finds the segment amplitude is less than the kernel amplitude by s, where σs is the segment amplitude uncertainty, all of the visibilities of that segment and the preceding segments are flagged out. Then, a similar procedure is executed at the end of the scan. Although late on source is the most common reason of the amplitude drop, the fringe amplitude may have spikes or drops due to the internal RFI and hardware malfunction as well. Therefore, at the end, the kernel span is extended to all visibilities, and segments with amplitude less than s are flagged out.

The parameter n is selected depending on what is less desirable: to flag out good points or miss bad points. I selected n = 3, which flags the data more aggressively, arguing that the presence of data with wrong amplitudes causes more damage than the removal of a fraction of good data points.

4.4. Aggregation of Visibilities

After flagging and calibration, visibility phases are rotated using group delays, phase-delay rates, and group-delay rates. However, several complications have to be taken into account. First, because the fringe fitting procedure processed each observation individually, in general, the fringe reference time used for the computation of the scan-averaged phases defined as the weighted mean epoch is different. Second, independently derived group delay, phase delay, and group-delay rate estimates along any baseline ij, ik, and jk do not preserve the closure relationship: ${a}_{{ij}}-{a}_{{ik}}+{a}_{{jk}}\ne 0$. If group delays or phase-delay rates have nonzero closures, phase rotation will distort the closure relationship of original visibility phases. Third, if the a priori quadratic term was added to phases before fringe fitting when processing a source with a large a priori position error, it has to be subtracted.

The common scan reference time ts is found as the weighted mean over the baseline fringe reference time tf. Then, station-based group delays τgi, phase-delay rates ${\dot{\tau }}_{i}^{p}$, and group-delay rates ${\dot{\tau }}_{i}^{g}$ are computed from baseline quantities using weighted least squares:

Equation (18)

The system has 3(n − 1) equations, where n is the number of stations. Only a portion of the system is shown. One of the stations is taken as a reference. Then, using estimates of station-based ${\dot{\tau }}_{i}^{p}$, ${\tau }_{i}^{g}$, and ${\dot{\tau }}_{i}^{g}$, new baseline-based quantities are computed. Now, they are referred to the same scan reference time and have a zero closure.

The right-hand side of system (18) contains ${\dot{\tau }}_{{ij}}^{p}$, ${\tau }_{{ij}}^{g}$, and ${\dot{\tau }}_{{ij}}^{g}$ of detected observations. When the inverse transformation from the station-based to the baseline-based quantities is performed, a situation may occur where a given observation at the baseline ij has not been detected, but the station-based quantities at stations i and j are available because there were detected observations at other baselines with these stations. Therefore, we can restore baseline-based ${\dot{\tau }}_{{ij}}^{p}$, ${\tau }_{{ij}}^{g}$, and ${\dot{\tau }}_{{ij}}^{g}$ for an observation that has not been detected during the fringe fitting process. I compute the amplitude averaged over time and frequency for such observations and retain those with the amplitude a factor of 4.0 greater than the noise amplitude. This criterion is more relaxed than the previously used detection thresholds. We are able to lower the detection threshold here because the visibilities at other baselines are utilized for processing a given observation.

Finally, the visibility phases are rotated according to phase-delay rates, group delays, and group-delay rates, and they are averaged over frequency within each IF and over time within 4 s long segments. The variance of the fringe amplitude within each segment is computed and used later for image restoration. All visibilities of a given source are combined and written into separate files, one file per source and per band. If a source was observed in several experiments, the data collected within 6 months are merged into a single file.

Source variability within a 6 months period will cause some blurring in the image made using merged data from two or more epochs, but for most of the sources, this effect is small. In almost all cases, change in structure is associated with a change of the flux density of a compact component. Such changes manifest in a synchronous increase or decrease of the correction factors determined during amplitude self-calibration. Such a systematic change in correction factors was found in several sources. These sources were reimaged for each individual epoch.

4.5. Image Restoration

I used Difmap (Shepherd 1997) for image restoration. First, for each session I selected three strong calibrators that were observed at all baselines. These sources were imaged manually using the conventional technique described in the Difmap manual. It includes iterations of phase self-calibration, selecting the areas on the image plane for the CLEAN algorithm (Högbom 1974), and amplitude self-calibration (Cornwell & Fomalont 1999). The empirical multiplicative gain corrections adjusted during the amplitude self-calibration were extracted by differencing the original visibility data and the self-calibrated data saved by Difmap. The gain corrections are averaged over three calibrator sources. IFs with unstable or low amplitudes were flagged out by assigning them zero-gain corrections in some cases. Then, these gain corrections were applied to the remaining data.

Further imaging was performed in an automatic fashion using the Difmap script originally developed by M. Shepherd and G. Taylor and modified by Y. Y. Kovalev. The pipeline starts with phase self-calibration with the solution interval of 3600 s, then it runs a number of times in the inner loop, which consists of the CLEAN procedure over established windows that have the size a factor of w greater than the CLEAN beam size and the phase self-calibration with a solution interval of s. The inner loop of cleaning and phase calibration is repeated until no new peak above d times the image rms is found. After each step, a set of CLEAN windows is accumulated. The first time, the inner loop runs with s = 3600 s, w = 4, and d = 6 with the uniform weights. Then, it is repeated with natural weights with w = 6.4, and d = 5.5. After that, the amplitude self-calibration is performed followed by the phase self-calibration with the solution interval of 12 s. Then, the inner loop with s = 12 s, w = 6.4, d = 5.0 and natural weights is performed. Then, the amplitude self-calibration is performed with the solution interval of 3600 s followed by the phase self-calibration and the inner loop with s = 12 s, w = 6.4, d = 4.75 and natural weighting. At that point, the accumulated model is cleared, the CLEAN procedure with the uniform weights is performed over established CLEAN windows, followed by the CLEAN procedure with natural weighting and with the modified inner loop with the same parameters but without phase self-calibration. At the end, the CLEAN procedure over established windows runs once more followed by the phase self-calibration, and then the final CLEAN procedure runs over the entire map a last time, and Difmap creates the final image.

An example of an image derived from processing the survey data is shown in Figure 15. The self-calibrated visibilities are plotted against the projection of the baseline vector into the jet direction. This source was detected at all 45 baselines, and no station failed. When the number of detections is fewer, either because a source is weak, or station failures, the image quality degrades. An image rarely provides useful information beyond the total flux density if the number of detections drops below 8–10. But anyway, the automatic procedure processed all of the sources that have enough data to form closures, i.e., if the number of detections was at least four to six.

Figure 15.

Figure 15. A sample image of WFCS J0018+2921 (left) and the self-calibrated visibilities averaged over time and frequency (right). The source has been detected at all baselines of the 10 station network in one scan 60 s long with the S/N in the range of 15–35.

Standard image High-resolution image

Upon completion of automatic imaging, I performed a manual inspection of all the images. If an image or a plot of self-calibrated visibilities showed abnormalities, I reprocessed these images manually. Manual intervention was required in 20%–25% cases. The most common problems are RFI, a sudden amplitude drop due to a hardware malfunction, and emission beyond the default mapping area. The manual intervention solved the problem in most of the cases.

4.6. Generation of Median Flux Densities

An image is a two-dimensional array. To provide a more concise but coarse image characterization, I followed the practice used in prior VLBI calibrator surveys (e.g., Kovalev et al. 2007) and computed the total flux density by summing all CLEAN components, the flux density at short baselines—the median flux density at baseline projection lengths <900 km, and the unresolved flux density—the median flux density at baseline projection lengths >5000 km. For some sources, no information about the unresolved flux density is available.

The imaging process failed for 108 sources: 32 at the C band and 76 at the X band because the number of observations, three to six, was too small. The calibrated visibilities were coherently averaged for these sources over time and frequency for each individual observation and then the amplitudes were incoherently averaged over all observations at different baselines. The mean visibilities produced that way were used as a substitute for the total flux density. No estimates of the flux density at short baselines and the unresolved flux densities are available for these sources.

In total, images in at least at one band are available for 6714 detected target sources out of 6755, and 6038 of them, or 90%, were generated using only one 60 s long scan. In addition, images for 465 calibrator sources were made, and many of them at more than at one epoch.

The distribution of flux densities at short and long baseline projection lengths of all target sources is shown in Figure 16. The target sources are rather weak. Table 6 shows the median flux densities of detected sources. Only 86 sources with flux density greater 200 mJy at 7.6 GHZ have been detected among the 13,154 observed. The detection limit was in the range of 10–15 mJy.

Figure 16.

Figure 16. The distribution of the median flux densities at 4.3 GHz (left) and at 7.6 GHz (right). The green points show the median flux densities at short baseline projection lengths. The blue points show the median flux densities at long baseline projection lengths.

Standard image High-resolution image

Table 6.  The Median Total Flux Densities, the Flux Densities at Short Baselines, and the Unresolved Flux Densities at 4.3 GHz and 7.6 GHz

  4.3 GHz 7.6 GHz
  Flux # Src Flux # Src
  (Jy)   (Jy)  
Total flux density 0.041 6738 0.038 6115
Flux density at short baselines 0.036 6688 0.034 6208
Unresolved flux density 0.023 5137 0.024 4601

Note. The table also shows the number of target sources for which flux density information is available.

Download table as:  ASCIITypeset image

4.7. Multiple Sources in the Field of View

It is not uncommon to find several objects in the field of view that covers several arcminutes. If the field of view contains several strong objects, each of them at least a factor of 2 brighter than the detection limit, then the astrometric solution will have an unusually high number of outliers. I applied a technique of component separation for such cases. I examined fringe plots and flagged those that showed a pattern that is consistent with internal or external radio interference—spikes, unusual spectrum, or sinc-shape pattern of fringe amplitude versus time. See Figures 3 and 4 in Shu et al. (2017) for examples. If the outliers remained, I inverted the suppression status and tried to estimate positions using only those observations that were previously suppressed. Three outcomes of this procedure are possible. The solution may converge to a new position, and most of the group delays that were outliers for the first component will be used in the solution for the second component. This happens if an observation was suppressed because the fringe fitting procedure picked up one source component at some baselines and another at others. The solution may not converge to a new position, and the result may be inconclusive: a solution may converge to a new position, but a significant fraction of observations is included for estimation of both components or is used in neither. If the distance between components exceeded 1'', its a priori coordinates were updated, and a new iteration of fringe fitting accounting for the quadratic term in the fringe phase followed. If the distance between components was below 2'', the field was imaged (see Figure 17 as an example). If the second component did not to appear at the image, the hypothesis that the source is double was rejected, and the observations were reflagged. The component separation procedure works easily if the distance between components is above 0farcs3–0farcs5. It does not work if the distance is less than 0farcs07–0farcs15. The range 0farcs1–0farcs4 is intermediate where the component separation is not always reliable. The second component may be missed if the flux density of the second component is a factor of 10 lower than the flux density of the primary component.

Figure 17.

Figure 17. The images of the visually double source J0241+6126/J0241+612A at 4.3 GHz (left) and at 7.6 GHz (right).

Standard image High-resolution image

In total, second components were found for 88 objects that resulted in separate catalog entries. They are treated as different sources in the context of this paper although in many cases they may be a part of the same object. The source components are found at separations from several milliarseconds to several arcminutes. The separation of sources into several catalog entries was not done based on their physical properties. It was done entirely on the basis of whether the position of the second component can be determined in the astrometric solution independently from the position of the first component.

Among 88 sources with several components found in the beam, 34 sources have separations in the range of 0farcs08–1farcs9 and 9 have separations in the range of 5'' to 60''. The first eight pairs are shown in Table 7. In a case where the IAU name derived from positions is the same for two or more sources, the convention adopted in the paper is to replace the final digit with a letter for the second source. I kept the notation of components for known multiple sources. The entire table is available in machine-readable format. All second components (and for some sources, the third and fourth components) at distances less than 2'' were identified in images. Therefore, two or more components share the same image. Analysis of the nature of multiple structure is beyond the scope of this paper.

Table 7.  The First 8 Rows of the Table of 43 Target Sources that have Two or more Components with Positions that were Independently Determined

Component 1 Component 2 Separation
WFCS J0403+702A WFCS J0403+7026 0farcs0782
WFCS J0241+612A WFCS J0241+6126 0farcs1140
WFCS J2108-210A WFCS J2108-2101 0farcs1167
WFCS J0904+593A WFCS J0904+5938 0farcs1401
WFCS J0031+540A WFCS J0031+5401 0farcs1452
WFCS J0132+521A WFCS J0132+5211 0farcs1481
WFCS J0023+273A WFCS J0023+2734 0farcs1523
WFCS J0716+470A WFCS J0716+470B 0farcs1743
...    

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

5. Source Catalog

The Wide-field VLBA Calibrator Survey catalogs has 6755 entries. The catalog presents source positions at the J2000.0 epoch, position uncertainties, correlation between R.A. and decl., the number of observations used in 4.3 GHz (C band), 7.6 GHz (X band), and dual-band solutions, the solution that was used for reporting positions, the total flux density, the median flux density at baseline projection lengths <900 km, and the unresolved flux density defined as the median flux density at baseline projection lengths >5000 km at 4.3 and 7.6 GHz. The first eight rows of the catalog are presented in Table 8. The entire table is available in machine-readable format.

Table 8.  The First Eight Records of the Wide-field VLBA Calibrator Survey Catalog

J2000-name B1950-name R.A. Decl. Dra Ddec Corr Num obs Band Flux Density
                      4.3 GHz 7.6 GHz
              X C X/C Tot Shr Unr Tot Shr Unr
    (hh mm ss.ffffff) (dd mm ss.fffff) (mas) (mas)           (Jy) (Jy) (Jy) (Jy) (Jy) (Jy)
WFCS J0000−1352 2357−141 00 00 03.124493 −13 52 00.75819 1.71 4.30 0.720 19 19 17 X 0.021 0.021 0.009 0.019 0.018 0.015
WFCS J0000−3738 2357−379 00 00 08.414016 −37 38 20.67746 3.90 6.96 0.921 32 32 29 X/C 0.019 0.018 0.019 0.019 0.020 0.020
WFCS J0000+0248 2357+025 00 00 19.282530 +02 48 14.68956 0.56 1.35 0.025 29 28 28 X 0.039 0.033 0.018 0.038 0.037 0.024
WFCS J0000+1139 2357+113 00 00 19.564227 +11 39 20.72629 1.18 2.54 0.009 29 18 18 C 0.022 0.025 0.008 0.018 0.014 0.011
WFCS J0000+0307 2357+028 00 00 27.022571 +03 07 15.64962 0.47 1.15 −0.080 36 36 36 X/C 0.085 0.085 0.047 0.075 0.071 0.027
WFCS J0000+3918 2358+390 00 00 41.527612 +39 18 04.14826 0.43 0.67 −0.142 45 45 45 X/C 0.070 0.066 0.056 0.094 0.089 0.076
WFCS J0000+5157 2358+516 00 00 51.385221 +51 57 19.89483 7.30 7.88 0.216 11 6 6 C 0.037 0.026 −9.9 0.015 0.017 −9.9
WFCS J0001−1741 2358−179 00 01 06.264989 −17 41 26.61572 17.71 18.90 −0.431 5 3 3 C 0.078 0.025 −9.9 0.040 0.019 −9.9
...                                

Note. The uncertainty in the R.A. Dra is given without the $\cos \,\delta $ factor. The column Corr contains the correlation between R.A. and decl. The column Num Obs contains the number of observations used in the astrometric solutions at band X and C, and the linear combinations of group-delay observables. The column Band contains a flag for which of the solutions was used for reported positions. The six columns Flux Density contain the estimates of the total flux density Tot, the median flux density at baseline projection lengths shorter 900 km Shr, and the median flux density at baseline projection lengths longer 5000 km Unr for both bands, 4.3 and 7.6 GHz.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

The source distribution over the sky is shown in Figure 18. The gap in R.A. 15h to 21h is because this range was oversubscribed, and too few observing sessions covering it were scheduled.

Figure 18.

Figure 18. The distribution of the detected sources over the sky. The red line shows the Galactic plane.

Standard image High-resolution image

The catalog is also accompanied by a data set of 15,542 images. Multiple sources with component separations less than 1farcs5 are shown in one image file. The data set provides six files associated with each image: the image in FITS format, calibrated visibilities in FITS format, a table with estimates of the correlated flux densities, as well as pictures in postscript format of the source maps, calibrated visibilities, and uv-coverage plots. The data set is available at http://astrogeo.org/wfcs/images. Two auxiliary tables with positions derived from all three solutions and the list of undetected sources are described in the Appendix.

The position accuracy of the catalog cannot be characterized by one number because it varies within four orders of magnitude, from 0.07 mas to 7''. Figure 19 shows the cumulative distribution of the semimajor axes of the error ellipse. The median is 1.7 mas (0.9 mas for R.A. scaled by cos δ, and 1.6 mas for decl.). This is noticeably higher than in previous surveys VCS1–VCS6. Figure 20 shows the position uncertainties as a function of the unresolved flux density at the 7.6 GHz band and as a function of the number of observations used in the solution. The mean position uncertainty is around 0.7 mas for the sources with the unresolved flux density of 100 mJy. It drops to 1.0 mas for sources with flux density 50 mJy, 1.7 mas for sources with flux density 20 mJy, and 3.0 mas for sources with flux density 10 mJy. Similarly, the position uncertainty steadily grows when the number of observations used decreases from 45 to 15, and then sharply increases.

Figure 19.

Figure 19. The cumulative distribution of the semimajor error ellipse axes.

Standard image High-resolution image
Figure 20.

Figure 20. The dependence of the semimajor error ellipse axes as a function of the unresolved flux density at the X band (left) and the number of observations used in the solution (right).

Standard image High-resolution image

Thus, we see two factors that affected position accuracy. First, the WFCS tapped the weak population—stronger sources have already been observed in previous surveys and the follow-up VCS-II campaign (Gordon et al. 2016). Therefore, remaining sources are faint. Second, for the amount of on-source time, one scan 60 s long is sufficient to detect a source to get its position with a milliarcsecond level of accuracy, get its coarse images, but not sufficient for reaching the 0.2 mas accuracy level and getting high-fidelity, high-dynamic-range images. In order to reach that level, approximately one order of magnitude more resources is required.

6. Summary

A list of 13,645 sources has been observed in 2013–2016 with VLBA, including 13,154 target objects never before observed with VLBI. Of them, more than one half, 6755 sources, have been detected. This number exceeds all prior published VLBI astrometry surveys combined.

A novel technique of a wide-band survey has been successfully demonstrated. The previous restrictions on the field of view caused by the limitations of hardware correlators and a lack of adequate computer resources are lifted. VLBI with the field of view comparable with the antenna primary beam size is feasible and is expected to become routine in forthcoming surveys. Unlike correlating with multiple phase centers, this approach does not involve a priori knowledge of where a source should be. If one needs to observe several sources with known positions, then the multiple phase center approach is preferable. If one needs to search for a source everywhere within the beam, a direct approach of correlation with high time and frequency resolution is preferable due to its simplicity. Tests showed that a precise calibration for the amplitude loss as far from the center as the 20% level of the primary beam power pattern can be performed. The tests also highlighted the necessity of pointing offset monitoring for such observations.

Historically, absolute astrometry surveys were conducted at the S and X bands, i.e., at 2.2 and 8.4 GHz. The choice was motivated by the availability of dual-band receivers that matched the NASA deep space navigation frequency bands. It was demonstrated that simultaneous observations at remote wings of a single broadband C-band receiver provide a noticeable advantage in sensitivity and accuracy of derived source positions. Such observations are affected by the radio interference to a much lesser extent than observations at 2.2 GHz. It was demonstrated that transition from 2.2/8.4 GHz to 4.3/7.6 GHz does not introduce any noticeable systematic differences related to the ionospheric contribution.

In the past, the preference was given to VLBI observations of flat-spectrum sources. As the pool of flat-spectrum sources not yet observed with VLBI is close to depleted, the spectral index selection criterion has to be lifted. The detection rate among flat- and steep-spectrum sources was 79% and 35% respectively. We can find compact sources among steep-spectrum sources and vice versa; flat-spectrum sources can be resolved out. Because the WFCS catalog does not form a complete flux-limited sample, these numbers should be taken with caution. The recent paper of Popkov et al. (2020) provides detailed statistics of the compactness of steep- and flat-spectrum sources. Two follow-up programs to reach the completeness at a given flux density limit in some areas covered by the WFCS ran with VLBA in 2019–2020. Analysis of these campaigns that will be published soon will investigate the properties of parsec-scale emission of flat and steep sources drawn from parent catalogs at different frequencies in detail.

Although snapshot images do not have a high dynamic range, a number of sources with multiple components was spotted in the data set. Analysis of their nature is beyond the scope of the present paper. It will require follow-up observations, and several such campaigns have already commenced.

The WFCS median semimajor axis of the position uncertainty is 1.7 mas, which is noticeably lower than that in a number of prior surveys. Two factors played the role: the median flux density was much lower, 35 mJy, and the sources were observed mostly only in one scan. The partly resolved sources were not detected at long baselines, and this caused a significant position accuracy degradation. Allocation of substantially more resources is required in order to reach a submilliarcsecond level of accuracy. It may not be practical to reobserve all of the WFCS sources, although reobservation of a subset of sources, for instance those that are compact and suitable as calibrators, or sources exhibiting unusual morphology, is warranted.

The catalog presented in Table 8 was built using only the data from the three observing campaigns VCS7, VCS8, and VCS9 in 2013–2016. Since then, a number of sources have been reobserved. Up-to-date positions of all these sources and many others can be found in the online Radio Fundamental Catalogue at http://astrogeo.org/rfc that is updated on a quarterly basis.

This work was done with data sets BP171, BP175, BP177, BP192, and BP245 collected with the VLBA instrument of the NRAO and available at https://archive.nrao.edu/archive. The NRAO is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc. This work made use of the Swinburne University of Technology software correlator, developed as part of the Australian Major National Research Facilities Programme and operated under license.

It is my pleasure to thank Yuri Y. Kovalev for lengthy and enlightening discussions about imaging and amplitude normalization.

Appendix: Auxiliary Tables

For completeness, two auxiliary tables are presented. Table 9, containing detected sources, has the following columns: (1) WFCS J2000 source name; (2) B1950 source name; (3) R.A. from the C-band solution; (4) decl. from the C-band solution; (5) error in R.A. (without cos(delta) factor) from the C-band solution in milliarseconds; (6) error in decl. from the C-band solution in milliarseconds; (7) correlation between R.A. and decl. from the C-band solution; (8) number of observations used in the C-band solution; (9) R.A. from the X-band solution; (10) decl. from the X-band solution; (11) error in R.A. (without cos(delta) factor) from the X-band solution in milliarseconds; (12) error in decl. from the X-band solution in milliarseconds; (13) correlation between R.A. and decl. from the X-band solution; (14) number of observations used in the X-band solution (15) R.A. from the dual-band X/C solution; (16) decl. from the dual-band X/C solution; (17) error in R.A. (without cos(delta) factor) from the dual-band solution in milliarseconds; (18) error in decl. from the X-band solution in milliarseconds; (19) correlation between R.A. and decl. from the dual-band solution; (20) number of observations used in the dual-band X/C solution; (21) flux density at 1.4 GHz from the cross-matched NVSS catalog; (22) flux density at 4.85 GHz from the cross-matched GB6 or PMN catalog; and (23) spectral index between the 1.4 GHz and 4.8 GHz catalogs. The entire table is available in machine-readable format.

Table 9.  The First Eight Rows of the Auxiliary Table with Positions and Flux Densities of 6755 WFCS Program Sources Detected in Each of the Three Solutions

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) ...
WFCS J0000+0248 2357+025 00 00 19.282714 +02 48 14.69019 1.20 2.57 0.021 29 00 00 19.282530 +02 48 14.68956 0.56 ...
RFCS J0000+0307 2357+028 00 00 27.022719 +03 07 15.64335 1.17 2.68 0.101 36 00 00 27.022615 +03 07 15.64531 0.52 ...
WFCS J0000+1139 2357+113 00 00 19.564227 +11 39 20.72629 1.18 2.54 0.009 29 00 00 19.564105 +11 39 20.72772 1.01 ...
WFCS J0000+3918 2358+390 00 00 41.527499 +39 18 04.14617 1.37 2.18 −0.040 45 00 00 41.527528 +39 18 04.14665 0.49 ...
WFCS J0000+5157 2358+516 00 00 51.385221 +51 57 19.89483 7.30 7.88 0.216 11 00 00 51.385344 +51 57 19.89833 14.43 ...
WFCS J0000−1352 2357−141 00 00 03.124573 −13 52 00.76274 3.01 6.45 0.586 19 00 00 03.124493 −13 52 00.75819 1.71 ...
WFCS J0000−3738 2357−379 00 00 08.414694 −37 38 20.70320 8.89 18.68 0.812 32 00 00 08.414234 −37 38 20.68723 3.41 ...
WFCS J0001+1456 2358+146 00 01 32.830898 +14 56 08.07612 1.44 2.83 0.058 28 00 01 32.830859 +14 56 08.07853 0.53 ...
...                      

Note. Only the first 11 columns out of 23 are shown.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

Table 10 lists 6399 target sources that were observed but have not been detected. It contains their a priori coordinates, the flux densities for matching sources from NVSS and either GB6 or PMN catalogs, and the spectral index. Among 6399 sources not detected in the WFCS campaign, 51 have been detected with other VLBI programs that ran after WFCS. These sources are marked with a flag. The entire table is available in machine-readable format.

Table 10.  The First Eight Rows of the Auxiliary Table with Positions and Flux Densities of 6399 WFCS Program Sources that were Observed, but Not Detected

Source Name IVS Name R.A. Decl. F1.4 F4.8 Sp. ind Flag
WFCScand J0000+0957 2357+096 00 00 02.87 +09 57 06.6 0.3014 0.1030 −0.86  
WFCScand J0000+1114 2358+109 00 00 47.60 +11 14 12.0 0.2019 0.1070 −0.51  
WFCScand J0000+1214 2358+119 00 00 37.80 +12 13 57.0 −9.9 0.0500 −9.9  
WFCScand J0000+2947 2358+295 00 00 57.60 +29 47 58.0 0.1298 0.0530 −0.72  
WFCScand J0000+5539 2357+553 00 00 19.40 +55 39 03.0 1.5178 0.4650 −0.95  
WFCScand J0000+6020 2357+600 00 00 32.40 +60 20 55.0 0.3741 0.1570 −0.70  
WFCScand J0000+6126 2358+611 00 00 58.30 +61 26 10.0 0.4360 0.1740 −0.74  
WFCScand J0000−1423 2358−146 00 00 40.22 −14 23 47.2 0.3088 −9.9 −9.9  

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

Footnotes

Please wait… references are loading.
10.3847/1538-3881/abc4e1