The following article is Open access

Target Selection and Validation of DESI Luminous Red Galaxies

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and

Published 2023 January 18 © 2023. The Author(s). Published by the American Astronomical Society.
, , Focus on Early Data from the Dark Energy Spectroscopic Instrument (DESI) Citation Rongpu Zhou et al 2023 AJ 165 58 DOI 10.3847/1538-3881/aca5fb

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1538-3881/165/2/58

Abstract

The Dark Energy Spectroscopic Instrument (DESI) is carrying out a five-year survey that aims to measure the redshifts of tens of millions of galaxies and quasars, including 8 million luminous red galaxies (LRGs) in the redshift range 0.4 < z ≲ 1.0. Here we present the selection of the DESI LRG sample and assess its spectroscopic performance using data from Survey Validation (SV) and the first two months of the Main Survey. The DESI LRG sample, selected using g, r, z, and W1 photometry from the DESI Legacy Imaging Surveys, is highly robust against imaging systematics. The sample has a target density of 605 deg−2 and a comoving number density of 5 × 10−4 h3 Mpc−3 in 0.4 < z < 0.8; this is a significantly higher density than previous LRG surveys (such as SDSS, BOSS, and eBOSS) while also extending to z ∼ 1. After applying a bright star veto mask developed for the sample, 98.9% of the observed LRG targets yield confident redshifts (with a catastrophic failure rate of 0.2% in the confident redshifts), and only 0.5% of the LRG targets are stellar contamination. The LRG redshift efficiency varies with source brightness and effective exposure time, and we present a simple model that accurately characterizes this dependence. In the appendices, we describe the extended LRG samples observed during SV.

Export citation and abstract BibTeX RIS

1. Introduction

Galaxy redshift surveys have been established as a pillar of observational cosmology over the past several decades. The large-scale structures traced by galaxies reveal the imprint of baryon acoustic oscillations (BAO), a feature that can be used to measure the expansion history of the universe. The redshift space distortions (RSD) caused by the peculiar velocities of galaxies enable measurements of the growth of the large-scale structure and tests of general relativity.

Luminous red galaxies (LRGs) are an important type of galaxies for large-area redshift surveys, and are specifically selected for observations because of two main advantages: (1) they are bright galaxies with the prominent 4000 Å break in their spectra, thus allowing for relatively easy target selection and redshift measurements; and (2) they are highly biased tracers of the large-scale structure, thus yielding a higher signal-to-noise ratio (S/N) per object for the BAO measurement than typical galaxies. In addition to the cosmological constraints from BAO and RSD, there will be significant gains in constraining powers when the LRG sample is combined with other observations, e.g., using the LRGs (and their massive dark matter halos) as gravitational lenses of background galaxies and the cosmic microwave background (e.g., Mandelbaum et al. 2013; Singh et al. 2020; White et al. 2022).

The Dark Energy Spectroscopic Instrument (DESI; DESI Collaboration et al. 2016a, 2016b), DESI Collaboration et al. (2023, in preparation) is undertaking the largest galaxy redshift survey to date, and LRGs will be the primary galaxy targets that DESI will observe in the redshift range 0.4 < z ≲ 1.0. Compared to LRG samples from previous surveys, such as the Sloan Digital Sky Survey (SDSS) LRG survey (Eisenstein et al. 2001, 2005), BOSS (Reid et al. 2016; Alam et al. 2017), and eBOSS (Prakash et al. 2016; eBOSS Collaboration et al. 2021), the DESI LRG sample has a significantly higher target density and extends to higher redshifts (see Figure 1). This is made possible by DESI's higher fiber multiplexing, larger telescope aperture, and better spectroscopic performance, and the availability of deeper (and highly uniform) imaging data necessary for target selection.

Figure 1.

Figure 1. The redshift distribution of the DESI LRG sample and its comparison with LRG samples from earlier surveys. The y-axis is the number of objects in each redshift bin (of width Δz = 0.05) per deg2. The survey area and the total number of LRGs that have been or will be observed in each survey are listed in the legend. The dashed curve corresponds to the redshift distribution of a hypothetical sample with a constant comoving density of 5 × 10−4 h3 Mpc−3, which is approximately the DESI LRG target density in the redshift range 0.4 < z < 0.8; the area under the curve is proportional to the enclosed comoving volume. We describe how we obtain the redshifts for DESI LRGs in Section 4.1.

Standard image High-resolution image

In this paper, we describe the selection of the DESI LRG targets and assess the selection uniformity and spectroscopic performance. Significant efforts were made to minimize the impact of imaging systematics (i.e., the modulation of the target density caused by variations in image quality across the sky). These include improvements to the image reduction pipeline, which were motivated by the need for uniform target selection and are discussed in D. Schlegel et al. (2023, in preparation), as well as making careful choices for target selection. We will describe these choices for the DESI LRG sample in this paper.

The structure of the paper is as follows. We describe the imaging data, selection cuts, stellar mass completeness, and veto masks in Section 2. We assess potential imaging systematics in Section 3. In Section 4, we evaluate the spectroscopic redshift efficiency and model its dependence on source brightness and exposure time. We summarize our work in Section 5. In the appendices, we describe the selections and redshift performance of the extended LRG samples observed before the start of the Main Survey, specifically during Survey Validation ("SV1") and the 1% Survey ("SV3"); these observations were done separately and are not included in the Main LRG sample (which we simply refer to as DESI LRGs).

This paper is part of a series of papers presenting the selection of DESI targets and their characterization. DESI Collaboration et al. (2023, in preparation) gives an overview of the DESI targets and the Survey Validation program, and Myers et al. (2022) describes how the target selection algorithms are implemented in DESI. The Milky Way Survey (MWS) sample is presented in Cooper et al. (2022), the Bright Galaxy Survey (BGS) sample in Hahn et al. (2022), the emission line galaxy (ELG) sample in Raichoor et al. (2022), and the quasar (QSO) sample in Chaussidon et al. (2022). Lan et al. (2022) and Alexander et al. (2022) describe the creation of spectroscopic truth tables based on visual inspections of the galaxy (BGS, LRG, and ELG) spectra and the QSO spectra, respectively.

2. Target Selection

2.1. Imaging Data

The LRG targets are selected from the DESI Legacy Imaging Surveys Data Release 9 (Dey et al. 2019; D. Schlegel et al. 2023, in preparation, hereafter LS DR9), specifically the g, r, and z optical bands (4000–10000 Å, without the i band at 7800 Å) and forced photometry of the Wide-field Infrared Survey Explorer (WISE, Wright et al. 2010) W1 band in the infrared (3.4 μm). The imaging footprint is shown in Figure 2. Table 1 lists the sky areas and other summary information about the DESI LRG sample.

Figure 2.

Figure 2. The footprint of the DESI Legacy Imaging Surveys DR9, with the colors representing the surface density (in deg−2) of the LRG targets (after applying the LRG veto masks). DESI will only observe regions above decl. > −20°, and the DESI footprint also avoids regions close to the edge of the imaging footprint. See the actual DESI footprint in E. Schlafly et al. (2023, in preparation). This density map is computed with a HEALPix resolution of NSIDE = 256, and we only plot pixels that are >20% occupied by the imaging survey footprint. The curve that separates the two regions is the Galactic plane.

Standard image High-resolution image

Table 1. Useful Information about DESI and the LRG Targets

Area in the imaging footprint19,700 deg2
Area in the DESI survey14,800 deg2
Fraction of area in target mask1.0%
Fraction of area in LRG veto mask8.5%
Target density605 deg−2
Spectroscopically confirmed star fraction0.5%
Spectroscopically confirmed quasar fraction1.6%
Fraction rejected by redshift quality cut1.1%
Fraction of catastrophic redshift failures0.2%

Note. The areas include all regions with optical (grz) coverage without any masking. The area in the DESI survey is approximate. The LRG veto mask includes the target mask. The target density is the average density over the DESI footprint. The target density and all the spectroscopy/redshift-related values are calculated after the LRG veto mask is applied. The catastrophic redshift failure rate is calculated after applying the redshift quality cut.

Download table as:  ASCIITypeset image

The optical grz imaging consists of two regions separated at decl. δ ≃ 32°, with each region observed by different telescopes (with similar grz filter sets). The southern (δ ≲ 32°) part of the imaging footprint (hereafter "the South") is observed by DECam on the 4 m Blanco Telescope at the Cerro Tololo Inter-American Observatory (CTIO). Most of the observation is done by the DECam Legacy Survey (DECaLS, Dey et al. 2019), and data from other observations, most importantly the Dark Energy Survey (DES, The Dark Energy Survey Collaboration 2005), are also used. The northern (δ ≳ 32°) part of the imaging footprint (hereafter "the North"), which is inaccessible from CTIO, is observed by two telescopes at the Kitt Peak National Observatory in two surveys: the Beijing–Arizona Sky Survey (BASS, Zou et al. 2017) observed in the g and r bands using the 90Prime Camera on the 2.3 m Bok Telescope, and the Mayall z-band Legacy Survey (MzLS) observed in the z band using the Mosaic-3 Camera on the 4 m Mayall Telescope. The Mayall Telescope has since been repurposed for DESI.

For the same photometric band, small differences in the filter sets, detectors, observing conditions, and image processing between the North and the South cause their photometry to differ slightly. For the LRGs, these differences are often nonlinear functions of magnitude and color, and cannot be quantified by a constant offset. Thus we find it necessary to implement slightly different color cuts for each region. The cuts are optimized using both photometric and spectroscopic data to achieve uniform density and redshift distributions across the footprint. The specific color cuts are described in the next subsection.

2.2. Selection Cuts

The LRG targets are selected using optical photometry in the grz bands and near-infrared photometry in the WISE W1. The LRG selection cuts for the South, shown in Figure 3, are

Equation (1a)

Equation (1b)

Equation (1c)

Equation (1d)

where g, r, z, and W1 are magnitudes and zfiber is the z-band fiber magnitude, i.e., the magnitude corresponding to the expected flux within a DESI fiber. 38 Throughout this paper, unless otherwise specified, all the magnitudes are in the AB system and are corrected for Galactic extinction 39 based on Schlegel et al. (1998) with correction from Schlafly & Finkbeiner (2011).

Figure 3.

Figure 3. Selection cuts for the LRG targets in the South footprint. The points are color-coded by their redshifts measured by DESI. The upper left panel shows the stellar rejection cut, with gray points representing stars (which are plotted to show the stellar locus and are not LRG targets). The upper right panel shows the cut that removes lower-redshift and bluer galaxies. The lower left panel shows the sliding color–magnitude cut that serves as the luminosity cut and also shapes the redshift distribution; the "knee" at W1 ≃ 19 introduces more galaxies at higher redshift. The lower right panel shows the magnitude limit in z-band fiber magnitude that ensures enough S/N for DESI observations.

Standard image High-resolution image

Equation (1b) utilizes the 1.6 μm (rest-frame) "bump" (John 1988; Sawicki 2002) to efficiently remove stars from the sample (as shown in the upper left panel of Figure 3), similar to the stellar rejection cut in Prakash et al. (2015), resulting in a low stellar contamination rate of ∼0.5% (see Section 4.2). Equation (1c) removes galaxies at lower redshifts while retaining high completeness of massive galaxies at z > 0.4. The sliding color–magnitude cuts in Equation (1d) function as luminosity cuts: as shown in the lower left panel of Figure 3, the rW1 color is a good proxy for redshift, and the W1 magnitude limit that shifts with rW1 effectively selects the most luminous (in the observed W1 band) galaxies at any redshift. For objects with rW1 ≳ 3.3, which are the faintest LRGs in our sample and are near the faint limit (Equation (1a)), the rW1 versus W1 sliding cut is dropped to boost the number density of the highest-redshift LRGs. The cuts are tuned so that the comoving number density of the selected sample is close to constant at 0.4 < z < 0.8.

Finally, Equation (1a) sets the faint limit for the sample to ensure a high redshift success rate for DESI spectroscopic observations (see Sections 4.3 and 4.4). We choose the fiber magnitude over the total magnitude as the faint limit because the former is much more strongly correlated with the spectroscopic S/N. Note that the fiber magnitude cut, which is effectively a morphology cut, could introduce systematics in both sample selection (e.g., systematics dependence on seeing) and theoretical modeling (as galaxy orientation could be aligned with the tidal field; e.g., see Lamman et al. 2022), and such effects will need to be carefully studied in the clustering analysis.

The photometry in the North is slightly different from that in the South, and the selection cuts are tuned to match the number density and the redshift distribution in the South. The cuts for the North are

Equation (2a)

Equation (2b)

Equation (2c)

Equation (2d)

In addition to the aforementioned cuts, we apply the following quality cuts (e.g., to remove objects with bad photometry) in the target selection pipeline (Myers et al. 2022). We require that each object be observed at least once in all of the three optical bands. We also require that the inverse-variance values for r, z, and W1 fluxes be positive; this rejects problematic imaging data. A small number of stars are not removed by the aforementioned selection cuts due to saturation in the imaging. We remove them by requiring zfibertot > 17.5, and if Gaia (Gaia Collaboration et al. 2018) photometry is available, we also require GGaia > 18.

The DESI target selection pipeline also removes objects close to bright stars (in Gaia and Tycho-2), star clusters, and large galaxies, which are flagged by the LS DR9 MASKBITS 40 1, 12, and 13, respectively. These masks are very minimal and only remove regions with the worst contamination, and additional masking is needed for the clustering analysis. We describe these additional masks in Section 2.4.

The selection cuts for the Survey Validation and the 1% Survey samples are described in Appendices A and B, respectively.

2.3. Stellar Mass Completeness

In order to accurately model galaxy clustering (e.g., Zu & Mandelbaum 2015; Zhou et al. 2021) and study galaxy–galaxy lensing (e.g., Alam et al. 2017; Jullo et al. 2019), halo occupation distribution (e.g., Rodríguez-Torres et al. 2016), and evolution of the most massive galaxies (e.g., Bundy et al. 2017), it is desirable to have a large spectroscopic sample of strongly clustered galaxies with well defined stellar populations. The target selection cuts for the DESI LRG sample have therefore been optimized to select the most massive galaxies with a high degree of completeness. We define completeness as the ratio of the number of selected LRGs to the total number of galaxies brighter than the LRG magnitude limit (defined by Equations (1a) and (2b)). In this section we refer to objects that satisfy our stellar rejection cut (defined by Equation (1b)) as "galaxies."

The cuts defined by Equations (1c) and (2b) reject objects with low redshifts while retaining the most massive galaxies for redshifts over 0.4. The design of these cuts was guided by estimates of stellar masses of galaxies obtained using a random forest algorithm (Breiman 2001) trained on DESI Legacy Imaging Survey photometry and stellar masses of galaxies from Bundy et al. (2015). A detailed description of the method used to obtain the stellar masses can be found in Appendix C.

Figure 4 shows the stellar mass completeness of the DESI LRG sample as a function of both stellar mass and redshift. As spectroscopic redshifts are only available for some of the selected LRGs and not the magnitude-limited sample, we use the photometric redshifts in LS DR9 41 (Zhou et al. 2021) for this analysis. The selection cuts result in a sample that is highly complete for the most massive galaxies (i.e., ${\mathrm{log}}_{10}({M}_{\star }[{M}_{\odot }])\gt 11.5$) in the redshift range 0.4–1.0. The completeness decreases significantly for redshifts lower than 0.4 but the decrease is less severe for redshifts above 1.0. This high mass completeness is one of the defining characteristics of the DESI LRG sample and will aid a multitude of scientific studies.

Figure 4.

Figure 4. Stellar mass completeness of the DESI LRG sample as a function of stellar mass and photometric redshift. The dashed curve shows the fraction of galaxies above a given stellar mass that have been selected as DESI LRGs compared to a magnitude-limited sample. The blue histogram shows the distribution of stellar masses of a magnitude-limited sample of galaxies (having the same magnitude limit as the DESI LRG sample) whereas the black histogram denotes the subset of galaxies that have been selected as DESI LRGs. The stellar masses were obtained using a random forest-based algorithm (described in Appendix C) and the photometric redshifts are from Zhou et al. (2021). As spectroscopic redshifts are not available for the magnitude-limited sample, we use photometric redshifts for this demonstration. The figure uses objects from both the North and the South where valid photometry in g, r, z, W1, and W2 is available. The selected sample is highly complete for the most massive galaxies (i.e., ${\mathrm{log}}_{10}({M}_{\star }[{M}_{\odot }])\gt 11.5$) in the redshift range 0.4–1.0. The completeness decreases significantly for redshifts lower than 0.4 but the decrease is less steep for redshifts above 1.0.

Standard image High-resolution image

2.4. Veto Masks for Clustering Analysis

Here we describe the additional veto masks specifically optimized for the LRG targets to create a clean sample for the DESI clustering analysis. They are mostly masks of bright stars, although they also include masking for large galaxies and star clusters, etc. The veto masks comprise four separate sets of masks:

  • 1.  
    unWISE (Meisner et al. 2019) pixel-level bitmask: we use all but bit five ("AllWISE-like circular halo") of the collapsed mask bits as listed in Table A4 of Meisner et al. (2019). We exclude bit five because these circular masks are not optimal (either too large or too small, depending on the magnitude of the star) for the LRG targets, and they are replaced by
  • 2.  
    WISE circular geometric masks: these masks replace the "AllWISE-like circular halo" masks in unWISE. The relation between radius and W1 magnitude is optimized for the LRGs, so that the excess or deficit of LRG targets at the edge of the mask is less than ∼10%.
  • 3.  
    Gaia/Tycho-2 circular masks: similarly, we obtain the relation between radius and magnitude for Gaia stars. We use stars in Gaia Early Data Release 3 (EDR3, Gaia Collaboration et al. 2021) supplemented at the bright end (where Gaia photometry is unreliable) by Tycho-2 and Two Micron All Sky Survey (2MASS) photometry.
  • 4.  
    Custom masks: these are masks for large galaxies, star clusters, and planetary nebulae that were not masked by the LS DR9 MASKBITS, and regions with other imaging artifacts (identified from visual inspection of regions with high LRG densities). The total area of the custom masks is much smaller than the area of the other masks.

We describe sets 2–4 in more detail in Appendix D. The combined veto masks remove 8.5% of the DESI footprint. Within the masked (contaminated) area, the LRG target density is 1100 deg−2 and the stellar contamination rate (based on spectroscopic classification) is ∼10%, compared to the 605 deg−2 density and ∼0.5% stellar contamination in the unmasked ("clean") area. The stellar contamination rate is much higher in the masked region because the photometry (especially in the W1 band) used in the stellar rejection cut (Equation (1b)) is contaminated by the nearby bright stars.

While the full LRG spectroscopic and target catalogs include objects flagged by the LRG veto masks, we recommend that those objects be removed for analyses that require a clean sample with uniform selection. In the rest of the paper, we only use the "clean" LRG sample (with the LRG veto masks applied) instead of the full target/spectroscopic sample.

Finally, we note that the LRG veto masks presented here are not definitive, and they may see further improvements, e.g., for the DESI Year 1 science analyses.

3. Target Selection Systematics

3.1. Imaging and Foreground Systematics

The variation of imaging properties (such as depth and seeing) over the footprint and the presence of astrophysical foregrounds (such as Galactic dust) can imprint on the density of galaxy targets. Here we examine the impact of these systematics on the LRG target density. The systematics properties that we examine here include depth (galaxy depth in grz and point-spread function (PSF) depth in W1), seeing (in grz), Gaia stellar density, and Galactic extinction E(BV) (from Schlegel et al. 1998).

While the photometry used in target selection has been corrected for Galactic extinction, we include E(BV) here because imperfections in the correction, e.g., due to errors in the dust map, can still bias the photometry and affect the target density.

We use the STARDENS values from Myers et al. (2022) for Gaia stellar density, and we use values from the imaging catalog 42 for all other systematics properties (GALDEPTH, PSFDEPTH, PSFSIZE, and EBV).

Figure 5 shows the dependence of LRG target density on the imaging and foreground systematics. The LRG sample is much brighter than the imaging detection limit (with the faintest LRG targets being at least ∼2 mag brighter than the median z-band 5σ detection limit), and the stellar rejection cut and the LRG veto masks efficiently remove the contamination caused by stars. Therefore the LRG sample is relatively robust against these imaging/foreground systematics. The density deviations caused by these systematics are almost all within ±5%.

Figure 5.

Figure 5. Density of LRG targets in bins of imaging/foreground systematics values in the three imaging regions. (DECaLS and DES are both observed with DECam and have the same selection cuts and linear regression coefficients, but DES is significantly deeper and we plot it as a separate region to illustrate the difference.) The error bars represent "the error of the mean" assuming Gaussian distribution. The histograms show the distribution of each systematics property for each imaging region. "Galaxy depth" is based on the "GALDEPTH" value in the LS DR9 catalog: it is the 5σ detection magnitude of an ELG-like galaxy and it assumes zero Galactic extinction; to account for Galactic extinction, we add an E(BV) term to obtain the imaging depth relevant for extragalactic sources. "PSF size" is the PSF FWHM and measures the seeing. The trends are computed using a HEALPix density map with NSIDE = 512 by averaging over the pixels in bins of imaging/foreground properties.

Standard image High-resolution image

The systematics trends in Figure 5 can be almost completely removed via linear regression of the systematics properties:

Equation (3)

where Npredict,k is the "predicted" number of LRG targets in the kth HEALPix pixel, Si,l is the value of the ith systematics property of the lth random point, Nrand,k is the number of random points in the kth pixel, and ci are the coefficients. We use the random catalogs in LS DR9. The "corrected" systematics trends are shown in Figure 6. For the linear regression, we include all systematics properties except stellar density, because (1) the stellar contamination rate is already very low and (2) the stellar density maps are inherently noisy and the stars in the Gaia catalog (on which the stellar density map is based) is much brighter than the stellar contamination in the LRG targets. Indeed we find little correlation with stellar densities after applying the systematics weights. The coefficients for the North and the South are different, but DECaLS and DES are treated as one sample in the linear regression and have the same coefficients. The LRG density in the DES region is ∼3% lower than the average density due to its deeper photometry, and the linear regression weights accurately predict this density difference.

Figure 6.

Figure 6. As Figure 5 but with linear regression weights applied. The stellar density is not included in the parameters for the linear regression.

Standard image High-resolution image

There is an unexpected dependence on E(BV) that remains after correcting for depth and seeing dependence: the LRG density at E(BV) ≲ 0.015 is more than 5% lower than average. We also see similar trends at very low E(BV) in the other DESI tracers (which have very different selections and redshifts), so it is unlikely to be a statistical fluke. While we are not certain what causes this drop in target density at very low E(BV), we speculate that it is caused by systematics in the Galactic extinction map. Specifically, the E(BV) map from Schlegel et al. (1998) is based on dust emission in the far infrared (FIR), which may have been contaminated by FIR emissions from background galaxies. The remaining trends (at higher E(BV)) might be due to effects dependent on the spectral energy distribution (SED): ideally, we would calculate the extinction coefficients for each galaxy based on its SED, but in practice we use a single stellar spectrum for calculating the extinction coefficients, and this could lead to systematic errors in the dereddened fluxes. We will examine the Galactic dust-related systematics in future investigations.

Note that while the aforementioned linear weights work well for the projected density of the full LRG sample, for subsets of the sample (e.g., in redshift bins) the coefficients and weights should be recomputed for each subset, because different subsamples are affected by the selection cuts differently (e.g., brighter subsamples might be more sensitive to the sliding cut in rW1 versus W1 while fainter subsamples might be more sensitive to the zfiber faint limit) and have different sensitivities to the different systematics.

There were no formal requirements on DESI targeting with respect to these systematic trends in target density. Instead, this analysis was used as a diagnostic to ensure no trends existed beyond expectations, e.g., from previous surveys. One can observe that the trends are generally more moderate than those found in BOSS and eBOSS (Ross et al. 2017, 2020). Further, we have demonstrated that a simple linear regression, similar again to that applied to BOSS/eBOSS, successfully models the trends. The exception is a relationship with E(BV) that should be studied in more detail in future DESI work. The details and modeling here thus represent a first step in the process of determining the selection function for the DESI LRG sample, a vital component for producing 3D clustering measurements once the redshifts for the LRG sample have been measured, and a reference that will aid future steps.

3.2. Zero-point Sensitivities

Another source of imaging systematics is the zero-point (ZP) uncertainty—the uncertainty of the photometric zero-point for each exposure of the imaging data. While depth and seeing can be accurately measured and their effects on targeting can in principle be modeled, the ZP uncertainty is a systematic uncertainty that is mostly due to imperfect modeling of the observing conditions. Thus, the imprint of ZP uncertainties on the sky is difficult, if not impossible, to correct for. We design the LRG target selection to be as insensitive to the ZP uncertainties as possible.

One way to quantify the effects of ZP uncertainties on the LRG targets is to estimate the level of fluctuation in target density caused by the ZP uncertainties. The estimated ZP uncertainties in g, r, z, and W1 are 3, 3, 6, and 1 mmag, respectively (D. Schlegel et al. 2023, in preparation). For the LRG selection, a net change of +10 mmag in g, r, z, and W1 causes changes of +0.11%, +1.40%, −1.23%, and −2.89% in target density, respectively. If we assume that the ZP errors in the four bands are all uncorrelated with each other, we can treat the combined effect on the target density as sums of independent Gaussian random variables, and the resulting rms of the target density is 0.9%. During Survey Validation, we considered an alternative LRG selection that implements the luminosity cut using z-band fiber magnitude and rz color (see Appendix A), and this selection has a much larger rms of ∼4% due to the large ZP uncertainty in the z band. (The z-band photometric calibration has large uncertainties mainly because the effective z-band filter transmission can vary significantly due to absorption by telluric water vapor.) Its insensitivity to zero-point errors is the main reason why we chose to adopt the WISE-based luminosity cut.

4. Spectroscopic Assessment

4.1. Spectroscopic Data

We use spectroscopic data from SV1, SV3, and the first two months of the Main Survey. We only select LRG+QSO tiles in SV1 and dark tiles in SV3 and the Main Survey. The sky coverage of the spectroscopic LRGs is shown in Figure 7. Figure 8 shows some example spectra of LRGs observed during the Main Survey and the image cutouts (from the Legacy Surveys Viewer 43 ).

Figure 7.

Figure 7. LRGs observed by DESI during Survey Validation and the first two months of the Main Survey. We use these data for evaluating the LRG redshift performance.

Standard image High-resolution image
Figure 8.

Figure 8. Example spectra and image cutouts of DESI LRGs that were observed in the Main Survey to nominal spectroscopic depth. The observed and model spectra are convolved with a Gaussian kernel with σ = 2.4 Å to reduce the noise. The three spectra from the B/R/Z spectrographs are coadded into a single spectrum in the figure. The target ID, g/r/z/W1 magnitudes, zfiber magnitude, best-fit redshift, best-fit spectral type, ZWARNING flag, and Δχ2 values are listed for each object. Major absorption and emission lines, which are taken from the DESI visual inspection tool Prospect (https://github.com/desihub/prospect), are shown as green dashed lines. The image cutouts are 34'' × 34'' composites in g/r/z (top) and W1/W2 (bottom).

Standard image High-resolution image

SV1 has several flavors of coadds, and we use the single-exposure coadds, the nominal (1×) depth coadds, and the cumulative (deep) coadds. LRG targets are assigned the target bit 20 in all three programs. A significant number of brighter LRGs are also observed in the BGS program under very different observing conditions, and we do not include them here. We remove objects affected by instrument issues by requiring that the COADD_FIBERSTATUS value in the catalogs is equal to 0. We apply the veto mask (Section 2.4) to create a clean sample.

The redshift fitting is done using Redrock (S. Bailey et al. 2023, in preparation). It uses 1D spectra produced by the DESI spectroscopic pipeline (Guy et al. 2022) as input. For each spectrum, it computes the best-fit redshift and Δχ2, which is the difference in χ2 between the best-fit model and the second best-fit model and is an indication of the reliability of the best-fit redshift.

We use a sample of "true" redshifts as the reference sample for measuring the catastrophic failure rates. While the redshifts from visual inspection (Lan et al. 2022) could also be used as true redshifts, they are only available for a few thousand SV LRGs and a few hundred Main Survey LRGs. Therefore we use the much larger sample of redshifts from deep coadded SV1 spectra as the true redshifts. We require a minimal effective exposure time (teff, see its definition in Guy et al. 2022) of 3000 s, which is three times the DESI nominal depth of teff = 1000 s. We validate these deep redshifts by comparing them with the redshifts from visual inspection, and we find that the redshifts disagree for less than 0.5% of the Main Survey LRGs.

For assessing the spectroscopic performance at nominal depth, we require teff > 800 s. And we also require teff < 1200 s for the SV1 coadds (which have a wider range of teff than the Main Survey). To assess whether a redshift (e.g., obtained at nominal depth) is correct, we compare it with the redshift of the same object from the deep coadded spectra. If it differs from the deep redshift by more than 1000 km s−1, that redshift is considered a "catastrophic redshift failure." (A small number of deep redshifts have ZWARN ≠0 or zredrock > 1.5, and are likely not reliable. We treat the corresponding nominal-depth redshifts as catastrophic failures regardless of their redshift values.)

Figure 1 shows the redshift distribution of LRGs observed in the first two months of the Main Survey, which covers roughly 2000 deg−2. The sample has a roughly constant comoving density of 5 × 10−4 h3 Mpc−3 in 0.4 < z < 0.8 and a high-z tail that extends beyond z = 1.0. Figure 1 excludes a small (1%) fraction of the observed LRG targets that are rejected by the quality cut (see Section 4.3); these objects are the faintest LRG targets and are mostly at the high-redshift end of the sample.

4.2. Spectroscopic Classification

Only 0.5% of the LRG targets are spectroscopically classified by Redrock as stars and 1.7% as QSOs. The rest are classified as galaxies. Almost all the stellar contaminants are cool stars such as M dwarfs whose strong infrared emissions allow them to pass the stellar rejection cut. About 0.6% of the LRG targets are also QSO targets, and 61% of these are spectroscopically classified as QSOs and 2% as stars. If we exclude QSO targets, the stellar fraction is 0.5% and the QSO fraction is 1.2%. A large fraction of the area observed in the first two months of the Main Survey is at relatively low Galactic latitude, and the full DESI footprint will include a larger fraction of high-latitude area where the stellar density is lower. Therefore we expect the 0.5% stellar contamination rate to be an upper bound for the full DESI sample.

Note that in the first two months of the Main Survey observations, the observed LRG targets include a larger fraction (by a factor of ∼2) of objects that are also QSO targets than in the full survey. This is because the overall fiber-assignment completeness is lower at the beginning of the survey and the QSO targets have a higher fiber-assignment priority than the LRGs (see Raichoor et al. 2022). We correct for this by downweighting the QSO targets so that the fractions described above are estimates for the final Main Survey sample.

4.3. Redshift Failure Rate and Quality Cut

The catastrophic redshift failure rate of LRGs observed at nominal depth is 0.7% (110/15,379) from comparison with the deep redshifts. This is consistent with what we find from repeat observations: in the overlap between SV3 and the Main Survey: we find that 1.2% (84/7233) of the repeats have different redshifts, which means that the per-object catastrophic failure rate is 0.6% if we assume that the redshift efficiency is the same in SV3 and the Main Survey.

To reject incorrect redshifts, we apply the following redshift quality cut (shown in Figure 9): we require Δχ2 > 15, zredrock < 1.5, and the ZWARNING flag ZWARN = 0. The zredrock < 1.5 cut removes the pileup of catastrophic failures at z ∼ 1.6. (While it is not entirely clear what causes this pileup, based on the fact that these objects have mostly noisy and featureless spectra, we speculate that to best match the observed featureless spectra, Redrock finds the best fit at z ≳ 1.5 so that the 4000 Å break feature of the principal component analysis templates is redshifted outside the DESI spectral coverage.) We note that the redshift quality cut is preliminary, and it may change, e.g., as the spectroscopic pipeline and Redrock evolve.

Figure 9.

Figure 9. Δχ2 vs. redshift (best-fit from Redrock) at nominal depth. The black lines are the redshift quality cut. We distinguish between correct redshifts and catastrophic redshift failures by using redshifts from the deep coadds as truth.

Standard image High-resolution image

The redshift quality cut removes 1.1% of the LRGs, 43% (82/191) of which are catastrophic failures. The catastrophic failure rate in the accepted (confident) redshifts is 0.2% (28/15,188). From visual inspection of the spectra and images, we find that a significant fraction of the catastrophic failures that pass the redshift quality cut are blends (i.e., there is more than one object within the fiber diameter), and both redshift solutions produce reasonable fits to the observed spectra.

Hereafter, we refer to the fraction of objects that meets the quality cut as the "redshift success rate" and the fraction that fails the cut as the "failure/rejection rate," and we refer to the incorrect redshifts (as determined by comparing with deep or repeat spectra) as "catastrophic failures."

In the above assessments, we have excluded QSO targets and objects spectroscopically classified as QSOs, because they have much higher redshift failure rates and are a very different population than the "normal" LRGs. The objects that are targeted or classified as QSOs have approximately 10 times higher catastrophic failure rates than the rest of the LRG targets, and they are also about 10 times more likely to fail the redshift quality cut. It should be noted that the distinction between a "galaxy" and "QSO" is not always sharply defined (e.g., Redrock may classify some "galaxies" with features of active galactic nuclei as "QSOs" but not others), and more careful consideration may be needed for the selection of the DESI clustering sample.

4.4. Depth and Magnitude Dependence

The spectroscopic redshifts of the LRGs are primarily based on absorption lines and the 4000 Å break, and sufficient spectroscopic S/N is critical for confident redshift estimation. The S/N mainly depends on two factors: the source brightness and the spectroscopic depth (teff). Here we investigate how the two factors affect the LRG redshift failure rate. (The redshift failure rate also depends on other factors such as the strength of the absorption and emission lines and prominence of the 4000 Å break, but they do not vary significantly within the LRG sample and are thus less important factors for the redshift determination.)

The z-band fiber magnitude (zfiber) is very well correlated with Δχ2 and catastrophic redshift failures, as shown in Figure 10, and the correlation is much stronger than the fiber magnitudes in the g and r bands. Therefore we use zfiber as the parameter for source brightness. In Figure 11 we show the catastrophic redshift failure rate and rejection rate as a function of zfiber at nominal depth. The error bars show the uncertainty for a binomial distribution: ${\sigma }_{p}=\sqrt{{Np}(1-p)}/N$ where N is the total number of objects and p is the failure/rejection rate. At the zfiber limit, the LRGs have the highest catastrophic failure rate of ∼2% and rejection rate of ∼4%. The catastrophic rate after applying the redshift quality cut is much less than 1% at the zfiber limit.

Figure 10.

Figure 10. Similar to Figure 9 but with the x-axis replaced by z-band fiber magnitude.

Standard image High-resolution image
Figure 11.

Figure 11. Catastrophic failure rates and rejection rates as a function of z-band fiber magnitude for the nominal exposures. (Error bars are not shown for fractions equal to zero.) The gray crosses are the predicted rejection rates based on zfiber and teff (see Section 4.4). The histogram shows the zfiber distribution of the LRG sample. We restrict to coadds with 800 s < teff < 1200 s.

Standard image High-resolution image

Figure 12 shows the catastrophic redshift failure rate and the rejection rate as a function of teff. The rejection rate flattens out above teff ≃ 1000 s. And while the catastrophic failure rate and rejection rate increase significantly at teff < 800 s, the catastrophic failure rate of the accepted redshifts remains well below 1%.

Figure 12.

Figure 12. Similar to Figure 11, but with the effective exposure time teff on the x-axis (and with the restriction on teff removed). The histogram shows the teff distribution of LRGs in the Main Survey.

Standard image High-resolution image

For clustering measurements, it is important to correct for redshift incompleteness caused by targeting and observational factors. Here we use the following function of the effective exposure time and z-band fiber flux to predict the LRG redshift failure/rejection probability:

Equation (4)

where nanomaggy is a linear flux unit used in LS DR9 catalogs (with a flux of 1 nanomaggy corresponding to an AB magnitude of 22.5), $S\equiv ({f}_{z,\mathrm{fiber}}/1\,\mathrm{nanomaggy})\sqrt{{t}_{\mathrm{eff}}/1\,{\rm{s}}}$ is (approximately) proportional to the spectroscopic S/N when the sky is much brighter than the source (which is true for the fainter LRGs), fz,fiber is the z-band fiber flux, and a0, a1, and a2 are constant coefficients.

The exponential term is motivated by the observation that the redshift failure rate decreases exponentially with S. At brighter magnitudes and in deeper exposures (where the per-object failure rate becomes less than ∼1%) the exponential term approaches zero faster than the observed failure rate, so we add the second term a2/fz,fiber to account for the higher observed redshift failure rate. From visual inspection, we find that many of the redshift failures at brighter magnitudes are blends or have problematic spectra due to instrument issues.

The best-fit coefficients are found by minimizing ${\sum }_{i}{({P}_{\mathrm{fail},i}-{Q}_{i})}^{2}$, where Pfail,i is the predicted failure probability of the ith object, and Qi = 0 if the object passes the quality cut and Qi = 1 if it fails the cut. For the fitting we use Main LRGs observed in SV1 and the first two months of the Main Survey, and SV3 LRGs (which are ∼0.1 magnitude fainter than the Main LRGs). To best match the redshift failure rates of the Main Survey, we only use LRGs with 500 s < teff < 2000 s; this prevents the large number of redshift failures at very low teff (mainly from SV1) from dominating the fit. The best-fit coefficients are a0 = −0.0911, a1 = 3.34, and a2 = 0.0228.

The gray crosses in Figures 11 and 12 are the predicted failure rates, and they match the observed failure rates (green error bars) very well. Figure 13 shows the observed and predicted dependence of the LRG redshift failure rate on both zfiber and teff. The residuals are negligible in the range of zfiber and teff relevant for LRGs in the Main Survey.

Figure 13.

Figure 13. Left: redshift failure rate in bins of spectroscopic depth (teff) and fiber magnitude (zfiber). The horizontal dashed line marks the nominal depth of 1000 s. The vertical line marks the magnitude limit (zfiber < 21.6) of the Main LRGs. At zfiber < 21.6 the failure rate is computed for the combined sample of Main and SV3 LRGs, and at zfiber ≥ 21.6 SV1 LRGs are added. Middle: the redshift failure rate from the model prediction using Equation (4). Right: the residual, i.e., the measured failure rate subtracted from the predicted failure rate. The model fitting is done using Main and SV3 LRGs with teff > 500 s, which is why the prediction is less accurate at lower teff and in the faintest magnitude bin.

Standard image High-resolution image

4.5. Fiber-to-fiber Variation in Failure Rate

For the DESI clustering analysis, it is important that variations in the spectroscopic efficiency of each fiber do not imprint on the measured galaxy densities. For example, Ross et al. (2012) found that the redshift efficiency of the BOSS CMASS sample varies with the fiber location on the focal plane. To quantify the per-fiber efficiency, we compute the average LRG redshift failure rate for each fiber using observations during the Main Survey. The per-fiber LRG failure rate is shown in Figure 14 (left panel), and we do not see clear patterns in the fiber efficiency given the statistical uncertainties. To more rigorously assess whether the observed failure rates are consistent with uniform fiber efficiency, we perform Monte Carlo simulations with the per-object failure probability given by the model described in Section 4.4. We find that the observed distribution of the measured per-fiber failure rates is mostly consistent with the simulated distributions for all except a few fibers, as shown in Figure 14 (right panel), indicating very uniform spectroscopic efficiencies for almost all the fibers. We will revisit the fiber-to-fiber uniformity when more data become available.

Figure 14.

Figure 14. Left: per-fiber LRG redshift failure rate. Each point represents a fiber on the DESI focal plane with the colors indicating its average LRG failure rate during the first two months of the Main Survey. Only fibers with >40 LRG observations are plotted, and the median number of LRG observations by each fiber is 70. Since the average failure rate is ∼1%, most fibers have either zero or one redshift failure, and much of the variation in this figure is simply noise. Right: the distribution of the per-fiber LRG redshift failure rates and the simulated distributions from 100 Monte Carlo simulations. In the simulations, the redshift failure probability of each object is determined by its zfiber and teff via Equation (4). The fact that the measured distribution matches the simulations (for all except a handful of fibers) suggests that the spectroscopic efficiency is very uniform across the fibers.

Standard image High-resolution image

5. Summary

To achieve the required accuracy on cosmological measurements for DESI, it is critical that the sample selection and spectroscopic observations have minimal and well-understood systematics. With this in mind, the DESI LRG sample is designed to be robust against variations in imaging properties and zero-point uncertainties and to achieve a high redshift success rate and low stellar contamination rate. The high stellar mass completeness of the sample also ensures high large-scale bias and should facilitate modeling of the galaxy–halo connection. In addition to the already robust target selection, we developed veto masks specifically optimized for the LRG targets to produce a clean sample suitable for clustering analysis. We also created a simple model that can accurately predict (and thus correct for) the per-object redshift failure rate based on the object's brightness and spectroscopic depth.

The LRG target catalogs are publicly available 44 (see Myers et al. 2022 for details). The data points in all figures are publicly available at Zenodo: 10.5281/zenodo.6987401.

This research is supported by the Director, Office of Science, Office of High Energy Physics of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and by the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility under the same contract; additional support for DESI is provided by the U.S. National Science Foundation, Division of Astronomical Sciences under Contract No. AST-0950945 to the NSF's National Optical–Infrared Astronomy Research Laboratory; the Science and Technologies Facilities Council of the United Kingdom; the Gordon and Betty Moore Foundation; the Heising–Simons Foundation; the French Alternative Energies and Atomic Energy Commission (CEA); the National Council of Science and Technology of Mexico (CONACYT); the Ministry of Science and Innovation of Spain (MICINN), and by the DESI Member Institutions: https://www.desi.lbl.gov/collaborating-institutions.

The DESI Legacy Imaging Surveys consist of three individual and complementary projects: the Dark Energy Camera Legacy Survey (DECaLS), the Beijing–Arizona Sky Survey (BASS), and the Mayall z-band Legacy Survey (MzLS). DECaLS, BASS and MzLS together include data obtained, respectively, at the Blanco telescope, Cerro Tololo Inter-American Observatory, NSF's NOIRLab; the Bok telescope, Steward Observatory, University of Arizona; and the Mayall telescope, Kitt Peak National Observatory, NOIRLab. NOIRLab is operated by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation. Pipeline processing and analyses of the data were supported by NOIRLab and the Lawrence Berkeley National Laboratory. Legacy Surveys also uses data products from the Near-Earth Object Wide-field Infrared Survey Explorer (NEOWISE), a project of the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration. Legacy Surveys was supported by: the Director, Office of Science, Office of High Energy Physics of the U.S. Department of Energy; the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility; the U.S. National Science Foundation, Division of Astronomical Sciences; the National Astronomical Observatories of China, the Chinese Academy of Sciences and the Chinese National Natural Science Foundation. LBNL is managed by the Regents of the University of California under contract to the U.S. Department of Energy. The complete acknowledgments can be found at https://www.legacysurvey.org/.

The authors are honored to be permitted to conduct scientific research on Iolkam Du'ag (Kitt Peak), a mountain with particular significance to the Tohono O'odham Nation.

BASS is a key project of the Telescope Access Program (TAP), which has been funded by the National Astronomical Observatories of China, the Chinese Academy of Sciences (the Strategic Priority Research Program "The Emergence of Cosmological Structures" grant #XDB09000000), and the Special Fund for Astronomy from the Ministry of Finance. The BASS is also supported by the External Cooperation Program of Chinese Academy of Sciences (grant #114A11KYSB20160057), and Chinese National Natural Science Foundation (grant #12120101003, #11433005).

A.D.M. was supported by the U.S. Department of Energy, Office of Science, Office of High Energy Physics, under Award Number DE-SC0019022.

Software: Astropy (Astropy Collaboration et al. 2013, 2018), HEALPix/healpy (Górski et al. 2005; Zonca et al. 2019), Matplotlib (Hunter 2007), Numpy (Harris et al. 2020), scikit-learn (Pedregosa et al. 2011), Scipy (Virtanen et al. 2020).

Appendix A: SV1 Selection and Considerations for the Final Selection

Here we describe the LRG sample observed during Survey Validation (or "SV1") before the Main Survey operations. We also discuss the reasoning behind the choices for the Main selection based on data from SV1. See DESI Collaboration et al. (2023, in preparation) for an overview of the Survey Validation program.

The SV1 selection cuts are listed in Table 2 ("IR" selection) and Table 3 ("optical" selection); an object that meet either selection is selected. The SV1 cuts are shown in Figure 15. The SV1 selection was designed as a parent selection within which we could explore different selection cuts for the final LRG sample, and for this reason its selection boundaries generally extend beyond the final selection. The SV1 observation is also significantly deeper than the Main Survey, and this allows us to create deeper coadds for assessing the redshift performance. The SV1 redshift distribution is shown in Figure 16. The target density of the SV1 sample is ∼2100 deg−2. The SV1 program observed 46,000 unique SV1 LRG targets (excluding SV1 LRGs observed in BGS tiles).

Figure 15.

Figure 15. LRG selection cuts for SV1 (solid green and dashed–dotted green lines), SV3 (dashed blue lines), and the Main Survey (solid red lines). The SV1 selection includes two sub-selections, the IR selection (solid green) and optical selection (dashed–dotted green); see text. The cuts shown here are for the South and are slightly different from the cuts in the North. The points are color-coded by their PRLS photometric redshifts (Zhou et al. 2021) and the gray points in panel (a) show the stellar locus. The cuts in panels (a)–(d) serve similar purposes to those in Figure 3. Panel (e) shows the sliding color–magnitude cut for optical selection that was explored in the SV but not implemented in the Main selection.

Standard image High-resolution image
Figure 16.

Figure 16. Redshift distributions of the various DESI LRG samples. The "SV1 bright" sample (gray, showing South only) is the subset of SV1 LRGs that have the same faint limit as the Main LRGs (zfiber < 21.6) and a very similar redshift success rate to the Main LRGs.

Standard image High-resolution image

Table 2. SV1 IR Selection Cuts

CutsComment
South
(zfiber < 22.0) OR (z < 21.0)Faint limit
zW1 > 0.8 × (rz) − 0.8Stellar rejection
rW1 > 1.0Remove low-z galaxies
(rW1 > (W1 − 17.48) × 1.8) OR (rW1 > 3.1)Luminosity cut
North
(zfiber < 22.0) OR (z < 21.0)Faint limit
zW1 > 0.8 × (rz) − 0.8Stellar rejection
rW1 > 1.03Remove low-z galaxies
(rW1 > (W1 − 17.44) × 1.8) OR (rW1 > 3.1)Luminosity cut

Download table as:  ASCIITypeset image

Table 3. SV1 Optical Selection Cuts

CutsComment
South
(zfiber < 22.0) OR (z < 21.0)Faint limit
zW1 > 0.8 × (rz) − 0.8Stellar rejection
((gW1 > 2.5) AND (gr > 1.3)) OR (rW1 > 1.7)Remove low-z galaxies
((z < 20.2) AND (rz > (z − 17.20) × 0.45) AND (rz > (z − 14.17) × 0.19))Luminosity cut
OR ((z ≥ 20.2) $\mathrm{AND}\ {({((z-23.18)/1.3)}^{2}+(r-z+2.5)}^{2}\gt {4.48}^{2}))$
North
(zfiber < 22.0) OR (z < 21.0)Faint limit
zW1 > 0.8 × (rz) − 0.8Stellar rejection
((gW1 > 2.57) AND (gr > 1.35)) OR (rW1 > 1.75)Remove low-z galaxies
((z < 20.2) AND (rz > (z − 17.17) × 0.45) AND (rz > (z − 14.14) × 0.19))Luminosity cut
OR ((z ≥ 20.2) $\mathrm{AND}\ {({((z-23.15)/1.3)}^{2}+(r-z+2.5)}^{2}\gt {4.48}^{2}))$

Download table as:  ASCIITypeset image

The SV1 sample is selected from two selections: an "IR" selection and an "optical" selection, with a modified version of the former being chosen as the final LRG selection for the Main Survey. Slightly different versions of the "optical" selection were presented in Zhou et al. (2020, 2021). The main difference between the two selections is in the sliding color–magnitude cut (Figures 15(d) and (e)). Recall that this cut serves as the luminosity cut and shapes the redshift distribution. The IR selection implements this cut in rW1 versus W1, whereas the optical selection does it in rz versus z. Both cuts can be tuned to yield the desired redshift distribution, and both yield similar redshift success rates. The optical selection yields slightly higher stellar mass completeness than the IR selection and the optical selection also has a slightly higher bias (by ∼5% based on angular correlation amplitudes at intermediate scales). The main advantage of the IR selection is its robustness to imaging systematics. As we discussed in Section 3.2, the WISE W1 photometry is much better calibrated than the ground-based z band, and as a result, the WISE-based IR selection is expected to be more uniform than the z-band-based optical selection. While the IR selection may be more sensitive to the effects of blending in WISE (due to its much larger PSF than the optical bands), we did not find any clear evidence of systematics introduced by WISE blending (apart from contamination by bright stars in WISE that can be removed by the veto mask).

The major differences between the SV1 IR selection and the Main Survey selection are in the magnitude limit and the sliding color–magnitude cut (rW1 versus W1). The SV1 magnitude limit is 0.4 magnitude fainter than the Main sample, which increases the density at z ≳ 0.8. The SV1 IR selection also has a more relaxed sliding color–magnitude cut that, combined with the fainter magnitude limit, more than doubles the comoving density of the Main sample. The redshift failure rate versus zfiber of the SV1 LRGs is shown in Figure 17. The failure/rejection rate increases drastically beyond zfiber ∼ 21.6 for the nominal DESI exposure time. The magnitude limit that we chose for the Main Survey is therefore a trade-off between redshift success completeness and number density at higher redshift.

Figure 17.

Figure 17. Catastrophic failure rates and rejection rates of SV1 LRGs as a function of z-band fiber magnitude for the nominal (1× depth) exposures (points with error bars) and 4× depth exposures (black dashed line with error bars). (Error bars are not shown for fractions equal to zero.) The histogram shows the zfiber distribution of the SV1 LRG sample.

Standard image High-resolution image

We note that the failure rates as a function of zfiber and teff are very similar to that of the Main sample, even though the comoving number density is much higher. This means that it is feasible to conduct a future survey with DESI (or a similar instrument) of a significantly denser sample of LRG-like galaxies at z ≲ 0.8 without increasing the nominal exposure time. For instance, the "SV1 bright" sample, as shown in Figure 16, is a brighter subset of the SV1 LRGs that have the same magnitude limit as the Main LRGs and thus a very similar redshift success rate. The density of the "SV1 bright" sample is limited by the SV1 selection, and one can design a selection with a significantly higher density at z ≲ 0.8). A higher comoving density at z > 0.8 can be achieved with a fainter magnitude limit and longer exposure times.

SV1 includes magnitude limits in both fiber magnitude and total magnitude, and this allowed us to compare the two faint limits. We find that, as expected, the fiber magnitude is a much better predictor of redshift success rate than the total magnitude, and a selection implementing a fiber magnitude limit includes significantly more high-z LRGs than is possible for a selection with similar redshift efficiency that implements a total magnitude limit. For this reason, we chose to set the faint limit in fiber magnitude for the Main Survey. As noted in Section 2.2, the fiber magnitude cut is effectively a morphology cut that could introduce systematics in both sample selection and theoretical modeling, and such effects should be carefully investigated in the clustering analysis.

Appendix B: SV3 Sample

Here we describe the LRG sample observed during the 1% Survey (or "SV3") before the Main Survey operations. See DESI Collaboration et al. (2023, in preparation) for an overview of the 1% Survey program.

The SV3 selection is designed after the decision is made to adopt the IR selection and before the Main selection is finalized. Therefore the SV3 selection is very similar to the Main selection but with small differences. The selection cuts are listed in Table 4 and shown in Figure 15. The SV3 selection is based on the IR selection, as is the Main selection, although the sliding cut is slightly more extended. The magnitude limit is also fainter by ∼0.1 magnitude. As a result, the SV3 sample has a target density of ∼800 deg−2, compared to the 605 deg−2 of the Main sample. Although almost all Main LRGs are within the SV3 selection, ∼2 deg−2 of main LRGs are not in SV3 because of differences in the low-z cuts (in gr versus rW1, see Figure 15(b)). The SV3 program observed 140,000 unique SV3 LRG targets (excluding SV3 LRGs observed in BGS tiles).

Table 4. SV3 Selection Cuts

CutsComment
South
zfiber < 21.7Faint limit
zW1 > 0.8 × (rz) − 0.6Stellar rejection
((gr > 1.3) AND (gr > −1.55 × (rW1) + 3.13)) OR (rW1 > 1.8)Remove low-z galaxies
((rW1 > (W1 − 17.26) × 1.8) AND (rW1 > W1 − 16.36)) OR (rW1 > 3.29)Luminosity cut
North
zfiber < 21.72Faint limit
zW1 > 0.8 × (rz) − 0.6Stellar rejection
((gr > 1.34) AND (gr > −1.55 × (rW1) + 3.23)) OR (rW1 > 1.8)Remove low-z galaxies
((rW1 > (W1 − 17.24) × 1.83) AND (rW1 > W1 − 16.33)) OR (rW1 > 3.39)Luminosity cut

Download table as:  ASCIITypeset image

Appendix C: Stellar Masses of DESI Legacy Imaging Survey Galaxies Using Random Forests

To assess the stellar mass completeness of the LRG sample and guide our selection cuts, we created a catalog of stellar masses of galaxies (not restricted to LRGs) in the DESI Legacy Imaging Surveys. The catalog is general-purpose and can be used for a variety of science cases, and it is publicly available. 45 In this appendix, we describe the data set and methods to estimate the stellar masses along with metrics used to assess their quality.

To estimate the stellar masses using only photometry from the DESI Legacy Imaging Surveys, we use a machine learning-based regression method called Random Forests (Breiman 2001). We train the model to take Milky Way extinction-corrected colors gr, rz, zW1, and W1W2 and the photometric redshifts from Zhou et al. (2021) as inputs to predict the ratio of stellar mass to observed total light of a galaxy. The training set comprises galaxies in the Stripe 82 region whose photometry is available in the DESI Legacy Survey imaging catalog, and stellar masses were measured by Bundy et al. (2015; S82-MGC) using SDSS ugriz and UKIDS YJHK photometry. We use the stellar mass estimates from S82-MGC, specifically the "MASS_OPT_ZREIS" value from the "Mstar-z_ukwide" catalog, 46 as the "truth" based on which we train the random forest model. Our mass estimates inherit any systematic uncertainty present in the stellar mass measurements from S82-MGC.

To decouple the effects of uncertainties in the photometric redshift from the prediction of stellar masses, we train our model to predict the ratio of stellar mass to observed total light instead of the stellar mass. The photometric redshift is used to calculate a galaxy's luminosity distance, which is then used to calculate the total light emitted by the galaxy in the z band in the observer's frame of reference. The predictions of our model are then converted back to the stellar masses. To generate the catalog of stellar masses, we use only the objects in the DESI Legacy Survey imaging that have valid photometry (namely positive fluxes) in the g, r, z, W1, and W2 bands and satisfy the stellar rejection cut of rW1 > 1.75 × (rz) − 1.1. Objects that do not meet these requirements are assigned the value –99 in the stellar mass catalog.

To assess the accuracy of our stellar mass predictions, we divide the data set into five equal random subsets, train the model using four of those subsets combined, and compare the predicted and true values of stellar masses for the remaining subset. We do this five times, each time selecting a different 4:1 split of the data. We calculate the performance metrics using the union of all five test sets. This process is also called fivefold cross-validation. We quantify the accuracy of our predictions using the following three common metrics:

  • 1.  
    Prediction bias: defined as $\langle {\rm{\Delta }}{\mathrm{log}}_{10}({M}_{\star })\rangle $, where ${\rm{\Delta }}{{\rm{l}}{\rm{o}}{\rm{g}}}_{10}({M}_{\star })={{\rm{l}}{\rm{o}}{\rm{g}}}_{10}({\rm{t}}{\rm{r}}{\rm{u}}{\rm{e}}\,{\rm{s}}{\rm{t}}{\rm{e}}{\rm{l}}{\rm{l}}{\rm{a}}{\rm{r}}\,{\rm{m}}{\rm{a}}{\rm{s}}{\rm{s}})$ $-{{\rm{l}}{\rm{o}}{\rm{g}}}_{10}({\rm{p}}{\rm{r}}{\rm{e}}{\rm{d}}{\rm{i}}{\rm{c}}{\rm{t}}{\rm{e}}{\rm{d}}\,{\rm{s}}{\rm{t}}{\rm{e}}{\rm{l}}{\rm{l}}{\rm{a}}{\rm{r}}\,{\rm{m}}{\rm{a}}{\rm{s}}{\rm{s}})$. This quantifies the average error in our predictions.
  • 2.  
    Normalized median absolute deviation (σNMAD): defined as $1.4826\times {\rm{median}}(| {\rm{\Delta }}{{\rm{log}}}_{10}({M}_{\star })| )$. σNMAD is a measure of the spread in our prediction and is also robust to outliers.
  • 3.  
    Fraction of outliers (foutlier): defined as the fraction of stellar mass predictions for which $| {\rm{\Delta }}{\mathrm{log}}_{10}({M}_{\star })| \gt 0.5\ \mathrm{dex}$, i.e., the fraction of cases where the prediction error is very high. We chose the threshold of 0.05 as it is a typical value of the systematic difference between many common stellar mass prediction methods.

Our predictions have a moderate scatter of σNMAD = 0.127 dex, a small fraction of outliers (foutlier = 1.93%) with a negligibly small bias ($\langle {\rm{\Delta }}{\mathrm{log}}_{10}({M}_{\star })\rangle =-0.0044$ dex). As shown in Figure 18, the scatter in predictions is symmetric and does not show any obvious patterns at the boundaries of the training data. The prediction performance is also stable across the entire range of stellar masses (as shown in Figure 19). We see that the prediction bias stays under 0.1 dex and σNMAD under 0.2 for the entire range of stellar masses. These values of the metrics are typical when two methods of estimating stellar masses are compared as shown in Bundy et al. (2015).

Figure 18.

Figure 18. Comparison of predicted stellar masses and their true values (i.e., values from Bundy et al. 2015) for galaxies in the test set. The central gray line tracks the identity axis and the dashed lines mark the threshold beyond which a prediction is considered to be an outlier. The color of the points represents the number of data points present in each pixel of the figure. The scatter in the predictions is tight and symmetrically distributed about the identity line with a very small bias. The distribution of points is random and does not show any obvious pattern at the boundaries of training data, indicating a stable performance across the entire range of stellar masses.

Standard image High-resolution image
Figure 19.

Figure 19. Evolution of prediction bias and σNMAD of our stellar mass predictions as a function of the true stellar masses. The metrics have been calculated for 10 equally populated bins with varying widths in stellar mass. This ensures that the standard errors on the binned statistics (even though very small due to a large number of data points per bin) are comparable across all bins. The gray scatter points show the distribution of galaxies in the test set. We observe that the magnitude of the prediction bias (i.e., $\langle {\rm{\Delta }}{\mathrm{log}}_{10}({M}_{\star })\rangle $) stays under 0.1 dex and the scatter in predictions (quantified by σNMAD) is between 0.2 and 0.1 dex for the entire range of stellar masses. These values of the metrics are typical of such comparisons between two methods of determining stellar mass from photometry (as shown in Bundy et al. 2015).

Standard image High-resolution image

If we instead train the model using the redshifts in S82-MGC and test using spectroscopic redshifts from DESI, we observe a significant improvement in the accuracy in stellar mass estimates with σNMAD = 0.086, foutlier = 0.75%, and $\langle {\rm{\Delta }}{\mathrm{log}}_{10}({M}_{\star })\rangle =-0.0006$ dex. This indicates that the error in redshifts is a significant source of uncertainty for the stellar mass predictions. Since spectroscopic redshifts are only available for a subset of the objects in DESI Legacy Survey imaging, better quality mass predictions are available for only a fraction of objects. We notice that training with redshifts from S82-MGC and testing with photometric redshifts and vice versa results in poorer prediction accuracy, and therefore the above combination of data sets was chosen to produce the two different mass catalogs. We use the catalog of stellar masses produced using photometric redshifts to analyze the stellar mass completeness of the DESI LRG sample.

Appendix D: Veto Masks for Clustering Analysis

The WISE circular geometric mask is based on the AllWISE star catalog (supplemented by 2MASS at the bright end) that was used for the unWISE mask (Meisner et al. 2019). Stars with a limiting magnitude of W1 < 10.0 are used. We derive the magnitude–radius relation empirically by checking the excess/deficit of LRG density around the stars in bins of magnitude and setting the mask radius to where the differential density excess/deficit is less than 10%. In unWISE, the "HALO" mask radius depends not only on magnitude but also on ecliptic latitude and the sky background level. We find the trend of LRG densities around stars has no noticeable dependence on either of the two values, so we make the mask radius a function of magnitude only. The magnitude–radius relation for the WISE mask is shown in Figure 20 (top).

Figure 20.

Figure 20. Top: radius–magnitude relation for the WISE circular mask. Bottom: radius–magnitude relation for the Gaia circular mask. The "mask magnitude" is typically Gaia G magnitude, although for very red stars (with z + 1 < G where z is the predicted DECam z magnitude), z + 1 is used instead). In both panels, The vertical dashed lines indicate the magnitude limit of the stars used in the masks.

Standard image High-resolution image

The Gaia mask is based on Gaia EDR3 with a limiting magnitude of G < 18 (compared to G < 16 in the LS DR9 MEDIUM mask) and supplemented with Tycho-2 stars at the bright end. Compared to Gaia DR2 (which was used in LS DR9), EDR3 has far fewer missing stars or stars with badly underestimated fluxes. And it contains few galaxies to G < 18 (and those that are galaxies are at much lower redshift than the DESI LRGs), thus making the cut on astrometric excess noise (see D. Schlegel et al. 2023, in preparation) unnecessary. We use the Gaia G magnitude as the mask magnitude (mask_mag), except for very red stars that have zguess+1 < G, where "zguess" is the predicted DECam z-band magnitude based on Gaia photometry (see D. Schlegel et al. 2023, in preparation). For these very red stars, we use zguess+1 as the mask magnitude, and this allows for sufficient masking for these stars.

Very few extremely bright stars are still missing from (or have incorrect photometry in) Gaia EDR3, and we supplement it with Tycho-2: if a Tycho-2 star with VT < 10 is not within 1farcs0 of a Gaia star (with proper motion correction), we add it to the Gaia catalog. For these Tycho-2 stars, we predict the Gaia G magnitude ("ggguess") and DECAm z magnitude: we cross-match Tycho-2 to 2MASS to obtain the J-band photometry from 2MASS, derive the G – VT versus VT – J and z – VT versus VT – J relations (polynomials) using common stars in Tycho-2 and Gaia, and finally obtain ggguess and zguess for all Tycho-2 stars with 2MASS photometry. In the rare cases that a Tycho-2 star is not matched to 2MASS, the VT magnitude is used as the masking magnitude.

In LS DR9 the source detection and source fitting are slightly different inside the MEDIUM mask compared to outside the mask, and this could lead to slightly different LRG densities. Therefore, we also identify stars from the Gaia reference catalog in LS DR9 that have mask_mag < 8 and are brighter than the new Gaia catalog by more than 0.05 mag, and add them to the Gaia catalog, so that the mask radii of these stars are at least as large as their DR9 mask radii.

Similar to the WISE mask, we obtain the Gaia radius–magnitude relation empirically based on LRG densities around the stars. We see significant differences in LRG density trends around the stars between the North and the South (presumably due to differences in optics and detectors), in particular significantly more excess targets around stars of 10 < G < 16 in the North, and we implement different radius–magnitude relations for the two regions. The magnitude–radius relations for the Gaia mask are shown in Figure 20 (bottom).

Figure 21 shows the LRG–WISE correlation for one of the WISE bins before and after applying the (full set of) veto masks. Figure 22 shows the LRG–Gaia correlation for one of the Gaia bins before and after applying the (full set of) veto masks in the South.

Figure 21.

Figure 21. Relative LRG densities near WISE stars with 8 < W1 < 8.5 before (top panels) and after (bottom panels) applying the veto masks. The left panels show the fractional excess/deficit of LRG targets around the stars in 2D bins of Δλ and Δβ (λ and β are ecliptic coordinates). The right panels show the fractional excess/deficit as a function of distance for each Δλ–Δβ bin (dots) and their binned average (curve). The excess targets along the diffraction spikes are removed by the unWISE mask.

Standard image High-resolution image
Figure 22.

Figure 22. Same as Figure 21 but for Gaia stars with 7 < G < 7.5 with the axes replaced by ΔR.A. and Δdecl.

Standard image High-resolution image

In addition to the Gaia and WISE masks, we developed custom masks for any remaining problematic regions. Such regions include a small number of large galaxies that were not masked due to a bug in LS DR9 and regions with imaging artifacts that were found by visually inspecting regions identified by a DBSCAN cluster analysis (Ester et al. 1996) as having very high LRG densities.

The stellar reference catalog (with mask radii) and the list of custom masks are publicly available. 47

Footnotes

Please wait… references are loading.
10.3847/1538-3881/aca5fb