The following article is Open access

Self-consistent Stellar Radial Velocities from LAMOST Medium-resolution Survey DR7

, , , , , , , , , , , , , , and

Published 2021 September 7 © 2021. The Author(s). Published by the American Astronomical Society.
, , Citation Bo Zhang et al 2021 ApJS 256 14 DOI 10.3847/1538-4365/ac0834

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0067-0049/256/1/14

Abstract

Radial velocity (RV) is among the most fundamental physical quantities obtainable from stellar spectra and is rather important in the analysis of time-domain phenomena. LAMOST Medium-resolution Survey (MRS) DR7 contains five million single-exposure stellar spectra with spectral resolution R ∼ 7500. However, the temporal variation of the RV zero-points (RVZPs) of the MRS, which makes the RVs from multiple epochs inconsistent, has not been addressed. In this paper, we measure the RVs of 3.8 million single-exposure spectra (for 0.6 million stars) with signal-to-noise ratios (S/N) higher than 5 based on the cross-correlation function method, and propose a robust method to self-consistently determine the RVZPs exposure by exposure for each spectrograph with the help of Gaia DR2 RVs. Such RVZPs are estimated for 3.6 million RVs and can reach a mean precision of ∼0.38 km s−1. The result of the temporal variation of RVZPs indicates that our algorithm is efficient and necessary before we use the absolute RVs to perform time-domain analyses. Validating the results with APOGEE DR16 shows that our absolute RVs can reach an overall precision of 0.84/0.80 km s−1 in the blue/red arm at 50 < S/N < 100 and of 1.26/1.99 km s−1 at 5 < S/N < 10. The cumulative distribution function of the standard deviations of multiple RVs (Nobs ≥ 8) for 678 standard stars reaches 0.45/0.54, 1.07/1.39, and 1.45/1.86 km s−1 in the blue/red arm at the 50%, 90%, and 95% levels, respectively. Catalogs of the RVs, RVZPs, and selected candidate RV standard stars are available at https://github.com/hypergravity/paperdata.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Large spectroscopic surveys, for example, RAVE (Steinmetz et al. 2006, 2020a), the Sloan Digital Sky Survey/SEGUE (Yanny et al. 2009), the Large Sky Area Multi-object Fiber Spectroscopic Telescope (LAMOST; Cui et al. 2012; Deng et al. 2012; Zhao et al. 2012; Luo et al. 2015), APOGEE (Majewski et al. 2017), GALAH (De Silva et al. 2015), Gaia-ESO (Gilmore et al. 2012), and the Gaia Radial Velocity Spectrometer (Gaia-RVS; Katz et al. 2004; Cropper et al. 2018), have obtained tens of millions of stellar spectra over the last two decades, aiming at understanding the formation and evolution of the Galaxy. Among the most fundamental physical quantities derived from stellar spectra is radial velocity (RV), which forms the basis of many studies, such as on stellar multiplicity (e.g., Gao et al. 2014, 2017; El-Badry et al. 2018; Yang et al. 2020), stellar kinematics (e.g., Tian et al. 2020; Bird et al. 2020), and Galactic substructures (e.g., Yang et al. 2019; Xu et al. 2020).

LAMOST, after a five-year low-resolution spectroscopic survey (LRS; R ∼ 1800, 3800 Å < λ < 9000 Å), started a five-year medium-resolution spectroscopic survey (MRS; R ∼ 7500, 4950 Å < λ < 5350 Å and 6300 Å < λ < 6800 Å; Liu et al. 2020) in 2017 September. DR6, 11 the first data release of the MRS, contains the data obtained from 2017 September to 2018 June and is already available to the international astronomical community. DR7, 12 including the data from 2017 September to 2019 June (about five million spectra for over 800,000 stars), is currently open to the Chinese astronomical community only.

In the beginning, an Sc lamp was used to calibrate the wavelength of LAMOST MRS spectra; later, it was switched to a ThAr lamp (in 2018). Due to the short wavelength coverage, sky lines are not used in wavelength calibration as in the LRS (Wang et al. 2010). The released spectra have undergone barycentric correction and the wavelength uses the vacuum standard. Both the LAMOST pipeline and Wang et al. (2019) have measured RVs from DR7 spectra and estimated a static RV zero-point (RVZP) by comparing the measured RVs to those in the literature of a sample of RV standard stars (Huang et al. 2018) for each spectrograph. However, the temporal variation of the RVZPs (between exposures) is not clear so far despite a few trials. For example, Liu et al. (2019b) and Zong et al. (2020) show the temporal variation of RVZPs based on data from the Kepler field using a sample of roughly selected RV-invariant stars, and Ren et al. (2021) show that the RVZPs of the red arm vary between exposures and the difference can reach 4 km s−1 in the MRS-N fields (nebula survey; Wu et al. 2021).

Physically, the variation of the LAMOST MRS RVZPs can be explained by several factors.

  • 1.  
    LAMOST has a long optical path and a large focal plane (1.75 m diameter) on which 4000 fibers are installed (Cui et al. 2012); temperature variation is unavoidable.
  • 2.  
    Currently, arc lamp exposures are taken about every 2 hr (typical observing time of a plate), which is not frequent enough. The instrument might change its state between lamp exposures.
  • 3.  
    Since the LAMOST MRS needs ∼20 ThAr lamps to illuminate the big focal plane simultaneously, the lamps are easily damaged and frequently replaced, and the lamp exposure time needs frequent adjustment. These factors affect the signal-to-noise ratio (S/N) of the lamp spectra and thus the wavelength calibration consistency over time.
  • 4.  
    The MRS spectrographs are mounted and dismounted monthly because the MRS is scheduled on the 14 bright/gray nights, while the other nights are for the LRS. Therefore, the focuses are adjusted monthly.

All these factors and other potential defects in the data reduction pipeline are finally reflected in the RVZPs of the spectra. Therefore, it is insufficient to use the RVs from either the LAMOST pipeline or Wang et al. (2019) at different epochs and perform a time-domain analysis, such as in studies of pulsating stars (the LAMOST-Kepler Project; Zong et al. 2020; Fu et al. 2020) and spectroscopic binaries (Gao et al. 2014, 2017; Liu 2019) and even in searches for black holes (Liu et al. 2019a; Gu et al. 2019), which are important scientific goals of the MRS (Liu et al. 2020).

In this paper, aiming at subsequent time-domain analysis, we measure the RVs for the 3.8 million single-exposure spectra with S/N > 5 in LAMOST MRS DR7 v1.1, 13 and propose a robust method to self-consistently determine the absolute RVZPs with the help of Gaia DR2 data. In Section 2, we briefly describe the observational rules of the LAMOST MRS and the instrumental parameters. In Section 3, we describe how we measure the RVs. In Section 4, we show the algorithm that determines the RVZPs self-consistently. In Section 5, we present our RV and RVZP measurements and assess the precision and self-consistency, and select a sample of candidate RV standard stars based on our results. Information on and instructions regarding our data products are described in Section 6, and a summary of this work is given in Section 7.

2. The LAMOST MRS

2.1. Targeting and Observational Rules

The scientific goals of the LAMOST MRS mainly include stellar multiplicity, stellar pulsation, star formation, emission nebulae, Galactic archeology, host stars of exoplanets, and open clusters. The details of the scientific plan and the survey strategy are described in Liu et al. (2020). In this section, we summarize them briefly.

Each scientific goal has a principal investigator who is responsible for its targeting. A planid is assigned to each MRS plate (pointing), which has the form [TD/NT]hhmmss[N/S]ddmmssXnn. The first two characters denote time-domain (TD, will be repeatedly observed during the five-year survey) or non-time-domain (NT, just observed on a single night), hhmmss[N/S]ddmmss represents the equatorial coordinates of the plate center, [N/S] means north/south, the digit X is used to denote the scientific goal (see Table 1), and the last two digits represent a serial number. For example, the planid TD062610N184524B01 means the plate is a time-domain plate dedicated to binarity/multiplicity research. There also exist some irregular planid that are testing fields, such as HIP8426401 and NGC216801. HIP8426401 means the central star of the plate is HIP 84264, and NGC216801 means a plate toward the star cluster NGC 2168.

Table 1. Abbreviations of Scientific Goals (X) Used in planid

XNT/TDScience
KTDKepler fields
HTDhigh-frequency Kepler fields
BTDbinarity/multiplicity
TTDTESS fields
MNTMilky Way
SNTstar formation
CNTstar cluster
NNTnebula

Download table as:  ASCIITypeset image

The MRS uses the local modified Julian minute (LMJM), an 8 bit integer defined as 1440× the local modified Julian date at the beginning of each exposure, as the stamp of each exposure. The typical exposure time is set to 1200 s, while 900 s and 600 s exposures also exist, depending on the brightness of the targets. Each NT plate is observed with three consecutive exposures while each TD plate is observed until it is unobservable (usually five to six 1200 s exposures are allowed). An arc lamp exposure is taken at the beginning of each plate, and at the end of an observing night (every ∼2 hr). The targeting is mostly based on the Gaia DR2 source catalog (Gaia Collaboration et al. 2018). For a 1200 s exposure, the magnitudes corresponding to S/N = 5 in blue- and red-arm spectra are G ∼ 14.5 and 15.2 mag, respectively (see Figure 1 in Liu et al. 2020). Magnitudes brighter than these cover the majority of the objects observed by the LAMOST MRS.

2.2. LAMOST MRS Spectra

The whole LAMOST MRS DR7 catalog (R ∼ 7500) contains over five million single-exposure spectra of over 800,000 stars obtained from 2017 September to 2019 June. In this work, 3,753,659 spectra with S/N > 5 either in the blue arm or in the red arm for 600,771 stars are selected. Their distributions on the number of exposures and time span are presented in Figure 1. The spectra are oversampled. The typical wavelength steps are 0.11 and 0.14 Å at the blue and red arms, respectively. The sampling rate (λλ ∼ 45,000) is as high as six times the spectral resolution (R ∼ 7500). In Figure 2, we show a spectrum of a K-type giant star as a demo of the LAMOST MRS. The blue arm and red arm are designed mainly for the Mg i b triplet at around 5175 Å and for Hα at around 6564 Å. The released spectra are not corrected with response curves.

Figure 1.

Figure 1. The distributions of the number of exposures and the time span of the stars observed in LAMOST MRS DR7.

Standard image High-resolution image
Figure 2.

Figure 2. An example of a single-exposure spectrum of a K-type giant star observed in LAMOST MRS DR7 (med-58649-TD193637N444141K01_sp16-249, KIC 9833073). The upper panels show the continuum-normalized blue- and red-arm spectra. The lower panels show zoomed-in views of the upper panels at the Mg b and Hα regions. The zoomed-in regions are shown as gray blocks in the upper panels. The S/N of the blue and red arms are 29 and 61, respectively.

Standard image High-resolution image

3. Measurement of RVs

3.1. Preparation for RV Measurements

We notice that cosmic rays frequently pollute MRS single-exposure spectra and are neither identified nor removed by the LAMOST pipeline. Therefore, in our method, the first step is to carefully remove cosmic rays in spectra. We smooth spectra with a 21-pixel median filter followed by a 9-pixel Gaussian filter, and remove the original pixels that deviate from the smoothed spectrum by four and eight times the local standard deviation in the upper and lower directions. The parameters of the filters are set empirically so that the absorption lines in the spectra of A-, F-, G-, and K-type stars are not affected. The removed pixels are replaced by linearly interpolated values using neighboring pixels. We also note that the two ends of both the blue and red arms are sometimes tilted and show unreasonable flux values probably due to extrapolation of the sky flux modeling, so we trim 50 pixels at both edges of each arm.

The second step is to normalize spectra to a pseudocontinuum to place the spectral features (usually absorption lines) on the same scale. We iteratively fit the spectrum with a smoothing spline function and clip the pixels away from the median values by three times the standard deviation in each 100 Å window. The number of iterations is set to 3. As shown in Figure 2, for a typical K-type star with a medium S/N, the normalization is quite adequate for the subsequent RV measurements.

3.2. Spectral Templates

We adopt the synthetic grid published by Allende Prieto et al. (2018) based on the ATLAS9 stellar atmosphere model (Kurucz 1979; Mészáros et al. 2012) as our spectral library. We degrade the spectral resolution from 10,000 to 7500 to fit the MRS configuration and convert the air wavelength to vacuum wavelength using the formula proposed by Morton (2000). To limit the computational cost of the subsequent RV measurements to a reasonable amount, we interpolate the synthetic library to generate 100 spectral templates with stellar parameters randomly drawn from a uniform distribution in the ranges of 3500 < Teff/K < 15,000, $0\lt \mathrm{log}g/\mathrm{dex}\lt 5$, −2 < [Fe/H]/dex < 0.5, and −0.5 < [α/Fe]/dex < 0.7. Extremely metal-poor and extremely hot templates are not considered because the spectral features are not significant. Tests on high-S/N MRS spectra show that the sparsity of our templates induces statistical errors ∼0.10 and 0.20 km s−1 in the blue and red arms, respectively, which are negligible as compared to other sources of uncertainties. Figure 3 shows the distribution of parameters of the 100 spectral templates and a series of PARSEC isochrones (Bressan et al. 2012) with solar metallicity and logarithmic age ${\mathrm{log}}_{10}\tau =7$, 8, 9, and 10. These spectral templates are normalized to a pseudocontinuum in the same way as in Section 3.1.

Figure 3.

Figure 3. The stellar parameters of the 100 randomly drawn templates that are used in the RV measurements. The gray area shows the model grid coverage. The black solid lines are the PARSEC isochrones with solar metallicity and ${\mathrm{log}}_{10}\tau =7$, 8, 9, and 10, where τ is age/yr.

Standard image High-resolution image

3.3. RV Estimates

The cross-correlation function (CCF; Tonry & Davis 1979) method is widely used in spectroscopic surveys to measure stellar RVs (e.g., Nidever et al. 2015; Steinmetz et al. 2020b). One important advantage is that the CCF can be accelerated using fast Fourier transformation (FFT) once a spectrum is continuum-subtracted and resampled to a logarithmic wavelength grid. However, the drawback of such a scheme is that the sampling of the resulting CCF is generally very sparse. To evaluate the CCF at a smaller RV step, we do not follow the FFT way. In our implementation, the CCF at an RV of v is evaluated as

Equation (1)

where F is the vector of the normalized observed spectrum, G (v) is the vector of the normalized synthetic spectrum shifted by an RV (v) and resampled to the wavelength grid of F , Var represents the variance operator, and Cov represents the covariance operator (see Appendix A for more details).

Deriving the final RV consists of three steps:

  • 1.  
    The initial estimates are made from an RV grid of −1500 to 1500 km s−1, with a step of 10 km s−1. The template with the maximum CCF value is selected as the best-match template, and the corresponding RV is adopted as the initial guess of the final RV of the observed star. The parameters of the template (Teff, $\mathrm{log}g$, [Fe/H], and [α/Fe]) are recorded, which make a good prior for some following analyses, such as for stellar atmospheric parameter determination.
  • 2.  
    With the best-match template, we maximize the CCF to determine the final RV (vobs) using the optimization routine scipy.optimize.minimize with the Nelder–Mead algorithm (Nelder & Mead 1965). The corresponding CCF value is recorded as CCFMAX to assess the likelihood between the best-match template and the observed spectrum. The S/N–CCFMAX relations are shown in Figure 4.
  • 3.  
    To obtain the measurement error σv,obs, we use a Monte Carlo method—namely, we repeat this process 100 times and each time we add Gaussian random noise to the spectrum according to the flux error. The measurement error is computed using the 16th and 84th percentiles, i.e., σv,obs = (v84v16)/2. The S/N–σv,obs relations are shown in Figure 5.

We avoid Gaussian fitting to the CCF, which is widely used in literature (e.g., Nidever et al. 2015; Wang et al. 2019). Our reasons for doing this include the fact that the CCF peak is obviously non-Gaussian and that at low S/N the fitting process is fragile. The final error of the RV includes the measurement error and a noise floor term that is due to systematics (e.g., background, detector imperfections, temperature changes, or focusing issues) and will be assessed below.

Figure 4.

Figure 4. The left/middle/right panel shows the S/N–CCFMAX relation for the B (blue arm) / R (red arm) / Rm (red arm with Hα masked) CCF results of 5000 randomly selected spectra, respectively. The color denotes the Teff of the best-match template.

Standard image High-resolution image
Figure 5.

Figure 5. The left/middle/right panel shows the S/N–σv,obs (measurement error) relation for the B (blue arm) / R (red arm) / Rm (red arm with Hα masked) CCF results of 5000 randomly selected spectra, respectively. The color denotes the Teff of the best-match template.

Standard image High-resolution image

In this work, one RV is estimated for the blue arm (vB) and two for the red arm (vR and vRm). The vRm is measured using the Hα-masked red-arm spectrum. As shown in Figures 4 and 5, these RV measurements deteriorate rapidly at S/N < 20. For cool stars (e.g., FGK types), at a given S/N, vB is more precise than vR because of rich spectral features in the blue arm (for a detailed discussion of spectral information content in MRS spectra, see Zhang et al. 2020b). However, for hot stars, vR is more reliable than vB because of the Hα feature and vRm is significantly less precise than vR due to the absence of Hα. Therefore, we recommend that readers only consider vRm when the targets have (possible) Hα emissions.

4. RVZPs

In essence, the RVZP is the bias of the wavelength solution of a specific spectrum as compared to its true wavelength solution in terms of RV. It is affected by many factors, e.g., the condition of the instrument, the quality of the arc lamp exposure, the reduction algorithm, and the nonsimultaneous nature of the arc lamp exposures and the object exposures. In this paper, we define the RVZP correction value Δv by

Equation (2)

where vabs is the absolute RV and vobs the RV directly measured from the spectrum.

4.1. The Scheme

LAMOST has 16 spectrographs of which each has 250 fibers (4000 fibers in total). Excluding a few tens of sky fibers and a few problematic fibers, each spectrograph typically produces ≲200 spectra in an exposure, depending on the targeting, the condition of the instrument, the data quality, and the reduction algorithm. Let i denote the exposure epoch or LMJM, j the spectrograph ID, and k the fiber ID; ideally, we seek the solution of the RVZP for each fiber, for each spectrograph, and exposure by exposure, namely, Δvi,j,k . This scheme is infeasible because the true/reference RVs vi,j,k,abs of the targets are not always known.

In this work, assuming that the fibers in a spectrograph in one exposure (hereafter, we refer to it as a spectrograph exposure unit, or an SEU) share similar RVZPs, the systematic RVZP Δvi,j can be determined as long as a homogeneous reference set of RVs can be found for a fraction of fibers in that SEU. The assumption is quite reasonable given the fact that the wavelength calibration of a multifiber spectrograph is done by fitting a 2D grating equation. And as we will see in Section 4.2, the Gaia DR2 RVs (Katz et al. 2019) meet our needs for the reference set.

As a contrast, both the LAMOST pipeline and Wang et al. (2019) calculate Δvj assuming the RVZPs for a specific spectrograph do not vary with time, and, therefore, get around the temporal variation of RVZPs. However, we notice that there exist some weird absolute RVs (as we will see in the results in Section 5).

4.2. Gaia DR2 RVs as the Reference Set

Thanks to the European Space Agency's Gaia mission (Gaia Collaboration et al. 2016), we now have access to the largest RV data set, which matches the LAMOST MRS in terms of velocity precision and magnitude limit. The spectral resolution of Gaia-RVS (R ∼ 11,500; Cropper et al. 2018) is slightly higher than that of the MRS (R ∼ 7500). The magnitude limit of the Gaia DR2 RV catalog (Katz et al. 2019) is at GRVS = 12 mag or G ∼ 14 mag depending on spectral type and line-of-sight interstellar extinction, which is slightly brighter than the LAMOST MRS magnitude limit. Gaia DR2 contains qualified median RVs for 7,224,631 stars derived from the Gaia-RVS spectra, with Teff in the range [3550, 6900] K, excluding large RV-variant stars (see Katz et al. 2019 for details). At the faint end, GRVS = 11.75 mag, the precisions for Teff = 5000 and 6500 K are 1.4 and 3.7 km s−1, respectively.

Aiming at studying time-domain astrophysical phenomena, e.g., spectroscopic binaries, we proceed to carrying out the second-best scheme—Δvi,j . Cross-matching the LAMOST MRS DR7 catalog with Gaia DR2, we find 1,582,948 out of the 3,753,659 single-exposure spectra (42.1%) have Gaia RVs, and the common objects usually have good S/N in the MRS because they are relatively bright in that survey. The number of objects in Gaia DR2 is ∼1000 times larger than that in catalogs of RV standard stars, such as that of Huang et al. (2018). The challenge arises because not all of the seven million objects in the Gaia-RVS catalog are RV-invariant, i.e., quite a number of them are pulsating stars or binary/multiple systems that have periodic/nonperiodic RV variations. Below, we demonstrate a robust method that can determine the RVZP self-consistently for each spectrograph in each exposure (Δvi,j ) by comparing the observed RVs to the Gaia DR2 RVs without identifying RV standard stars.

4.3. Self-consistent RVZPs

Assuming that the RV variables vary with random periods at random phases or nonperiodically, and are not the majority of the observed stars, we can regard them as outliers and use a robust method, e.g., least absolute residual (LAR) regression (or least absolute deviation regression; see Press et al. 2007), to estimate Δvi,j (the common RV bias shared by the objects in an SEU). From a Bayesian perspective, LAR regression originates from an exponential likelihood while least squares (LSQ) regression comes from a Gaussian likelihood. Utilizing the LAR technique, extreme values have less influence on the fit compared to those in LSQ regression. Besides, since we aim at time-domain analysis, as long as our RVZPs are temporally self-consistent, the absolute scales are not very important.

Using the indices proposed in Section 4.1, for each group of pointings, we construct a global cost function f as follows:

Equation (3)

where Δ v is the vector of {Δvi,j } for all relevant SEUs in the group of pointings; vi,j,k,obs and σi,j,k,obs are the RV and associated measurement error of the kth star in the SEU {i, j}; vi,j,k,GAIA and σi,j,k,GAIA are the Gaia RV and associated uncertainty of the kth star in SEU {i, j}; σmin is the noise floor of the measured RV, which indicates the stability of the wavelength calibration, i.e., the dispersion of Δvi,j,k in SEU {i, j}; Δvi,j is the RVZP correction value of SEU {i, j}, which is a free variable to be solved; $\overline{{v}_{\cdot ,\mathrm{obs}}}$ and σ.,obs are the median and scatter of the (RVZP-corrected) measured RVs of the star {i, j, k} in other SEUs; and Λ1 and Λ2 are the regularization parameters of the two terms. In this scheme, the first term guarantees that the absolute scale of our RVZP-corrected RVs is close to that of Gaia DR2, while the second term makes use of multiple exposures and guarantees that the relative RVZPs are self-consistent. We set Λ1 = Λ2 = 1 so that the final correction of each SEU is determined as an average of the two effects. Then the vector Δ v is determined by minimizing the cost function f, i.e., ${\boldsymbol{\Delta }}{\boldsymbol{v}}=\arg \,\min f$. This algorithm can be implemented by minimizing

Equation (4)

for each SEU {i, j} sequentially and iteratively, where Δvi,j is the RVZP correction value for SEU {i, j}. Therefore, the problem is to solve the vector Δ v that has NSEU elements, where NSEU is the number of related SEUs. We claim that Δ v is determined when the L norm of the difference between the solutions in the lth and (l + 1)th iterations is less than a specified value, i.e., $\max (| {\boldsymbol{\Delta }}{{\boldsymbol{v}}}_{l}-{\boldsymbol{\Delta }}{{\boldsymbol{v}}}_{l+1}| )\lt \epsilon $, where epsilon is the tolerance and is set to 0.075 km s−1.

SEUs with RVZP correction values ∣Δvi,j ∣ > 50 km s−1 or associated uncertainties ${\sigma }_{{\rm{\Delta }}{v}_{i,j}}\gt 10$ km s−1 (see Section 4.5) are excluded from the iterations. These results are generally due to (1) the number of Gaia DR2 objects being ≤10, (2) the spectral S/N being too low, or (3) bad spectra due to saturation or instrumental problems. We do not think that for these SEUs our scheme and assumptions are valid, so we only keep their initial guesses of Δvi,j (see Section 4.4) and their uncertainties (see Section 4.5) in our catalog (see Section 6).

4.4. Tricks to Accelerate the Algorithm

Several tricks are used to accelerate the algorithm. The first trick is to get a good initial estimation of Δ v . A good approximation can be made by ignoring the second term in Equation (4), so that with Gaia DR2 RVs we can roughly estimate Δvi,j by minimizing

Equation (5)

We note that if an SEU has only a few objects in common with the Gaia catalog, the estimation is risky. Therefore, we require that an SEU at least have 10 objects in common with Gaia DR2 to proceed; otherwise we calculate the initial guess Δvi,j,init but exclude this SEU in the iteration process.

The second trick is to cut down NSEU by separating physically detached SEUs, which hastens the index evaluation in each iteration. In Equations (3) and (4), the second terms contain cross-terms, meaning that when solving the NSEU elements, an iteration process is needed to guarantee that the solution of Δ v is stable. However, the evaluation of indices is computationally expensive when NSEU grows. Therefore, before the optimization process, physically detached sky areas can be separated, so that the many index evaluation processes can be accelerated by cutting down the array sizes. We group the pointings of LAMOST MRS DR7 using a friends-of-friends algorithm with a 5° linking length, which is double the radius of the field of view of LAMOST in case of any possible common stars between them. Eventually, 137 groups are obtained, as shown in Figure 6. Initial guesses of Δv for each SEU are made by optimizing the cost function of Equation (3) for each group of plates.

Figure 6.

Figure 6. Grouped pointings of the LAMOST MRS DR7 observations using a friends-of-friends method with a linking length of 5°. The field of view of LAMOST is a circle with a 2fdg5 radius if all spectrographs are in operation. The figure uses the equatorial coordinate system with Mollweide projection, and the pointings in a particular group are shown with the same color.

Standard image High-resolution image

In practice, we find that if epsilon is too small, there is the possibility that Δ v jumps back and forth between two solutions and does not converge. Further analysis shows that this is an optimization-method-related problem (we use the Nelder–Mead solver; changing it to the Powell solver does not solve the problem but the two solutions are different). We guess that this might be due to the numerical problem of the optimization routine. To avoid such a situation, we then add random processes into the algorithm, i.e., if in the lth iteration the solution is Δ v l and after looping over all related SEUs the solution is Δ v l,opt, we evaluate the (l + 1)th solution by Δ v l+1 = η(Δ v l,optΔ v l ) + Δ v l , where η is the learning rate randomly generated between η0 and η1. We set η0 = 0.5, η1 = 1.0, and epsilon = 0.075; considering the expectation for η is 0.75, the effective tolerance of our solution of Δ v is 0.10 km s−1, which is acceptable when compared to the typical precision of measured RVs (e.g., ∼1.5 km s−1 as reported by Wang et al. 2019).

Finally, the RVZP corrections for all the 137 groups of pointings converge after several tens of iterations with a Dell Precision R740 workstation with two Intel Xeon Platinum 8260 CPUs (2.40 GHz), among which the longest solution takes ∼10 hr. Compared to the computation of RV measurements using the CCF for the 3.8 million spectra, including the blue and red arms (∼1 week with the same machine), computing the RVZPs is quite fast.

4.5. Uncertainty Estimation

Rigorous uncertainties are very difficult to obtain for our RVZP corrections. The uncertainties of the RVZP correction values consist of two parts, namely the tolerance in the iteration process and the formal error. The tolerance is epsilon = 0.1 km s−1 as mentioned above. For the latter part, based on the discussion presented in Appendix B, we use the 16th and 84th percentiles to construct a fiducial error of our RVZP correction values Δvi,j divided by an empirical correction ξ to construct the formal error. Hence, the total uncertainties of the RVZP correction values are evaluated via

Equation (6)

where i and j index the SEUs, q16 and q84 denote the 16th and 84th percentiles of the residuals of the Gaia DR2 RVs and the RVZP-corrected LAMOST MRS RVs, Ni,j is the number of Gaia DR2 objects with RVs, and ξ is the empirical correction factor for small-number statistics.

5. Results and Validation

In total, we have measured RVs from 3,181,157/3,723,934 single-exposure blue/red-arm spectra in LAMOST MRS DR7 with S/N higher than 5. For 36,301/37,122/37,122 (B/R/Rm) of the 37,624 SEUs, we have successfully derived the initial values of the RVZPs. After eliminating bad SEUs with the criteria described at the end of Section 4.3, we estimate the final RVZPs for 33,073/35,207/35,199 (B/R/Rm) SEUs, which cover 2,985,015/3,631,023/3,629,895 B/R/Rm RVs. Roughly, the percentages of coverage are 87.9%/93.6%/93.6% for B/R/Rm in terms of SEUs and 93.8%/97.5%/97.5% for B/R/Rm in terms of RVs.

5.1. The Temporal Variation of RVZPs

In Figure 7, we present the Δvi,j for each SEU for Sc and ThAr arc lamps versus the date. The Sc lamp was in use until 2018 October, after which it was replaced by the ThAr lamp. The mean uncertainties of ΔvB, ΔvR, and ΔvRm are all ∼0.38 km s−1, which is quite good. The median uncertainties are even 20% smaller. The ΔvB and ΔvR have different patterns while ΔvR and ΔvRm are very similar. In Figure 8 we show the distribution of ΔvB, ΔvR, and ΔvRm of the SEUs solved. The μ here is estimated using the median, and σ is estimated using (q84 − q16)/2. The μ are 0.49 and 6.47 km s−1 for the ThAr and Sc lamps in the blue arm, while in the red arm they are 0.26 and 4.91 km s−1, respectively. The σ are 1.07 and 1.06 km s−1 for the ThAr and Sc lamps in the blue arm, and in the red arm they are 0.85 and 0.68 km s−1. Generally, the different systematics for the Sc- and ThAr-lamp-calibrated data are consistent with Wang et al. (2019). Despite the large systematics, we find that the precision of the Sc-lamp-calibrated data is no worse than that of the data calibrated by the ThAr lamp. The Rm results are very similar to those of R, except their μ have a 0.3 km s−1 difference. We note that this is reasonable considering that the systematics vary with wavelength as shown in Ren et al. (2021). In addition, ∼4000 single-exposure spectra calibrated using an Ne lamp are also found in DR7 v1.1. We confirm that the Ne lamp was used to calibrate the LRS spectra and these mistakenly calibrated data will be removed from the internationally available version of DR7, so we exclude these data in the following analysis.

Figure 7.

Figure 7. The temporal variations of the RVZP correction values for the blue arm (ΔvB), red arm (ΔvR), and Hα-masked red arm (ΔvRm) are shown in the top, middle, and bottom panels, respectively. The gray/cyan/olive square markers show the RVZP correction values of SEUs calibrated with an Sc/ThAr/Ne lamp in LAMOST MRS DR7. The maximum/minimum peak fluxes of Th 5231 Å and Ar 6752 Å of the 16 spectrographs are shown using gray areas.

Standard image High-resolution image
Figure 8.

Figure 8. Histograms of ΔvB, ΔvR, and ΔvRm. The Sc- and ThAr-lamp-calibrated data are shown in gray and cyan, respectively. The μ and σ are calculated using the median and (q84 − q16)/2, respectively.

Standard image High-resolution image

It is clear that the RVZPs are reasonably stable except after around 2019 May 1, which seems to be correlated with the arc lamp exposure flux. 14 We plot the Δv for each spectrograph for the time interval with available arc lamp intensity in Figures 9 and 10. Since our mean uncertainty of Δvi,j is 0.38 km s−1, we regard Δvi,j larger than 1 km s−1 as significant values, including the 7th, 12th, 13th, and 15th spectrographs of the blue arm and the 7th, 9th, and 15th spectrographs of the red arm. It is currently not known whether these shifts are due to the fact that the new ThAr lamps were brought into use at around 2019 May 1 or to some other issues induced in the maintenance. Probably in DR8, with the RVZP data in a longer time baseline, we can address the problem.

Figure 9.

Figure 9. The temporal variance of ΔvB before/after 2019 May 1 for the blue arm. The gray filled areas show the 16th and 84th percentiles of ΔvB in the two time intervals, and the black solid lines show the medians. The colors denote the square root of the peak flux of the Th5231 line, which can be used as an indicator of S/N. Here the new batch of ThAr lamps also introduces RVZP variance.

Standard image High-resolution image
Figure 10.

Figure 10. The same as Figure 9 but for the red arm (ΔvR). The colors denote the square root of the peak flux of the Ar6752 line.

Standard image High-resolution image

5.2. Validating with Other Data Sets

We also validate our RVs with stars common to other data sets, namely, Gaia DR2 (Katz et al. 2019), APOGEE DR16 (Column VHELIO_AVG; Jönsson et al. 2020), the RV standard stars from Huang et al. (2018), GALAH DR3 (Column rv_galah; Buder et al. 2021), and RAVE DR6 (Column hrv_sparv; Steinmetz et al. 2020b). The mean μ and scatter σ derived from Gaussian fitting to the residuals are tabulated in Table 2 and shown as functions of S/N in Figure 11. Also shown are the results of the LAMOST pipeline and Wang et al. (2019). Note that, in DR7 v1.1, the LAMOST pipeline only provides two RVs measured from the blue arm using ELODIE empirical templates (Moultaka et al. 2004) and ATLAS9 synthetic templates (Castelli & Kurucz 2003), respectively. In future MRS data releases, the RVs using ELODIE templates will be removed, and the RVs based on ATLAS9 will be provided for both blue and red arms for better performance. Therefore, we use calibrated RVs of blue arms based on ATLAS9 templates in our comparison (Column rv_ku1). The Wang et al. (2019) catalog is a subset of ours (including data taken during the first year and a half with an S/N cut at 10). Spectra without our measurements (i.e., either S/N < 5 or Δv is invalid) are excluded from this comparison, so that it is fair to the other two RV sources. At the high-S/N end (50–100), we find the standard deviations derived from Gaussian fitting for our results can reach 1.00/1.10, 0.84/0.80, 0.69/0.74, 0.72/0.78, and 1.77/1.88 km s−1 with respect to Gaia DR2, APOGEE DR16, Huang et al. (2018), GALAH DR3, and RAVE DR6 in the blue/red arm. The common stars among LAMOST MRS DR7, Gaia DR2, and the other four reference sets are used to calculate the fiducial accuracy of Gaia DR2. At the high-S/N end (50 < S/N < 100), the LAMOST MRS gets close to the performance of Gaia DR2 (see Figure 11). Note that, in our algorithm, the Gaia DR2 RVs are used as a reference set, so the comparison with Gaia DR2 is not an independent validation. These results of the comparisons are quite fascinating. At the high-S/N end, we outperform Wang et al. (2019) and the LAMOST pipeline by ∼20% and 30%, respectively, according to the blue-arm results compared to APOGEE DR16. At the low-S/N end (10–20), the advantages increase to 58% and 47%, indicating that our algorithm of RV measurements and RVZP determinations is quite efficient. Since the Huang et al. (2018) sample is from APOGEE DR14, the comparison to Huang et al. (2018) has a 0.4 km s−1 systematic bias, which does not exist in our comparison to APOGEE DR16. This may be due to the update of the APOGEE data release. GALAH DR3 has a 0.23 km s−1 systematic bias as compared to Gaia DR2. The comparison with RAVE DR6, whose spectral resolution is the same as that of the LAMOST MRS but is lower than that of Gaia-RVS, APOGEE, and GALAH, shows a large scatter but it is still reasonable.

Figure 11.

Figure 11. Comparisons of our absolute RVs from LAMOST MRS DR7 (including the RVs of this work, the LAMOST pipeline, and Wang et al. 2019) with other data sets, i.e., Gaia DR2 (Katz et al. 2019), APOGEE DR16 (Column VHELIO_AVG; Jönsson et al. 2020), the RV standard stars from Huang et al. (2018), GALAH DR3 (Column rv_galah; Buder et al. 2021), and RAVE DR6 (Column hrv_sparv; Steinmetz et al. 2020b). The scatters (σ) are estimated using the Gaussian fitting method.

Standard image High-resolution image

Table 2. Comparison of the LAMOST MRS RVs to Other Data Sets

  This WorkLAMOSTWang et al. (2019)Gaia DR2
Ref. Data SetS/NBlue ArmRed ArmHα MaskedBlue ArmBlue ArmRed ArmRVS
 5 < S/N < 10 N = 166,218 N = 89,032 N = 89,009 N = 166,218 N = 0 N = 0 
   μ = −0.02 μ = −0.05 μ = −0.07 μ = −0.56 
   σ = 1.68 σ = 2.38 σ = 2.47 σ = 3.78 
   
 10 < S/N < 20 N = 301,963 N = 164,068 N = 164,010 N = 301,963 N = 209,446 N = 26,195 
   μ = −0.04 μ = −0.10 μ = −0.10 μ = −0.53 μ = −0.55 μ = −0.59
   σ = 1.42 σ = 1.74 σ = 1.81 σ = 2.52 σ = 2.14 σ = 2.72
Gaia DR2
 20 < S/N < 50 N = 572,355 N = 531,044 N = 530,913 N = 572,355 N = 389,277 N = 313,059
   μ = −0.03 μ = −0.07 μ = −0.08 μ = −0.44 μ = −0.46 μ = −0.66 
   σ = 1.22 σ = 1.39 σ = 1.44 σ = 1.88 σ = 1.66 σ = 2.01 
   
 50 < S/N < 100 N = 270,880 N = 497,328 N = 497,126 N = 270,880 N = 180,183 N = 322,647 
   μ = 0.00 μ = −0.00 μ = −0.01 μ = −0.33 μ = −0.35 μ = −0.58 
   σ = 1.00 σ = 1.10 σ = 1.14 σ = 1.45 σ = 1.32 σ = 1.55 
 5 < S/N < 10 N = 35,352 N = 23,362 N = 23,354 N = 35,352 N = 0 N = 0 
   μ = 0.22 μ = 0.08 μ = 0.06 μ = −0.39 
   σ = 1.26 σ = 1.99 σ = 2.17 σ = 3.72 
   
 10 < S/N < 20 N = 49,482 N = 38,971 N = 38,987 N = 49,482 N = 33,297 N = 4781 
   μ = 0.13 μ = 0.08 μ = 0.02 μ = −0.41 μ = −0.46 μ = −0.47 N = 28,155
   σ = 0.95 σ = 1.27 σ = 1.38 σ = 2.24 σ = 1.81 σ = 2.14
APOGEE DR16 μ = 0.02
 20 < S/N < 50 N = 80,556 N = 87,168 N = 87,159 N = 80,556 N = 52,872 N = 48,104 σ = 0.67
   μ = 0.06 μ = 0.06 μ = 0.03 μ = −0.41 μ = −0.43 μ = −0.49 
   σ = 0.82 σ = 0.92 σ = 0.99 σ = 1.51 σ = 1.28 σ = 1.60 
   
 50 < S/N < 100 N = 41,479 N = 73,363 N = 73,350 N = 41,479 N = 26,724 N = 45,191 
   μ = −0.04 μ = 0.04 μ = 0.04 μ = −0.41 μ = −0.42 μ = −0.51 
   σ = 0.84 σ = 0.80 σ = 0.85 σ = 1.22 σ = 1.08 σ = 1.26 
 5 < S/N < 10 N = 2753 N = 1659 N = 1661 N = 2753 N = 0 N = 0 
   μ = 0.44 μ = 0.33 μ = 0.36 μ = 0.28 
   σ = 1.11 σ = 1.53 σ = 1.69 σ = 3.34 
   
 10 < S/N < 20 N = 3404 N = 2890 N = 2890 N = 3404 N = 2316 N = 161 
   μ = 0.43 μ = 0.36 μ = 0.36 μ = 0.01 μ = −0.18 μ = 0.18 N = 2070
   σ = 0.84 σ = 1.13 σ = 1.18 σ = 2.02 σ = 1.54 σ = 2.07
Huang et al. (2018) μ = 0.34
 20 < S/N < 50 N = 5050 N = 5923 N = 5920 N = 5050 N = 3352 N = 2938 σ = 0.59
   μ = 0.42 μ = 0.39 μ = 0.40 μ = 0.01 μ = −0.05 μ = 0.10 
   σ = 0.76 σ = 0.81 σ = 0.87 σ = 1.31 σ = 1.13 σ = 1.47 
   
 50 < S/N < 100 N = 2507 N = 4917 N = 4916 N = 2507 N = 1711 N = 3127 
   μ = 0.38 μ = 0.44 μ = 0.47 μ = 0.06 μ = 0.05 μ = −0.08 
   σ = 0.69 σ = 0.74 σ = 0.79 σ = 0.98 σ = 0.87 σ = 1.03 
 5 < S/N < 10 N = 30,272 N = 17,317 N = 17,271 N = 30,272 N = 0 N = 0 
   μ = 0.32 μ = 0.26 μ = 0.22 μ = −0.25 
   σ = 1.37 σ = 2.18 σ = 2.37 σ = 4.06 
   
 10 < S/N < 20 N = 42,242 N = 32,885 N = 32,768 N = 42,242 N = 29,780 N = 5135 
   μ = 0.27 μ = 0.29 μ = 0.23 μ = −0.20 μ = −0.28 μ = −0.19 N = 8778
   σ = 1.00 σ = 1.40 σ = 1.54 σ = 2.48 σ = 1.99 σ = 2.13
GALAH DR3 μ = 0.23
 20 < S/N < 50 N = 52,015 N = 63,642 N = 63,593 N = 52,015 N = 37,996 N = 40,401 σ = 0.77
   μ = 0.26 μ = 0.27 μ = 0.22 μ = −0.10 μ = −0.14 μ = −0.32 
   σ = 0.83 σ = 0.96 σ = 1.01 σ = 1.66 σ = 1.47 σ = 1.75 
   
 50 < S/N < 100 N = 18,796 N = 37,239 N = 37,239 N = 18,796 N = 13,727 N = 27,296 
   μ = 0.24 μ = 0.29 μ = 0.27 μ = −0.07 μ = −0.13 μ = −0.31 
   σ = 0.72 σ = 0.78 σ = 0.81 σ = 1.16 σ = 1.08 σ = 1.30 
 5 < S/N < 10 N = 593 N = 414 N = 414 N = 593 N = 0 N = 0 
   μ = 0.53 μ = 0.93 μ = 0.77 μ = 0.13 
   σ = 2.17 σ = 2.99 σ = 3.08 σ = 4.63 
   
 10 < S/N < 20 N = 1273 N = 719 N = 719 N = 1273 N = 862 N = 146 
   μ = 0.47 μ = 0.50 μ = 0.37 μ = −0.29 μ = −0.19 μ = 0.14 N = 1674
   σ = 1.94 σ = 2.08 σ = 2.03 σ = 3.10 σ = 2.55 σ = 3.02
RAVE DR6 μ = 0.33
 20 < S/N < 50 N = 3343 N = 2273 N = 2273 N = 3343 N = 2539 N = 1487 σ = 1.60
   μ = 0.39 μ = 0.62 μ = 0.55 μ = −0.09 μ = −0.16 μ = 0.15 
   σ = 1.81 σ = 2.15 σ = 2.17 σ = 2.34 σ = 2.14 σ = 2.57 
   
 50 < S/N < 100 N = 2332 N = 3209 N = 3209 N = 2332 N = 1798 N = 2368 
   μ = 0.35 μ = 0.41 μ = 0.35 μ = 0.08 μ = 0.05 μ = −0.25 
   σ = 1.77 σ = 1.88 σ = 1.86 σ = 2.11 σ = 1.93 σ = 2.10 

Note. In each cell, we present the number of stars (N), the Gaussian-fitted systematic bias (μ/km s−1), and the standard error (σ/km s−1) with respect to a specific reference set.

Download table as:  ASCIITypeset images: 1 2

5.3. Self-consistency

Besides the precision, a check on self-consistency is necessary and important before using RVs in time-domain research. As a demonstration, in Figures 12 and 13 we validate the temporal RV variations of the standard stars (whose RVs are assumed invariant) from Huang et al. (2018) in two pointings, namely, planid = HIP8426401 and HIP4312101, which have 10 and 17 exposures and contain 27 and 41 RV standard stars, respectively. The RVs from the LAMOST pipeline show large fluctuations, which are as expected from the comparison in Section 5.2. It turns out that for many stars, the RVs of multiple exposures from the Wang et al. (2019) catalog have exactly the same values. This is due to the failure of their Gaussian fitting process and then they fall back on the best estimation from the 1 km s−1 RV grid, which is a defect in the algorithm. By comparing the measured RVs and the RVZP-corrected RVs, we find that the RVZP-corrected RVs are much cleaner, indicating that the RVs do benefit from our algorithm for Δvi,j determination. The LAMOST pipeline and Wang et al. (2019) assume that the RVZP for a spectrograph is static so that the RVZP corrections are basically a shift of the RVs.

Figure 12.

Figure 12. A comparison of the RVs of standard stars observed in LAMOST MRS DR7 (planid = HIP8426401) determined in this work, in LAMOST, and in Wang et al. (2019). Each color represents a unique RV standard star. The thick ticks show the RVs from Huang et al. (2018), the dashed lines show the RVs before RVZP corrections, and the circles connected by solid lines show the RVZP-corrected RVs. The first exposure is in 2018 April and is calibrated with an Sc lamp, so the correction value is quite different from those of other exposures.

Standard image High-resolution image
Figure 13.

Figure 13. The same as Figure 12 but for planid = HIP4312101.

Standard image High-resolution image

To quantify the performance in self-consistency, we select RV standard stars from Huang et al. (2018) with at least eight exposures having valid RVZP-corrected RVs in our catalog and calculate their standard deviation, empirically corrected for the small-number statistic effect, which is discussed in Appendix B. The cumulative distribution function (CDF) of the standard deviations is shown in Figure 14. The absolute RVs from the blue arm (B) show the best self-consistency, followed by those from the red arm (R) and the Hα-masked red arm (Rm). By calculating the 50th, 90th, and 95th percentiles of the CDFs for the blue-arm results, we find our absolute RVs have significant advantages over those of the LAMOST pipeline, namely 16.5%, 35.5%, and 37.5% better (see Table 3), while Wang et al. (2019) are at nearly the same level as the LAMOST pipeline. This reveals the excellent self-consistency of our absolute RVs, which will be used in the time-domain analysis. Note that these estimations of advantages are very conservative due to the fact that the Wang et al. (2019) data set cuts the S/N at 10 but LAMOST and our sample cut at 5, for example. Besides, we do not exclude any of the "exactly the same" RVs from Wang et al. (2019) as readers may find that the CDFs for LAMOST and Wang et al. (2019) jump at around the 0 km s−1 position.

Figure 14.

Figure 14. The CDFs of the standard deviations of multiple observations for 678 RV standard stars (Huang et al. 2018) observed in LAMOST MRS DR7. The blue/red/orange colors denote the B/R/Rm measurements, and the solid/dashed/dotted lines represent the results of this work/LAMOST/Wang et al. (2019).

Standard image High-resolution image

Table 3. Comparison of the Self-consistency of This Work to That of the LAMOST Pipeline and Wang et al. (2019)

 This WorkLAMOSTWang et al. (2019)
PercentileBlue ArmRed ArmHα MaskedBlue ArmBlue ArmRed Arm
q = 50%0.45 (16.5%)0.540.590.530.54 (−0.7%)0.50
q = 90%1.07 (35.5%)1.391.611.661.76 (−5.8%)1.72
q = 95%1.45 (37.5%)1.862.102.322.29 (1.5%)2.08

Note. Each column shows the RV standard deviation for 678 standard stars at corresponding levels of the CDF. In the parentheses we show the advantages over the LAMOST blue-arm results.

Download table as:  ASCIITypeset image

With this excellent self-consistency, we select 10,320 candidate RV standard stars by requiring that in the blue arm (B) and red arm (R)

  • 1.  
    their numbers of exposures are at least 8,
  • 2.  
    their absolute RVs have standard deviations (corrected for small-number statistics) of less than 1.45 and 1.85 km s−1 (corresponding to the 95% level of the CDF, or to 95% completeness with respect to Huang et al. 2018), and
  • 3.  
    their time baselines are longer than at least 180 days.

These stars can be useful for the RV calibration of low-resolution surveys, such as the LAMOST LRS (R ∼ 1800).

6. Data Products

The data products of this work include

  • 1.  
    a catalog of ∼3.8 million measured RVs (but 5 million rows for completeness), associated errors, and information on observations for ∼0.8 million stars (Table 4),
  • 2.  
    a catalog of RVZP corrections (Δvi,j , for B, R, and Rm) and their uncertainties for all SEUs (Table 5), and
  • 3.  
    a catalog of 10,320 candidate RV standard stars with at least eight exposures and standard deviation less than 1.45/1.86 km s−1 in the blue/red arm over a time baseline longer than 180 days (Table 6).

The RV and RVZP catalogs can be cross-matched using the columns spid and lmjm. All catalogs will be available online in FITS format and also on GitHub. 15

A few tips: Users who want to correct Doppler effects of their spectra (e.g., Zhang et al. 2020a) should use RVs without RVZP corrections, while those who want to use absolute RVs can obtain them from our catalogs via Equation (2). The uncertainties of the absolute RVs can be evaluated via

Equation (7)

where σv,obs is the measurement error; σmin is the wavelength calibration error floor, which we can infer from the comparison to APOGEE DR16 to be approximately 0.85 km s−1 or conservatively 1 km s−1; σΔv is the uncertainty of the RVZP; and σmod is the contribution from the sparsity of the spectral templates (0.10 km s−1 for the blue arm and 0.20 km s−1 for the red arm).

Table 4. The 3.8 Million RVs (vobs) Obtained from LAMOST MRS DR7 (v1.1)

IndexLabel (FITS)FormatUnitsDescription
1obsidIntegerLAMOST observational ID (unique for each .fits file)
2lmjmIntegerminLMJM a
3bjdmidDoubleBarycentric Julian date of the middle of exposure
4planidStringPlan ID
5spidShortSpectrograph ID
6fiberidShortFiber ID
7raDoubledegR.A. (J2000)
8decDoubledegDecl. (J2000)
9snr_BFloatS/N of blue arm
10snr_RFloatS/N of red arm
11lamp_BStringLamp used to calibrate blue arm
12lamp_RStringLamp used to calibrate red arm
13rv_BFloat km s−1 RV (vobs)
14rv_err_BFloat km s−1 RV measurement error (σv,obs)
15rv_teff_BFloatK Teff of the best template
16ccfmax_BFloatCCF max value
17rv_RFloat km s−1 RV (vobs)
18rv_err_RFloat km s−1 RV measurement error (σv,obs)
19rv_teff_RFloatK Teff of the best template
20ccfmax_RFloatCCF max value
21rv_RmFloat km s−1 RV (vobs)
22rv_err_RmFloat km s−1 RV measurement error (σv,obs)
23rv_teff_RmFloatK Teff of the best template
24ccfmax_RmFloatCCF max value

Notes. The suffixes _B, _R, and _Rm represent results for the blue arm, red arm, and red arm without Hα, respectively. Table 4 is published in its entirety in machine-readable (FITS) format. A portion is shown here for guidance regarding its form and content.

a The LMJM is 1440× the local modified Julian date of the beginning of exposure, which is an 8 bit integer assigned to each exposure.(This table is available in its entirety in FITS format.)

Download table as:  ASCIITypeset image

Table 5. The RVZP Correction Values (Δv) for LAMOST MRS DR7 (v1.1)

IndexLabel (FITS)FormatUnitsDescription
1planidStringPlan ID
2lmjmIntegerLMJM a
3spidIntegerSpectrograph ID
4rv_corr0_BFloat km s−1 Initial guess of RVZP correction
5nStar_fnt_BIntegerNumber of stars in this SEU with Gaia DR2 RVs
6rv_corr2_BFloat km s−1 Final RVZP correction (Δv)
7nF1_BIntegerNumber of stars for first term of cost function
8nF2_BIntegerNumber of stars for second term of cost function
9nOther_med_BIntegerMedian number of exposures
10nOther_max_BIntegerMaximum number of exposures
11nOther_min_BIntegerMinimum number of exposures
12rv_corr2_unc_BFloat km s−1 Uncertainty of the final RVZP correction (σΔv )
13rv_corr0_RFloat km s−1 Initial guess of RVZP correction
14nStar_fnt_RIntegerNumber of stars in this SEU with Gaia DR2 RVs
15rv_corr2_RFloat km s−1 Final RVZP correction (Δv)
16nF1_RIntegerNumber of stars for first term of cost function
17nF2_RIntegerNumber of stars for second term of cost function
18nOther_med_RIntegerMedian number of exposures
19nOther_max_RIntegerMaximum number of exposures
20nOther_min_RIntegerMinimum number of exposures
21rv_corr2_unc_RFloat km s−1 Uncertainty of the final RVZP correction (σΔv )
22rv_corr0_RmFloat km s−1 Initial guess of RVZP correction
23nStar_fnt_RmIntegerNumber of stars in this SEU with Gaia DR2 RVs
24rv_corr2_RmFloat km s−1 Final RVZP correction (Δv)
25nF1_RmIntegerNumber of stars for first term of cost function
26nF2_RmIntegerNumber of stars for second term of cost function
27nOther_med_RmIntegerMedian number of exposures
28nOther_max_RmIntegerMaximum number of exposures
29nOther_min_RmIntegerMinimum number of exposures
30rv_corr2_unc_RmFloat km s−1 Uncertainty of the final RVZP correction (σΔv )

Notes. The suffixes _B, _R, and _Rm represent results for the blue arm, red arm, and red arm without Hα, respectively. Table 5 is published in its entirety in machine-readable (FITS) format. A portion is shown here for guidance regarding its form and content.

a The LMJM is 1440× the local modified Julian date of the beginning of exposure, which is an 8 bit integer assigned to each exposure.(This table is available in its entirety in FITS format.)

Download table as:  ASCIITypeset image

Table 6. Selected Candidates of RV Standard Stars from LAMOST MRS DR7 (v1.1)

IndexLabel (FITS)FormatUnitsDescription
1raDoubledegR.A. (J2000)
2decDoubledegDecl. (J2000)
3rvmed_BFloat km s−1 Median absolute RV
4rvstd_BFloat km s−1 Standard deviation of absolute RVs
5Nexp_BIntegerNumber of exposures
6ts_BDoubledaysTime span
7rvmed_RFloat km s−1 Median absolute RV
8rvstd_RFloat km s−1 Standard deviation of absolute RVs
9Nexp_RIntegerNumber of exposures
10ts_RDoubledaysTime span

Note. The suffixes _B and _R represent results for the blue arm and red arm, respectively. Table 6 is published in its entirety in machine-readable (FITS) format. A portion is shown here for guidance regarding its form and content.

(This table is available in its entirety in FITS format.)

Download table as:  ASCIITypeset image

7. Summary

In this paper, we measure the RVs from LAMOST MRS DR7 stellar spectra and determine the RVZPs with the help of Gaia DR2 RVs, aiming at making the absolute RVs self-consistent and proper for time-domain analysis. More specifically,

  • 1.  
    we have measured the RVs of ∼3.8 million single-exposure spectra for more than 0.8 million stars obtained from LAMOST MRS DR7, including the blue arm and red arm (with and without Hα);
  • 2.  
    we determine the RVZPs exposure by exposure (for 3.6 million spectra) by comparing the measured RVs to those of Gaia DR2 and multiple MRS exposures using a robust method to a mean precision of 0.38 km s−1;
  • 3.  
    we find the RVZPs vary significantly for some spectrographs before/after 2019 May 1, which confirms the utility of our algorithm for determining RVZPs;
  • 4.  
    we find good consistency in the comparisons of our absolute RVs with those of APOGEE DR16, RV standard stars (Huang et al. 2018), GALAH DR3, and RAVE DR6, and the precision at 50 < S/N < 100 can reach 1.00/1.10, 0.84/0.80, 0.69/0.74, 0.72/0.78, and 1.77/1.88 km s−1 in the blue/red arm, respectively;
  • 5.  
    we show that compared to those of the LAMOST pipeline and Wang et al. (2019), our absolute RVs have 16.5%, 35.5%, and 37.5% better self-consistency at the 50%, 90%, and 95% levels of the CDF of the standard deviations, respectively, which benefits the subsequent time-domain analysis; and
  • 6.  
    we select a set of 10,320 candidate RV standard stars whose standard deviations of RVs are less than 1.45 and 1.86 km s−1 in the blue arm and red arm, respectively, over a time baseline of at least 180 days.

LAMOST MRS DR7 v1.2 and v1.3 have been released. We confirm that, in DR7 v1.2/1.3 the spectra are the same as those in v1.1 while the catalogs and parameters have some minor changes. 16 , 17 Therefore, our results can be cross-matched with the v1.2/v1.3 catalogs directly. And we will release a new version of RVs on github once DR8 is released. On the other hand, since Gaia eDR3 is the same as DR2 but with moderate filtering, our absolute RVs should be consistent with Gaia eDR3. In future LAMOST MRS data releases, we will update our RVs using the most recent Gaia RVs as a reference set.

B.Z. thanks Qin Lai, Feng Luo, Dr. Rui Wang, Dr. Hai-Bo Yuan, Prof. Jian-Jun Chen, Prof. Hao-Tong Zhang, Prof. A-Li Luo, and Prof. Jian-Rong Shi for constructive discussions and generous help with the paper. B.Z. also acknowledges support from the LAMOST FELLOWSHIP fund. J.-N.F. acknowledges support from the National Natural Science Foundation of China (NSFC) through grants 11833002, 12090042, and 12090040. This work is supported by the National Key R&D Program of China (No. 2019YFA0405500). C.L. thanks the NSFC for grant No. 11835057. This work is supported by the Chinese Space Station Telescope pre-research projects Key Problems in Binaries and Chemical Evolution of the Milky Way and Its Nearby Galaxies, NSFC (No. U1931209). W.Z. is supported by Fundamental Research Funds for the Central Universities. The authors also thank the reviewer of this paper for the constructive comments and patience during the review process.

The LAMOST FELLOWSHIP is supported by Special Funding for Advanced Users, budgeted and administered by the Center for Astronomical Mega-science, Chinese Academy of Sciences (CAMS-CAS). This work is supported by the Cultivation Project for LAMOST Scientific Payoff and Research Achievement of CAMS-CAS.

Guoshoujing Telescope (LAMOST) is a National Major Scientific Project built by CAS. Funding for the project has been provided by the National Development and Reform Commission. LAMOST is operated and managed by the National Astronomical Observatories, CAS.

Facility: LAMOST - .

Software: laspec (Zhang 2020a), regli (Zhang 2020b), berliner (Zhang 2020c), NumPy (van der Walt et al. 2011), SciPy (Virtanen et al. 2020), Astropy (Astropy Collaboration et al. 2013, 2018).

Appendix A: CCF

In this section, we explain our definitions of mean, variance, covariance, and CCF. Let X denote a vector containing N elements {Xi } (e.g., a continuum-normalized spectrum with N pixels); the mean is defined as

Equation (A1)

the variance as

Equation (A2)

and the covariance of two vectors X and Y as

Equation (A3)

A normalized CCF can be calculated with standardized f and g, namely

Equation (A4)

where F is one vector and G (v) is another vector but it is shifted by the RV v. When utilizing this CCF to estimate stellar RVs, G (v) is usually a spectral template whose S/N is infinite and covers the wavelength range of F . Therefore, the shift could be implemented with interpolation. The CCF in this form is essentially the linear correlation coefficient and varies between −1 and 1.

Appendix B: Bias in Small-number Statistics

The estimators that characterize dispersion are often underestimated when the number of samples is small. For example, if we only have three or five measurements of a physical quantity, the standard deviation could be underestimated. In this section, we propose an empirical correction of this bias for the error of mean and standard deviation assuming a Gaussian distribution P(xμ, σ), where μ is its position and σ is its standard error.

B.1. The Deviation of the Mean

We can minimize an L1-norm or L2-norm cost function, namely, ${\sum }_{i}| {x}_{i}-\hat{\mu }| $ or ${\sum }_{i}{\left({x}_{i}-\hat{\mu }\right)}^{2}/2$, respectively, to get an estimate of the mean $\hat{\mu }$. The true deviation is by definition

Equation (B1)

However, in practice we do not know μ when we tackle such a problem. A fiducial deviation associated with $\hat{\mu }$ can be constructed using the 84th and 16th percentiles (or the interquantiles; see Lupton 1993; Ivezić et al. 2014), i.e.,

Equation (B2)

where N is the sample size. To obtain an empirical relation between δest and δtrue, we assume the following form:

Equation (B3)

where ξ(N) is the empirical correction factor and is a function of N. Then, we draw mock data from a standard Gaussian distribution with the numpy.random module. In each experiment, we draw N samples and calculate δest and δtrue, and derive ξ. We repeat this experiment 3000 times for each N, which ranges from 2 to 250 (which is enough for our purposes), and show the 16th, 50th, and 84th percentiles of the results for each N in the left panel of Figure B1. This fiducial deviation is an underestimation of δtrue. The relation between the medians of logξ and logN is fitted with a fifth-order polynomial function, whose coefficients are tabulated in Table B1. With this relation, we can scale the fiducial deviation to a standard that is less affected by the sample size N.

Figure B1.

Figure B1. The empirical correction factor for the error of mean (ξ) and for the standard error (ζ) of the Gaussian distributions in small-number statistics.

Standard image High-resolution image

Table B1. The Best-fit Coefficients of the Fifth-order Polynomials for the Empirical Relationship between log10 ξ and log10 N

Cost Function β5 β4 β3 β2 β1 β0
${\sum }_{i}| {x}_{i}-\hat{\mu }| $ 0.07349721−0.606470221.97806105−3.229940842.72007585−0.92812989
${\sum }_{i}{\left({x}_{i}-\hat{\mu }\right)}^{2}/2$ 0.08434813−0.696944292.28047526−3.738214533.15612997−0.99158461

Note. Our definition of a polynomial is poly(xβ) = ∑i βi xi .

Download table as:  ASCIITypeset image

B.2. The Standard Error

For a Gaussian distribution P(xμ, σ), we can use a sample-based standard deviation and the 16th and 84th percentiles to estimate the true standard error σ, i.e.,

Equation (B4)

and

Equation (B5)

respectively. We define ζ by

Equation (B6)

The results clearly show that both methods underestimate the standard error. Similar to the procedures in the previous test, we fit the relation with a fifth-order polynomial to the medians of ζ and ${\mathrm{log}}_{10}N$; the best-fit polynomials are shown in the right panel of Figure B1 and the coefficients are tabulated in Table B2. Compared to Equation (B4), Equation (B5) is more robust to outliers but suffers from more significant underestimation when N is small.

Table B2. The Best-fit Coefficients of the Fifth-order Polynomials for the Empirical Relationship Between ζ and log10 N

Estimator β5 β4 β3 β2 β1 β0
(q84q16)/20.05712779−0.470505921.56888875−2.756490242.71900493−0.28637106
$\sqrt{\tfrac{1}{N-1}\sum {\left({x}_{i}-\hat{\mu }\right)}^{2}}$ 0.08344418−0.686305042.21084202−3.505291862.778540930.08576048

Download table as:  ASCIITypeset image

Appendix C: Several Related Python Packages

Three packages are developed in this work.

  • 1.  
    laspec (Zhang 2020a): A toolkit for LAMOST MRS/LRS spectra, including modules for file IO, spectral convolution, continuum normalization, removal of cosmic rays, CCFs, and empirical correction evaluation (Appendix B).
  • 2.  
    regli (Zhang 2020b): The Regular Grid Linear Interpolator, a multidimensional linear interpolator based on gridded data. It is faster than scipy.interpolate.LinearNDInterpolator in the Python standard library in our performance test.
  • 3.  
    berliner (Zhang 2020c): A toolkit for manipulating MIST (Dotter 2016) and PARSEC (Bressan et al. 2012) stellar evolutionary tracks and isochrones, including a Python interface for downloading PARSEC isochrones from the CMD 3.4 website (http://stev.oapd.inaf.it/cgi-bin/cmd).

The source code and some tutorials of these packages can be found at https://github.com/hypergravity. Readers who are interested in LAMOST MRS spectra might find them useful for their research.

Footnotes

Please wait… references are loading.
10.3847/1538-4365/ac0834