Self-consistent Stellar Radial Velocities from LAMOST Medium-resolution Survey DR7

Bo Zhang; Jiao Li; Fan Yang; Jian-Ping Xiong; Jian-Ning Fu; Chao Liu; Hao Tian; Yin-Bi Li; Jia-Xin Wang; Cai-Xia Liang; Yu-Tao Zhou; Weikai Zong; Cheng-Qun Yang; Nian Liu; Yong-Hui Hou

doi:10.3847/1538-4365/ac0834

1. Introduction

Large spectroscopic surveys, for example, RAVE (Steinmetz et al. 2006, 2020a), the Sloan Digital Sky Survey/SEGUE (Yanny et al. 2009), the Large Sky Area Multi-object Fiber Spectroscopic Telescope (LAMOST; Cui et al. 2012; Deng et al. 2012; Zhao et al. 2012; Luo et al. 2015), APOGEE (Majewski et al. 2017), GALAH (De Silva et al. 2015), Gaia-ESO (Gilmore et al. 2012), and the Gaia Radial Velocity Spectrometer (Gaia-RVS; Katz et al. 2004; Cropper et al. 2018), have obtained tens of millions of stellar spectra over the last two decades, aiming at understanding the formation and evolution of the Galaxy. Among the most fundamental physical quantities derived from stellar spectra is radial velocity (RV), which forms the basis of many studies, such as on stellar multiplicity (e.g., Gao et al. 2014, 2017; El-Badry et al. 2018; Yang et al. 2020), stellar kinematics (e.g., Tian et al. 2020; Bird et al. 2020), and Galactic substructures (e.g., Yang et al. 2019; Xu et al. 2020).

LAMOST, after a five-year low-resolution spectroscopic survey (LRS; R ∼ 1800, 3800 Å < λ < 9000 Å), started a five-year medium-resolution spectroscopic survey (MRS; R ∼ 7500, 4950 Å < λ < 5350 Å and 6300 Å < λ < 6800 Å; Liu et al. 2020) in 2017 September. DR6,¹¹ the first data release of the MRS, contains the data obtained from 2017 September to 2018 June and is already available to the international astronomical community. DR7,¹² including the data from 2017 September to 2019 June (about five million spectra for over 800,000 stars), is currently open to the Chinese astronomical community only.

In the beginning, an Sc lamp was used to calibrate the wavelength of LAMOST MRS spectra; later, it was switched to a ThAr lamp (in 2018). Due to the short wavelength coverage, sky lines are not used in wavelength calibration as in the LRS (Wang et al. 2010). The released spectra have undergone barycentric correction and the wavelength uses the vacuum standard. Both the LAMOST pipeline and Wang et al. (2019) have measured RVs from DR7 spectra and estimated a static RV zero-point (RVZP) by comparing the measured RVs to those in the literature of a sample of RV standard stars (Huang et al. 2018) for each spectrograph. However, the temporal variation of the RVZPs (between exposures) is not clear so far despite a few trials. For example, Liu et al. (2019b) and Zong et al. (2020) show the temporal variation of RVZPs based on data from the Kepler field using a sample of roughly selected RV-invariant stars, and Ren et al. (2021) show that the RVZPs of the red arm vary between exposures and the difference can reach 4 km s⁻¹ in the MRS-N fields (nebula survey; Wu et al. 2021).

Physically, the variation of the LAMOST MRS RVZPs can be explained by several factors.

1.
LAMOST has a long optical path and a large focal plane (1.75 m diameter) on which 4000 fibers are installed (Cui et al. 2012); temperature variation is unavoidable.
2.
Currently, arc lamp exposures are taken about every 2 hr (typical observing time of a plate), which is not frequent enough. The instrument might change its state between lamp exposures.
3.
Since the LAMOST MRS needs ∼20 ThAr lamps to illuminate the big focal plane simultaneously, the lamps are easily damaged and frequently replaced, and the lamp exposure time needs frequent adjustment. These factors affect the signal-to-noise ratio (S/N) of the lamp spectra and thus the wavelength calibration consistency over time.
4.
The MRS spectrographs are mounted and dismounted monthly because the MRS is scheduled on the 14 bright/gray nights, while the other nights are for the LRS. Therefore, the focuses are adjusted monthly.

All these factors and other potential defects in the data reduction pipeline are finally reflected in the RVZPs of the spectra. Therefore, it is insufficient to use the RVs from either the LAMOST pipeline or Wang et al. (2019) at different epochs and perform a time-domain analysis, such as in studies of pulsating stars (the LAMOST-Kepler Project; Zong et al. 2020; Fu et al. 2020) and spectroscopic binaries (Gao et al. 2014, 2017; Liu 2019) and even in searches for black holes (Liu et al. 2019a; Gu et al. 2019), which are important scientific goals of the MRS (Liu et al. 2020).

In this paper, aiming at subsequent time-domain analysis, we measure the RVs for the 3.8 million single-exposure spectra with S/N > 5 in LAMOST MRS DR7 v1.1,¹³ and propose a robust method to self-consistently determine the absolute RVZPs with the help of Gaia DR2 data. In Section 2, we briefly describe the observational rules of the LAMOST MRS and the instrumental parameters. In Section 3, we describe how we measure the RVs. In Section 4, we show the algorithm that determines the RVZPs self-consistently. In Section 5, we present our RV and RVZP measurements and assess the precision and self-consistency, and select a sample of candidate RV standard stars based on our results. Information on and instructions regarding our data products are described in Section 6, and a summary of this work is given in Section 7.

2. The LAMOST MRS

2.1. Targeting and Observational Rules

The scientific goals of the LAMOST MRS mainly include stellar multiplicity, stellar pulsation, star formation, emission nebulae, Galactic archeology, host stars of exoplanets, and open clusters. The details of the scientific plan and the survey strategy are described in Liu et al. (2020). In this section, we summarize them briefly.

Each scientific goal has a principal investigator who is responsible for its targeting. A planid is assigned to each MRS plate (pointing), which has the form [TD/NT]hhmmss[N/S]ddmmssXnn. The first two characters denote time-domain (TD, will be repeatedly observed during the five-year survey) or non-time-domain (NT, just observed on a single night), hhmmss[N/S]ddmmss represents the equatorial coordinates of the plate center, [N/S] means north/south, the digit X is used to denote the scientific goal (see Table 1), and the last two digits represent a serial number. For example, the planid TD062610N184524B01 means the plate is a time-domain plate dedicated to binarity/multiplicity research. There also exist some irregular planid that are testing fields, such as HIP8426401 and NGC216801. HIP8426401 means the central star of the plate is HIP 84264, and NGC216801 means a plate toward the star cluster NGC 2168.

Table 1. Abbreviations of Scientific Goals (X) Used in planid

X	NT/TD	Science
K	TD	Kepler fields
H	TD	high-frequency Kepler fields
B	TD	binarity/multiplicity
T	TD	TESS fields
M	NT	Milky Way
S	NT	star formation
C	NT	star cluster
N	NT	nebula

Download table as: ASCII Typeset image

The MRS uses the local modified Julian minute (LMJM), an 8 bit integer defined as 1440× the local modified Julian date at the beginning of each exposure, as the stamp of each exposure. The typical exposure time is set to 1200 s, while 900 s and 600 s exposures also exist, depending on the brightness of the targets. Each NT plate is observed with three consecutive exposures while each TD plate is observed until it is unobservable (usually five to six 1200 s exposures are allowed). An arc lamp exposure is taken at the beginning of each plate, and at the end of an observing night (every ∼2 hr). The targeting is mostly based on the Gaia DR2 source catalog (Gaia Collaboration et al. 2018). For a 1200 s exposure, the magnitudes corresponding to S/N = 5 in blue- and red-arm spectra are G ∼ 14.5 and 15.2 mag, respectively (see Figure 1 in Liu et al. 2020). Magnitudes brighter than these cover the majority of the objects observed by the LAMOST MRS.

2.2. LAMOST MRS Spectra

The whole LAMOST MRS DR7 catalog (R ∼ 7500) contains over five million single-exposure spectra of over 800,000 stars obtained from 2017 September to 2019 June. In this work, 3,753,659 spectra with S/N > 5 either in the blue arm or in the red arm for 600,771 stars are selected. Their distributions on the number of exposures and time span are presented in Figure 1. The spectra are oversampled. The typical wavelength steps are 0.11 and 0.14 Å at the blue and red arms, respectively. The sampling rate (λ/Δλ ∼ 45,000) is as high as six times the spectral resolution (R ∼ 7500). In Figure 2, we show a spectrum of a K-type giant star as a demo of the LAMOST MRS. The blue arm and red arm are designed mainly for the Mg i b triplet at around 5175 Å and for Hα at around 6564 Å. The released spectra are not corrected with response curves.

**Figure 1.** The distributions of the number of exposures and the time span of the stars observed in LAMOST MRS DR7.
Download figure:
Standard image High-resolution image

**Figure 2.** An example of a single-exposure spectrum of a K-type giant star observed in LAMOST MRS DR7 (med-58649-TD193637N444141K01_sp16-249, KIC 9833073). The upper panels show the continuum-normalized blue- and red-arm spectra. The lower panels show zoomed-in views of the upper panels at the Mg b and Hα regions. The zoomed-in regions are shown as gray blocks in the upper panels. The S/N of the blue and red arms are 29 and 61, respectively.
Download figure:
Standard image High-resolution image

3. Measurement of RVs

3.1. Preparation for RV Measurements

We notice that cosmic rays frequently pollute MRS single-exposure spectra and are neither identified nor removed by the LAMOST pipeline. Therefore, in our method, the first step is to carefully remove cosmic rays in spectra. We smooth spectra with a 21-pixel median filter followed by a 9-pixel Gaussian filter, and remove the original pixels that deviate from the smoothed spectrum by four and eight times the local standard deviation in the upper and lower directions. The parameters of the filters are set empirically so that the absorption lines in the spectra of A-, F-, G-, and K-type stars are not affected. The removed pixels are replaced by linearly interpolated values using neighboring pixels. We also note that the two ends of both the blue and red arms are sometimes tilted and show unreasonable flux values probably due to extrapolation of the sky flux modeling, so we trim 50 pixels at both edges of each arm.

The second step is to normalize spectra to a pseudocontinuum to place the spectral features (usually absorption lines) on the same scale. We iteratively fit the spectrum with a smoothing spline function and clip the pixels away from the median values by three times the standard deviation in each 100 Å window. The number of iterations is set to 3. As shown in Figure 2, for a typical K-type star with a medium S/N, the normalization is quite adequate for the subsequent RV measurements.

3.2. Spectral Templates

We adopt the synthetic grid published by Allende Prieto et al. (2018) based on the ATLAS9 stellar atmosphere model (Kurucz 1979; Mészáros et al. 2012) as our spectral library. We degrade the spectral resolution from 10,000 to 7500 to fit the MRS configuration and convert the air wavelength to vacuum wavelength using the formula proposed by Morton (2000). To limit the computational cost of the subsequent RV measurements to a reasonable amount, we interpolate the synthetic library to generate 100 spectral templates with stellar parameters randomly drawn from a uniform distribution in the ranges of 3500 < T_eff/K < 15,000, $0\lt \mathrm{log}g/\mathrm{dex}\lt 5$ , −2 < [Fe/H]/dex < 0.5, and −0.5 < [α/Fe]/dex < 0.7. Extremely metal-poor and extremely hot templates are not considered because the spectral features are not significant. Tests on high-S/N MRS spectra show that the sparsity of our templates induces statistical errors ∼0.10 and 0.20 km s⁻¹ in the blue and red arms, respectively, which are negligible as compared to other sources of uncertainties. Figure 3 shows the distribution of parameters of the 100 spectral templates and a series of PARSEC isochrones (Bressan et al. 2012) with solar metallicity and logarithmic age ${\mathrm{log}}_{10}\tau =7$ , 8, 9, and 10. These spectral templates are normalized to a pseudocontinuum in the same way as in Section 3.1.

**Figure 3.** The stellar parameters of the 100 randomly drawn templates that are used in the RV measurements. The gray area shows the model grid coverage. The black solid lines are the PARSEC isochrones with solar metallicity and ${\mathrm{log}}_{10}\tau =7$ , 8, 9, and 10, where τ is age/yr.
Download figure:
Standard image High-resolution image

**Figure 3.** The stellar parameters of the 100 randomly drawn templates that are used in the RV measurements. The gray area shows the model grid coverage. The black solid lines are the PARSEC isochrones with solar metallicity and ${\mathrm{log}}_{10}\tau =7$ , 8, 9, and 10, where τ is age/yr.
Download figure:
Standard image High-resolution image

3.3. RV Estimates

The cross-correlation function (CCF; Tonry & Davis 1979) method is widely used in spectroscopic surveys to measure stellar RVs (e.g., Nidever et al. 2015; Steinmetz et al. 2020b). One important advantage is that the CCF can be accelerated using fast Fourier transformation (FFT) once a spectrum is continuum-subtracted and resampled to a logarithmic wavelength grid. However, the drawback of such a scheme is that the sampling of the resulting CCF is generally very sparse. To evaluate the CCF at a smaller RV step, we do not follow the FFT way. In our implementation, the CCF at an RV of v is evaluated as

$\begin{eqnarray}&&\mathrm{CCF}(v| {\boldsymbol{F}},{\boldsymbol{G}})=\displaystyle \frac{\mathrm{Cov}\left({\boldsymbol{F}},{\boldsymbol{G}}(v)\right)}{\sqrt{\mathrm{Var}({\boldsymbol{F}})\mathrm{Var}({\boldsymbol{G}}(v))}},\end{eqnarray} \tag{ 1 }$

where F is the vector of the normalized observed spectrum, G (v) is the vector of the normalized synthetic spectrum shifted by an RV (v) and resampled to the wavelength grid of F , Var represents the variance operator, and Cov represents the covariance operator (see Appendix A for more details).

Deriving the final RV consists of three steps:

1.
The initial estimates are made from an RV grid of −1500 to 1500 km s⁻¹, with a step of 10 km s⁻¹. The template with the maximum CCF value is selected as the best-match template, and the corresponding RV is adopted as the initial guess of the final RV of the observed star. The parameters of the template (T_eff, $\mathrm{log}g$ , [Fe/H], and [α/Fe]) are recorded, which make a good prior for some following analyses, such as for stellar atmospheric parameter determination.
2.
With the best-match template, we maximize the CCF to determine the final RV (v_obs) using the optimization routine scipy.optimize.minimize with the Nelder–Mead algorithm (Nelder & Mead 1965). The corresponding CCF value is recorded as CCFMAX to assess the likelihood between the best-match template and the observed spectrum. The S/N–CCFMAX relations are shown in Figure 4.
3.
To obtain the measurement error σ_v,obs, we use a Monte Carlo method—namely, we repeat this process 100 times and each time we add Gaussian random noise to the spectrum according to the flux error. The measurement error is computed using the 16th and 84th percentiles, i.e., σ_v,obs = (v₈₄ – v₁₆)/2. The S/N–σ_v,obs relations are shown in Figure 5.

We avoid Gaussian fitting to the CCF, which is widely used in literature (e.g., Nidever et al. 2015; Wang et al. 2019). Our reasons for doing this include the fact that the CCF peak is obviously non-Gaussian and that at low S/N the fitting process is fragile. The final error of the RV includes the measurement error and a noise floor term that is due to systematics (e.g., background, detector imperfections, temperature changes, or focusing issues) and will be assessed below.

**Figure 4.** The left/middle/right panel shows the S/N–CCFMAX relation for the B (blue arm) / R (red arm) / Rm (red arm with Hα masked) CCF results of 5000 randomly selected spectra, respectively. The color denotes the T_eff of the best-match template.
Download figure:
Standard image High-resolution image

**Figure 5.** The left/middle/right panel shows the S/N–σ_v,obs (measurement error) relation for the B (blue arm) / R (red arm) / Rm (red arm with Hα masked) CCF results of 5000 randomly selected spectra, respectively. The color denotes the T_eff of the best-match template.
Download figure:
Standard image High-resolution image

In this work, one RV is estimated for the blue arm (v_B) and two for the red arm (v_R and v_Rm). The v_Rm is measured using the Hα-masked red-arm spectrum. As shown in Figures 4 and 5, these RV measurements deteriorate rapidly at S/N < 20. For cool stars (e.g., FGK types), at a given S/N, v_B is more precise than v_R because of rich spectral features in the blue arm (for a detailed discussion of spectral information content in MRS spectra, see Zhang et al. 2020b). However, for hot stars, v_R is more reliable than v_B because of the Hα feature and v_Rm is significantly less precise than v_R due to the absence of Hα. Therefore, we recommend that readers only consider v_Rm when the targets have (possible) Hα emissions.

4. RVZPs

In essence, the RVZP is the bias of the wavelength solution of a specific spectrum as compared to its true wavelength solution in terms of RV. It is affected by many factors, e.g., the condition of the instrument, the quality of the arc lamp exposure, the reduction algorithm, and the nonsimultaneous nature of the arc lamp exposures and the object exposures. In this paper, we define the RVZP correction value Δv by

$\begin{eqnarray}&&{v}_{\mathrm{abs}}={v}_{\mathrm{obs}}+{\rm{\Delta }}v,\end{eqnarray} \tag{ 2 }$

where v_abs is the absolute RV and v_obs the RV directly measured from the spectrum.

4.1. The Scheme

LAMOST has 16 spectrographs of which each has 250 fibers (4000 fibers in total). Excluding a few tens of sky fibers and a few problematic fibers, each spectrograph typically produces ≲200 spectra in an exposure, depending on the targeting, the condition of the instrument, the data quality, and the reduction algorithm. Let i denote the exposure epoch or LMJM, j the spectrograph ID, and k the fiber ID; ideally, we seek the solution of the RVZP for each fiber, for each spectrograph, and exposure by exposure, namely, Δv_i,j,k. This scheme is infeasible because the true/reference RVs v_i,j,k,abs of the targets are not always known.

In this work, assuming that the fibers in a spectrograph in one exposure (hereafter, we refer to it as a spectrograph exposure unit, or an SEU) share similar RVZPs, the systematic RVZP Δv_i,j can be determined as long as a homogeneous reference set of RVs can be found for a fraction of fibers in that SEU. The assumption is quite reasonable given the fact that the wavelength calibration of a multifiber spectrograph is done by fitting a 2D grating equation. And as we will see in Section 4.2, the Gaia DR2 RVs (Katz et al. 2019) meet our needs for the reference set.

As a contrast, both the LAMOST pipeline and Wang et al. (2019) calculate Δv_j assuming the RVZPs for a specific spectrograph do not vary with time, and, therefore, get around the temporal variation of RVZPs. However, we notice that there exist some weird absolute RVs (as we will see in the results in Section 5).

4.2. Gaia DR2 RVs as the Reference Set

Thanks to the European Space Agency's Gaia mission (Gaia Collaboration et al. 2016), we now have access to the largest RV data set, which matches the LAMOST MRS in terms of velocity precision and magnitude limit. The spectral resolution of Gaia-RVS (R ∼ 11,500; Cropper et al. 2018) is slightly higher than that of the MRS (R ∼ 7500). The magnitude limit of the Gaia DR2 RV catalog (Katz et al. 2019) is at G_RVS = 12 mag or G ∼ 14 mag depending on spectral type and line-of-sight interstellar extinction, which is slightly brighter than the LAMOST MRS magnitude limit. Gaia DR2 contains qualified median RVs for 7,224,631 stars derived from the Gaia-RVS spectra, with T_eff in the range [3550, 6900] K, excluding large RV-variant stars (see Katz et al. 2019 for details). At the faint end, G_RVS = 11.75 mag, the precisions for T_eff = 5000 and 6500 K are 1.4 and 3.7 km s⁻¹, respectively.

Aiming at studying time-domain astrophysical phenomena, e.g., spectroscopic binaries, we proceed to carrying out the second-best scheme—Δv_i,j. Cross-matching the LAMOST MRS DR7 catalog with Gaia DR2, we find 1,582,948 out of the 3,753,659 single-exposure spectra (42.1%) have Gaia RVs, and the common objects usually have good S/N in the MRS because they are relatively bright in that survey. The number of objects in Gaia DR2 is ∼1000 times larger than that in catalogs of RV standard stars, such as that of Huang et al. (2018). The challenge arises because not all of the seven million objects in the Gaia-RVS catalog are RV-invariant, i.e., quite a number of them are pulsating stars or binary/multiple systems that have periodic/nonperiodic RV variations. Below, we demonstrate a robust method that can determine the RVZP self-consistently for each spectrograph in each exposure (Δv_i,j) by comparing the observed RVs to the Gaia DR2 RVs without identifying RV standard stars.

4.3. Self-consistent RVZPs

Assuming that the RV variables vary with random periods at random phases or nonperiodically, and are not the majority of the observed stars, we can regard them as outliers and use a robust method, e.g., least absolute residual (LAR) regression (or least absolute deviation regression; see Press et al. 2007), to estimate Δv_i,j (the common RV bias shared by the objects in an SEU). From a Bayesian perspective, LAR regression originates from an exponential likelihood while least squares (LSQ) regression comes from a Gaussian likelihood. Utilizing the LAR technique, extreme values have less influence on the fit compared to those in LSQ regression. Besides, since we aim at time-domain analysis, as long as our RVZPs are temporally self-consistent, the absolute scales are not very important.

Using the indices proposed in Section 4.1, for each group of pointings, we construct a global cost function f as follows:

$\begin{eqnarray}\begin{array}{rcl}f\left({\boldsymbol{\Delta }}{\boldsymbol{v}}\right) & = & {{\rm{\Lambda }}}_{1}\displaystyle \sum _{i}\displaystyle \sum _{j}\displaystyle \sum _{k}\displaystyle \frac{| {v}_{i,j,k,\mathrm{obs}}+{\rm{\Delta }}{v}_{i,j}-{v}_{i,j,k,\mathrm{GAIA}}| }{\sqrt{{\sigma }_{\min }^{2}+{\sigma }_{i,j,k,\mathrm{obs}}^{2}+{\sigma }_{i,j,k,\mathrm{GAIA}}^{2}}}\\ & & +{{\rm{\Lambda }}}_{2}\displaystyle \sum _{i}\displaystyle \sum _{j}\displaystyle \sum _{k}\displaystyle \frac{| {v}_{i,j,k,\mathrm{obs}}+{\rm{\Delta }}{v}_{i,j}-\overline{{v}_{\cdot ,\mathrm{obs}}}| }{\sqrt{{\sigma }_{\min }^{2}+{\sigma }_{i,j,k,\mathrm{obs}}^{2}+{\sigma }_{\cdot ,\mathrm{obs}}^{2}}},\end{array}\end{eqnarray} \tag{ 3 }$

where Δ v is the vector of {Δv_i,j} for all relevant SEUs in the group of pointings; v_i,j,k,obs and σ_i,j,k,obs are the RV and associated measurement error of the kth star in the SEU {i, j}; v_i,j,k,GAIA and σ_i,j,k,GAIA are the Gaia RV and associated uncertainty of the kth star in SEU {i, j}; σ_min is the noise floor of the measured RV, which indicates the stability of the wavelength calibration, i.e., the dispersion of Δv_i,j,k in SEU {i, j}; Δv_i,j is the RVZP correction value of SEU {i, j}, which is a free variable to be solved; $\overline{{v}_{\cdot ,\mathrm{obs}}}$ and σ_.,obs are the median and scatter of the (RVZP-corrected) measured RVs of the star {i, j, k} in other SEUs; and Λ₁ and Λ₂ are the regularization parameters of the two terms. In this scheme, the first term guarantees that the absolute scale of our RVZP-corrected RVs is close to that of Gaia DR2, while the second term makes use of multiple exposures and guarantees that the relative RVZPs are self-consistent. We set Λ₁ = Λ₂ = 1 so that the final correction of each SEU is determined as an average of the two effects. Then the vector Δ v is determined by minimizing the cost function f, i.e., ${\boldsymbol{\Delta }}{\boldsymbol{v}}=\arg \,\min f$ . This algorithm can be implemented by minimizing

$\begin{eqnarray}\begin{array}{rcl}{f}_{i,j}\left({\rm{\Delta }}{v}_{i,j}\right) & = & {{\rm{\Lambda }}}_{1}\displaystyle \sum _{k}\displaystyle \frac{| {v}_{i,j,k,\mathrm{obs}}+{\rm{\Delta }}{v}_{i,j}-{v}_{i,j,k,\mathrm{GAIA}}| }{\sqrt{{\sigma }_{\min }^{2}+{\sigma }_{i,j,k,\mathrm{obs}}^{2}+{\sigma }_{i,j,k,\mathrm{GAIA}}^{2}}}\\ & & +{{\rm{\Lambda }}}_{2}\displaystyle \sum _{k}\displaystyle \frac{| {v}_{i,j,k,\mathrm{obs}}+{\rm{\Delta }}{v}_{i,j}-\overline{{v}_{\cdot ,\mathrm{obs}}}| }{\sqrt{{\sigma }_{\min }^{2}+{\sigma }_{i,j,k,\mathrm{obs}}^{2}+{\sigma }_{\cdot ,\mathrm{obs}}^{2}}}\end{array}\end{eqnarray} \tag{ 4 }$

for each SEU {i, j} sequentially and iteratively, where Δv_i,j is the RVZP correction value for SEU {i, j}. Therefore, the problem is to solve the vector Δ v that has N_SEU elements, where N_SEU is the number of related SEUs. We claim that Δ v is determined when the L_∞ norm of the difference between the solutions in the lth and (l + 1)th iterations is less than a specified value, i.e., $\max (| {\boldsymbol{\Delta }}{{\boldsymbol{v}}}_{l}-{\boldsymbol{\Delta }}{{\boldsymbol{v}}}_{l+1}| )\lt \epsilon$ , where is the tolerance and is set to 0.075 km s⁻¹.

SEUs with RVZP correction values ∣Δv_i,j∣ > 50 km s⁻¹ or associated uncertainties ${\sigma }_{{\rm{\Delta }}{v}_{i,j}}\gt 10$ km s⁻¹ (see Section 4.5) are excluded from the iterations. These results are generally due to (1) the number of Gaia DR2 objects being ≤10, (2) the spectral S/N being too low, or (3) bad spectra due to saturation or instrumental problems. We do not think that for these SEUs our scheme and assumptions are valid, so we only keep their initial guesses of Δv_i,j (see Section 4.4) and their uncertainties (see Section 4.5) in our catalog (see Section 6).

4.4. Tricks to Accelerate the Algorithm

Several tricks are used to accelerate the algorithm. The first trick is to get a good initial estimation of Δ v . A good approximation can be made by ignoring the second term in Equation (4), so that with Gaia DR2 RVs we can roughly estimate Δv_i,j by minimizing

$\begin{eqnarray}&&{f}_{i,j,\mathrm{init}}\left({\rm{\Delta }}{v}_{i,j}\right)=\sum _{k}\displaystyle \frac{| {v}_{i,j,k,\mathrm{obs}}+{\rm{\Delta }}{v}_{i,j}-{v}_{i,j,k,\mathrm{GAIA}}| }{\sqrt{{\sigma }_{\min }^{2}+{\sigma }_{i,j,k,\mathrm{obs}}^{2}+{\sigma }_{i,j,k,\mathrm{GAIA}}^{2}}}.\end{eqnarray} \tag{ 5 }$

We note that if an SEU has only a few objects in common with the Gaia catalog, the estimation is risky. Therefore, we require that an SEU at least have 10 objects in common with Gaia DR2 to proceed; otherwise we calculate the initial guess Δv_i,j,init but exclude this SEU in the iteration process.

The second trick is to cut down N_SEU by separating physically detached SEUs, which hastens the index evaluation in each iteration. In Equations (3) and (4), the second terms contain cross-terms, meaning that when solving the N_SEU elements, an iteration process is needed to guarantee that the solution of Δ v is stable. However, the evaluation of indices is computationally expensive when N_SEU grows. Therefore, before the optimization process, physically detached sky areas can be separated, so that the many index evaluation processes can be accelerated by cutting down the array sizes. We group the pointings of LAMOST MRS DR7 using a friends-of-friends algorithm with a 5° linking length, which is double the radius of the field of view of LAMOST in case of any possible common stars between them. Eventually, 137 groups are obtained, as shown in Figure 6. Initial guesses of Δv for each SEU are made by optimizing the cost function of Equation (3) for each group of plates.

**Figure 6.** Grouped pointings of the LAMOST MRS DR7 observations using a friends-of-friends method with a linking length of 5°. The field of view of LAMOST is a circle with a 25 radius if all spectrographs are in operation. The figure uses the equatorial coordinate system with Mollweide projection, and the pointings in a particular group are shown with the same color.
Download figure:
Standard image High-resolution image

fdg — **Figure 6.** Grouped pointings of the LAMOST MRS DR7 observations using a friends-of-friends method with a linking length of 5°. The field of view of LAMOST is a circle with a 25 radius if all spectrographs are in operation. The figure uses the equatorial coordinate system with Mollweide projection, and the pointings in a particular group are shown with the same color.
Download figure:
Standard image High-resolution image

In practice, we find that if is too small, there is the possibility that Δ v jumps back and forth between two solutions and does not converge. Further analysis shows that this is an optimization-method-related problem (we use the Nelder–Mead solver; changing it to the Powell solver does not solve the problem but the two solutions are different). We guess that this might be due to the numerical problem of the optimization routine. To avoid such a situation, we then add random processes into the algorithm, i.e., if in the lth iteration the solution is Δ v _l and after looping over all related SEUs the solution is Δ v _l,opt, we evaluate the (l + 1)th solution by Δ v _l+1 = η(Δ v _l,opt − Δ v _l) + Δ v _l, where η is the learning rate randomly generated between η₀ and η₁. We set η₀ = 0.5, η₁ = 1.0, and = 0.075; considering the expectation for η is 0.75, the effective tolerance of our solution of Δ v is 0.10 km s⁻¹, which is acceptable when compared to the typical precision of measured RVs (e.g., ∼1.5 km s⁻¹ as reported by Wang et al. 2019).

Finally, the RVZP corrections for all the 137 groups of pointings converge after several tens of iterations with a Dell Precision R740 workstation with two Intel Xeon Platinum 8260 CPUs (2.40 GHz), among which the longest solution takes ∼10 hr. Compared to the computation of RV measurements using the CCF for the 3.8 million spectra, including the blue and red arms (∼1 week with the same machine), computing the RVZPs is quite fast.

4.5. Uncertainty Estimation

Rigorous uncertainties are very difficult to obtain for our RVZP corrections. The uncertainties of the RVZP correction values consist of two parts, namely the tolerance in the iteration process and the formal error. The tolerance is = 0.1 km s⁻¹ as mentioned above. For the latter part, based on the discussion presented in Appendix B, we use the 16th and 84th percentiles to construct a fiducial error of our RVZP correction values Δv_i,j divided by an empirical correction ξ to construct the formal error. Hence, the total uncertainties of the RVZP correction values are evaluated via

$\begin{eqnarray}&&{\sigma }_{{\boldsymbol{\Delta }}{v}_{i,j}}^{2}={\left(\displaystyle \frac{q{84}_{i,j}-q{16}_{i,j}}{2\xi \sqrt{{N}_{i,j}}}\right)}^{2}+{\epsilon }^{2},\end{eqnarray} \tag{ 6 }$

where i and j index the SEUs, q16 and q84 denote the 16th and 84th percentiles of the residuals of the Gaia DR2 RVs and the RVZP-corrected LAMOST MRS RVs, N_i,j is the number of Gaia DR2 objects with RVs, and ξ is the empirical correction factor for small-number statistics.

5. Results and Validation

In total, we have measured RVs from 3,181,157/3,723,934 single-exposure blue/red-arm spectra in LAMOST MRS DR7 with S/N higher than 5. For 36,301/37,122/37,122 (B/R/Rm) of the 37,624 SEUs, we have successfully derived the initial values of the RVZPs. After eliminating bad SEUs with the criteria described at the end of Section 4.3, we estimate the final RVZPs for 33,073/35,207/35,199 (B/R/Rm) SEUs, which cover 2,985,015/3,631,023/3,629,895 B/R/Rm RVs. Roughly, the percentages of coverage are 87.9%/93.6%/93.6% for B/R/Rm in terms of SEUs and 93.8%/97.5%/97.5% for B/R/Rm in terms of RVs.

5.1. The Temporal Variation of RVZPs

In Figure 7, we present the Δv_i,j for each SEU for Sc and ThAr arc lamps versus the date. The Sc lamp was in use until 2018 October, after which it was replaced by the ThAr lamp. The mean uncertainties of Δv_B, Δv_R, and Δv_Rm are all ∼0.38 km s⁻¹, which is quite good. The median uncertainties are even 20% smaller. The Δv_B and Δv_R have different patterns while Δv_R and Δv_Rm are very similar. In Figure 8 we show the distribution of Δv_B, Δv_R, and Δv_Rm of the SEUs solved. The μ here is estimated using the median, and σ is estimated using (q84 − q16)/2. The μ are 0.49 and 6.47 km s⁻¹ for the ThAr and Sc lamps in the blue arm, while in the red arm they are 0.26 and 4.91 km s⁻¹, respectively. The σ are 1.07 and 1.06 km s⁻¹ for the ThAr and Sc lamps in the blue arm, and in the red arm they are 0.85 and 0.68 km s⁻¹. Generally, the different systematics for the Sc- and ThAr-lamp-calibrated data are consistent with Wang et al. (2019). Despite the large systematics, we find that the precision of the Sc-lamp-calibrated data is no worse than that of the data calibrated by the ThAr lamp. The Rm results are very similar to those of R, except their μ have a 0.3 km s⁻¹ difference. We note that this is reasonable considering that the systematics vary with wavelength as shown in Ren et al. (2021). In addition, ∼4000 single-exposure spectra calibrated using an Ne lamp are also found in DR7 v1.1. We confirm that the Ne lamp was used to calibrate the LRS spectra and these mistakenly calibrated data will be removed from the internationally available version of DR7, so we exclude these data in the following analysis.

**Figure 8.** Histograms of Δv_B, Δv_R, and Δv_Rm. The Sc- and ThAr-lamp-calibrated data are shown in gray and cyan, respectively. The μ and σ are calculated using the median and (q84 − q16)/2, respectively.
Download figure:
Standard image High-resolution image

It is clear that the RVZPs are reasonably stable except after around 2019 May 1, which seems to be correlated with the arc lamp exposure flux.¹⁴ We plot the Δv for each spectrograph for the time interval with available arc lamp intensity in Figures 9 and 10. Since our mean uncertainty of Δv_i,j is 0.38 km s⁻¹, we regard Δv_i,j larger than 1 km s⁻¹ as significant values, including the 7th, 12th, 13th, and 15th spectrographs of the blue arm and the 7th, 9th, and 15th spectrographs of the red arm. It is currently not known whether these shifts are due to the fact that the new ThAr lamps were brought into use at around 2019 May 1 or to some other issues induced in the maintenance. Probably in DR8, with the RVZP data in a longer time baseline, we can address the problem.

**Figure 9.** The temporal variance of Δv_B before/after 2019 May 1 for the blue arm. The gray filled areas show the 16th and 84th percentiles of Δv_B in the two time intervals, and the black solid lines show the medians. The colors denote the square root of the peak flux of the Th5231 line, which can be used as an indicator of S/N. Here the new batch of ThAr lamps also introduces RVZP variance.
Download figure:
Standard image High-resolution image

**Figure 10.** The same as Figure 9 but for the red arm (Δv_R). The colors denote the square root of the peak flux of the Ar6752 line.
Download figure:
Standard image High-resolution image

5.2. Validating with Other Data Sets

We also validate our RVs with stars common to other data sets, namely, Gaia DR2 (Katz et al. 2019), APOGEE DR16 (Column VHELIO_AVG; Jönsson et al. 2020), the RV standard stars from Huang et al. (2018), GALAH DR3 (Column rv_galah; Buder et al. 2021), and RAVE DR6 (Column hrv_sparv; Steinmetz et al. 2020b). The mean μ and scatter σ derived from Gaussian fitting to the residuals are tabulated in Table 2 and shown as functions of S/N in Figure 11. Also shown are the results of the LAMOST pipeline and Wang et al. (2019). Note that, in DR7 v1.1, the LAMOST pipeline only provides two RVs measured from the blue arm using ELODIE empirical templates (Moultaka et al. 2004) and ATLAS9 synthetic templates (Castelli & Kurucz 2003), respectively. In future MRS data releases, the RVs using ELODIE templates will be removed, and the RVs based on ATLAS9 will be provided for both blue and red arms for better performance. Therefore, we use calibrated RVs of blue arms based on ATLAS9 templates in our comparison (Column rv_ku1). The Wang et al. (2019) catalog is a subset of ours (including data taken during the first year and a half with an S/N cut at 10). Spectra without our measurements (i.e., either S/N < 5 or Δv is invalid) are excluded from this comparison, so that it is fair to the other two RV sources. At the high-S/N end (50–100), we find the standard deviations derived from Gaussian fitting for our results can reach 1.00/1.10, 0.84/0.80, 0.69/0.74, 0.72/0.78, and 1.77/1.88 km s⁻¹ with respect to Gaia DR2, APOGEE DR16, Huang et al. (2018), GALAH DR3, and RAVE DR6 in the blue/red arm. The common stars among LAMOST MRS DR7, Gaia DR2, and the other four reference sets are used to calculate the fiducial accuracy of Gaia DR2. At the high-S/N end (50 < S/N < 100), the LAMOST MRS gets close to the performance of Gaia DR2 (see Figure 11). Note that, in our algorithm, the Gaia DR2 RVs are used as a reference set, so the comparison with Gaia DR2 is not an independent validation. These results of the comparisons are quite fascinating. At the high-S/N end, we outperform Wang et al. (2019) and the LAMOST pipeline by ∼20% and 30%, respectively, according to the blue-arm results compared to APOGEE DR16. At the low-S/N end (10–20), the advantages increase to 58% and 47%, indicating that our algorithm of RV measurements and RVZP determinations is quite efficient. Since the Huang et al. (2018) sample is from APOGEE DR14, the comparison to Huang et al. (2018) has a 0.4 km s⁻¹ systematic bias, which does not exist in our comparison to APOGEE DR16. This may be due to the update of the APOGEE data release. GALAH DR3 has a 0.23 km s⁻¹ systematic bias as compared to Gaia DR2. The comparison with RAVE DR6, whose spectral resolution is the same as that of the LAMOST MRS but is lower than that of Gaia-RVS, APOGEE, and GALAH, shows a large scatter but it is still reasonable.

**Figure 11.** Comparisons of our absolute RVs from LAMOST MRS DR7 (including the RVs of this work, the LAMOST pipeline, and Wang et al. 2019) with other data sets, i.e., Gaia DR2 (Katz et al. 2019), APOGEE DR16 (Column `VHELIO`_`AVG`; Jönsson et al. 2020), the RV standard stars from Huang et al. (2018), GALAH DR3 (Column `rv`_`galah`; Buder et al. 2021), and RAVE DR6 (Column `hrv`_`sparv`; Steinmetz et al. 2020b). The scatters (σ) are estimated using the Gaussian fitting method.
Download figure:
Standard image High-resolution image

Table 2. Comparison of the LAMOST MRS RVs to Other Data Sets

		This Work			LAMOST	Wang et al. (2019)		Gaia DR2
Ref. Data Set	S/N	Blue Arm	Red Arm	Hα Masked	Blue Arm	Blue Arm	Red Arm	RVS
	5 < S/N < 10	N = 166,218	N = 89,032	N = 89,009	N = 166,218	N = 0	N = 0
		μ = −0.02	μ = −0.05	μ = −0.07	μ = −0.56	⋯	⋯
		σ = 1.68	σ = 2.38	σ = 2.47	σ = 3.78	⋯	⋯

	10 < S/N < 20	N = 301,963	N = 164,068	N = 164,010	N = 301,963	N = 209,446	N = 26,195
		μ = −0.04	μ = −0.10	μ = −0.10	μ = −0.53	μ = −0.55	μ = −0.59	⋯
		σ = 1.42	σ = 1.74	σ = 1.81	σ = 2.52	σ = 2.14	σ = 2.72	⋯
Gaia DR2								⋯
	20 < S/N < 50	N = 572,355	N = 531,044	N = 530,913	N = 572,355	N = 389,277	N = 313,059	⋯
		μ = −0.03	μ = −0.07	μ = −0.08	μ = −0.44	μ = −0.46	μ = −0.66
		σ = 1.22	σ = 1.39	σ = 1.44	σ = 1.88	σ = 1.66	σ = 2.01

	50 < S/N < 100	N = 270,880	N = 497,328	N = 497,126	N = 270,880	N = 180,183	N = 322,647
		μ = 0.00	μ = −0.00	μ = −0.01	μ = −0.33	μ = −0.35	μ = −0.58
		σ = 1.00	σ = 1.10	σ = 1.14	σ = 1.45	σ = 1.32	σ = 1.55

	5 < S/N < 10	N = 35,352	N = 23,362	N = 23,354	N = 35,352	N = 0	N = 0
		μ = 0.22	μ = 0.08	μ = 0.06	μ = −0.39	⋯	⋯
		σ = 1.26	σ = 1.99	σ = 2.17	σ = 3.72	⋯	⋯

	10 < S/N < 20	N = 49,482	N = 38,971	N = 38,987	N = 49,482	N = 33,297	N = 4781
		μ = 0.13	μ = 0.08	μ = 0.02	μ = −0.41	μ = −0.46	μ = −0.47	N = 28,155
		σ = 0.95	σ = 1.27	σ = 1.38	σ = 2.24	σ = 1.81	σ = 2.14
APOGEE DR16								μ = 0.02
	20 < S/N < 50	N = 80,556	N = 87,168	N = 87,159	N = 80,556	N = 52,872	N = 48,104	σ = 0.67
		μ = 0.06	μ = 0.06	μ = 0.03	μ = −0.41	μ = −0.43	μ = −0.49
		σ = 0.82	σ = 0.92	σ = 0.99	σ = 1.51	σ = 1.28	σ = 1.60

	50 < S/N < 100	N = 41,479	N = 73,363	N = 73,350	N = 41,479	N = 26,724	N = 45,191
		μ = −0.04	μ = 0.04	μ = 0.04	μ = −0.41	μ = −0.42	μ = −0.51
		σ = 0.84	σ = 0.80	σ = 0.85	σ = 1.22	σ = 1.08	σ = 1.26

	5 < S/N < 10	N = 2753	N = 1659	N = 1661	N = 2753	N = 0	N = 0
		μ = 0.44	μ = 0.33	μ = 0.36	μ = 0.28	⋯	⋯
		σ = 1.11	σ = 1.53	σ = 1.69	σ = 3.34	⋯	⋯

	10 < S/N < 20	N = 3404	N = 2890	N = 2890	N = 3404	N = 2316	N = 161
		μ = 0.43	μ = 0.36	μ = 0.36	μ = 0.01	μ = −0.18	μ = 0.18	N = 2070
		σ = 0.84	σ = 1.13	σ = 1.18	σ = 2.02	σ = 1.54	σ = 2.07
Huang et al. (2018)								μ = 0.34
	20 < S/N < 50	N = 5050	N = 5923	N = 5920	N = 5050	N = 3352	N = 2938	σ = 0.59
		μ = 0.42	μ = 0.39	μ = 0.40	μ = 0.01	μ = −0.05	μ = 0.10
		σ = 0.76	σ = 0.81	σ = 0.87	σ = 1.31	σ = 1.13	σ = 1.47

	50 < S/N < 100	N = 2507	N = 4917	N = 4916	N = 2507	N = 1711	N = 3127
		μ = 0.38	μ = 0.44	μ = 0.47	μ = 0.06	μ = 0.05	μ = −0.08
		σ = 0.69	σ = 0.74	σ = 0.79	σ = 0.98	σ = 0.87	σ = 1.03

	5 < S/N < 10	N = 30,272	N = 17,317	N = 17,271	N = 30,272	N = 0	N = 0
		μ = 0.32	μ = 0.26	μ = 0.22	μ = −0.25	⋯	⋯
		σ = 1.37	σ = 2.18	σ = 2.37	σ = 4.06	⋯	⋯

	10 < S/N < 20	N = 42,242	N = 32,885	N = 32,768	N = 42,242	N = 29,780	N = 5135
		μ = 0.27	μ = 0.29	μ = 0.23	μ = −0.20	μ = −0.28	μ = −0.19	N = 8778
		σ = 1.00	σ = 1.40	σ = 1.54	σ = 2.48	σ = 1.99	σ = 2.13
GALAH DR3								μ = 0.23
	20 < S/N < 50	N = 52,015	N = 63,642	N = 63,593	N = 52,015	N = 37,996	N = 40,401	σ = 0.77
		μ = 0.26	μ = 0.27	μ = 0.22	μ = −0.10	μ = −0.14	μ = −0.32
		σ = 0.83	σ = 0.96	σ = 1.01	σ = 1.66	σ = 1.47	σ = 1.75

	50 < S/N < 100	N = 18,796	N = 37,239	N = 37,239	N = 18,796	N = 13,727	N = 27,296
		μ = 0.24	μ = 0.29	μ = 0.27	μ = −0.07	μ = −0.13	μ = −0.31
		σ = 0.72	σ = 0.78	σ = 0.81	σ = 1.16	σ = 1.08	σ = 1.30

	5 < S/N < 10	N = 593	N = 414	N = 414	N = 593	N = 0	N = 0
		μ = 0.53	μ = 0.93	μ = 0.77	μ = 0.13	⋯	⋯
		σ = 2.17	σ = 2.99	σ = 3.08	σ = 4.63	⋯	⋯

	10 < S/N < 20	N = 1273	N = 719	N = 719	N = 1273	N = 862	N = 146
		μ = 0.47	μ = 0.50	μ = 0.37	μ = −0.29	μ = −0.19	μ = 0.14	N = 1674
		σ = 1.94	σ = 2.08	σ = 2.03	σ = 3.10	σ = 2.55	σ = 3.02
RAVE DR6								μ = 0.33
	20 < S/N < 50	N = 3343	N = 2273	N = 2273	N = 3343	N = 2539	N = 1487	σ = 1.60
		μ = 0.39	μ = 0.62	μ = 0.55	μ = −0.09	μ = −0.16	μ = 0.15
		σ = 1.81	σ = 2.15	σ = 2.17	σ = 2.34	σ = 2.14	σ = 2.57

	50 < S/N < 100	N = 2332	N = 3209	N = 3209	N = 2332	N = 1798	N = 2368
		μ = 0.35	μ = 0.41	μ = 0.35	μ = 0.08	μ = 0.05	μ = −0.25
		σ = 1.77	σ = 1.88	σ = 1.86	σ = 2.11	σ = 1.93	σ = 2.10

Note. In each cell, we present the number of stars (N), the Gaussian-fitted systematic bias (μ/km s⁻¹), and the standard error (σ/km s⁻¹) with respect to a specific reference set.

Download table as: ASCIITypeset images: 1 2

5.3. Self-consistency

Besides the precision, a check on self-consistency is necessary and important before using RVs in time-domain research. As a demonstration, in Figures 12 and 13 we validate the temporal RV variations of the standard stars (whose RVs are assumed invariant) from Huang et al. (2018) in two pointings, namely, planid = HIP8426401 and HIP4312101, which have 10 and 17 exposures and contain 27 and 41 RV standard stars, respectively. The RVs from the LAMOST pipeline show large fluctuations, which are as expected from the comparison in Section 5.2. It turns out that for many stars, the RVs of multiple exposures from the Wang et al. (2019) catalog have exactly the same values. This is due to the failure of their Gaussian fitting process and then they fall back on the best estimation from the 1 km s⁻¹ RV grid, which is a defect in the algorithm. By comparing the measured RVs and the RVZP-corrected RVs, we find that the RVZP-corrected RVs are much cleaner, indicating that the RVs do benefit from our algorithm for Δv_i,j determination. The LAMOST pipeline and Wang et al. (2019) assume that the RVZP for a spectrograph is static so that the RVZP corrections are basically a shift of the RVs.

**Figure 12.** A comparison of the RVs of standard stars observed in LAMOST MRS DR7 (`planid` = HIP8426401) determined in this work, in LAMOST, and in Wang et al. (2019). Each color represents a unique RV standard star. The thick ticks show the RVs from Huang et al. (2018), the dashed lines show the RVs before RVZP corrections, and the circles connected by solid lines show the RVZP-corrected RVs. The first exposure is in 2018 April and is calibrated with an Sc lamp, so the correction value is quite different from those of other exposures.
Download figure:
Standard image High-resolution image

**Figure 13.** The same as Figure 12 but for `planid` = HIP4312101.
Download figure:
Standard image High-resolution image

To quantify the performance in self-consistency, we select RV standard stars from Huang et al. (2018) with at least eight exposures having valid RVZP-corrected RVs in our catalog and calculate their standard deviation, empirically corrected for the small-number statistic effect, which is discussed in Appendix B. The cumulative distribution function (CDF) of the standard deviations is shown in Figure 14. The absolute RVs from the blue arm (B) show the best self-consistency, followed by those from the red arm (R) and the Hα-masked red arm (Rm). By calculating the 50th, 90th, and 95th percentiles of the CDFs for the blue-arm results, we find our absolute RVs have significant advantages over those of the LAMOST pipeline, namely 16.5%, 35.5%, and 37.5% better (see Table 3), while Wang et al. (2019) are at nearly the same level as the LAMOST pipeline. This reveals the excellent self-consistency of our absolute RVs, which will be used in the time-domain analysis. Note that these estimations of advantages are very conservative due to the fact that the Wang et al. (2019) data set cuts the S/N at 10 but LAMOST and our sample cut at 5, for example. Besides, we do not exclude any of the "exactly the same" RVs from Wang et al. (2019) as readers may find that the CDFs for LAMOST and Wang et al. (2019) jump at around the 0 km s⁻¹ position.

**Figure 14.** The CDFs of the standard deviations of multiple observations for 678 RV standard stars (Huang et al. 2018) observed in LAMOST MRS DR7. The blue/red/orange colors denote the B/R/Rm measurements, and the solid/dashed/dotted lines represent the results of this work/LAMOST/Wang et al. (2019).
Download figure:
Standard image High-resolution image

Table 3. Comparison of the Self-consistency of This Work to That of the LAMOST Pipeline and Wang et al. (2019)

	This Work			LAMOST	Wang et al. (2019)
Percentile	Blue Arm	Red Arm	Hα Masked	Blue Arm	Blue Arm	Red Arm
q = 50%	0.45 (16.5%)	0.54	0.59	0.53	0.54 (−0.7%)	0.50
q = 90%	1.07 (35.5%)	1.39	1.61	1.66	1.76 (−5.8%)	1.72
q = 95%	1.45 (37.5%)	1.86	2.10	2.32	2.29 (1.5%)	2.08

Note. Each column shows the RV standard deviation for 678 standard stars at corresponding levels of the CDF. In the parentheses we show the advantages over the LAMOST blue-arm results.

Download table as: ASCII Typeset image

With this excellent self-consistency, we select 10,320 candidate RV standard stars by requiring that in the blue arm (B) and red arm (R)

1.
their numbers of exposures are at least 8,
2.
their absolute RVs have standard deviations (corrected for small-number statistics) of less than 1.45 and 1.85 km s⁻¹ (corresponding to the 95% level of the CDF, or to 95% completeness with respect to Huang et al. 2018), and
3.
their time baselines are longer than at least 180 days.

These stars can be useful for the RV calibration of low-resolution surveys, such as the LAMOST LRS (R ∼ 1800).

6. Data Products

The data products of this work include

1.
a catalog of ∼3.8 million measured RVs (but 5 million rows for completeness), associated errors, and information on observations for ∼0.8 million stars (Table 4),
2.
a catalog of RVZP corrections (Δv_i,j, for B, R, and Rm) and their uncertainties for all SEUs (Table 5), and
3.
a catalog of 10,320 candidate RV standard stars with at least eight exposures and standard deviation less than 1.45/1.86 km s⁻¹ in the blue/red arm over a time baseline longer than 180 days (Table 6).

The RV and RVZP catalogs can be cross-matched using the columns spid and lmjm. All catalogs will be available online in FITS format and also on GitHub.¹⁵

A few tips: Users who want to correct Doppler effects of their spectra (e.g., Zhang et al. 2020a) should use RVs without RVZP corrections, while those who want to use absolute RVs can obtain them from our catalogs via Equation (2). The uncertainties of the absolute RVs can be evaluated via

$\begin{eqnarray}&&{\sigma }_{\mathrm{abs}}^{2}={\sigma }_{v,\mathrm{obs}}^{2}+{\sigma }_{\min }^{2}+{\sigma }_{{\rm{\Delta }}v}^{2}+{\sigma }_{\mathrm{mod}}^{2},\end{eqnarray} \tag{ 7 }$

where σ_v,obs is the measurement error; σ_min is the wavelength calibration error floor, which we can infer from the comparison to APOGEE DR16 to be approximately 0.85 km s⁻¹ or conservatively 1 km s⁻¹; σ_Δv is the uncertainty of the RVZP; and σ_mod is the contribution from the sparsity of the spectral templates (0.10 km s⁻¹ for the blue arm and 0.20 km s⁻¹ for the red arm).

Table 4. The 3.8 Million RVs (v_obs) Obtained from LAMOST MRS DR7 (v1.1)

Index	Label (FITS)	Format	Units	Description
1	obsid	Integer	⋯	LAMOST observational ID (unique for each .fits file)
2	lmjm	Integer	min	LMJM^a
3	bjdmid	Double	⋯	Barycentric Julian date of the middle of exposure
4	planid	String	⋯	Plan ID
5	spid	Short	⋯	Spectrograph ID
6	fiberid	Short	⋯	Fiber ID
7	ra	Double	deg	R.A. (J2000)
8	dec	Double	deg	Decl. (J2000)
9	snr_B	Float	⋯	S/N of blue arm
10	snr_R	Float	⋯	S/N of red arm
11	lamp_B	String	⋯	Lamp used to calibrate blue arm
12	lamp_R	String	⋯	Lamp used to calibrate red arm
13	rv_B	Float	km s⁻¹	RV (v_obs)
14	rv_err_B	Float	km s⁻¹	RV measurement error (σ_v,obs)
15	rv_teff_B	Float	K	T_eff of the best template
16	ccfmax_B	Float	⋯	CCF max value
17	rv_R	Float	km s⁻¹	RV (v_obs)
18	rv_err_R	Float	km s⁻¹	RV measurement error (σ_v,obs)
19	rv_teff_R	Float	K	T_eff of the best template
20	ccfmax_R	Float	⋯	CCF max value
21	rv_Rm	Float	km s⁻¹	RV (v_obs)
22	rv_err_Rm	Float	km s⁻¹	RV measurement error (σ_v,obs)
23	rv_teff_Rm	Float	K	T_eff of the best template
24	ccfmax_Rm	Float	⋯	CCF max value

Notes. The suffixes _B, _R, and _Rm represent results for the blue arm, red arm, and red arm without Hα, respectively. Table 4 is published in its entirety in machine-readable (FITS) format. A portion is shown here for guidance regarding its form and content.

^aThe LMJM is 1440× the local modified Julian date of the beginning of exposure, which is an 8 bit integer assigned to each exposure.(This table is available in its entirety in FITS format.)

Download table as: ASCII Typeset image

Table 5. The RVZP Correction Values (Δv) for LAMOST MRS DR7 (v1.1)

Index	Label (FITS)	Format	Units	Description
1	planid	String	⋯	Plan ID
2	lmjm	Integer	⋯	LMJM^a
3	spid	Integer	⋯	Spectrograph ID
4	rv_corr0_B	Float	km s⁻¹	Initial guess of RVZP correction
5	nStar_fnt_B	Integer	⋯	Number of stars in this SEU with Gaia DR2 RVs
6	rv_corr2_B	Float	km s⁻¹	Final RVZP correction (Δv)
7	nF1_B	Integer	⋯	Number of stars for first term of cost function
8	nF2_B	Integer	⋯	Number of stars for second term of cost function
9	nOther_med_B	Integer	⋯	Median number of exposures
10	nOther_max_B	Integer	⋯	Maximum number of exposures
11	nOther_min_B	Integer	⋯	Minimum number of exposures
12	rv_corr2_unc_B	Float	km s⁻¹	Uncertainty of the final RVZP correction (σ_Δv)
13	rv_corr0_R	Float	km s⁻¹	Initial guess of RVZP correction
14	nStar_fnt_R	Integer	⋯	Number of stars in this SEU with Gaia DR2 RVs
15	rv_corr2_R	Float	km s⁻¹	Final RVZP correction (Δv)
16	nF1_R	Integer	⋯	Number of stars for first term of cost function
17	nF2_R	Integer	⋯	Number of stars for second term of cost function
18	nOther_med_R	Integer	⋯	Median number of exposures
19	nOther_max_R	Integer	⋯	Maximum number of exposures
20	nOther_min_R	Integer	⋯	Minimum number of exposures
21	rv_corr2_unc_R	Float	km s⁻¹	Uncertainty of the final RVZP correction (σ_Δv)
22	rv_corr0_Rm	Float	km s⁻¹	Initial guess of RVZP correction
23	nStar_fnt_Rm	Integer	⋯	Number of stars in this SEU with Gaia DR2 RVs
24	rv_corr2_Rm	Float	km s⁻¹	Final RVZP correction (Δv)
25	nF1_Rm	Integer	⋯	Number of stars for first term of cost function
26	nF2_Rm	Integer	⋯	Number of stars for second term of cost function
27	nOther_med_Rm	Integer	⋯	Median number of exposures
28	nOther_max_Rm	Integer	⋯	Maximum number of exposures
29	nOther_min_Rm	Integer	⋯	Minimum number of exposures
30	rv_corr2_unc_Rm	Float	km s⁻¹	Uncertainty of the final RVZP correction (σ_Δv)

Notes. The suffixes _B, _R, and _Rm represent results for the blue arm, red arm, and red arm without Hα, respectively. Table 5 is published in its entirety in machine-readable (FITS) format. A portion is shown here for guidance regarding its form and content.

^aThe LMJM is 1440× the local modified Julian date of the beginning of exposure, which is an 8 bit integer assigned to each exposure.(This table is available in its entirety in FITS format.)

Download table as: ASCII Typeset image

Table 6. Selected Candidates of RV Standard Stars from LAMOST MRS DR7 (v1.1)

Index	Label (FITS)	Format	Units	Description
1	ra	Double	deg	R.A. (J2000)
2	dec	Double	deg	Decl. (J2000)
3	rvmed_B	Float	km s⁻¹	Median absolute RV
4	rvstd_B	Float	km s⁻¹	Standard deviation of absolute RVs
5	Nexp_B	Integer	⋯	Number of exposures
6	ts_B	Double	days	Time span
7	rvmed_R	Float	km s⁻¹	Median absolute RV
8	rvstd_R	Float	km s⁻¹	Standard deviation of absolute RVs
9	Nexp_R	Integer	⋯	Number of exposures
10	ts_R	Double	days	Time span

Note. The suffixes _B and _R represent results for the blue arm and red arm, respectively. Table 6 is published in its entirety in machine-readable (FITS) format. A portion is shown here for guidance regarding its form and content.

(This table is available in its entirety in FITS format.)

Download table as: ASCII Typeset image

7. Summary

In this paper, we measure the RVs from LAMOST MRS DR7 stellar spectra and determine the RVZPs with the help of Gaia DR2 RVs, aiming at making the absolute RVs self-consistent and proper for time-domain analysis. More specifically,

1.
we have measured the RVs of ∼3.8 million single-exposure spectra for more than 0.8 million stars obtained from LAMOST MRS DR7, including the blue arm and red arm (with and without Hα);
2.
we determine the RVZPs exposure by exposure (for 3.6 million spectra) by comparing the measured RVs to those of Gaia DR2 and multiple MRS exposures using a robust method to a mean precision of 0.38 km s⁻¹;
3.
we find the RVZPs vary significantly for some spectrographs before/after 2019 May 1, which confirms the utility of our algorithm for determining RVZPs;
4.
we find good consistency in the comparisons of our absolute RVs with those of APOGEE DR16, RV standard stars (Huang et al. 2018), GALAH DR3, and RAVE DR6, and the precision at 50 < S/N < 100 can reach 1.00/1.10, 0.84/0.80, 0.69/0.74, 0.72/0.78, and 1.77/1.88 km s⁻¹ in the blue/red arm, respectively;
5.
we show that compared to those of the LAMOST pipeline and Wang et al. (2019), our absolute RVs have 16.5%, 35.5%, and 37.5% better self-consistency at the 50%, 90%, and 95% levels of the CDF of the standard deviations, respectively, which benefits the subsequent time-domain analysis; and
6.
we select a set of 10,320 candidate RV standard stars whose standard deviations of RVs are less than 1.45 and 1.86 km s⁻¹ in the blue arm and red arm, respectively, over a time baseline of at least 180 days.

LAMOST MRS DR7 v1.2 and v1.3 have been released. We confirm that, in DR7 v1.2/1.3 the spectra are the same as those in v1.1 while the catalogs and parameters have some minor changes.¹⁶ ^, ¹⁷ Therefore, our results can be cross-matched with the v1.2/v1.3 catalogs directly. And we will release a new version of RVs on github once DR8 is released. On the other hand, since Gaia eDR3 is the same as DR2 but with moderate filtering, our absolute RVs should be consistent with Gaia eDR3. In future LAMOST MRS data releases, we will update our RVs using the most recent Gaia RVs as a reference set.

B.Z. thanks Qin Lai, Feng Luo, Dr. Rui Wang, Dr. Hai-Bo Yuan, Prof. Jian-Jun Chen, Prof. Hao-Tong Zhang, Prof. A-Li Luo, and Prof. Jian-Rong Shi for constructive discussions and generous help with the paper. B.Z. also acknowledges support from the LAMOST FELLOWSHIP fund. J.-N.F. acknowledges support from the National Natural Science Foundation of China (NSFC) through grants 11833002, 12090042, and 12090040. This work is supported by the National Key R&D Program of China (No. 2019YFA0405500). C.L. thanks the NSFC for grant No. 11835057. This work is supported by the Chinese Space Station Telescope pre-research projects Key Problems in Binaries and Chemical Evolution of the Milky Way and Its Nearby Galaxies, NSFC (No. U1931209). W.Z. is supported by Fundamental Research Funds for the Central Universities. The authors also thank the reviewer of this paper for the constructive comments and patience during the review process.

The LAMOST FELLOWSHIP is supported by Special Funding for Advanced Users, budgeted and administered by the Center for Astronomical Mega-science, Chinese Academy of Sciences (CAMS-CAS). This work is supported by the Cultivation Project for LAMOST Scientific Payoff and Research Achievement of CAMS-CAS.

Guoshoujing Telescope (LAMOST) is a National Major Scientific Project built by CAS. Funding for the project has been provided by the National Development and Reform Commission. LAMOST is operated and managed by the National Astronomical Observatories, CAS.

Facility: LAMOST - .

Software: laspec (Zhang 2020a), regli (Zhang 2020b), berliner (Zhang 2020c), NumPy (van der Walt et al. 2011), SciPy (Virtanen et al. 2020), Astropy (Astropy Collaboration et al. 2013, 2018).

Appendix A: CCF

In this section, we explain our definitions of mean, variance, covariance, and CCF. Let X denote a vector containing N elements {X_i} (e.g., a continuum-normalized spectrum with N pixels); the mean is defined as

$\begin{eqnarray}&&\overline{{\boldsymbol{X}}}=\displaystyle \frac{1}{N}\sum _{i}{X}_{i},\end{eqnarray} \tag{ A1 }$

the variance as

$\begin{eqnarray}&&\mathrm{Var}({\boldsymbol{X}})=\displaystyle \frac{1}{N}\sum _{i}{\left({X}_{i}-\overline{{\boldsymbol{X}}}\right)}^{2},\end{eqnarray} \tag{ A2 }$

and the covariance of two vectors X and Y as

$\begin{eqnarray}&&\mathrm{Cov}({\boldsymbol{X}},{\boldsymbol{Y}})=\displaystyle \frac{1}{N}\sum _{i}\left({X}_{i}-\overline{{\boldsymbol{X}}}\right)({Y}_{i}-\overline{{\boldsymbol{Y}}}).\end{eqnarray} \tag{ A3 }$

A normalized CCF can be calculated with standardized f and g, namely

$\begin{eqnarray}&&\mathrm{CCF}(v| {\boldsymbol{F}},{\boldsymbol{G}})=\displaystyle \frac{\mathrm{Cov}\left({\boldsymbol{F}},{\boldsymbol{G}}(v)\right)}{\sqrt{\mathrm{Var}({\boldsymbol{F}})\mathrm{Var}({\boldsymbol{G}}(v))}},\end{eqnarray} \tag{ A4 }$

where F is one vector and G (v) is another vector but it is shifted by the RV v. When utilizing this CCF to estimate stellar RVs, G (v) is usually a spectral template whose S/N is infinite and covers the wavelength range of F . Therefore, the shift could be implemented with interpolation. The CCF in this form is essentially the linear correlation coefficient and varies between −1 and 1.

Appendix B: Bias in Small-number Statistics

The estimators that characterize dispersion are often underestimated when the number of samples is small. For example, if we only have three or five measurements of a physical quantity, the standard deviation could be underestimated. In this section, we propose an empirical correction of this bias for the error of mean and standard deviation assuming a Gaussian distribution P(x∣μ, σ), where μ is its position and σ is its standard error.

B.1. The Deviation of the Mean

We can minimize an L₁-norm or L₂-norm cost function, namely, ${\sum }_{i}| {x}_{i}-\hat{\mu }|$ or ${\sum }_{i}{\left({x}_{i}-\hat{\mu }\right)}^{2}/2$ , respectively, to get an estimate of the mean $\hat{\mu }$ . The true deviation is by definition

$\begin{eqnarray}&&{\delta }_{\mathrm{true}}=| \hat{\mu }-\mu | .\end{eqnarray} \tag{ B1 }$

However, in practice we do not know μ when we tackle such a problem. A fiducial deviation associated with $\hat{\mu }$ can be constructed using the 84th and 16th percentiles (or the interquantiles; see Lupton 1993; Ivezić et al. 2014), i.e.,

$\begin{eqnarray}&&{\delta }_{\mathrm{est}}=\displaystyle \frac{{\{{x}_{i}\}}_{q84}-{\{{x}_{i}\}}_{q16}}{2\sqrt{N}},\end{eqnarray} \tag{ B2 }$

where N is the sample size. To obtain an empirical relation between δ_est and δ_true, we assume the following form:

$\begin{eqnarray}&&{\delta }_{\mathrm{true}}={\delta }_{\mathrm{est}}/\xi (N),\end{eqnarray} \tag{ B3 }$

where ξ(N) is the empirical correction factor and is a function of N. Then, we draw mock data from a standard Gaussian distribution with the numpy.random module. In each experiment, we draw N samples and calculate δ_est and δ_true, and derive ξ. We repeat this experiment 3000 times for each N, which ranges from 2 to 250 (which is enough for our purposes), and show the 16th, 50th, and 84th percentiles of the results for each N in the left panel of Figure B1. This fiducial deviation is an underestimation of δ_true. The relation between the medians of logξ and logN is fitted with a fifth-order polynomial function, whose coefficients are tabulated in Table B1. With this relation, we can scale the fiducial deviation to a standard that is less affected by the sample size N.

**Figure B1.** The empirical correction factor for the error of mean (ξ) and for the standard error (ζ) of the Gaussian distributions in small-number statistics.
Download figure:
Standard image High-resolution image

Table B1. The Best-fit Coefficients of the Fifth-order Polynomials for the Empirical Relationship between log₁₀ ξ and log₁₀ N

Cost Function	β₅	β₄	β₃	β₂	β₁	β₀
${\sum }_{i}\| {x}_{i}-\hat{\mu }\|$	0.07349721	−0.60647022	1.97806105	−3.22994084	2.72007585	−0.92812989
${\sum }_{i}{\left({x}_{i}-\hat{\mu }\right)}^{2}/2$	0.08434813	−0.69694429	2.28047526	−3.73821453	3.15612997	−0.99158461

Note. Our definition of a polynomial is poly(x∣β) = ∑_i β_i xⁱ.

Download table as: ASCII Typeset image

B.2. The Standard Error

For a Gaussian distribution P(x∣μ, σ), we can use a sample-based standard deviation and the 16th and 84th percentiles to estimate the true standard error σ, i.e.,

$\begin{eqnarray}&&{\sigma }_{\mathrm{est}}=\sqrt{\displaystyle \frac{\sum _{i}{\left({x}_{i}-\hat{\mu }\right)}^{2}}{N-1}},\end{eqnarray} \tag{ B4 }$

and

$\begin{eqnarray}&&{\sigma }_{\mathrm{est}}=\displaystyle \frac{{\{{x}_{i}\}}_{q84}-{\{{x}_{i}\}}_{q16}}{2},\end{eqnarray} \tag{ B5 }$

respectively. We define ζ by

$\begin{eqnarray}&&{\sigma }_{\mathrm{true}}={\sigma }_{\mathrm{est}}/\zeta (N).\end{eqnarray} \tag{ B6 }$

The results clearly show that both methods underestimate the standard error. Similar to the procedures in the previous test, we fit the relation with a fifth-order polynomial to the medians of ζ and ${\mathrm{log}}_{10}N$ ; the best-fit polynomials are shown in the right panel of Figure B1 and the coefficients are tabulated in Table B2. Compared to Equation (B4), Equation (B5) is more robust to outliers but suffers from more significant underestimation when N is small.

Table B2. The Best-fit Coefficients of the Fifth-order Polynomials for the Empirical Relationship Between ζ and log₁₀ N

Estimator	β₅	β₄	β₃	β₂	β₁	β₀
(q₈₄−q₁₆)/2	0.05712779	−0.47050592	1.56888875	−2.75649024	2.71900493	−0.28637106
$\sqrt{\tfrac{1}{N-1}\sum {\left({x}_{i}-\hat{\mu }\right)}^{2}}$	0.08344418	−0.68630504	2.21084202	−3.50529186	2.77854093	0.08576048

Download table as: ASCII Typeset image

Appendix C: Several Related Python Packages

Three packages are developed in this work.

1.
laspec (Zhang 2020a): A toolkit for LAMOST MRS/LRS spectra, including modules for file IO, spectral convolution, continuum normalization, removal of cosmic rays, CCFs, and empirical correction evaluation (Appendix B).
2.
regli (Zhang 2020b): The Regular Grid Linear Interpolator, a multidimensional linear interpolator based on gridded data. It is faster than scipy.interpolate.LinearNDInterpolator in the Python standard library in our performance test.
3.
berliner (Zhang 2020c): A toolkit for manipulating MIST (Dotter 2016) and PARSEC (Bressan et al. 2012) stellar evolutionary tracks and isochrones, including a Python interface for downloading PARSEC isochrones from the CMD 3.4 website (http://stev.oapd.inaf.it/cgi-bin/cmd).

The source code and some tutorials of these packages can be found at https://github.com/hypergravity. Readers who are interested in LAMOST MRS spectra might find them useful for their research.

Self-consistent Stellar Radial Velocities from LAMOST Medium-resolution Survey DR7

Article metrics

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. The LAMOST MRS

2.1. Targeting and Observational Rules

2.2. LAMOST MRS Spectra

3. Measurement of RVs

3.1. Preparation for RV Measurements

3.2. Spectral Templates

3.3. RV Estimates

4. RVZPs

4.1. The Scheme

4.2. Gaia DR2 RVs as the Reference Set

4.3. Self-consistent RVZPs

4.4. Tricks to Accelerate the Algorithm

4.5. Uncertainty Estimation

5. Results and Validation

5.1. The Temporal Variation of RVZPs

5.2. Validating with Other Data Sets

5.3. Self-consistency

6. Data Products

7. Summary

Appendix A: CCF

Appendix B: Bias in Small-number Statistics

B.1. The Deviation of the Mean

B.2. The Standard Error

Appendix C: Several Related Python Packages

Footnotes

Self-consistent Stellar Radial Velocities from LAMOST Medium-resolution Survey DR7

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. The LAMOST MRS

2.1. Targeting and Observational Rules

2.2. LAMOST MRS Spectra

3. Measurement of RVs

3.1. Preparation for RV Measurements

3.2. Spectral Templates

3.3. RV Estimates

4. RVZPs

4.1. The Scheme

4.2. Gaia DR2 RVs as the Reference Set

4.3. Self-consistent RVZPs

4.4. Tricks to Accelerate the Algorithm

4.5. Uncertainty Estimation

5. Results and Validation

5.1. The Temporal Variation of RVZPs

5.2. Validating with Other Data Sets

5.3. Self-consistency

6. Data Products

7. Summary

Appendix A: CCF

Appendix B: Bias in Small-number Statistics

B.1. The Deviation of the Mean

B.2. The Standard Error

Appendix C: Several Related Python Packages

Footnotes