Binary Companions of Evolved Stars in APOGEE DR14: Search Method and Catalog of ∼5000 Companions

Adrian M. Price-Whelan; David W. Hogg; Hans-Walter Rix; Nathan De Lee; Steven R. Majewski; David L. Nidever; Nicholas Troup; José G. Fernández-Trincado; Domingo A. García-Hernández; Penélope Longa-Peña; Christian Nitschelm; Jennifer Sobeck; Olga Zamora

doi:10.3847/1538-3881/aac387

1. Introduction

Time-domain radial velocity measurements of stars contain information about massive companions. Even with two successive observations of a single star, a difference in the measured radial velocities implies the existence of at least one companion. However, with few or imprecise radial velocity (RV) measurements, the orbital properties of the companion(s) may be poorly constrained (e.g., Price-Whelan et al. 2017). The vast majority of spectroscopic targets with repeat observations in the largest (by number of objects) stellar spectroscopic surveys are observed just a few times with sparse, non-uniform phase coverage. Most prior searches for companions using survey RV data have therefore restricted their searches to only sources with many, high-precision epochs, so that the orbital solution can be unambiguously determined (e.g., Troup et al. 2016), or have used simple statistics computed from the data to study multiplicity (e.g., RV_max; Badenes et al. 2018).

If there are only a few radial velocity epochs per star, and if the companion spectrum is not observed, the data will be consistent with many different combinations of primary orbital parameters (period, amplitude, eccentricity, etc.). To identify companions to the typical star observed in a spectroscopic survey, we therefore face at least one major challenge: how, given a small number of observations of a primary star, do we reliably obtain posterior information about the properties of the binary system? In general, the likelihood function—and the posterior probability distribution function (pdf) under any reasonable prior pdf —will be highly multimodal, and many of the modes will have comparable integrated probability density. For example, with just two radial velocity measurements, a harmonic series of period modes will exist in the likelihood function.

We have solved the problem of deriving comprehensive multimodal pdf samplings previously with The Joker (Price-Whelan et al. 2017), a Monte Carlo rejection sampler that is computationally expensive but probabilistically righteous: it delivers independent posterior pdf samples for single-companion binary orbital parameters, given any number of radial velocity measurements. Here we use The Joker to generate posterior pdf samples for stars observed by the APOGEE survey (see Section 2; Majewski et al. 2017).

The APOGEE surveys primarily target red giant branch (RGB) and other evolved stars (e.g., red clump giants, RC), which are ideal for the study of single-line binary systems. In general, they are unlikely to have comparably bright companions, and their spectra are therefore fit as single-line objects. When this constraint is not met, The Joker will in general fail, and a model that fits for a mixture of stellar spectra is more appropriate (e.g., El-Badry et al. 2018). The subset of RC stars are even more powerful because they are standard candles and have masses that can be estimated using spectroscopy (using dredged-up elements; Martig et al. 2016; Ness et al. 2016). With mass estimates for the primary star, the binary-orbit fitting will return M₂ sin i (minimum mass) estimates for the secondary, and not just estimates of the so-called "binary mass function." Additionally, the APOGEE pipelines (García Pérez et al. 2016) and also The Cannon (Ness et al. 2015) produce detailed abundance estimates for RGB and RC stars. If there are causal relationships between chemical abundances and binary companions—as are expected—these should be measurable. By making cuts on this library of posterior pdf samples (described in detail in Section 5), we deliver catalogs of binary star systems from the APOGEE survey, and show the bulk properties of these systems.

Binary and multiple star systems are of great interest in astrophysics: the population of stars and their companions encodes information about star formation processes, stellar parameters and evolution, and the dynamics of multi-body systems (for recent reviews, see Duchêne & Kraus 2013; Moe & Di Stefano 2017). Most of what is known about stellar companions comes from studies of nearby main-sequence (MS) stars (e.g., Duquennoy et al. 1991; Raghavan et al. 2010; Tokovinin 2014; Moe & Di Stefano 2017). Nearly 50% of MS stars in the solar neighborhood have companions (e.g., Tokovinin 2014). MS stars with companions have a large dynamic range of constituent and orbital characteristics. For example, binary stars have mass ratios that span from ≈0.03 to 1 (e.g., Kraus et al. 2008), and have periods from days to millions of years (e.g., Raghavan et al. 2010). Less is known about population properties of non-interacting or detached companions to evolved stars. The catalogs and methodology presented in this work are a first step toward performing population inferences of binary star systems with evolved star members.

2. Data

All data used in this work come from the publicly available data release 14 (DR14) of the APOGEE survey (Abolfathi et al. 2018; Majewski et al. 2017), a component of the Sloan Digital Sky Survey IV (SDSS-IV; Gunn et al. 2006; Blanton et al. 2017). APOGEE is designed to map stars across much of the Milky Way by obtaining high-resolution (R ∼ 22,500) infrared (H-band) spectroscopy of primarily RGB stars. Targets are selected with simple color and brightness cuts, but the survey uses fiber-plugged plates with a maximum of 300 fibers per ≈1.5 deg² field of view, leading to "pencil-beam"-like sampling of the Galactic stellar distribution. In order to meet requirements on signal-to-noise ratio, most APOGEE stars are observed multiple times in a series of "visits," typically with at least one visit separated from others by a month or more in order to help identify binary stars.

Data taken as part of the APOGEE survey are reduced with a multi-step data reduction pipeline that ultimately solves for the stellar parameters, chemical abundances, and radial velocities for each target (Nidever et al. 2015). Most relevant for this work, the visit radial velocities are determined using an iterative scheme: the individual visit spectra are combined using initial guesses for the relative RVs into a coadded spectrum, which is then used to re-derive the relative visit velocities. The stellar parameters—surface gravity, $\mathrm{log}g$ , and effective temperature, ${T}_{\mathrm{eff}}$ —and the chemical abundances are determined from the coadded spectrum as a part of the APOGEE Stellar Parameters and Chemical Abundances Pipeline (ASPCAP; García Pérez et al. 2016).

We use the primary data products from APOGEE DR14 (i.e., the allStar and allVisit files), which contain 258,475 unique source IDs (APOGEE_ID) and 1,054,381 unique visits. We select all stars with ≥3 visits that each pass a set of quality cuts, described below. For each visit, we require that the uncertainty in the visit velocity is $\lt 100\,\mathrm{km}\,{{\rm{s}}}^{-1}$ (VRELERR) and the following bits are not set in the STARFLAGS bitmask: PERSIST_HIGH, PERSIST_JUMP_POS, PERSIST_JUMP_NEG, VERY_BRIGHT_NEIGHBOR, LOW_SNR. For each star, we require that $0\lt \mathrm{log}g\lt 4$ and the following bits are not set in the ASPCAPFLAGS bitmask: STAR_BAD. After these cuts, and the requirement of ≥3 visits for a given star, we are left with 397,559 visits for 96,231 unique sources. Figure 1 shows the number of stars in several bins of number of visits that pass the above quality cuts: 92% of the stars in our parent sample have <8 visits. Figure 2 shows the stellar parameters of all 96,231 stars in the parent sample used in this work.

**Figure 1.** Number of APOGEE DR14 stars in logarithmic bins of number of visits that pass the quality cuts described in Section 2. In total, this work uses 397,559 visits and 96,231 unique sources.
Download figure:
Standard image High-resolution image

**Figure 2.** Left panel: number of stars in bins of iron abundance, [Fe/H], that pass the quality cuts described in Section 2. Right panel: distribution of stars in our sample in stellar parameters log(surface gravity), $\mathrm{log}g$ , and effective temperature, ${T}_{\mathrm{eff}}$ , with points colored by the iron abundance.
Download figure:
Standard image High-resolution image

**Figure 2.** Left panel: number of stars in bins of iron abundance, [Fe/H], that pass the quality cuts described in Section 2. Right panel: distribution of stars in our sample in stellar parameters log(surface gravity), $\mathrm{log}g$ , and effective temperature, ${T}_{\mathrm{eff}}$ , with points colored by the iron abundance.
Download figure:
Standard image High-resolution image

3. Methods

3.1. Orbit Inference and Velocity Modeling

For every source in the sample of APOGEE stars defined in Section 2, we obtain a posterior sampling in binary-system parameter space, treating it as a single-lined (SB1) spectroscopic binary system with a single companion. This sampling is performed with The Joker (Price-Whelan et al. 2017) under a relatively uninformative prior pdf, and the resulting posterior samplings are used to discover and characterize individual binary star systems and generate a catalog of companions. We now describe the assumptions and method used to generate samplings for individual systems:

no multiple companions: All variations in radial velocity of the primary star are induced by a single companion. This is motivated by the idea that triple star systems are usually hierarchical so that the period of the inner binary is typically much shorter than the orbital period of the outer body (e.g., Tokovinin 2018). At present, we ignore the possibility of higher-order multiple systems.
Kepler: All variations in velocity of the primary are gravitational, and we therefore ignore the possibility of coherent intrinsic variation from, e.g., stellar oscillations.
SB1: All spectra are single-lined; that is, we assume that the secondary is significantly fainter and is thus undetected in the spectra. This assumption is motivated by the fact that we expect RGB stars to be substantially more luminous than their typical companion. However, there are known main-sequence double-lined binary stars in the APOGEE catalog (El-Badry et al. 2018), and an expected but unknown fraction of RGB–RGB binaries.
simple noise model: Measurements are unbiased and noise estimates are correct up to an unknown excess variance. All noise contributions result in Gaussian uncertainties on the individual radial velocity measurements.

The Joker is a custom-built Monte Carlo sampler designed to produce independent posterior samples in Keplerian orbital parameters, given radial velocity measurements under the assumptions listed above. Our parameterization of the orbital elements is similar to the notation in Murray & Correia (2010): the radial velocity v at time t is given by

$\begin{eqnarray}&&v(t;{\boldsymbol{\theta }})={v}_{0}+K\,[\cos (\omega +f(t;e,P,{M}_{0},{t}_{0}))+e\,\cos \omega ]\end{eqnarray} \tag{ 1 }$

where ${\boldsymbol{\theta }}=(P,e,{M}_{0},\omega ,K,{v}_{0})$ —period, eccentricity, mean anomaly at a reference time t₀, argument of pericenter, velocity semi-amplitude, barycentric velocity—and the true anomaly, f, is a function of time and the specified parameters (see Section 2 of Price-Whelan et al. 2017 or Equation (63) in Murray & Correia 2010). In addition to the orbital parameters listed above, The Joker can also generate samples in an "excess variance" parameter, s², that is added to the per-visit measurement variances. This parameter allows us to test whether the visit RV uncertainties are underestimated For upper RGB stars the inferred excess variance will be a combination of extra systematic uncertainty and true astrophysical surface jitter (e.g., Hekker et al. 2008).

The Joker was designed for the extremely multimodal pdfs expected when the number of radial velocity measurements of a source is small, or the data are sparse (in phase coverage) or noisy. While other Markov chain Monte Carlo (MCMC) methods have difficulty producing independent samples with such data, The Joker succeeds by brute force: after generating an initial (very large) library of prior samples from an assumed prior pdf (see below), the (typically multimodal) likelihood is evaluated at each sample and used to reject all prior samples above the likelihood surface (the rejection sampling step). In practice, given a number of requested samples for each star, the sampling proceeds iteratively: since it is easier to accept samples when the data are sparse or noisy, far more prior sample draws (and thus likelihood evaluations) must occur under very constraining data.

3.2. Individual-system Posterior Samplings

Here we describe the specifics of generating posterior samplings for all of the APOGEE targets in our parent sample. We execute the full procedure twice for different goals (as described in Section 5), and only here outline the key steps in the pipeline.

For all 96,231 APOGEE stars with ≥3 good visits (see Section 2), we use The Joker to generate posterior samplings for each star under the assumptions listed above (see Section 3.1). We start by generating a library of 536,870,912 prior samples generated under a prior similar to that defined in Price-Whelan et al. (2017):

1.
uniform or isotropic in angle parameters,
2.
uniform in log-period over the domain [1, 32768] day,
3.
a beta distribution over eccentricity (using parameters from Kipping 2013).

For the excess variance parameter, s², we use a Gaussian over the transformed parameter $y=\mathrm{ln}{s}^{2}$ with the mean and standard deviation (μ_y, σ_y) indicated where the runs are described (see Section 5). The reference time for each star is set to the first visit epoch; M₀ then becomes the mean anomaly at the first visit observation. Table 1 contains descriptions of all parameters and priors used.

Table 1. Summary and Description of Parameters

Name	Prior	Description
P	$\mathrm{ln}P\sim { \mathcal U }(1,32768)\,\mathrm{day}$	period
e	e ∼ Beta(0.867, 3.03)	eccentricity
t₀	fixed	reference time
M₀	${M}_{0}\sim { \mathcal U }(0,2\pi )\,\mathrm{rad}$	mean anomaly at reference time
ω	$\omega \sim { \mathcal U }(0,2\pi )\,\mathrm{rad}$	argument of pericenter
s²	$\mathrm{ln}{s}^{2}\sim { \mathcal N }({\mu }_{y},{\sigma }_{y}^{2})$	extra variance added to each visit variance
K	${ \mathcal N }(0,{\sigma }_{v}^{2})\,\mathrm{km}\,{{\rm{s}}}^{-1}$	velocity semi-amplitude
v₀	${ \mathcal N }(0,{\sigma }_{v}^{2})\,\mathrm{km}\,{{\rm{s}}}^{-1}$	system barycentric velocity

Note. Beta(a, b) is the beta distribution with shape parameters (a, b), ${ \mathcal U }(a,b)$ the uniform distribution over the domain (a, b), and ${ \mathcal N }(\mu ,{\sigma }^{2})$ is the normal distribution with mean μ and variance σ². For the systemic velocity and semi-amplitude, (v₀, K), we use a broad Gaussian prior that is formally inconsistent between The Joker and follow-up MCMC sampling (see Section 3.2.1): in The Joker we assume Gaussian priors with σ_v much larger than the measurement uncertainty, so they can be neglected ( ${\sigma }_{v}\approx \infty$ ), whereas when running MCMC we fix ${\sigma }_{v}={10}^{3}\,\mathrm{km}\,{{\rm{s}}}^{-1}$ .

Download table as: ASCII Typeset image

We request 256 posterior samples for each source. Depending on the data quality and phase coverage of the visits, The Joker will require different numbers of prior samples in order to rejection-sample down to the requested number of posterior samples: for few-epoch or noisy RV data, many prior samples will pass the rejection step, whereas for very precise or many-epoch RV data, The Joker may need to process the full library of prior samples. We therefore generate the posterior samples using an iterative procedure that adaptively predicts how many prior samples to test for each star. For sources with very constraining data, The Joker may return fewer than the requested number of samples (as few as one sample). When this occurs, we continue sampling either using standard MCMC or by increasing the size of the prior cache and continuing rejection sampling with The Joker.

3.2.1. "Needs MCMC": Following up The Joker with MCMC

If just one posterior sample is returned after exhausting the full library of prior samples, or if multiple (but fewer than 256) are returned that all lie within a single mode of the posterior pdf, the posterior pdf over orbital parameters is treated as effectively unimodal: these stars are flagged "needs MCMC." In this case, we use the location of the returned sample (if only one is returned) or a randomly chosen sample from those returned (if multiple samples are returned within one mode) to generate a small Gaussian ball of initial conditions and use standard MCMC to continue sampling until we obtain 256 samples.

In detail, we use an ensemble MCMC sampling algorithm (Goodman & Weare 2010) implemented in Python (emcee; Foreman-Mackey et al. 2013) to perform the samplings. emcee uses an ensemble of "walkers" (separate Markov chains) that naturally adapt to the geometry of the parameter space by generating proposal steps based on the locations of other walkers (i.e., the ensemble chains are not independent). We transform the standard Keplerian orbital parameters to a safer parameterization, $(\mathrm{ln}P,\sqrt{K}\,\cos ({M}_{0}+\omega )$ , $\sqrt{K}\,\sin ({M}_{0}+\omega )$ , $\sqrt{e}\,\cos \omega$ , $\sqrt{e}\,\cos \omega$ , $\mathrm{ln}{s}^{2},{v}_{0})$ , for MCMC sampling. This reparameterization is safer and more efficient for sampling with emcee, which expects parameters to be components of a vector so that linear operations can be applied (see, e.g., Hogg & Foreman-Mackey 2018); the angle variables (ω, M₀) do not meet this requirement in the standard parameterization. We use the same prior pdfs as in The Joker when running MCMC (see Table 1).

For each star that is flagged "needs MCMC," we run emcee with 1024 walkers for 16,384 steps, take the final walker positions, and downsample at random until we have 256 samples. We compute the Gelman–Rubin convergence statistic, ${\hat{R}}_{j}$ , (Gelman & Rubin 1992) for each parameter j and include these values in the catalogs below when standard MCMC is run. We also provide a binary flag, "converged," for each sampling continued with MCMC that is set to true if

$\begin{eqnarray}&&\mathop{\mathrm{mean}}\limits_{j}({\hat{R}}_{j})\lt 1.1.\end{eqnarray} \tag{ 2 }$

3.2.2. "Needs More Prior Samples": Extending The Joker Sampling

If more than one posterior sample is returned after exhausting the full library of prior samples, and the samples lie in multiple modes of the posterior pdf, the only way to proceed is to generate more prior samples and continue running The Joker: these stars are flagged "needs more prior samples." In this case, we generate another equal-sized library of prior samples (a total of $2\times 536,870,912$ samples) and redo the rejection sampling. We note that because of the way the rejection sampling step is done, this is not equivalent to concatenating the results from a second, independent run of The Joker: the log-likelihood values for all of the prior samples must be used. If at the end of this second run the target still has fewer than 256 samples, the sampling is flagged as "incomplete."

3.3. Null Comparison Sample

We construct a comparison sample of simulated data with no RV variability to assess our false-positive rates in the selections below (see Section 5). We randomly pick 16,384 stars from the parent sample used in this work and replace the visit velocity measurements with simulated data. For each star n, we randomly sample an excess variance parameter value from the prior, s_n. For each visit k, we then sample a new velocity v_nk by drawing from a Gaussian with mean equal to the mean of the real data visit velocities, ${\bar{v}}_{n}$ , and variance equal to the sum of the visit variance (uncertainty), σ_nk, and the excess variance,

$\begin{eqnarray}&&{v}_{{nk}}\sim { \mathcal N }({\bar{v}}_{n},{\sigma }_{{nk}}^{2}+{s}_{n}^{2}).\end{eqnarray} \tag{ 3 }$

When we save the comparison data, we only store the visit uncertainty and "forget" the fact that the simulated data are generated with excess variance (to be inferred with The Joker).

4. Experiment: Infer the Excess Variance Distribution

With posterior samplings for all APOGEE DR14 systems in hand, and as an initial use of the per-source posterior samplings, we use a hierarchical Bayesian model to infer the parameters of the (assumed Gaussian) prior over the log-excess-variance parameter, (μ_y, σ_y) (see Table 1). This inference serves as a test case for future work, where we intend to use the independent posterior samplings to construct a hierarchical inference over properties of companion populations. This is also a test of the visit velocity uncertainties reported in the APOGEE data products: if the catalog uncertainties are significantly underestimated, we expect the inferred log-excess-variance distribution parameters to tend toward larger values.

In detail, we maximize the marginal likelihood of the population-level parameters (μ_y, σ_y). We compute this marginal likelihood using the per-object posterior samples reweighted by the ratio of the value of the hyperprior evaluated at a given, new set of parameters (μ_y, σ_y) to the value of the default prior at the previously assumed values, (μ_y,0, σ_y,0) (see above). This trick has been used in other hierarchical inferences as a way to marginalize over the per-object parameters to infer population-level parameters (Hogg et al. 2010; Foreman-Mackey et al. 2014); we describe how to compute the marginal likelihood in detail in the Appendix.

We execute a full run of The Joker on all 96,231 APOGEE stars in our parent sample using initial values for the excess variance distribution parameters chosen so that $\sqrt{{e}^{{\mu }_{y,0}}}\,\approx 200\,{\rm{m}}\,{{\rm{s}}}^{-1}$ : $({\mu }_{y,0},{\sigma }_{y,0})=(10.6,3)$ in units of ${\rm{m}}\,{{\rm{s}}}^{-1}$ . This run took approximately 300 hours on a computer cluster with 448 cores, with the time dominated by sources with many (≳10) visits. For this initial run, we do not follow up on stars that return <256 samples.

We use 1825 stars with >10 visits and $\mathrm{log}g\gt 2$ (to avoid upper RGB stars that have large intrinsic jitter) and maximize the above likelihood to determine better hyperparameters for the log-excess-variance parameter distribution. Figure 3 shows the distribution corresponding to the maximum-likelihood hyperparameters, ${\alpha }^{* }=({\mu }_{y}^{* },{\sigma }_{y}^{* })=(9.50,1.64)$ . These values are consistent with the estimated systematic floor of the visit velocity uncertainties of ≈100– $200\,{\rm{m}}\,{{\rm{s}}}^{-1}$ estimated from stars observed multiple times and on multiple plates (Nidever et al. 2015).

**Figure 3.** Inferred prior over excess variance parameter ( $y=\mathrm{ln}{s}^{2}$ ) using posterior samples for 1825 lower-RGB stars with $\gt 10$ visits. Left and right panels show the distribution corresponding to the maximum-likelihood parameters (μ_y, σ_y) = (9.50, 1.64) on log and linear scales, respectively.
Download figure:
Standard image High-resolution image

**Figure 3.** Inferred prior over excess variance parameter ( $y=\mathrm{ln}{s}^{2}$ ) using posterior samples for 1825 lower-RGB stars with $\gt 10$ visits. Left and right panels show the distribution corresponding to the maximum-likelihood parameters (μ_y, σ_y) = (9.50, 1.64) on log and linear scales, respectively.
Download figure:
Standard image High-resolution image

However, there are a number of caveats to keep in mind about this estimate of the excess variance distribution. First, we do not remove triple or other multiple systems: for any individual system, the variations in radial velocity induced by other massive bodies will lead to larger preferred values for the excess variance parameter. We do not expect there to be a large number of triple systems in our sample, but this is still an important consideration for future efforts. Second, we are sensitive to outliers and very non-Gaussian systematic error distributions. We later assess the non-Gaussianity of the visit velocity uncertainties by looking at the visit velocity residuals away from the orbit samples produced by The Joker using the updated excess variance distribution (see Section 7.2).

5. Companion Catalogs

Using the most likely hyperparameters derived from the initial posterior samplings (see Section 4), we update and fix the excess variance prior distribution parameters— $({\mu }_{y},{\sigma }_{y})\,=(9.50,1.64)$ , in units of m s⁻¹—and rerun The Joker on the parent sample of APOGEE DR14 stars. This approach of estimating a prior pdf from the data and then fixing the parameters is typically referred to as "empirical Bayes"; in this case, it is an approximation to doing a hierarchical inference of the individual system orbital parameters and the excess variance distribution hyperparameters simultaneously.

From this run, 91,096 stars completed and successfully returned 256 samples using The Joker alone. The remaining 5135 stars did not return 256 samples: 4744 were flagged as "needs more prior samples", 391 as "needs MCMC." We then proceed to generate 256 samples for these two subsets using the methodology explained above (see Section 3.2). The full catalog of posterior samples for all APOGEE stars in our parent sample is available online.¹⁷

We also run on the null comparison sample (see Section 3.3) with the same parameters. From the comparison sample run, 13 "stars" are flagged as "needs MCMC," and 1023 are flagged as "needs more prior samples." For these comparison sample stars, we do not continue sampling with The Joker or MCMC and only use the (<256) posterior samples returned from The Joker.

5.1. Stars with Companions

We do not expect a sharp transition in inferred orbital parameters between stars with and without companions: there is a continuum of companion properties. For example, the companions can have low mass or long periods, or be at high inclination, all of which will make the velocity semi-amplitude, K, small for a given system. Therefore, there is no simple cut on radial velocity data alone that would select a complete sample of stars with companions. It is nevertheless possible to define a cut that selects stars with high-confidence companions.

We select a sample of stars that confidently have companions using percentiles computed from the log of the posterior samples in the velocity semi-amplitude parameter, $\mathrm{ln}K$ . Figure 4 shows the distribution of first percentiles of $\mathrm{ln}K$ computed for all stars with posterior samplings from The Joker (filled, blue histogram), along with the same for posterior samplings for the null comparison sample (solid, dark line). To select stars with companions, we use a cut in the first percentile of the $\mathrm{ln}K$ samples such that 1% of the comparison sample is selected, i.e., our estimated false-positive rate is ≈1%. We use a threshold of $\mathrm{ln}K=-0.2$ to meet this constraint (vertical, dashed line in Figure 4); 4898 stars pass this cut. For brevity below, we refer to this sample as the "high-K" stars, and the complementary sample as the "low-K" stars.

**Figure 4.** Distribution of first percentiles in $\mathrm{ln}K$ (in units of $\mathrm{km}\,{{\rm{s}}}^{-1}$ ) for the parent APOGEE sample (blue, filled) and for the comparison sample (solid, dark line). The vertical line (dashed, orange) indicates our adopted cut to select high-K stars that likely have companions at $\mathrm{ln}K=-0.2;$ 1% of the comparison sample falls above this cut.
Download figure:
Standard image High-resolution image

Table 2 contains a list of all stars in the parent sample and the value of the first percentile over the $\mathrm{ln}K$ samples for each star. Figures 5 and 6 show eight examples of high-K stars, i.e., stars that likely have companions, with different numbers of visits, N, indicated on the panels. Table 3 contains a list of all APOGEE_IDs for stars in the high-K sample. Figures 7 and 8 shows the same for eight low-K stars, i.e., stars that have either no companions, low-mass companions, or companions at long periods or high inclination. The majority of the stars in the high-K sample have poorly constrained orbital parameters because the posterior pdfs are very multimodal. The high-K sample selected from Table 2 includes all stars in the catalogs described in the next two subsections, but the following two catalogs are mutually exclusive.

**Figure 5.** Examples of stars in our high-K sample, i.e., stars that likely have companions (see Section 5.1), with different numbers of visits. Left panels show the data (black markers, error bars are the visit velocity uncertainties) with 128 random orbits from the 256 posterior samples underplotted (lines, blue). The `APOGEE`_`ID` of each target and the number of visits that pass our quality cuts, N, are indicated on each panel. Right panels show the 256 posterior samples visualized in period–eccentricity space. In most cases, despite confidently having companions based on the RV amplitude, the permitted orbit fits are highly multimodal.
Download figure:
Standard image High-resolution image

**Figure 6.** Continuation of Figure 5.
Download figure:
Standard image High-resolution image

**Figure 7.** Same as Figure 5, but for stars in the low-K sample, i.e. stars that have undetected or no companions.
Download figure:
Standard image High-resolution image

**Figure 8.** Continuation of Figure 7.
Download figure:
Standard image High-resolution image

Table 2. Parent Sample of APOGEE DR14 Stars

`APOGEE`_`ID`	`lnK`_`per`_`1`

2M00000002+7417074	−2.028
2M00000068+5710233	−2.600
2M00000222+5625359	2.134
2M00000446+5854329	−7.199

(96,231 rows)

Note. This table contains APOGEE_IDs for all 96,231 stars in the parent sample used in this work, for which we have posterior samplings in orbital parameters. The second column contains the first percentile computed over the $\mathrm{ln}K$ samples for each source, lnK_per_1: the "high-K" sample (see Section 5.1) is defined using this column, by selecting lnK_per_1 > 0.

(This table is available in its entirety in FITS format.)

Download table as: ASCII Typeset image

Table 3. High-K Stars

`APOGEE`_`ID`
2M00000222+5625359
2M00001071+6258172
2M00001296+5851378
2M00003649+7948407
2M00003987+6236456

(4898 rows)

Note. This table contains APOGEE_IDs for 4898 stars in the "high-K" sample.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

5.2. Companions with Uniquely Determined Orbits

Starting from the high-K sample, we select stars that have posterior period samples that fall within a single mode. We define a period resolution for each star ${\rm{\Delta }}=4{P}_{min}^{2}/(2\pi T)$ , where P_min is the minimum period sample value and T is the epoch span of the data for that star. We consider a sampling to be unimodal when

$\begin{eqnarray}&&{P}_{\max }-{P}_{\min }\lt {\rm{\Delta }}\end{eqnarray} \tag{ 4 }$

(see Section 2 of Price-Whelan et al. 2017); 320 stars pass this cut. Unlike the multimodal stars, the posterior pdfs for these stars can be approximated using point estimates and standard deviations of their respective samplings. We report maximum a posteriori (MAP) sample values for the orbital parameters, along with other computed quantities for all 320 stars in this high-K, unimodal sample. For stars with measured primary masses, M₁, from other work (Ness et al. 2015), we compute the minimum companion mass, ${M}_{2,\min }$ , and include both masses in this catalog. We additionally join with the APOGEE DR14 allStar catalog and provide all columns from this catalog for convenience. Table 4 contains descriptions and units for all columns in the high-K, unimodal sample catalog, available online.¹⁸

Table 4. High-K, Unimodal Systems

Column name	Unit/format	Description
`APOGEE`_`ID`		identifier used by APOGEE
`P`	days	P, period
`P`_`err`	days
`M0`	rad	M₀, phase at reference epoch
`M0`_`err`	rad
`e`		e, eccentricity
`e`_`err`
`omega`	rad	ω, argument of pericenter
`omega`_`err`	rad
`jitter`	km s⁻¹	s, excess variance parameter
`jitter`_`err`	km s⁻¹
`K`	km s⁻¹	K, velocity semi-amplitude
`K`_`err`	km s⁻¹
`v0`	km s⁻¹	v₀, systemic velocity
`v0`_`err`	km s⁻¹
`t0`	barycentric MJD	t₀, reference epoch
`converged`		binary flag indicating whether the sampling converged
`Gelman–Rubin`		Gelman–Rubin statistic for each MCMC parameter
`M1`	M_⊙	primary mass estimate (Ness et al. 2015)
`M1`_`err`	M_⊙
`M2`_`min`	M_⊙	M_2,min, minimum M₂ mass
`M2`_`min`_`err`	M_⊙
`clean`_`flag`		[0 = good, 1 = suspicious, 2 = bad], score from by-eye vetting
`q`_`min`		minimum mass ratio
`q`_`min`_`err`
`R1`	${R}_{\odot }$	radius of the primary star
`R1`_`err`	${R}_{\odot }$
`a`_`sini`	au	projected separation
`a`_`sini`_`err`	au
`a2`_`sini`	au	projected semimajor axis of the companion orbit
`a2`_`sini`_`err`	au
`DR14RC`	`True`/`False`	if the star is included in the APOGEE DR14 red clump catalog
`TINGRC`	`True`/`False`	if the star is included in red clump catalog of Ting et al. (2018)
⋮		all columns from Ness et al. (2015)
⋮		all columns from APOGEE DR14 `allStar` file

(320 rows)

Note. Description of the data table containing summary information for all stars in the high-K, unimodal sample: stars that likely have companions and have well-determined orbital parameters. All orbital parameter values are from the maximum a posteriori (MAP) posterior sample (either from The Joker or from emcee). All columns ending in _err are estimates of the standard deviation of the posterior samples, σ, computed using the median absolute deviation, MAD, as σ ≈ 1.5 × MAD.

(This table is available in its entirety in FITS format.)

Download table as: ASCII Typeset image

We have also visually inspected the inferred orbits for all systems in this sample and have flagged systems with questionable or invalid fits. The value of the flag is: 0, when the orbits look reasonable; 1, when the orbits look reasonable but the value of the inferred excess variance is large; and 2, when the orbits are clearly poor fits. Many of the stars flagged as "2" look like they may be triple systems, because they tend to have variations in radial velocity over two distinct timescales that are never well-fit by a single Keplerian orbit. The stars flagged as "1" tend to be either upper RGB stars, where astrophysical surface jitter can lead to RV modulations, or SB2 systems, where absorption lines from the secondary confuse the RV pipeline and lead to strange RV signals. This flag is included in the catalog (Table 4) as the column clean_flag; to select a clean sample of companions that have been successfully vetted by eye, select only stars with clean_flag == 0.

5.3. Companions with Highly Constrained Orbits

A subset of the high-K sample (Section 5.1) have multimodal posterior distributions over orbital parameters that are limited to a few qualitatively different solutions. For these stars, one or a few more RV measurements would likely lead to uniquely determined orbital parameters, making these stars prime candidates for follow-up efforts. As examples of such cases, we include an additional catalog of systems that have effectively bimodal posterior samplings in orbital period. We identify these systems using k-means clustering (Lloyd 1982) of the posterior samples in period. In detail, for all stars in the high-K sample, we compute $\mathrm{ln}P$ for all period samples, then use k-means clustering with k = 2 as implemented in the scikit-learn package (Pedregosa et al. 2011) to identify two clusters of samples. We initialize the cluster positions at either end of the range of possible periods. For each cluster, we then ask whether the samples assigned to that cluster are unimodal (Equation (4)). If the samples in each respective cluster are unimodal, we call the sampling "bimodal"; 106 systems are identified as bimodal.

Figure 9 shows a few examples of systems that meet this criterion. In many of these cases, there are certain times at which a future observation would be far more informative. Visually, those times correspond to periods when the predicted RVs have large variance based on the posterior samples at hand. For example, in the bottom left panel of Figure 9, an observation at t = 60 days would rule out one of the two possible classes of orbits.

Table 5 contains a list of all APOGEE targets with bimodal samplings identified in this way. The table contains two rows for each source, with the values of period, eccentricity, and velocity semi-amplitude from the MAP sample from each period mode. When available, this table also includes an estimate of the primary mass, M₁, from Ness et al. (2015) and an estimate of the minimum companion mass, M_2,min, for each mode.

Table 5. High-K, bimodal systems

`APOGEE`_`ID`	`P`	`e`	`K`	`M₁`	`M₂`_,`_min`	`clean`_`flag`
	(days)		(km s⁻¹)	(M_⊙)	(M_⊙)
2M21362657-0017579	14.639	0.1171	30.68	⋯	⋯	0
2M21362657-0017579	2.3049	0.0026	28.69	⋯	⋯	0
2M03180303-0004215	35.553	0.1302	4.541	1.20	0.20	0
2M03180303-0004215	107.22	0.3192	8.792	1.20	0.13	0

(210 rows)

Note. This table contains APOGEE_IDs and limited orbital parameter information for all 106 stars identified as having bimodal posterior samplings in orbital period. Each source is listed twice, with MAP values of period, P, eccentricity, e, and velocity semi-amplitude, K, from each period mode of the posterior sampling for the source. When available, this table also includes point-mass estimates of the primary mass, M₁, from Ness et al. (2015), and minimum companion masses, M_2,_min, computed for each period mode.

(This table is available in its entirety in FITS format.)

Download table as: ASCII Typeset image

We again visually inspect all systems in this sample and assign a flag based on the apparent quality of the inferred orbits (see Section 5.2). Again, to select a clean sample of companions from this catalog that have been successfully vetted by eye, select only stars with clean_flag == 0.

6. Results

Here we highlight a few interesting results from visualizing the properties of companions in the unimodal and bimodal samples defined above (see Sections 5.2 and 5.3). In all figures below, we only plot companions with inferred orbits that pass our visual inspection (clean_flag == 0).

6.1. Systems with Unimodal and Bimodal Posterior Samplings

Figure 10 shows the companion and stellar properties of the unimodal and bimodal samples: the upper left panel shows inferred period and eccentricity of the binary orbit, the lower left panel shows inferred period and surface gravity of the primary star, and the lower right panel shows the stellar parameters of the primary star.

The period–eccentricity plot (upper left panel in Figure 10) shows a clear signature of tidal circularization: such plots for main-sequence stars typically show a more distinct circularization period, below which the majority of stars have eccentricities close to zero. Here, we see that close to P ≈ 10 days the majority of companion orbits appear to be circular, but between 10 and 100 days there appears to be a gradual decrease in eccentricity rather than a sharp transition. This is likely because the primary stars in this sample have a large range in surface gravities and therefore a large range in stellar radii (see lower right panel); this result is explored in more detail in a companion work (Price-Whelan & Goodman 2018).

In the lower left panel of Figure 10, the diagonal (thick) line shows the orbital period of a hypothetical massless companion at the surface of a $1.35\,{M}_{\odot }$ primary star (the median of our sample with measured masses) with the given surface gravity. The vertical line in this panel shows the same for a $1.35\,{M}_{\odot }$ primary with $\mathrm{log}g=0$ , i.e., the orbital period at the surface of such a star close to the tip of the RGB. The horizontal line in this panel shows the approximate upper bound of the red clump, above which most stars should be fully convective. Interestingly, there is a dearth of short-period systems for stars above the red clump ( $\mathrm{log}g\lesssim 2.3$ ) in the triangle defined by the three qualitative lines, suggestive of possible companion engulfment as the primary star envelope encases the companion.

The lower right panel of Figure 10 shows the stellar parameters of all primary stars in the unimodal and bimodal samples. Relative to Figure 2, upper RGB stars appear to be under-represented in these companion catalogs, another tentative signature of engulfment or depletion of companions on shorter-period orbits.

6.2. Companion Masses and Mass Ratios

Most of the companions we find have minimum masses ${M}_{2,\min }\lt 1\,{M}_{\odot }$ . The left panel of Figure 11 shows estimates of primary and companion masses for systems with unimodal (circle markers) and bimodal (square markers, blue lines) period samplings for the subset of 69/320 unimodal and 25/106 bimodal systems with primary mass estimates, again using primary masses from Ness et al. (2015). The upper dashed line shows the curve M_2,min = M₁, and the lower dashed line shows the hydrogen-burning limit, ${M}_{2,\min }\,=0.08\,{M}_{\odot }$ . For the systems with bimodal samplings, we compute the minimum companion mass for each period mode and connect these estimates with a line. Here we restrict the selection to stars with $\mathrm{log}g\gt 2$ to avoid upper RGB stars that may have surface oscillations that mimic RV modulations from companions.

**Figure 11.** Left panel: estimated minimum companion mass and primary mass for a subset of the systems in the high-K, unimodal converged sample (31 systems; black circles), unimodal unconverged sample (38 systems; gray circles), and bimodal sample (25 systems; blue squares and lines) with $\mathrm{log}g\gt 2$ and previously measured primary masses (Ness et al. 2015). Lines connecting markers (blue) show estimates from each period mode for each system with bimodal period samplings. The upper dashed line shows the equal-mass curve, and the lower dashed line shows the hydrogen-burning limit. Right panel: ratio of primary stellar radius, R₁, to projected separation of the two bodies, $a\,\sin i$ , and minimum mass ratio, q_min, for the same systems and markers as in the left panel.
Download figure:
Standard image High-resolution image

The right panel of Figure 11 shows the ratio of the primary stellar radius to the projected separation of the two bodies, ${R}_{1}/(a\,\sin i)$ , and the minimum mass ratio, q_min = M_2,min/M₁, for the same systems. Again, circle (black, gray) markers indicate systems from the unimodal sample, and square (blue) markers and lines connect the two estimates for systems with bimodal period samplings. The upper dashed line in this panel indicates the Roche radius, computed using an approximate functional form (Eggleton 1983)

$\begin{eqnarray}&&\displaystyle \frac{R}{a}=\displaystyle \frac{0.49\,{q}^{-2/3}}{0.6\,{q}^{-2/3}+\mathrm{ln}(1+{q}^{-1/3})}\end{eqnarray} \tag{ 5 }$

where $q={M}_{2}/{M}_{1}$ . All of these systems are consistent with being detached binaries, with the exception of one notable system (discussed below).

In Figure 11, a few interesting systems are immediately obvious: from the left panel, two systems have bimodal period samplings in which both period modes put the minimum companion mass below the hydrogen-burning limit (brown dwarf candidates), and two systems have M_2,min > M₁ (neutron star or black hole candidates). In the right panel, one system appears right on the limit of being an interacting binary (q_min ≈ 0.5), but has no significant UV flux (GALEX DR5; Bianchi et al. 2011). These systems are prime candidates for spectroscopic follow-up.

7. Discussion

7.1. Impossible Companions and the Upper RGB

Paradoxically, there appear to exist short-period (≲100 day) companions at low surface gravities ( $\mathrm{log}g\lesssim 1$ ), obvious in the lower left panel of Figure 10: these companions would orbit within the surface of their primary stars. Each of these systems appears fine from the APOGEE data quality flags, but could nonetheless have incorrect stellar parameters. As a test, we would ideally be able to compare the spectroscopic stellar parameters to an independent determination from, e.g., asteroseismology. Unfortunately, none of these short-period, low- $\mathrm{log}g$ stars in the unimodal or bimodal samples appears in the APOKASC (Serenelli et al. 2017) catalog, which provides asteroseismic stellar parameters for a few hundred APOGEE dwarf and giant stars. We therefore instead cross-match all high-K stars with $\mathrm{log}g\lt 1$ and 99% of their period samples below 100 days to the APOKASC catalog: one star appears in both samples, 2M19024490+4419523. In APOGEE DR14, this star has $\mathrm{log}g=0.57$ , but asteroseismic $\mathrm{log}g\gt 4$ (e.g., HUBER_LOGG), consistent with being an M dwarf. Many of these systems could therefore be low-mass dwarf stars with incorrect stellar parameters in APOGEE DR14.

Another possibility is that The Joker interprets semi-coherent surface oscillations (from asteroseismic modes) with sparse time coverage as orbital RV variations (see, e.g., Hekker et al. 2008). However, for RGB stars with $\mathrm{log}g\lt 1$ , these modes would likely have frequencies of ν_max ≈0.5–5 μHz (Garcia & Stello 2018), corresponding to periods of P ≈ 20–2 days and amplitudes of ≈20–200 m s⁻¹ (Huber et al. 2011; Huber 2017). Except for stars with the lowest $\mathrm{log}g$ , these amplitudes would not pass our cut on K (Section 5.1).

These systems would need precise, long-term photometric follow-up to obtain asteroseismic parameters to confirm or explain their existence.

7.2. Assumptions

It is important to keep in mind that the companion catalogs and results presented in this article depend on the assumptions laid out in Section 3.1. We have only considered radial velocity modulations from a single massive companion—no multiple companions. This is a fundamental limitation of our search and methodology. However, stars with multiple companions will likely be included in our companion catalog anyway, but with two-body orbital solutions that either do not fit the data well or pick out the shortest-period orbit.

Related to the above, we assume that all RV variations are gravitational—Kepler. We see tentative evidence for surface oscillations at the upper giant branch (see Figure 10, lower left), which we treat at present as excess variance or intrinsic jitter. Further observations are needed to determine whether these "systems" have incorrect stellar parameters, or indeed show surface oscillations related to asteroseismic modes.

We assume that one star in each two-body system overwhelmingly dominates the luminosity and therefore the spectrum of each system—SB1. As seen in Figure 11, there are some companions with minimum masses comparable to or consistent with being larger than the masses of the observed stars. These systems are either (a) nearly edge-on MS–RGB binaries (i.e., consistent with our assumption), (b) SB2 systems where the APOGEE pipeline failed to flag the source as having broad lines or a bad fit, or (c) RGB–stellar remnant systems with a black hole or neutron star companion. A subset of these systems that look like they fall into category (c) are being followed-up to obtain further RV measurements to test whether any companions are black holes. However, The Joker can and should be extended to support sampling for SB2 systems with just one additional parameter (typically the mass ratio, or the ratio of velocity semi-amplitudes).

Finally, we have assumed that the visit velocity error distribution is Gaussian, and that the visit velocity uncertainties could be underestimated—simple noise model. We can test this assumption using posterior samples from The Joker by computing the residuals away from our best-fit two-body orbital solutions. We find that the normalized residuals appear to be very close to Gaussian over a large dynamic range of normalized residual values, indicating that this assumption may be sufficient. Figure 12 shows the distribution of normalized residuals, R_nk, for each visit: the k visit velocities for each n star, v_nk, minus the predicted radial velocity from the best-fitting sample returned by The Joker, ${\hat{v}}_{{nk}}^{* }$ , normalized by the excess-variance-included uncertainty, ${\sigma }_{{nk}}^{* }=\sqrt{{\sigma }_{{nk}}^{2}+{\hat{s}}_{n}^{2}}$ , where ${\hat{s}}_{n}$ is computed from the excess variance parameter of the best-fitting sample:

$\begin{eqnarray}&&{R}_{{nk}}=\displaystyle \frac{{v}_{{nk}}-{\hat{v}}_{{nk}}^{* }}{\sqrt{{\sigma }_{{nk}}^{2}+{\hat{s}}_{n}^{2}}}.\end{eqnarray} \tag{ 6 }$

**Figure 12.** The histogram shows the distribution of normalized visit residuals (see Equation (6)) for all 397,559 visits used in the final run of *The Joker* (i.e., with updated excess variance distribution parameters inferred in Section 4). The orange curve shows that expected for Gaussian uncertainties. The distribution appears mostly Gaussian over a large range of residual values, with slightly more populated tails and evidence of a few catastrophic outliers.
Download figure:
Standard image High-resolution image

7.3. Comparison to other APOGEE Companion Catalogs

There are at least two other recent catalogs of stellar systems and companions based on APOGEE data.

One of these studies focused on decomposing spectra of MS stars into mixtures of stellar spectra (El-Badry et al. 2018). Conceptually, this method works because (a) for two stars of unequal mass, unexpected absorption lines will appear superimposed on the brighter star's spectrum, and (b) for stars of close to equal mass, the line depths and ratios will not be well-matched by a single stellar model. Using this technique, they identified thousands of candidate MS binaries and trinaries, but did not consider giant stars ( $\mathrm{log}g\gt 4$ ). This sample is therefore complementary to and non-overlapping with the catalog presented in this work.

The other recent catalog searched for stellar and substellar companions to all stars in APOGEE DR12 (including the RGB) that passed a series of quality cuts and had $\geqslant 8$ visits (Troup et al. 2016). For each star in the sample, orbits were fit to the visit RVs using a multi-step orbit-fitting procedure: it starts by identifying significant periods and a few harmonics of those periods, then fits a Keplerian orbit at each of these harmonics using least-squares fitting (De Lee et al. 2013) with a modified χ² statistic that penalizes fits in which the phase coverage of the data is poor. This procedure is not guaranteed to provide a unique orbit solution.

Of the 382 companions released as a part of the search of Troup et al. (2016), only 188 of the host stars passed the stellar parameter and quality cuts used to define the parent sample in this work (see Section 2). We have looked at all of the overlapping stars to compare the previously derived companion orbital properties to the posterior samplings derived with The Joker. We find that the comparisons fall into three categories: (1) the parameters reported in Troup et al. (2016) agree with the posterior samplings, and the period distribution appears unimodal, (2) the parameters reported in Troup et al. (2016) identify one possible mode of a likely multimodal posterior pdf over orbital parameters, and (3) the radial velocity data used in Troup et al. (2016) changed significantly between APOGEE DR12 and DR14, so no meaningful comparisons can be made; roughly 1/3 of the comparison sample falls into each class. Figure 13 shows a few representative cases in which the orbital parameters of Troup et al. (2016) (orbit shown as an orange line in the left panels, parameters shown as orange + markers in the right panels) are consistent with the orbit samples from The Joker. Figure 14 shows a few representative cases in which we find that the posterior pdf over orbital parameters is multimodal, and the orbit of Troup et al. (2016) identifies one of these modes. For completeness, Figure 15 shows two instances in which the orbital parameters from Troup et al. (2016) no longer make sense, likely because the data changed between data releases.

**Figure 13.** Four examples of stars that also appear in the companion catalog of Troup et al. (2016) where our orbit samplings are unimodal and agree well with the previous orbit fits. Left panels show the data from APOGEE DR14 (black markers), the orbit fit (orange, thick line; Troup et al. 2016), and 128 orbits computed from posterior samples generated in this work (blue, thinner lines). Right panels show the period and eccentricity of the orbit fit (orange, + marker; Troup et al. 2016) and the posterior samples from this work (blue markers).
Download figure:
Standard image High-resolution image

**Figure 14.** Same as Figure 13, but for four stars where we find a multimodal posterior sampling. In this cases, the orbit fit from Troup et al. (2016) generally identifies one of the possible period modes, but multiple orbit solutions are typically allowed.
Download figure:
Standard image High-resolution image

**Figure 15.** Same as Figures 13 and 14, but for two stars in which no meaningful comparison can be made because the visit velocities changed significantly between DR12 and DR14.
Download figure:
Standard image High-resolution image

**Figure 16.** Randomly selected examples of high-K stars with effectively unimodal posterior samplings and converged MCMC samplings. Each panel shows the data (black markers), phase-folded at the period of the posterior sample with maximum posterior probability. Visit velocity uncertainties are shown as black error bars (these are typically smaller than or equal to the size of the markers), and the inferred jitter as gray, capped error bars. The line (blue) shows the orbit computed from the posterior sample with maximum posterior probability.
Download figure:
Standard image High-resolution image

7.4. Population Inference

The main motivation for this work was to produce posterior samplings for all APOGEE DR14 stars to be used in a population inference. To use these samplings for population or hierarchical inference, the orbits of each individual system do not have to be unimodal and thus all 96,231 samplings can be used in conjunction without well-determined orbital parameters for the majority of the stars. These samplings will be useful for constraining (through hierarchical inference) the period and eccentricity distributions of evolved stars, and the occurrence rates of companions to evolved stars to test for signatures of engulfment.

8. Conclusions

We have selected a catalog of nearly 5000 stellar systems with companions by making cuts on posterior beliefs about the amplitude of variations in orbital radial velocity for often low-epoch, sparsely sampled radial velocity data. We provide posterior samplings over orbital parameters for all 96,231 in the parent sample of APOGEE DR14 systems used in this work, along with several sub-catalogs with better constrained orbital information:

1.
A catalog of 320 systems with unimodal posterior samplings, and therefore uniquely determined orbits, of which 225 are newly discovered binary star systems.
2.
A catalog of 106 systems with highly constrained posterior samplings that, in period, span two distinct period modes. For these systems, one or a few more radial velocity measurements would uniquely determine their orbits. 90 of these systems are newly discovered binary star systems.
3.
A catalog of 4898 systems with variations in radial velocity consistent with having a companion, but which need further radial velocity measurements to better constrain the orbital properties.

All companion catalogs described in this work (Section 5) are available online.¹⁹

The source code for this project is open source and available from https://github.com/adrn/TwoFace under the MIT open source software license. This paper version was generated with Git commit 0a5146a(2018-04-24).

It is a pleasure to thank Kevin Covey (WWU), Marla Geha (Yale), Marina Kounkel (WWU), Keith Hawkins (Columbia), Melissa Ness (Columbia/Flatiron), David Spergel (Princeton/Flatiron), and Yuan-Sen Ting (Princeton). We thank the anonymous referee for constructive comments that improved this manuscript.

D.W.H. was partially supported by the NSF (grant AST-1517237). H.W.R. acknowledges support by Sonderforschungsbereich SFB 881 (A3) of the German Research Foundation (DFG), and from the European Research Council under the European Unions Seventh Framework Programme (FP 7) ERC Grant Agreement n. [321035]. P.L. was partly funded by "Programa de Iniciación en Investigación-Universidad de Antofagasta." D.A.G.H. and O.Z. acknowledge support provided by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant AYA-2017-88254-P.

Funding for the Sloan Digital Sky Survey IV has been provided by the Alfred P. Sloan Foundation, the U.S. Department of Energy Office of Science, and the Participating Institutions. SDSS-IV acknowledges support and resources from the Center for High-Performance Computing at the University of Utah. The SDSS website is http://www.sdss.org.

SDSS-IV is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS Collaboration including the Brazilian Participation Group, the Carnegie Institution for Science, Carnegie Mellon University, the Chilean Participation Group, the French Participation Group, Harvard-Smithsonian Center for Astrophysics, Instituto de Astrofísica de Canarias, The Johns Hopkins University, Kavli Institute for the Physics and Mathematics of the Universe (IPMU)/University of Tokyo, Lawrence Berkeley National Laboratory, Leibniz Institut für Astrophysik Potsdam (AIP), Max-Planck-Institut für Astronomie (MPIA Heidelberg), Max-Planck-Institut für Astrophysik (MPA Garching), Max-Planck-Institut für Extraterrestrische Physik (MPE), National Astronomical Observatories of China, New Mexico State University, New York University, University of Notre Dame, Observatário Nacional/MCTI, The Ohio State University, Pennsylvania State University, Shanghai Astronomical Observatory, United Kingdom Participation Group, Universidad Nacional Autónoma de México, University of Arizona, University of Colorado Boulder, University of Oxford, University of Portsmouth, University of Utah, University of Virginia, University of Washington, University of Wisconsin, Vanderbilt University, and Yale University.

The authors are pleased to acknowledge that the work reported on in this paper was substantially performed at the TIGRESS high performance computer center at Princeton University, which is jointly supported by the Princeton Institute for Computational Science and Engineering and the Princeton University Office of Information Technology's Research Computing department.

Software: Astropy (Astropy Collaboration et al. 2013), emcee (Foreman-Mackey et al. 2013), IPython (Pérez & Granger 2007), matplotlib (Hunter 2007), numpy (Van der Walt et al. 2011), scikit-learn (Pedregosa et al. 2011), scipy (https://www.scipy.org/), schwimmbad (Price-Whelan & Foreman-Mackey 2018), sqlalchemy (https://www.sqlalchemy.org/), thejoker (Price-Whelan & Hogg 2017).

Facilities: - 2.5m Sloan Digital Sky Survey (SDSS) Telescope at Apache Point Observatory (APO) - .

Appendix: Hierarchical Inference of the Excess Variance Parameter

For each n of N RGB stars in APOGEE, we obtain M posterior samples over primary orbital parameters ${\boldsymbol{\theta }}\,=(P,e,\omega ,{M}_{0},K,{v}_{0})$ and the excess variance parameter, $y=\mathrm{ln}{s}^{2}$ , using The Joker. For brevity in expressions below, we will use the vector

$\begin{eqnarray}&&{\boldsymbol{w}}=({\boldsymbol{\theta }},y)\end{eqnarray} \tag{ 7 }$

to represent the full set of parameters. To obtain this sampling, we use an interim (Gaussian) prior on the excess variance parameter parameterized by a mean and standard deviation, i.e. α₀ = (μ_y,0, σ_y,0) as described above. For a given source, the posterior samples in the above parameters are drawn from the distribution

$\begin{eqnarray}&&{{\boldsymbol{w}}}_{m}\sim p({{\boldsymbol{w}}}_{m}\,| \,D,{{\boldsymbol{\alpha }}}_{0})\end{eqnarray} \tag{ 8 }$

where D represents the data for a given object.

We want to compute the likelihood of all data from all N stars, {D_n}, given a new set of hyperparameters ${\boldsymbol{\alpha }}$

$\begin{eqnarray}&&p(\{{D}_{n}\}\,| \,{\boldsymbol{\alpha }})=\displaystyle \prod _{n}^{N}p({D}_{n}\,| \,{\boldsymbol{\alpha }})\end{eqnarray} \tag{ 9 }$

where in the above we have assumed that this likelihood is separable (the data for each source are independent). The per-source marginal likelihood in the above expression is given by

$\begin{eqnarray}&&p({D}_{n}\,| \,{\boldsymbol{\alpha }})=\int d{{\boldsymbol{w}}}_{n}\,p({D}_{n}\,| \,{{\boldsymbol{w}}}_{n})\,p({{\boldsymbol{w}}}_{n}\,| \,{\boldsymbol{\alpha }})\end{eqnarray} \tag{ 10 }$

$\begin{eqnarray}&&=\,\int d{{\boldsymbol{w}}}_{n}\,p({D}_{n}\,| \,{{\boldsymbol{w}}}_{n})\,p({{\boldsymbol{w}}}_{n}\,| \,{\boldsymbol{\alpha }})\,\displaystyle \frac{p({{\boldsymbol{w}}}_{n}\,| \,{D}_{n},{{\boldsymbol{\alpha }}}_{0})}{p({{\boldsymbol{w}}}_{n}\,| \,{D}_{n},{{\boldsymbol{\alpha }}}_{0})}\end{eqnarray} \tag{ 11 }$

$\begin{eqnarray}&&=\,p({D}_{n}\,| \,{{\boldsymbol{\alpha }}}_{0})\,\int d{{\boldsymbol{w}}}_{n}\,\displaystyle \frac{p({{\boldsymbol{w}}}_{n}\,| \,{\boldsymbol{\alpha }})}{p({{\boldsymbol{w}}}_{n}\,| \,{{\boldsymbol{\alpha }}}_{0})}\,p({{\boldsymbol{w}}}_{n}\,| \,{D}_{n},{{\boldsymbol{\alpha }}}_{0}).\end{eqnarray} \tag{ 12 }$

Using the Monte Carlo integration approximation, Equation (12) can be simplified to a sum over prior value ratios of the M posterior samples in the log-excess variance parameter for each n star

$\begin{eqnarray}&&\approx \displaystyle \frac{{{ \mathcal Z }}_{n}}{M}\,\displaystyle \sum _{m}^{M}\displaystyle \frac{p({y}_{{nm}}\,| \,{\boldsymbol{\alpha }})}{p({y}_{{nm}}\,| \,{{\boldsymbol{\alpha }}}_{0})}\end{eqnarray} \tag{ 13 }$

where we have canceled the other priors (over ${\boldsymbol{\theta }}$ ), and all normalization constants appear in the constant scale factor ${{ \mathcal Z }}_{n}$ .

The above expression gives the marginal likelihood of the velocity data for a single source given new hyperparameters ${\boldsymbol{\alpha }}$ . The full marginal likelihood is then the product of these individual likelihoods:

$\begin{eqnarray}&&p(\{{D}_{n}\}\,| \,\alpha )\propto \displaystyle \prod _{n}^{N}\displaystyle \frac{1}{M}\,\displaystyle \sum _{m}^{M}\displaystyle \frac{p({y}_{{nm}}\,| \,\alpha )}{p({y}_{{nm}}\,| \,{\alpha }_{0})}.\end{eqnarray} \tag{ 14 }$

In practice, we evaluate the log-marginal-likelihood

$\begin{eqnarray}&&\mathrm{ln}p(\{{D}_{n}\}\,| \,\alpha )\propto \displaystyle \sum _{n}^{N}\left[\mathrm{ln}\left(\displaystyle \sum _{m}^{M}\displaystyle \frac{p({y}_{{nm}}\,| \,\alpha )}{p({y}_{{nm}}\,| \,{\alpha }_{0})}\right)-\mathrm{ln}M\right]\end{eqnarray} \tag{ 15 }$

$\begin{eqnarray}&&\propto \,\displaystyle \sum _{n}^{N}[\mathop{\mathrm{logsumexp}}\limits_{k}[\mathrm{ln}p({y}_{{nm}}\,| \,\alpha )-\mathrm{ln}p({y}_{{nm}}\,| \,{\alpha }_{0})]-\mathrm{ln}M]\end{eqnarray} \tag{ 16 }$

where logsumexp (the log-sum-exp trick) provides a more stable estimate of the sum in Equation (15).

Binary Companions of Evolved Stars in APOGEE DR14: Search Method and Catalog of ∼5000 Companions

Article metrics

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Data

3. Methods

3.1. Orbit Inference and Velocity Modeling

3.2. Individual-system Posterior Samplings

3.2.1. "Needs MCMC": Following up The Joker with MCMC

3.2.2. "Needs More Prior Samples": Extending The Joker Sampling

3.3. Null Comparison Sample

4. Experiment: Infer the Excess Variance Distribution

5. Companion Catalogs

5.1. Stars with Companions

5.2. Companions with Uniquely Determined Orbits

5.3. Companions with Highly Constrained Orbits

6. Results

6.1. Systems with Unimodal and Bimodal Posterior Samplings

6.2. Companion Masses and Mass Ratios

7. Discussion

7.1. Impossible Companions and the Upper RGB

7.2. Assumptions

7.3. Comparison to other APOGEE Companion Catalogs

7.4. Population Inference

8. Conclusions

Appendix: Hierarchical Inference of the Excess Variance Parameter

Footnotes

Binary Companions of Evolved Stars in APOGEE DR14: Search Method and Catalog of ∼5000 Companions

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Data

3. Methods

3.1. Orbit Inference and Velocity Modeling

3.2. Individual-system Posterior Samplings

3.2.1. "Needs MCMC": Following up The Joker with MCMC

3.2.2. "Needs More Prior Samples": Extending The Joker Sampling

3.3. Null Comparison Sample

4. Experiment: Infer the Excess Variance Distribution

5. Companion Catalogs

5.1. Stars with Companions

5.2. Companions with Uniquely Determined Orbits

5.3. Companions with Highly Constrained Orbits

6. Results

6.1. Systems with Unimodal and Bimodal Posterior Samplings

6.2. Companion Masses and Mass Ratios

7. Discussion

7.1. Impossible Companions and the Upper RGB

7.2. Assumptions

7.3. Comparison to other APOGEE Companion Catalogs

7.4. Population Inference

8. Conclusions

Appendix: Hierarchical Inference of the Excess Variance Parameter

Footnotes