Binary Black Hole Population Properties Inferred from the First and Second Observing Runs of Advanced LIGO and Advanced Virgo

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and

Published 2019 September 9 © 2019. The American Astronomical Society. All rights reserved.
, , Citation B. P. Abbott et al 2019 ApJL 882 L24 DOI 10.3847/2041-8213/ab3800

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

2041-8205/882/2/L24

Abstract

We present results on the mass, spin, and redshift distributions with phenomenological population models using the 10 binary black hole (BBH) mergers detected in the first and second observing runs completed by Advanced LIGO and Advanced Virgo. We constrain properties of the BBH mass spectrum using models with a range of parameterizations of the BBH mass and spin distributions. We find that the mass distribution of the more massive BH in such binaries is well approximated by models with no more than 1% of BHs more massive than 45 ${M}_{\odot }$ and a power-law index of α = ${1.3}_{-1.7}^{+1.4}$ (90% credibility). We also show that BBHs are unlikely to be composed of BHs with large spins aligned to the orbital angular momentum. Modeling the evolution of the BBH merger rate with redshift, we show that it is flat or increasing with redshift with 93% probability. Marginalizing over uncertainties in the BBH population, we find robust estimates of the BBH merger rate density of R = ${53.2}_{-28.2}^{+55.8}$ Gpc−3 yr−1 (90% credibility). As the BBH catalog grows in future observing runs, we expect that uncertainties in the population model parameters will shrink, potentially providing insights into the formation of BHs via supernovae, binary interactions of massive stars, stellar cluster dynamics, and the formation history of BHs across cosmic time.

Export citation and abstract BibTeX RIS

1. Introduction

The second LIGO/Virgo observing run (O2) spanned 9 months between 2016 November and 2017 August, building on the first, 4-month run (O1) in 2015. The LIGO/Virgo gravitational-wave (GW) interferometer network is composed of two instruments in the United States (LIGO; LIGO Scientific Collaboration et al. 2015; Abbott et al. 2016a) and a third in Europe (Virgo; Acernese et al. 2015), the latter joining the run in the summer of 2017. In total, 10 binary black hole (BBH) mergers have been detected to date (Abbott et al. 2018a). The BBHs detected possess a wide range of physical properties. The lightest so far is GW170608 (Abbott et al. 2017a), with an inferred total mass of ${18.7}_{-0.7}^{+3.3}$ ${M}_{\odot }$. GW170729 (Abbott et al. 2018a)—exceptional in several ways—is likely to be the heaviest BBH to date, having total mass ${85.2}_{-11.2}^{+15.4}$ ${M}_{\odot }$, as well as the most distant, at redshift ${0.48}_{-0.20}^{+0.19}$. Both GW151226 and GW170729 show evidence for at least one BH with a spin greater than zero (Abbott et al. 2016b, 2018a).

By measuring the distributions of mass, spin, and merger redshift in the BBH population, we may make inferences about the physics of binary mergers and better understand the origin of these systems. We employ Bayesian inference and modeling (Gelman et al. 2004; Mandel 2010; Foreman-Mackey et al. 2014; Hilbe et al. 2017; Asensio Ramos 2018), which, when applied to parameterized models of the population, is able to infer population-level parameters—sometimes called hyperparameters to distinguish them from the event-level parameters—while properly accounting for the uncertainty in the measurements of each event's parameters (Hogg et al. 2010; Mandel 2010).

The structure and parameterization of BBH population models are guided by the physical processes and evolutionary environments in which BBHs are expected to form and merge. Several BBH formation channels have been proposed in the literature, each of them involving a specific environment and a number of physical processes. For example, BBHs might form from isolated massive binaries in the galactic field through common-envelope evolution (Bethe & Brown 1998; Portegies Zwart & Yungelson 1998; Belczynski et al. 2002, 2007, 2008, 2014; Voss & Tauris 2003; Dewi et al. 2006; Dominik et al. 2013; Mennekens & Vanbeveren 2014; Spera et al. 2015; Eldridge & Stanway 2016; Mapelli et al. 2017; Stevenson et al. 2017b; Tauris et al. 2017; Chruslinska et al. 2018; Giacobbo & Mapelli 2018; Giacobbo et al. 2018; Kruckow et al. 2018; Mapelli & Giacobbo 2018) or via chemically homogeneous evolution (de Mink & Mandel 2016; Mandel & de Mink 2016; Marchant et al. 2016). Alternatively, BBHs might form via dynamical processes in stellar clusters (Kulkarni et al. 1993; Sigurdsson & Hernquist 1993; Portegies Zwart & McMillan 2000; Grindlay et al. 2006; O'Leary et al. 2006; Ivanova et al. 2008; Sadowski et al. 2008; Downing et al. 2010, 2011; Clausen et al. 2013; Ziosi et al. 2014; Rodriguez et al. 2015, 2016a; Mapelli 2016; Askar et al. 2017; Banerjee 2017; Chatterjee et al. 2017) and galactic nuclei (Antonini & Perets 2012; Antonini & Rasio 2016; Petrovich & Antonini 2017), evolution of hierarchical triple systems (Antonini et al. 2014, 2017; Kimpson et al. 2016; Liu & Lai 2018), gas drag and stellar scattering in accretion disks surrounding supermassive BHs (McKernan et al. 2012; Bartos et al. 2017; Stone et al. 2017). Finally, BBHs might originate as part of a primordial BH population in the early universe (Carr & Hawking 1974; Bird et al. 2016; Carr et al. 2016; Inayoshi et al. 2016; Sasaki et al. 2016; Ali-Haïmoud et al. 2017; Clesse & García-Bellido 2017; Inomata et al. 2017; Ando et al. 2018; Chen & Huang 2018), where their mass spectrum is typically proposed as having power-law behavior, but spanning a much wider range of masses than stellar-mass BHs. Each channel contributes differently to the distributions of the mass, spin, distance, and orbital characteristics of BBHs.

There are several processes common to most pathways through stellar evolution that affect the properties of the resultant BBH system. Examples include mass loss (Vink et al. 2001; Vink & de Koter 2005; Gräfener & Hamann 2008) and supernovae (O'Connor & Ott 2011; Fryer et al. 2012; Janka 2012; Ugliano et al. 2012; Ertl et al. 2016; Sukhbold et al. 2016). The mass of the compact object left after the supernova is directly related to its pre-supernova mass and the supernova mechanism itself. Metallicity has been shown(Kudritzki & Puls 2000; Vink et al. 2001; Brott et al. 2011) to have important effects on stellar mass loss through winds—line-driven winds are quenched in metal-poor progenitors, enabling large BHs to form through direct collapse or post-supernova mass fallback (Heger et al. 2003; Mapelli et al. 2009; Belczynski et al. 2010; Spera et al. 2015). This also, in turn, might suppress supernova kicks (Fryer et al. 2012) and hence enhance the number of binaries that are not disrupted.

Theoretical and phenomenological models of BBH formation are explored by population synthesis. This requires modeling of not only stellar evolution but also the influence of their evolutionary environments. For instance, isolated evolution in galactic fields requires prescriptions for binary interactions, such as common-envelope physics, as well as mass transfer episodes (see reviews in Kalogera et al. 2007; Vanbeveren 2009; Postnov & Yungelson 2014) and, more recently, the effects of rapid rotation (de Mink et al. 2009; Mandel & de Mink 2016; Marchant et al. 2016). Meanwhile, BBH formation in dense stellar clusters (Ziosi et al. 2014; Rodriguez et al. 2015, 2016a; Mapelli 2016; Askar et al. 2017; Banerjee 2017) is impacted primarily by dynamical interactions within the cluster (Fregeau 2004; Morscher et al. 2013), but also by cluster size and initial mass functions (Scheepmaker et al. 2007; Portegies Zwart et al. 2010; Kremer et al. 2019). GW observations provide an alternative to sharpen our understanding of those processes.

Electromagnetic observations and modeling of systems containing BHs have led to speculation about the existence of potential gaps in the BH mass spectrum. Both gaps may be probed using data from current ground-based GW interferometers and as such have been the target of parametric studies. At low masses, observations of X-ray binaries (XRBs) combined via Bayesian population modeling (Bailyn et al. 1998; Özel et al. 2010; Farr et al. 2011b) suggest a minimum BH mass well above the largest neutron star masses. While the existence and nature of this gap are still uncertain (Kreidberg et al. 2012), it is proposed to exist between the most massive neutron stars (Freire et al. 2008; Özel & Freire 2016; Margalit & Metzger 2017; 2.1–2.5 ${M}_{\odot }$) and the lightest BHs (∼5 ${M}_{\odot }$). It is possible to constrain the existence of this lower-mass gap with GW observations (Littenberg et al. 2015; Mandel et al. 2015, 2017; Kovetz et al. 2017). In Section 3, we find that our current GW observations do not inform the upper edge of this gap, inferring a minimum mass on the primary BH at ${m}_{\min }$ ≲ 9 ${M}_{\odot }$. Our volumetric sensitivity to BBH systems with masses less than 5 ${M}_{\odot }$ is small enough that we expect (and observe) no events in the lower gap region. Thus, our ability to place constraints in this region is severely limited.

Recently, there have been claims of an upper cutoff in the BBH mass spectrum based on the first few LIGO detections (Fishbach & Holz 2017; Bai et al. 2018; Talbot & Thrane 2018; Wysocki et al. 2018; Roulet & Zaldarriaga 2019). This might be expected as a consequence of a different supernova type, called the (pulsational) pair-instability supernova (Heger & Woosley 2002; Belczynski et al. 2016a; Spera & Mapelli 2017; Woosley 2017; Marchant et al. 2018). Evolved stars with a helium core mass ≳30 ${M}_{\odot }$ are expected to become unstable because efficient pair production softens their equation of state. For helium core mass ∼30–64 ${M}_{\odot }$, the star undergoes a sequence of pulsations, losing mass until stability is reestablished (Woosley et al. 2007). The enhanced mass loss during pulsational pair instability is expected to affect the final collapse of the star, leading to smaller BH masses. The fate of a star with He core mass ∼64–135 ${M}_{\odot }$ is more dramatic: the entire star is disrupted by a pair-instability supernova, leaving no remnant (Fowler & Hoyle 1964; Barkat et al. 1967; Rakavy & Shaviv 1967). From the combination of pair instability and pulsational pair instability, it is expected that pair-instability supernovae should leave no BH remnants between ∼50 and 150 ${M}_{\odot }$ because the progenitor star is partially or entirely disrupted by the explosion. It is also possible that contributions from the merger of previous merger products—second-generation mergers (O'Leary et al. 2016; Fishbach et al. 2017; Gerosa & Berti 2017; Rodriguez et al. 2018b)—could occupy this gap. Primordial BHs could also span numerous decades of the mass spectrum (Georg & Watson 2017), but their number density in either mass gap is dependent on the behavior of fluctuations in the early universe (Byrnes et al. 2018). Nonetheless, consistent with prior work, we find that all our mass models have almost no merging BHs above ∼45 ${M}_{\odot }$.

Observational constraints on the BBH merger rate (Abbott et al. 2016c, 2018a) generally assume a rate density that is uniform in the comoving volume. As first shown in Fishbach et al. (2018), it is also possible to search for redshift evolution in the rate density using current data. Different redshift-dependent evolutionary behavior is possible (Dominik et al. 2013; Mandel & de Mink 2016; Rodriguez et al. 2016a; Mapelli et al. 2017; Rodriguez & Loeb 2018) with different environments and stellar evolution scenarios (O'Shaughnessy et al. 2010; Belczynski et al. 2016b). For instance, theoretical models of isolated evolution through common envelope lead to a distribution of times to merger $p({t}_{\mathrm{GW}})\propto {t}_{\mathrm{GW}}^{-1}$ (Dominik et al. 2012; Belczynski et al. 2016b). This would imply that many isolated binaries will coalesce near their formation redshift and produce a BBH merger rate that approximately tracks the star formation rate, peaking near z ∼ 2. We find in Section 4 that the current sample of BBH mergers does not provide enough information to confidently constrain any but the most extreme models. While we place more posterior mass on merger rates that increase with increasing redshift than those that decrease, the scenario of a uniform rate in comoving volume is comfortably within our constraints.

BH spin measurements also provide a powerful tool to discriminate between different channels of BBH formation (Mandel & O'Shaughnessy 2010; Abbott et al. 2016d; Rodriguez et al. 2016c; Farr et al. 2017, 2018; Gerosa & Berti 2017; Vitale et al. 2017; Gerosa et al. 2018). For example, BBHs formed in a dynamic environment will have no preferred direction for alignment, producing isotropically oriented spins (Sigurdsson & Hernquist 1993; Portegies Zwart & McMillan 2000; Mandel & O'Shaughnessy 2010; Rodriguez et al. 2015, 2016c; Stone et al. 2017). However, some evidence has been presented for correlation in spin direction due to the natal environment of the progenitor stars within the cluster (Corsaro et al. 2017). In contrast, isolated binaries are expected to preferentially produce mergers with alignment between the spins of the constituent BHs and the orbital angular momentum of the system (Tutukov & Yungelson 1993; Kalogera 2000; Grandclément et al. 2004; Belczynski et al. 2016b; Mandel & de Mink 2016; Marchant et al. 2016; Rodriguez et al. 2016c; O'Shaughnessy et al. 2017; Stevenson et al. 2017b; Gerosa et al. 2018). Other effects occurring in stellar systems like hierarchical triples could also produce a weak preference for certain spin–orbit misalignments (Rodriguez & Antonini 2018). All of our parameterized models point to preferences against high spin magnitudes when the spin tilts are aligned with the orbital angular momentum. In Section 5, we find that the dimensionless spin magnitude inference prefers distributions that decline as the spin magnitude increases from zero, but our ability to distinguish between assumed distributions of spin orientation is very limited.

GW170817, the first binary neutron star merger observed through GW emission (Abbott et al. 2017b), was detected by GW observatories and associated with a short GRB (Abbott et al. 2017c) in 2017 August. A subsequent post-merger transient (AT 2017gfo) was observed across the electromagnetic spectrum, from radio (Alexander et al. 2017), to near-IR/optical (Chornock et al. 2017; Coulter et al. 2017; Cowperthwaite et al. 2017; Nicholl et al. 2017; Pian et al. 2017; Soares-Santos et al. 2017), to X-ray (Margutti et al. 2017; Troja et al. 2017), to γ-ray (Abbott et al. 2017c; Goldstein et al. 2017; Savchenko et al. 2017). Unfortunately, with only one confident detection, it is not yet possible to infer details of binary neutron star populations more than to note that the GW measurement is mostly compatible with the observed Galactic population (Özel et al. 2012). However, if GW170817 did form a BH, it would also occupy the lower-mass gap described previously.

We structure the paper as follows. First, notation and models are established in Section 2. Section 3 describes our modeling of the BH mass distribution, followed by rate distributions and evolution in Section 4. The BH spin magnitude and orientation distributions are discussed in Section 5. We conclude in Section 6. Studies of various systematics are presented in Appendix A. In Appendix B we present additional studies of spin distributions with model selection for a number of zero-parameter spin models and mixtures of spin orientations. To motivate and enable more detailed studies, we have established a repository of our samples and other derived products.184

2. Data, Notation, and Models

In this work, we analyze the population of 10 BBH merger events confidently identified in the first and second observing runs (O1 and O2; Abbott et al. 2018a). We do not include marginal detections, but these likely have a minimal impact on our conclusions here (Gaebel et al. 2019). Ordered roughly from smallest to most massive by source-frame chirp mass, the mergers considered in this paper are GW170608, GW151226, GW151012, GW170104, GW170814, GW170809, GW170818, GW150914, GW170823, and GW170729.

The individual properties of those 10 sources were inferred using a Bayesian framework, with results summarized in Abbott et al. (2018a). For BBH systems, two waveform models have been used, both calibrated to numerical relativity simulations and incorporating spin effects, albeit differently: IMRPhenomPv2 (Hannam et al. 2014; Husa et al. 2016; Khan et al. 2016), which includes an effective representation (Schmidt et al. 2015) of precession effects, and SEOBNRv3 (Pan et al. 2014; Taracchini et al. 2014; Babak et al. 2017), which incorporates all spin degrees of freedom. The results presented in this work use IMRPhenomPv2; we discuss potential systematic biases in our inference in Appendix A. We also refer to Appendix B in Abbott et al. (2018a) for more details on comparisons between those two waveform families.

To assess the stability of our results to statistical effects and systematic error, we focus on one modestly exceptional event. Both GW151226 and GW170729 exhibit evidence for measurable BH spin, but GW170729 in particular is an outlier by several other metrics as well. In addition to spins, it is also more massive and more distant than any of the other events in the catalog. All events used in the population analysis have confident probabilities of astrophysical origin, but GW170729 is the least significant, having the smallest odds ratio of astrophysical versus noise origin (Abbott et al. 2018a). As we describe in Sections 3 and 4, this event has an impact on our inferred merger rate versus both mass and redshift. To demonstrate the robustness of our result, we present these analyses twice: once using every event, and again omitting GW170729.

2.1. Binary Parameters

A coalescing compact binary in a quasi-circular orbit can be completely characterized by its eight intrinsic parameters, namely, its component masses mi and spins ${{\boldsymbol{S}}}_{i}$, and its seven extrinsic parameters: R.A., decl., luminosity distance, coalescence time, and three Euler angles characterizing its orientation (e.g., inclination, orbital phase, and polarization). Binary eccentricity is also a potentially observable quantity in BBH mergers, with several channels having imprints on eccentricity distributions (e.g., Quinlan & Shapiro 1987; Kocsis & Levin 2012; Samsing et al. 2014; Fragione et al. 2018; Rodriguez et al. 2018a). However, our ability to parameterize (Huerta et al. 2014, 2017; Hinder et al. 2018; Klein et al. 2018) and measure (Coughlin et al. 2015; Abbott et al. 2016d, 2017d; Lower et al. 2018) eccentricity is an area of active development. For low to moderate eccentricity at formation, binaries are expected to circularize (Peters 1964; Hinder et al. 2008) before entering the bandwidth of ground-based GW interferometers. We therefore assume zero eccentricity in our models.

In this work, we define the mass ratio as q = m2/m1, where m1 ≥ m2. The frequency of GW emission is directly related to the component masses. However, due to the expansion of spacetime as the GW is propagating, the frequencies measured by the instrument are redshifted relative to those emitted at the source (Thorne 1983). We capture these effects by distinguishing between masses as they would be measured in the source frame, denoted as above, and the redshifted masses, (1 + z) mi, which are measured in the detector frame. Meanwhile, the amplitude of the wave scales inversely with the luminosity distance (Misner et al. 1973). We use the GW measurement of the luminosity distance to obtain the cosmological redshift and therefore convert between detector-frame and source-frame masses. We assume a fixed Planck 2015 (Planck Collaboration et al. 2016) cosmology throughout to convert between a source's luminosity distance and its redshift (Hogg 1999).

We characterize BH spins using the dimensionless spin parameter ${{\boldsymbol{\chi }}}_{i}={{\boldsymbol{S}}}_{i}/{m}_{i}^{2}$. Of particular interest are the magnitude of the dimensionless spin, ${a}_{i}=| {{\boldsymbol{\chi }}}_{i}| $, and the tilt angle with respect to the orbital angular momentum, $\hat{{\boldsymbol{L}}}$, given by $\cos {t}_{i}=\hat{{\boldsymbol{L}}}\cdot {\hat{{\boldsymbol{\chi }}}}_{i}$. We also define an overall effective spin, ${\chi }_{\mathrm{eff}}$ (Damour 2001; Racine 2008; Ajith et al. 2011), which is a combination of the individual spin components along the orbital angular momentum:

Equation (1)

${\chi }_{\mathrm{eff}}$ is approximately proportional to the lowest-order contribution to the GW waveform phase that contains spin for systems with similar masses. Additionally, ${\chi }_{\mathrm{eff}}$ is conserved throughout the binary evolution to high accuracy (Racine 2008; Gerosa et al. 2015).

2.2. Model Features

The current sample is not sufficient to allow for a high-fidelity comparison with models (e.g., population synthesis) that include more detailed descriptions of stellar evolution and environmental influences. As such, we adopt the union of the parameterizations presented in Talbot & Thrane (2017), Fishbach & Holz (2017), Wysocki et al. (2018), Talbot & Thrane (2018), and Fishbach et al. (2018). This allows for better facilitation of comparison between models and the ability to vary the subsets of parameters influencing the mass and spin distributions while leaving others fixed.

The general model family has eight parameters to characterize the mass model, three parameters to characterize each BH's spin distribution, one parameter describing the local merger rate ${{ \mathcal R }}_{0}$, and one parameter characterizing redshift dependence. We refer to the set of these population parameters as θ. All of the population parameters introduced in this section are summarized in Table 1.

Table 1.  Parameters Describing the Binary Black Hole Population

α Spectral index of m1 for the power-law distributed component of the mass spectrum
mmax Maximum mass of the power-law distributed component of the mass spectrum
mmin Minimum BH mass
βq Spectral index of the mass ratio distribution
λm Fraction of binary BHs in the Gaussian component
μm Mean mass of BHs in the Gaussian component
σm Standard deviation of masses of BHs in the Gaussian component
δm Mass range over which BH mass spectrum turns on
ζ Fraction of binaries with isotropic spin orientations
σi Width of the preferentially aligned component of the distribution of BH spin orientations
${\mathbb{E}}[a]$ Mean of the beta distribution of spin magnitudes
Var[a] Variance of the beta distribution of spin magnitudes
λ How the merger rate evolves with redshift

Note. See the text for a more thorough discussion and the functional forms of the models.

Download table as:  ASCIITypeset image

2.3. Parameterized Mass Models

The power-law distribution considered previously (Abbott et al. 2016c, 2017e) modeled the BBH primary mass distribution as a one-parameter power law, with fixed limits on the minimum and maximum allowed BH mass. With our sample of 10 binaries, we extend this analysis by considering three increasingly complex models for the distribution of BH masses. The first extension, Model A (derived from Fishbach & Holz 2017; Wysocki et al. 2018), allows the maximum BH mass ${m}_{\max }$ and the power-law index α to vary. In Model B (derived from Fishbach & Holz 2017; Kovetz et al. 2017; Talbot & Thrane 2018) the minimum BH mass ${m}_{\min }$ and the mass ratio power-law index βq are also free parameters. However, the priors on Models B and C enforce a minimum of 5 ${M}_{\odot }$ on ${m}_{\min }$—see Table 2. Explicitly, the mass distribution in Models A and B takes the form

Equation (2)

where $C\left({m}_{1}\right)$ is chosen so that the marginal distribution is a power law in m1: $p\left({m}_{1}| {m}_{\min },{m}_{\max },\alpha ,{\beta }_{q}\right)={m}_{1}^{-\alpha }$.

Table 2.  Summary of Models Used in Sections 35, with the Prior Ranges for the Population Parameters

  Mass Parameters Spin Parameters
Model α mmax mmin βq λm μm σm δm ${\mathbb{E}}[a]$ Var[a] ζ σi
A [−4, 12] [30, 100] 5 0 0 N/A N/A N/A [0, 1] [0, 0.25] 1 [0, 4]
B [−4, 12] [30, 100] [5, 10] [−4, 12] 0 N/A N/A N/A [0, 1] [0, 0.25] 1 [0, 4]
C [−4, 12] [30, 100] [5, 10] [−4, 12] [0, 1] [20, 50] (0, 10] [0, 10] [0, 1] [0, 0.25] [0, 1] [0, 4]

Note. The fixed parameters are in bold. Each of these distributions is uniform over the stated range. All models in this section assume rates that are uniform in the comoving volume (λ = 0). The lower limit on ${m}_{\min }$ is chosen to be consistent with Abbott et al. (2018a).

Download table as:  ASCIITypeset image

Model A fixes ${m}_{\min }$ = 5 ${M}_{\odot }$ and βq = 0, whereas Model B fits for all four parameters. Equation (2) implies that the conditional mass ratio distribution is a power law with p(qm1) ∝ qβq. When βq = 0, C(m1) ∝ 1/(m1 − ${m}_{\min }$), as assumed in Abbott et al. (2016c, 2017e).

Model C (from Talbot & Thrane 2018) further builds on the mass distribution in Equation (2) by allowing for a second, Gaussian component at high mass, as well as introducing smoothing scales δm, which taper the hard edges of the low- and high-mass cutoffs of the primary- and secondary-mass power law. The second Gaussian component is designed to capture a possible buildup of high-mass BHs created from pulsational pair-instability supernovae. The tapered low-mass smoothing reflects the fact that parameters such as metallicity probably blur the edge of the lower-mass gap, if it exists. Model C therefore introduces four additional model parameters, the mean, μm, and standard deviation, σm, of the Gaussian component; λm, the fraction of primary BHs in this Gaussian component; and δm, the smoothing scale at the low-mass end of the distribution.

The full form of this distribution is

Equation (3)

The factors A, B, and C ensure that the power-law component, Gaussian component, and mass ratio distributions are correctly normalized. S is a smoothing function that rises from 0 at mmin to 1 at mmin + δm as defined in Talbot & Thrane (2018). Θ is the Heaviside step function. Models A, B, and C are displayed with a selection of parameters for demonstration purposes in the left panels of Figure 1.

Figure 1.

Figure 1. Probability distributions for models encoded by Equations (3), (4), and (6) are shown in the left panels, top right panel, and bottom right panel, respectively. In each, the legend indicates the parameter values for the models plotted. In the case of the lower left (q) distribution, we condition the value of m1 = 40 ${M}_{\odot }$ rather than marginalizing for simplicity.

Standard image High-resolution image

2.4. Parameterized Spin Models

The BH spin distribution is decomposed into independent models of spin magnitudes, a, and orientations, t. For simplicity and lacking compelling evidence to the contrary, we assume that both BH spin magnitudes in a binary, ai, are drawn from a beta distribution (Wysocki et al. 2018):

Equation (4)

This distribution is a convenient and flexible parameterization for describing values on the unit interval (Ferrari & Cribari-Neto 2004). Two examples of this distribution are shown in the top right panel of Figure 1. We choose to model the moments of the beta distribution using the mean (${\mathbb{E}}[a]$) and variance (Var[a]), given by

Equation (5)

We adopt a prior on the spin magnitude model parameters, which are uniform over the values of ${\mathbb{E}}[a]$ and Var[a], which satisfy αa, βa ≥ 1. This choice of which values to sample avoids numerically challenging singular spin distributions.

To describe the spin orientation, we assume that the tilt angles between each BH spin and the orbital angular momentum, ti, are drawn from a mixture of two distributions: an isotropic component and a preferentially aligned component, represented by a truncated Gaussian distribution in $\cos {t}_{i}$ peaked at $\cos {t}_{i}=1$ (Talbot & Thrane 2017),

Equation (6)

We choose to parameterize the cosine of the tilt angles, rather than the angles themselves. This choice prompts the selection of a Gaussian (or uniform) model, rather than a wrapped distribution, which would be more appropriate for an angular variable. An example of the Mixture distribution is displayed in the bottom right panel of Figure 1.

The parameter ζ denotes the fraction of binaries that are preferentially aligned with the orbital angular momentum; ζ = 1 implies that all BH spins are preferentially aligned, and ζ = 0 is an isotropic distribution of spin orientations. The typical degree of spin misalignment is represented by the σi. For spin orientations we explore two parameterized families of models:

  • 1.  
    Gaussian (G): ζ = 1.
  • 2.  
    Mixture (M): 0 ≤ ζ ≤ 1.

The Gaussian model is motivated by formation in isolated binary evolution, with significant natal misalignment, while the mixture scenarios allow for an arbitrary combination of this scenario and randomly oriented spins, which arise naturally in dynamical formation.

2.5. Redshift Evolution Models

The previous two subsections described the probability distributions of intrinsic parameters p(ξ) (i.e., masses and spins) that characterize the population of BBHs. In addition, we also measure the value of one extrinsic parameter of the population: the overall merger rate density R. The models described in the previous two subsections assume that the distribution of intrinsic parameters is independent of cosmological redshift z, at least over the redshift range accessible to the LIGO and Virgo interferometers during the first two observing runs (z ≲ 1). However, we consider an additional model in which the overall event rate evolves with redshift. We follow Fishbach et al. (2018) by parameterizing the evolving merger rate density R(z) in the comoving frame by

Equation (7)

where R0 is the rate density at z = 0. In this model, λ = 0 corresponds to a merger rate density that is uniform in comoving volume and source-frame time, while λ ∼ 3 corresponds to a merger rate that approximately follows the star formation rate in the redshift range relevant to the detections in O1 and O2 (Madau & Dickinson 2014). Various BBH formation channels predict different merger rate histories, ranging from rate densities that will peak in the future (λ < 0) to rate densities that peak earlier than the star formation rate (λ ≳ 3). These depend on the formation rate history and the distribution of delay times between formation and redshift. In cases where we do not explicitly write the event rate density as R(z), it is assumed that the rate density R is constant in comoving volume and source-frame time.

The general model family, including the distributions of masses, spins, and merger redshift, is therefore given by the distribution

Equation (8)

where N is the total number of mergers that occur within the detection horizon (i.e., the maximum redshift considered) over the total observing time, Tobs, as measured in the detector frame, θ is the collection of all hyperparameters that characterize the distribution, and d ${V}_{c}$/dz is the differential comoving volume per unit redshift. The merger rate density R(z) is related to N by

Equation (9)

where t is the time in the source frame, so that Equation (8) can be written equivalently in terms of the merger rate density:

Equation (10)

2.6. Hierarchical Population Model

We perform a hierarchical Bayesian analysis, accounting for measurement uncertainty and selection effects (Loredo 2004; Abbott et al. 2016c; Fishbach et al. 2018; Wysocki et al. 2018; Mandel et al. 2019; Mortlock et al. 2018). We model the occurrence rate of events through a Poisson process with a mean dependent on the parameter distribution of the binaries.185 The likelihood of the observed GW data given the population hyperparameters θ that describe the general astrophysical distribution, dN/dξdz, is given by the inhomogeneous Poisson likelihood:

Equation (11)

where μ(θ) is the rate constant describing the mean number of events as a function of the population hyperparameters, Nobs is the number of detections, and ${ \mathcal L }({d}_{n}| \xi ,z)$ is the individual-event likelihood for the nth detection having parameters ξ, z.

In order to calculate the expected number of detections μ(θ), we must understand the selection effects of our detectors. The sensitivity of GW detectors is a strong function of the binary masses and distance and also varies with spin. For any binary, we define the sensitive spacetime volume VT(ξ) of a network with a given sensitivity to be

Equation (12)

where the sensitivity is assumed to be constant over the observing time, Tobs, as measured in the detector frame and $f(z| \xi )$ is the detection probability of a BBH with the given parameter set ξ at redshift z (O'Shaughnessy et al. 2010), averaged over the extrinsic binary orientation parameters (Finn & Chernoff 1993). The factor of 1/(1 + z) arises from the difference in clocks timed between the source frame and the detector frame. For a given population with hyperparameters θ, we can calculate the total observed spacetime volume

Equation (13)

where $p(\xi | \theta )$ describes the underlying distribution of the intrinsic parameters. We performed large-scale simulation runs wherein the spacetime volume in the above equation is estimated by Monte Carlo integration (Tiwari 2018)—these runs are restricted to have no BH less massive than 5 ${M}_{\odot }$. We then use a semianalytic prescription, calibrated to the simulation results, to derive the $\langle {VT}\rangle $θ for specific hyperparameters.

Allowing the merger rate to evolve with redshift, the expected number of detections is given by

Equation (14)

If the merger rate does not evolve with redshift, i.e., R(z) = R0, this reduces to μ(θ) = R0 $\langle {VT}\rangle $θ.

We note that the hyperparameter likelihood given by Equation (11) reduces to the likelihood used in the O1 mass distribution analysis (Equation (D10) of Abbott et al. 2016c), which fit only for the shape, not the rate/normalization of the mass distribution, if one marginalizes over the rate parameter with a flat-in-log prior p(R0) ∝ 1/R0 (Fishbach et al. 2018; Mandel et al. 2019). For consistency with previous analyses, we adopt a flat-in-log prior on the rate parameter throughout this work.

2.7. Statistical Framework and Prior Choices

In practice, we sample the likelihood ${ \mathcal L }({d}_{n}| \xi ,z)$ using the parameter estimation pipeline LALInference (Veitch et al. 2015). Since LALInference gives us a set of posterior samples for each event, we first divide out the priors used in the individual-event analyses before applying Equation (11) (Hogg et al. 2010; Mandel 2010; see Appendix C).

Where not fixed, we adopt uniform priors on population parameters describing the models. Unless otherwise noted, for the event rate distribution we use a log-uniform distribution in R0, bounded between [10−1, 103]. While this is a different form than the priors adopted in Abbott et al. (2018a), we note that similar results are obtained on the rates (see Section 4), indicating that the choice of prior does not strongly influence the posterior distributions. We provide specific limits on all priors when the priors for a given model are introduced. Unless otherwise stated, all posterior credible intervals are 90% intervals, symmetric in the quantiles around the median. The MCMC-based analyses presented in this work have approximately 104 effective samples, after thinning by their autocorrelation time.

The normalization factor of the posterior density in Bayes's theorem is the evidence—it is the probability of the data given the model. We are interested in the preferences of the data for one model versus another. This preference is encoded in the Bayes factor, or the ratio of evidences. The odds ratio is the Bayes factor multiplied by their ratio of the model prior probabilities. In all cases presented here, the prior model probabilities are assumed to be equal, and odds ratios are equivalent to Bayes factors.

We often present the posterior population distribution (PPD) of various quantities. The PPD is the expected distribution of new mergers conditioned on previously obtained observations. It integrates the distribution of values (e.g., ξ, such as the masses and spins) conditioned on the model parameters (e.g., the power-law index) over the posteriors obtained for the model parameters:

Equation (15)

It is a predictor for future merger values ξnew given observed data ξobserved and factors in the uncertainties imposed by the posterior on the model parameters. Note that the PPD does not incorporate the detector sensitivity and therefore is not a straightforward predictor of the properties of future observed mergers.

3. The Mass Distribution

For context, Figure 4 in Abbott et al. (2018a) illustrates the inferred masses for all of the significant BBH observations identified in our GW surveys in O1 and O2. Despite at least moderate sensitivity to total masses between 0.1 and 500 ${M}_{\odot }$, current observations occupy only a portion of the binary mass parameter space. Notably, we have not yet observed a pair of very massive (e.g., 100 ${M}_{\odot }$) BHs, a binary that is bounded away from equal mass in its posterior, or a binary with a component mass confidently below 5 ${M}_{\odot }$. In our survey, we also find a preponderance of observations at higher masses: six with significant posterior support above 30 ${M}_{\odot }$. In this section, we attempt to reconstruct the BBH merger rate as a function of the component masses using parameterized models. Table 2 summarizes the mass models adopted from Section 2.3 and the prior distributions for each of the parameters in those models. We present results for three increasingly general mass and spin models, the most complex of which ranges over the full set of model parameters in Section 2, with the exception of dependence of rate on redshift. The interdependence of the mass and redshift distribution is explored more fully in Section 4.

3.1. Parameterized Modeling Results

Figure 2 shows our updated inference for the compact binary primary mass m1 and mass ratio q distributions for several increasingly general population models. In addition to inferring the mass distribution, all of these calculations self-consistently marginalize over the parameterized spin distribution presented in Section 5 and the merger rate. Figures 3 and 4 show the posterior distribution on selected model hyperparameters.

Figure 2.

Figure 2. Inferred differential merger rate as a function of primary mass, m1, and mass ratio, q, for three different assumptions. For each of the three increasingly complex assumptions A, B, C described in the text we show the PPD (dashed) and median (solid), as well as the 90% symmetric credible intervals (shaded regions), for the differential rate. The results shown marginalize over the spin distribution model. The falloff at small masses in Models B and C is driven by our choice of the prior limits on the ${m}_{\min }$ parameter (see Table 2). All three models give consistent mass distributions within their 90% credible intervals over a broad range of masses, consistent with their near-unity evidence ratios (Table 3); in particular, the peaks and trough seen in Model C, while suggestive, are not identified at high credibility in the mass distribution.

Standard image High-resolution image
Figure 3.

Figure 3. 1D and 2D posterior distributions for the hyperparameters describing Models A and B. Large values of α correspond to a mass distribution that rapidly decays with increasing mass. Large values of β correspond to a mass ratio distribution that prefers equal-mass binaries. Also shown is the 1D posterior distribution for the merger rate discussed in Abbott et al. (2018a) and the stability of Model A to the removal of the GW170729 event.

Standard image High-resolution image
Figure 4.

Figure 4. 1D and 2D posterior distributions for the hyperparameters describing Model C. This model consists of the power-law distribution in Model B with an additional Gaussian component at high mass. The parameters α, β, ${m}_{\max }$, and ${m}_{\min }$ describe the power-law component. The Gaussian has mean μm and standard deviation σm. The fraction of BHs in the Gaussian component is λm. This model also allows for a gradual turn-on at low masses over a mass range δm.

Standard image High-resolution image

If we assume that the BH masses follow a power-law distribution and fix the minimum BH mass to be ${m}_{\min }$ = 5 M (Model A), we find α = ${0.4}_{-1.9}^{+1.4}$ and ${m}_{\max }$ = ${41.6}_{-4.3}^{+9.6}\,{M}_{\odot }$. In Model B we infer the power-law index of the primary mass to be α = ${1.3}_{-1.7}^{+1.4}$ with corresponding limits ${m}_{\min }$ = ${7.8}_{-2.5}^{+1.2}\,{M}_{\odot }$ and ${m}_{\max }$ = ${40.8}_{-4.4}^{+11.8}\,{M}_{\odot }$.

Figure 4, shows the posterior over the population parameters present in Models A and B, as well as a second, Gaussian population parameterized with ${m}_{\max }$ and σm. λm is the mixing fraction of binaries in the Gaussian population versus the power law, with λm = 0 indicating only the power-law component. The Gaussian component is centered at μm = ${29.8}_{-7.3}^{+5.8}\,{M}_{\odot }$, has a width σm = ${6.4}_{-4.2}^{+3.2}\,{M}_{\odot }$, and is consistent with the parameters of the seven highest-mass events in our sample as seen in Figure 5. Also as a consequence of this mixture, the second component can account for many of the high-mass events. Without needing to accommodate higher-mass events, the inferred power law is much steeper (α = ${7.1}_{-4.8}^{+4.4}$) than Model A or Model B; however, the posterior distribution for Model C is less informative for α ≳ 4. This in turn means that we cannot constrain the parameter ${m}_{\max }$ in Model C since the power-law component has negligible support above ∼45 ${M}_{\odot }$ (see the top panel of Figure 2). In the intermediate regime, ∼15–25 M, Model C infers a smaller rate than Model A or Model B as a consequence of the steeper power-law behavior. The low-mass smoothing allowed in this model also weakens constraints we can place on the minimum BH mass; in this model we find ${m}_{\min }$ = ${6.9}_{-2.8}^{+1.7}\,{M}_{\odot }$. All three models produce consistent results for the marginal merger rate distribution, as is further discussed in Section 4.

Figure 5.

Figure 5. Left panel: compact-object masses (mCO) from GW detections in O1 and O2, with the black squares and error bars representing the component masses of the merging BHs and their uncertainties, and red triangles representing the mass and associated uncertainties of the merger products. The horizontal green line shows the 99th percentile of the mass distribution inferred from the Model B PPD. Right panel: predicted compact-object mass as a function of the zero-age main-sequence mass of the progenitor star (mZAMS) and for four different metallicities of the progenitor star (ranging from Z = 10−4 to Z = 2 × 10−2; Spera & Mapelli 2017). This model accounts for single stellar evolution from the PARSEC stellar evolution code (Bressan et al. 2012), for core-collapse supernovae (Fryer et al. 2012), and for pulsational pair-instability and pair-instability supernovae (Woosley 2017). The shaded areas represent the lower- and upper-mass gaps. There is uncertainty as to the final product of GW170817. It is shown in the left panel to emphasize that BNS mergers might fill the lower gap.

Standard image High-resolution image

All models feature a parameter ${m}_{\max }$, which defines a cutoff of the power law. However, the interpretation of that parameter within Model C is not a straightforward comparison with Models A and B, due to the presence of the Gaussian component at high mass and the large value of the power-law spectral index. Instead, to compare those two features, we compute the 99th percentile of the mass distribution inferred from the model PPDs (see Equation (15)). Model A obtains 44.0 M, Model B obtains 41.8 M, and Model C obtains 41.8 M. Therefore, all models self-consistently infer a dearth of BHs above ∼45 ${M}_{\odot }$. This is determined by the lower limit for the mass of the most massive BH in the sample because ${m}_{\max }$ can be no smaller than this value. Similarly, the models that allow ${m}_{\min }$ to vary (B and C) disfavor populations with ${m}_{\min }$ above ≃9 M. This parameter is close to the largest allowed mass for the least massive BH in the sample, for similar reasons.

The lower limits we place on ${m}_{\min }$ are dominated by our prior choices that constrain ${m}_{\min }$ ∈ [5, 10] ${M}_{\odot }$ (see Table 2). For example, in Figure 3, the posterior on ${m}_{\min }$ becomes flat as ${m}_{\min }$ approaches the prior boundary at 5 ${M}_{\odot }$. Given current sensitivities, this is to be expected (Littenberg et al. 2015; Mandel et al. 2015). In the inspiral-dominated regime, the sensitive time volume scales as VT ∼ m15/6 (Finn & Chernoff 1993); extending our inferred mass distributions and merger rates into the possible lower BH mass gap from 3–5 ${M}_{\odot }$ (Özel et al. 2010; Farr et al. 2011b; Kreidberg et al. 2012) yields an expected number of detected BBH mergers ≲1. Thus, we are unable to place meaningful constraints on the presence or absence of a mass gap at low BH mass.

Models B and C also allow the distribution of mass ratios to vary according to βq. In these cases the inferred mass ratio distribution favors comparable-mass binaries (i.e., distributions with most support near q ≃ 1); see the bottom panel of Figure 2. Within the context of our parameterization, we find βq = ${6.9}_{-5.7}^{+4.6}$ for Model B and βq = ${4.5}_{-5.2}^{+6.6}$ for Model C. These values are consistent with each other and are bounded above zero at 95% confidence, thus implying that the mass ratio distribution is nearly flat or declining with more extreme mass ratios. The posterior on βq returns the prior for βq ≳ 4. Thus, we cannot say much about the relative likelihood of asymmetric binaries, beyond their overall rarity.

The distribution of the parameter controlling the fraction of the power law versus the Gaussian component in Model C is λm = ${0.3}_{-0.2}^{+0.4}$, which peaks away from zero, implying that this model prefers a contribution to the mass distribution from the Gaussian population in addition to the power laws modeled in A and B. To determine preference among the three models presented in this section, we compute the Bayes factors comparing the mass models using a nested sampler (Skilling 2004), CPNest (Veitch et al. 2017). These are shown in Table 3. Model B, which allows ${m}_{\min }$ and βq to vary, is preferred over Model A (ln BFAB = −1.42). To isolate the contributions of the Gaussian component and low-mass smoothing in Model C, we compute the Savage–Dickey density ratio, p(θ = 0)/pprior(θ = 0), equivalent to the Bayes factor comparing without and with the feature. The model including a Gaussian component in addition to the power-law distribution is preferred over the pure power-law models (ln BFλ=0C = −1.92); nevertheless, all models infer mass distributions that agree within their 90% credible bounds (see Figure 2). We caution that the mild preferences in Table 3 are influenced by our choices of the range and shape of the priors we apply to the parameters, particularly for models where the number of parameters is comparable to the number of events. Moreover, the credible intervals on the distributions of the primary mass overlap, indicating that the model predictions agree to within the individual model uncertainties. We are unable to distinguish between a gradual and a sharp cutoff at low mass (ln ${\mathrm{BF}}_{C}^{{\delta }_{m}=0}$ = 0.14). This is unsurprising, since we are less sensitive to structure in the mass distribution at low masses (Talbot & Thrane 2018).

Table 3.  The Log Bayes Factor Comparing Each of the Models Described in Table 2 to the Most Complex Model, Model C

Model A B C, λm = 0 C, δm = 0
ln BFiC −2.28  −0.86  −1.92  0.14 

Note. The evidence for the three mass models is computed using nested sampling, while the limits λm = 0 and δm = 0 of Model C are computed using the Savage–Dickey density ratio.

Download table as:  ASCIITypeset image

The analysis above includes all 10 BBH detections, though not all events have the same statistical detection confidence (Gaebel et al. 2019). To assess the stability of our results against systematics in the estimated significance, we have repeated these analyses after omitting the least significant detection. For our sample, the least significant detection, GW170729, is also the most massive binary. Most features we derive from our observations remain unchanged, with one exception shown in Figure 3: since we have omitted the most massive binary, the maximum BH mass ${m}_{\max }$ reported in Models A and B is decreased by about 5 ${M}_{\odot }$. Without GW170729, the ${m}_{\max }$ distribution is ${38.3}_{-3.6}^{+7.3}\,{M}_{\odot }$ for Model A and ${37.3}_{-3.4}^{+8.5}\,{M}_{\odot }$ for Model B. This is consistent with the difference between GW170729 and the next-highest-mass binary, GW170823, when comparing the less massive end of their primary mass posteriors.

3.2. Comparison with Theoretical and Observational Models

Previous modeling of the primary mass distribution with a power-law distribution (Abbott et al. 2016c) was last updated with the discovery of GW170104 (Abbott et al. 2017e). This analysis measured spectral index of the power law to be α = ${2.3}_{-1.4}^{+1.3}$ at 90% confidence assuming a minimum BH mass of 5 ${M}_{\odot }$ and maximum total mass of 100 ${M}_{\odot }$. None of our models directly emulate this one, but Model A is the closest analog. When allowing ${m}_{\max }$ to vary, 100 ${M}_{\odot }$ is strongly disfavored. As a consequence of the lower ${m}_{\max }$, the power-law index inferred is also shallower than previously obtained (Fishbach & Holz 2017) but remains consistent with the previous distribution.

In Figure 5, we highlight the two mass gaps predicted by models of stellar evolution: the first gap between ∼2 and ∼5 M and the second between ∼50 and ∼150 M, compared against the observed BHs. A set of tracks (Spera & Mapelli 2017) relating the progenitor mass and compact object is also shown for reference purposes. The tracks are subject to many uncertainties in stellar and binary evolution and only serve as representative examples. We discuss some of those uncertainties in the context of our results below.

The minimum mass of a BH and the existence of a mass gap between neutron stars and BHs (lower gray shaded area, right panel of Figure 5) are currently debated. Claims (Özel et al. 2010; Farr et al. 2011b) of the existence of a mass gap between the heaviest neutron stars (∼2 M) and the lightest BHs (∼5 M) are based on the sample of about a dozen XRBs with dynamical mass measurements. However, Kreidberg et al. (2012) suggested that the dearth of observed BH masses in the gap could be due to a systematic offset in mass measurements. Moreover, only a subset of theoretical models (e.g., the "rapid" model in Fryer et al. 2012) reproduce this gap in stellar modeling. We can see in Figure 5 that none of the observed binaries sit in this gap, but the sample is not sufficient to definitively confirm or refute the existence of this mass gap.

From the first six announced BBH detections, Fishbach & Holz (2017) argued that there is evidence for missing BHs with mass greater than ≳40 M. The existence of this second mass gap—see the upper gray shaded area in the right panel of Figure 5 between ∼50 and ∼150 M—has been further explored by Talbot & Thrane (2018), Wysocki et al. (2018), Bai et al. (2018), and Roulet & Zaldarriaga (2019). This gap might arise from the combined effect of pulsational pair-instability (Barkat et al. 1967; Heger et al. 2003; Woosley et al. 2007; Woosley 2017) and pair-instability (Fowler & Hoyle 1964; Ober et al. 1983; Bond et al. 1984) supernovae. Uncertainties in stellar evolution models (e.g., stellar winds, rotation) and in the treatment of the final outcomes of (pulsational) pair instability lead to a range of possible low-mass edges for the upper-mass gap, as well as the shape and abundance in a putative buildup. Predictions for the maximum mass of BHs born after pulsational pair-instability supernovae are ∼50 ${M}_{\odot }$ (Belczynski et al. 2016a; Spera & Mapelli 2017). Our inferred maximum mass is consistent with these predictions.

4. Merger Rates and Evolution with Redshift

As illustrated in previous work (Abbott et al. 2016e, 2018a; Fishbach & Holz 2017; Fishbach et al. 2018; Wysocki et al. 2018), the inferred BBH merger rate depends on and correlates with our assumptions about their intrinsic mass (and to a lesser extent, spin) distribution. In the most recent catalog of GW BBH events (Abbott et al. 2018a), we infer the overall BBH merger rate for two fixed-parameter populations. The first of these populations follows the power-law model given by Equation (2) with α = 2.3, βq = 0, ${m}_{\min }$ = 5 ${M}_{\odot }$, and ${m}_{\max }$ = 50 ${M}_{\odot }$. The second population follows a distribution in which both BH masses are independently drawn from a flat-in-log distribution:

Equation (16)

subject to the same mass cutoffs 5 ${M}_{\odot }$ < m2 < m1 < 50 ${M}_{\odot }$ as the fixed power-law population. Both the power-law and flat-in-log populations assume an isotropic and uniform-magnitude spin distribution (αa = βa = 1). These two fixed-parameter populations are used to estimate the population-averaged sensitive volume $\langle {VT}\rangle $ with a Monte Carlo injection campaign as described in Abbott et al. (2018a), with each population corresponding to a different $\langle {VT}\rangle $ because of the strong correlation between the mass spectrum and the sensitive volume. Under the assumption of a constant-in-redshift rate density, these $\langle {VT}\rangle $ estimates yield two different estimates of the rate: ${57}_{-25}^{+40}$ Gpc−3 yr−1 for the α = 2.3 population and ${19}_{-8.2}^{+13}$ Gpc−3 yr−1 for the flat-in-log population (90% credibility; combining the rate posteriors from the two analysis pipelines).

The two fixed-parameter distributions do not incorporate all information about the mass, mass ratio, spin distribution, and redshift evolution suggested by our observations in O1 and O2. In this section, rather than fixing the mass and spin distribution, we estimate the rate by marginalizing over the uncertainty in the underlying population, which we parameterize with the mass and spin models employed in Sections 3 and 5. When carrying out these analyses, it is computationally infeasible to determine VT(ξ) for each point in parameter space with the full Monte Carlo injection campaign described in Abbott et al. (2018a), so we employ the semianalytic methods described in Appendix A. Furthermore, while the rate calculations in Abbott et al. (2018a) incorporate all triggers down to a very low threshold and fit the number of detections by modeling the signal and background distributions in the detection pipelines (Farr et al. 2015b; Abbott et al. 2016e), in this work we fix a high detection threshold Abbott et al. (2018a), which sets the number of detections to Nobs = 10. In principle, our results are sensitive to the choice of threshold, but this effect has been shown to be much smaller than the statistical uncertainties (Gaebel et al. 2019). The choice of detection threshold is further discussed in Appendix A. The full set of models used in this section is enumerated in Table 4.

Table 4.  Summary of Models in Section 4, with Prior Ranges for the Population Parameters Determining the Rate Models

  Mass Model Rate Parameters Spin Parameters
Model   λ αa βa ${\mathbb{E}}[a]$ Var[a]
Fixed parameter (power law) A, with α = 2.3, 0 1 1 N/A N/A
  ${m}_{\max }$ = 50 ${M}_{\odot }$           
Fixed parameter (flat-in-log) Equation (16) 0 1 1 N/A N/A
Nonevolving A, B, C 0 N/A N/A [0, 1] [0, 0.25]
Evolvinga A [−25, 25] N/A N/A 0 0

Notes. The fixed-parameter models are drawn from Abbott et al. (2018a). The fixed parameters are in bold. Each of these distributions is uniform over the stated range; as previously, we require αa, βa ≥ 1. Details of the mass models listed here are described in Table 2.

aThis model assumes that the BHs have zero spin.

Download table as:  ASCIITypeset image

In these calculations, we first maintain the assumption in Abbott et al. (2018a) that the merger rate is uniform in comoving volume and source-frame time, as discussed in Section 2. We then relax this assumption and consider a merger rate that evolves in redshift according to Equation (7), fitting the mass distribution jointly with the rate density as a function of redshift.

4.1. Nonevolving Merger Rate

We first consider the case of a uniform-in-volume merger rate and examine the effects of fitting the rate jointly with the distribution of masses and spins. The first column in Figures 3 and 4 shows the results of self-consistently determining the rate using the models for the mass and spin distribution described in the previous two sections.

Table 5 contains the intervals on the distribution of R0 for all three models. For Models B and C we deduce a merger rate in the range of R0 = $24.9\mbox{--}109.0\,{\mathrm{Gpc}}^{-3}\,{\mathrm{yr}}^{-1}$. Adopting Model A for the mass distribution yields a slightly higher rate estimate, R0 = $31.0-\,137.5\,{\mathrm{Gpc}}^{-3}\,{\mathrm{yr}}^{-1}$, as this model fixes ${m}_{\min }$ = 5 ${M}_{\odot }$, whereas Models B and C favor a higher minimum mass and therefore larger population-averaged sensitive volumes. The rate estimates are consistent between all mass models considered, including the results presented for the fixed-parameter power-law model in Abbott et al. (2018a). However, the fixed-parameter models in Abbott et al. (2018a) are disfavored by our full fit to the mass distribution, particularly with respect to the maximum mass. Our results favor maximum masses ≲45 ${M}_{\odot }$, rather than 50 ${M}_{\odot }$ as used in Abbott et al. (2018a), and power-law slopes closer to α ∼ 1. For this reason, although we infer a mass distribution slope that is similar to the flat-in-log population from Abbott et al. (2018a), we infer a rate that is closer to the rate inferred for the fixed-parameter power-law model.186 While $\langle {VT}\rangle $ gets larger (implying a smaller rate estimate) as α is decreased, decreasing ${m}_{\max }$ has the opposite effect, and so the $\langle {VT}\rangle $ for the fixed-parameter power-law model is similar to the $\langle {VT}\rangle $ values for our best-fit mass distributions, which favor smaller α and smaller ${m}_{\max }$.

Table 5.  BBH Merger Rate Intervals for Each of the Mass Models Tested

Model A B C
R0 (Gpc−3 yr−1) ${64.0}_{-33.0}^{+73.5}$  ${53.2}_{-28.2}^{+55.8}$  ${58.3}_{-32.2}^{+72.3}$ 

Note. These rates assume no evolution in redshift but otherwise marginalize over all other population parameters.

Download table as:  ASCIITypeset image

We note that while our analysis differs from the rate calculations in Abbott et al. (2018a) by the choice of prior on the rate parameter (log-uniform in this work compared to a Jeffreys prior p(R0) ∝ R0−0.5 in Abbott et al. 2018a), adopting a Jeffreys prior has a negligible effect on our rate posteriors. For example, under a log-uniform prior, we recover a rate for Model A of ${62.8}_{-33.3}^{+74.0}\,{\mathrm{Gpc}}^{-3}\,{\mathrm{yr}}^{-1}$, whereas under a Jeffreys prior this shifts by only ∼10% to ${56.7}_{-30.4}^{+65.4}\,{\mathrm{Gpc}}^{-3}\,{\mathrm{yr}}^{-1}$.

4.2. Evolution of the Merger Rate with Redshift

As discussed in the introduction, most formation channels predict some evolution of the merger rate with redshift, due to factors such as the star formation rate, time-delay distribution, metallicity evolution, and globular cluster formation rate (Dominik et al. 2013; Belczynski et al. 2016b; Mandel & de Mink 2016; Rodriguez & Loeb 2018). Therefore, in this section we allow the merger rate to evolve with redshift, and we infer the redshift evolution jointly with the mass distribution. For simplicity, we adopt the two-parameter Model A for the mass distribution and fix spins to zero for this analysis. As discussed in Section 3, the additional mass and spin degrees of freedom have only a weak effect on the inferred merger rate. We assume the redshift evolution model given by Equation (7). Because massive binaries are detectable at higher redshifts, the observed redshift evolution correlates with the observed mass distribution of the population, and so we must fit them simultaneously. However, as in Fishbach et al. (2018), we assume that the underlying mass distribution does not vary with redshift. We therefore fit the joint mass-redshift distribution according to the model:

Equation (17)

Note that this model assumes that the merger rate density increases or decreases monotonically with redshift over the sensitive range z < 1. If the merger rate follows the star formation rate, we expect the rate to peak around z ∼ 2, which is currently far beyond the horizon redshift for BBH detections.

Figure 6 shows the merger rate density as a function of redshift (blue band), compared to the rate inferred in Section 4.1 for the nonevolving Model B (orange band). The joint posterior probability density function (pdf) on λ, α, and ${m}_{\max }$, marginalized over the local rate parameter R0, is shown in Figure 7. There is a strong correlation between the mass power-law slope and the redshift evolution parameter. This is due to the fact that higher-mass BBHs are detectable at higher redshifts, and so, for the same underlying mass distribution, an increasing rate evolution with redshift implies that a greater fraction of detected BBHs will be massive. This effect is hard to disentangle from a shallower mass distribution, which will also produce comparatively more massive BBH detections. Note that the constraints on α and ${m}_{\max }$ in Section 3 are correlated by the same effect. Compared to the constraints on α and ${m}_{\max }$ discussed in Section 3, which assume a constant-in-redshift merger rate density, allowing for additional freedom in the redshift distribution of BBHs relaxes the constraints on the mass distribution parameters, especially the power-law slope α (${m}_{\max }$ is sufficiently well measured that the correlation with λ is not as noticeable). Under the assumption of a constant merger rate density, Model A in Section 3 finds α = ${0.4}_{-1.9}^{+1.4}$, ${m}_{\max }$ = ${41.6}_{-4.3}^{+9.6}\,{M}_{\odot }$, whereas allowing for redshift evolution yields α = ${1.8}_{-2.0}^{+1.7}$, ${m}_{\max }$ = ${41}_{-5}^{+11}$ ${M}_{\odot }$ when analyzing the sample of 10 BBHs from O1 and O2. As in Section 3, we carry out a leave-one-out analysis, excluding the most massive and distant BBH, GW170729, from the sample (red curves in Figure 7). Without GW170729, the marginalized mass distribution posteriors become α = ${0.9}_{-2.2}^{+1.8}$, ${m}_{\max }$ = ${38}_{-4}^{+10}$ ${M}_{\odot }$.

Figure 6.

Figure 6. Constraints on evolution of the BBH merger rate density as a function of redshift. Including the 10 BBHs from O1 and O2 in our analysis, we find a preference for a merger rate that increases with increasing redshift. The solid blue line gives the posterior median merger rate density, and dark and light bands give 50% and 90% credible intervals, respectively. In orange, the solid line and shaded region show the median and 90% credible interval, respectively, of the rate inferred for Model B as discussed in Section 4.1, assuming a nonevolving merger rate.

Standard image High-resolution image
Figure 7.

Figure 7. Posterior pdf on the redshift evolution parameter λ, mass power-law slope α, and maximum mass ${m}_{\max }$, marginalized over the local rate parameter R0, and assuming a flat prior on λ, α, and ${m}_{\max }$ and a flat-in-log prior on R0. In order to analyze the stability of the model against outliers, we repeat the analysis once with the sample of 10 BBHs (results shown in blue) and once excluding the most distant and massive event in our sample, GW170729 (results shown in red). The contours show 50% and 90% credible intervals. The dashed black lines show the values of hyperparameters assumed for the fixed-parameter power-law model. We infer a redshift evolution that is consistent with a flat in comoving volume and source-frame time merger rate (λ = 0) with a preference for λ ≥ 0 at 0.93 credibility when considering all 10 events. This preference becomes less significant with the exclusion of GW170729 from the analysis. The inferred power-law slope and maximum mass are consistent with the values inferred in Section 3. This analysis recovers a broader posterior on the mass power-law slope because of the correlation with the redshift evolution parameter, but the maximum mass remains well constrained at ≲45 ${M}_{\odot }$.

Standard image High-resolution image

Marginalizing over the two mass distribution parameters and the redshift evolution parameter, the merger rate density is consistent with being constant in redshift (λ = 0), and in particular, it is consistent with the rate estimates recovered under the different mass distribution models in Section 4.1 above. However, we find a preference for a merger rate density that increases at higher redshift (λ ≥ 0) with probability 0.93. This implies that models that predict a constant or slightly decreasing merger rate with redshift, such as certain models of primordial BHs (Mandic et al. 2016), are disfavored. This preference for a merger rate that increases with increasing redshift becomes less significant when GW170729 is excluded from the analysis because this event likely merged at redshift z ≳ 0.5, close to the O1–O2 detection horizon. Although GW170729 shifts the posterior toward larger values of λ, implying a stronger redshift evolution of the merger rate, the posterior remains well within the uncertainties inferred from the remaining nine BBHs. When including GW170729 in the analysis, we find λ = ${8.4}_{-9.5}^{+9.6}$ at 90% credibility, compared to λ = ${2.3}_{-10.9}^{+9.9}$ when excluding GW170729 from the analysis. With only 10 BBH detections so far, the wide range of possible values for λ is consistent with most astrophysical formation channels. The precision of this measurement will improve as we accumulate more detections in future observing runs and may enable us to discriminate between different formation rate histories or time-delay distributions (Sathyaprakash et al. 2012; Van Den Broeck 2014; Fishbach et al. 2018).

5. The Spin Distribution

The GW signal depends on spins in a complicated way, but at leading order, and in the regime we are interested in here, some combinations of parameters have more impact on our inferences than others and thus are measurable. One such parameter is ${\chi }_{\mathrm{eff}}$. For binaries that are near equal mass, we can see from Equation (1) that only when BH spins are high and aligned with the orbital angular momentum will ${\chi }_{\mathrm{eff}}$ be measurably greater than zero. Figure 5 in Abbott et al. (2018a) illustrates the inferred ${\chi }_{\mathrm{eff}}$ spin distributions for all of the BBHs identified in our GW surveys in O1 and O2. Only GW170729 and GW151226 show significant evidence for positive ${\chi }_{\mathrm{eff}}$; the rest of the posteriors cluster around ${\chi }_{\mathrm{eff}}$ = 0.

Despite these degeneracies, several tests have been proposed to use spins to constrain BBH formation channels (Farr et al. 2017, 2018; Gerosa & Berti 2017; Stevenson et al. 2017a; Talbot & Thrane 2017; Vitale et al. 2017; Gerosa et al. 2018; Wysocki et al. 2018). Drawing on these methods, we now seek to estimate the BH spin magnitude and misalignment distributions, under different assumptions regarding isotropy or alignment.

5.1. Spin Magnitude and Tilt Distributions

We examine here the individual spin magnitudes and tilt distributions. Throughout this section, when referring to the parametric models, we also allow the merger rate and population parameters describing the most general mass model to vary (Model C, see Table 2). Changing the parameterization of the mass model does not significantly change our inferences about the spin distribution. However, to account for degeneracies between mass and spin that grow increasingly significant for longer, low-mass signals (Baird et al. 2013), we must consistently model the mass and spin distributions together. See Table 6 for a summary of the models and priors used in this section.

Table 6.  Summary of Spin Distribution Models Examined in Section 5.1, with Prior Ranges for the Population Parameters Determining the Spin Models

  Mass Model Spin Parameters
Model   ${\mathbb{E}}[a]$ Var[a] αa, βa ζ σi
Gaussian (G) C [0, 1] [0, 0.25] ≥1 1 [0, 4]
Mixture (M) C [0, 1] [0, 0.25] ≥1 [0, 1] [0, 4]

Note.  The fixed parameters are in bold. Each of these distributions is uniform over the stated range, with boundary conditions such that the inferred parameters αa, βa must be ≥1. Details of the mass model listed here are described in Table 2.

Download table as:  ASCIITypeset image

The inferred distributions of spin magnitude are shown in Figure 8. The top panel shows the PPD and the median and associated uncertainties on the spin magnitude inferred from the parametric Mixture model defined in Section 2.4 and using prior distributions shown in Table 6. It marginalizes over all other parameters, including the mass parameters in Model C and the spin mixture fraction. We observe that spin distributions that decline with increasing magnitude are preferred. In terms of our Beta function parameterization—${\mathbb{E}}[a]$ and Var[a], defined in Equation (5)—these have mean spin ${\mathbb{E}}[a]\lt 1/2$ or equivalently have βa > αa, at posterior probability 0.79. We find that 90% of BH spins in BBHs are less than a ≤ 0.55 from the PPD, and 50% of BH spins are less than a ≤ 0.27. We find similar conclusions if both BH spins are drawn from different distributions (i.e., 90% of BH spins on the more massive BH are less than 0.6). When avoiding singular values in the spin magnitude model distribution, the distribution exhibits a peak structure, i.e., p(a = 0) = p(a = 1) = 0. If allowed to capture the full range of model parameters, including "singular" configurations, the support for small values of a is more pronounced. However, this scenario forces a small—and otherwise observationally unsupported—uptick of probability mass at a near maximal spins. In both cases, the recovered spin distribution in the top panel of Figure 8 is driven by favoring declining spin distributions, which are more compatible with the observed population. This conclusion is also consistent with the preference in Appendix B for the very low spin magnitude model.

Figure 8.

Figure 8. Inferred distribution of spin magnitude for a parametric (top) and nonparametric binned model (bottom). Both component magnitudes are included in these distributions. The solid lines show the median, and the dashed line shows the PPD. The shaded regions denote the 50% and 90% symmetric intervals. In the upper panel, the parametric model is presented with both singular (orange) and non-singular (blue) model configurations. For comparison purposes, the V (very low spin magnitude) model is plotted in a dash-dotted black line. In the bottom panel, the distribution of spin magnitude is inferred over five bins, assuming an either perfectly aligned (pink) or isotropic (green) population. The solid lines denote the median, and the shaded regions denote the central 90% posterior credible bounds. In both cases, the magnitude is consistent within the uncertainties with the parametric (singular and nonsingular) results. The number of bins in the model was chosen to balance resolution with the amount of information in the data; analyses with more bins do not indicate any additional features in the spin distributions.

Standard image High-resolution image

We also compute the posterior distribution for the magnitude of BH spins from ${\chi }_{\mathrm{eff}}$ measurements by modeling the distribution of BH spin magnitudes nonparametrically with five bins, assuming either an isotropic or perfectly aligned population following Farr et al. (2018). We show in the bottom panel of Figure 8 that under the perfectly aligned scenario there is a preference for small BH spin, inferring 90% of BHs to have spin magnitudes below ${0.6}_{-0.28}^{+0.24}$. However, when spins are assumed to be isotropic, the distribution is relatively flat, with 90% of BH spin magnitudes below ${0.8}_{-0.24}^{+0.15}$. Thus, the nonparametric analysis produces conclusions consistent with our parametric analyses described above. These conclusions are also reinforced by computing the Bayes factor for a set of fixed-parameter models of spin magnitude and orientation in Appendix B. There we find that the very low spin magnitude model is preferred by a log Bayes factor of 1 or greater in most mass and spin orientation configurations tested (see Figure 13 and Table 8 for details).

Figure 9 shows the inferred distribution of the primary spin tilt for the more massive BH. These results were obtained without including the effects of component spins on the detection probability; see Appendix A for further discussion. In the Gaussian model (ζ = 1), all BH spin orientations are drawn from spin tilt distributions that are preferentially aligned and parameterized with σi. In that model, the σi distributions do not differ appreciably from their flat priors. As such, the inferred spin tilt distribution is influenced by large σi, and the result resembles an isotropic distribution. The Mixture distribution does not return a decisive measurement of the mixture fraction, obtaining ζ = ${0.6}_{-0.5}^{+0.4}$. Since the Gaussian model is a subset of the Mixture model, we can compare preferences via the Savage–Dickey ratio. The log Bayes factor for ζ = 1 is ln BF = 0.15, indicating virtually no preference for any particular orientation distribution. While we allow both BHs to have different typical misalignment, the inference on the second tilt is less informative than the primary. The inferred distribution for cos t2 is similar to cos t1 but also closer to the prior.

Figure 9.

Figure 9. Inferred distribution of cosine spin tilt for the more massive BH for two choices of prior (see Section 2.4). The dashed–dotted line denotes a completely isotropic distribution (see Appendix B). The solid lines show the median. The shaded regions denote the 90% symmetric intervals, and the dashed line denotes the PPD.

Standard image High-resolution image

The mixture fraction distribution is also modeled with the fixed-parameter models in Appendix B. The fixed magnitude distributions considered in Appendix B prefer isotropic to aligned, but the preference is weakened for distributions concentrated at lower spins. A few exceptions occur for the very low spin fixed mass ratio models, with aligned models being slightly preferred.

In general, we are not able to place strong constraints on the distribution of spin orientations. We elaborate in Appendix B.3 on how our BH spin measurements are not yet informative enough to discern between isotropic and aligned orientation distribution via ${\chi }_{\mathrm{eff}}$.

5.2. Interpretation of Spin Distributions

The spins of BHs are affected by a number of uncertain processes that occur during the evolution of the binary. As a consequence, the magnitude distribution is difficult to predict from theoretical models of these processes alone. While the spin of a BH should be related to the rotation of the core of its progenitor star, the amount of spin that is lost during the final stages of the progenitor's life is still highly uncertain. While we have modeled the spins independently, correlations from binary evolution and stellar collapse are possible (Belczynski et al. 2017; Gerosa et al. 2018; Qin et al. 2018; Arca Sedda & Benacquista 2019; Postnov & Kuranov 2019). The core rotational angular momentum before the supernova can be changed from the birth spin of the progenitor by several processes (Langer 2012; de Mink et al. 2013; Amaro-Seoane & Chen 2016). Examples include mass transfer (Packet 1981; Shu & Lubow 1981) and tidal interactions (Petrovic et al. 2005), as well as internal mixing of the stellar layers across the core-envelope boundary via magnetic torquing (Spruit 2002; Maeder & Meynet 2003) and gravity waves (Talon & Charbonnel 2005, 2008; Fuller et al. 2015). In principle, an off-center supernova explosion could also impart significant angular momentum and tilt the spin of the remnant into the collapsing star (Farr et al. 2011a).

Once a BH is formed, however, changing the spin magnitude is more difficult owing to limitations on mass accretion rates affecting how much a BH can be spun up (Thorne 1974; Valsecchi et al. 2010; Wong et al. 2012; Qin et al. 2019). Once the BBH system is formed, the spin magnitudes do not change appreciably over the inspiral (Farr et al. 2014).

No BBH detected to date has a component with confidently high and aligned spin magnitude. The results in the previous section imply that BHs tend to be born with spin less than our PPD bound of 0.55, or that another process (e.g., supernova kicks or dynamical processes involved in binary formation) induces tilts such that ${\chi }_{\mathrm{eff}}$ is small.

The possibility of a spin magnitude distribution that peaks at low spins incurs a degeneracy between models that is not easily overcome: when the spin magnitudes are small enough, models produce features that cannot be distinguished within observational uncertainties.

6. Discussion and Conclusions

We have presented a variety of estimates for the mass, spin, and redshift distributions of BBH, based on the observed sample of 10 BBH and generic phenomenological population models motivated by electromagnetic observations and theory. Some model independent features are evident from the observations. Notably, no BBHs more massive than GW170729 have been observed to date, but several binaries have component masses likely between 20 and 40 M. No highly asymmetric (small q) system has been observed. Only two systems (GW151226 and GW170729) produce a ${\chi }_{\mathrm{eff}}$ distribution that is confidently different from zero; conversely, most BH binaries are consistent with ${\chi }_{\mathrm{eff}}$ near zero. These features drive our inferences about the mass and spin distribution.

Despite exploring a wide range of mass and spin distributions, we find that the BBH merger rate density is R = ${64.0}_{-33.0}^{+73.5}$ Gpc−3 yr−1 for Model A and is within R = ${53.2}_{-28.2}^{+55.8}$ Gpc−3 yr−1 for Models B and C. This result is consistent with the fixed model assumptions reported in the combined O1 and O2 observational periods (Abbott et al. 2018a). We find a significant reduction in the merger rate for BBHs with primary masses larger than $\sim 45{M}_{\odot }$. We do not have enough sensitivity to binaries with a BH mass less than 5 ${M}_{\odot }$ to be able to place meaningful constraints on the minimum mass of BHs. We find mild evidence that the mass distribution of coalescing BHs may not be a pure power law, instead being slightly better fit by a model including a broad Gaussian distribution at high mass. We find that the best-fitting models preferentially produce comparable-mass binaries (i.e., βq > 0 is preferred).

The mass models in this work supersede results from an older model from O1 that inferred only the power-law index (Abbott et al. 2016c, 2017e). That model found systematically larger values of α than its nearest counterpart in this work, Model A, because the older model used a fixed value for the minimum and maximum mass of 5 and 100 ${M}_{\odot }$, respectively. This extreme ${m}_{\max }$ is highly disfavored by our current results, and so the older model is also disfavored. Moreover, volumetric sensitivity grows as a strong function of mass. The lack of detections near the older ${m}_{\max }$ drives a preference for a much smaller maximum BH mass in the new models (Fishbach & Holz 2017). A reduced maximum mass is associated with a shallower power-law fit.

Inferring the redshift distribution is difficult with only a small sample of local events (Fishbach et al. 2018). We have constrained models with extreme variation over redshift, favoring instead those that are uniform in the comoving volume or have increasing merger rates with higher redshift. Many potential formation channels in the literature (Belczynski et al. 2014; Antonini & Rasio 2016; Inayoshi et al. 2016; Mandel & de Mink 2016; Rodriguez et al. 2016b; Bartos et al. 2017; Mapelli et al. 2017; Kruckow et al. 2018) produce event rates that are compatible with those from the previous observing runs (Abbott et al. 2018a) and this work. It is, of course, plausible that several are contributing simultaneously, and no combination of mass, rate, or redshift dependence explored here rules out any of the channels proposed to date. The next generation of interferometers will allow for an exquisite probe into this dependence at large redshifts (Sathyaprakash et al. 2012; Van Den Broeck 2014; Vitale & Farr 2018).

We have modeled the spin distribution in several ways, forming inferences on the spin magnitude and tilt distributions. In all of our analysis, the evidence disfavors distributions with large spin components aligned (or nearly aligned) with the orbital angular momentum; specifically, we find that 90% of the spin magnitude PPD is smaller than 0.55. We cannot significantly constrain the degree of spin–orbit misalignment in the population. However, regardless of the mass or assumed spin tilt distribution, there is a preference (demonstrated in Figure 8 and Appendix B) for distributions that emphasize lower spin magnitudes. Our inferences suggest that 90% of coalescing BH binaries are formed with ${\chi }_{\mathrm{eff}}$ < 0.3. Low spins argue against so-called second-generation mergers, where at least one of the components of the binary is a BH formed from a previous merger (Berti et al. 2007; González et al. 2007) and possesses spins near 0.7 (Fishbach et al. 2017).

GW170729 is notable in several ways: it is the most massive, largest ${\chi }_{\mathrm{eff}}$, and most distant redshift event detected so far. To quantify the impact it has on our results, where possible we have presented model posteriors that reflect its presence in or exclusion from the event set. Many of our predictions are robust despite its extreme values—by far, and not unexpectedly, its influence is most significant in the distribution of ${m}_{\max }$. It also impacts our conclusions about redshift evolution, where its absence flattens the inferred redshift evolution.

Recent modeling using only the first six released events (Wysocki et al. 2018; Roulet & Zaldarriaga 2019) has come to similar conclusions about low spin magnitudes and the shape of the power-law distribution. The presence of an apparent upper limit to the merging BBH mass distribution was also observed after the first six released events (Fishbach & Holz 2017). An enhancement that will benefit these types of analyses in the future is a simultaneous fit of the astrophysical model and its parameters and noise background model (Gaebel et al. 2019).

Several studies have noted that population features (Mandel & O'Shaughnessy 2010; Stevenson et al. 2015, 2017a; Farr et al. 2017, 2018; Fishbach et al. 2017; Fishbach & Holz 2017; Gerosa & Berti 2017; Kovetz et al. 2017; Talbot & Thrane 2017, 2018; Zevin et al. 2017; Barrett et al. 2018; Gerosa et al. 2018; Wysocki et al. 2018) and complementary physics (Abbott et al. 2016f; Stevenson et al. 2017a; Zevin et al. 2017; Chen et al. 2018) will be increasingly accessible as observations accumulate. Additional events will also permit the enhancement of the simple phenomenological models used in this work and comparison with modeling of astrophysical processes. Given the event merger rates estimated here and anticipated improvements in sensitivity (Abbott et al. 2018b), hundreds of BBHs and tens of binary neutron stars are expected to be collected in the operational lifetime of second-generation GW instruments. Thus, the inventory of BBH in the coming years will enable inquiries into astrophysics that were previously unobtainable.

The authors gratefully acknowledge the support of the United States National Science Foundation (NSF) for the construction and operation of the LIGO Laboratory and Advanced LIGO as well as the Science and Technology Facilities Council (STFC) of the United Kingdom, the Max-Planck-Society (MPS), and the State of Niedersachsen/Germany for support of the construction of Advanced LIGO and construction and operation of the GEO600 detector. Additional support for Advanced LIGO was provided by the Australian Research Council. The authors gratefully acknowledge the Italian Istituto Nazionale di Fisica Nucleare (INFN), the French Centre National de la Recherche Scientifique (CNRS), and the Foundation for Fundamental Research on Matter supported by the Netherlands Organisation for Scientific Research, for the construction and operation of the Virgo detector and the creation and support of the EGO consortium. The authors also gratefully acknowledge research support from these agencies, as well as by the Council of Scientific and Industrial Research of India, the Department of Science and Technology, India, the Science & Engineering Research Board (SERB), India, the Ministry of Human Resource Development, India, the Spanish Agencia Estatal de Investigación, the Vicepresidència i Conselleria d'Innovació Recerca i Turisme and the Conselleria d'Educació i Universitat del Govern de les Illes Balears, the Conselleria d'Educació Investigació Cultura i Esport de la Generalitat Valenciana, the National Science Centre of Poland, the Swiss National Science Foundation (SNSF), the Russian Foundation for Basic Research, the Russian Science Foundation, the European Commission, the European Regional Development Funds (ERDF), the Royal Society, the Scottish Funding Council, the Scottish Universities Physics Alliance, the Hungarian Scientific Research Fund (OTKA), the Lyon Institute of Origins (LIO), the Paris Île-de-France Region, the National Research, Development and Innovation Office Hungary (NKFIH), the National Research Foundation of Korea, Industry Canada and the Province of Ontario through the Ministry of Economic Development and Innovation, the Natural Science and Engineering Research Council Canada, the Canadian Institute for Advanced Research, the Brazilian Ministry of Science, Technology, Innovations, and Communications, the International Center for Theoretical Physics South American Institute for Fundamental Research (ICTP-SAIFR), the Research Grants Council of Hong Kong, the National Natural Science Foundation of China (NSFC), the Leverhulme Trust, the Research Corporation, the Ministry of Science and Technology (MOST), Taiwan, and the Kavli Foundation. The authors gratefully acknowledge the support of the NSF, STFC, MPS, INFN, CNRS, and the State of Niedersachsen/Germany for provision of computational resources.

Appendix A: Systematics

In this section, we discuss the systematic uncertainties that affect our analysis and show that they are subdominant to statistical uncertainties. We focus on two major sources of systematic uncertainty. The first of these is introduced by the waveform models that are used to extract the parameters of individual events, and the second is in the estimation of the detection efficiency.

A.1. Waveform Systematics

In Abbott et al. (2018a), two waveform families are used to extract the parameters of individual events: SEOBNRv3 (Pan et al. 2014; Babak et al. 2017) and IMRPhenomPv2 (Hannam et al. 2014; Husa et al. 2016; Khan et al. 2016). While both families capture a wide variety of physical effects, including simple precession and other spin effects, they do not match each other exactly over the whole of the parameter space. Differences between the waveforms can therefore lead to slight biases in the inference of individual events' parameters and thereby impact the inferred population distributions. To directly assess the impact of these uncertainties on our results, we have repeated our calculations using parameter estimates based on SEOBNRv3 (the results in the main text all use IMRPhenomPv2). See Table 7—we find that the two waveform models produce at most modestly different inferences about key parameters. For example, the standard Model B mass and spin distribution analysis with SEOBNRv3 leads us to infer that the 90% upper bound of a1 is 0.5 and credible intervals on ${m}_{\max }$ and ${ \mathcal R }$ are 36.7–53.0 M and $24.8\mbox{--}105.7\,{\mathrm{Gpc}}^{-3}\,{\mathrm{yr}}^{-1}$, which is consistent with the IMRPhenomPv2 Model B estimates of 0.6, 36.4–52.6 M, and $24.9\mbox{--}109.0\,{\mathrm{Gpc}}^{-3}\,{\mathrm{yr}}^{-1}$ presented in the main text. Similarly, in the redshift evolution analysis, we infer that the redshift evolution parameter λ = ${8.4}_{-9.7}^{+9.6}$ under the SEOBNRv3 waveform compared to λ = ${8.4}_{-9.5}^{+9.6}$ under the IMRPhenomPv2 waveform. In summary, for the mass and rate part of the distributions, we find that there is no significant change whatsoever. The most significant change is to the parameter controlling the primary tilt angle. The SEOBNRv3 waveform predicts that this parameter is smaller by 50%, along with a smaller reduction in the secondary tilt angle parameter. Compare Figure 10, produced with SEOBNRv3 derived event posteriors, with Figure 9. SEOBNRv3 produces a distribution of tilts that is closer to isotropic than its IMRPhenomPv2 counterpart. However, the models are compatible to within their uncertainties over the distribution p(cos t1).

Figure 10.

Figure 10. Inferred distribution of cosine spin tilt for the more massive BH for two choices of prior (see Section 2.4) with the SEOBNRv3 waveform model, with the same definitions as in Figure 9.

Standard image High-resolution image

Table 7.  Summary of Intervals for Each of the Parameters Considered in the Models of Sections 35

Parameter/Model Reference Spin $\langle {VT}\rangle $  using SEOBNR
Mass
α (Model A) ${0.4}_{-1.9}^{+1.4}$  ${0.4}_{-1.9}^{+1.3}$  ${0.3}_{-2.0}^{+1.4}$ 
α (Model B) ${1.3}_{-1.7}^{+1.4}$  ${1.2}_{-1.7}^{+1.4}$  ${1.2}_{-1.8}^{+1.4}$ 
α (Model C) ${7.1}_{-4.8}^{+4.4}$  ${7.3}_{-4.8}^{+4.2}$ 
βq (Model B) ${6.9}_{-5.7}^{+4.6}$  ${7.0}_{-5.6}^{+4.6}$  ${7.1}_{-6.0}^{+4.4}$ 
βq (Model C) ${4.5}_{-5.2}^{+6.6}$  ${5.0}_{-5.7}^{+6.3}$ 
mmax (Model A) ${41.6}_{-4.3}^{+9.6}$  ${41.6}_{-4.4}^{+9.9}$  ${41.3}_{-4.2}^{+9.2}$ 
mmax (Model B) ${40.8}_{-4.4}^{+11.8}$  ${40.6}_{-4.3}^{+10.8}$  ${40.5}_{-3.9}^{+12.5}$ 
mmax (Model C) ${62.0}_{-28.7}^{+34.0}$  ${62.5}_{-29.1}^{+33.8}$ 
mmin (Model B) ${7.8}_{-2.5}^{+1.2}$  ${7.7}_{-2.4}^{+1.3}$  ${7.7}_{-2.4}^{+1.3}$ 
mmin (Model C) ${6.9}_{-2.8}^{+1.7}$  ${6.9}_{-2.7}^{+1.7}$ 
λm (Model C) ${0.3}_{-0.2}^{+0.4}$  ${0.3}_{-0.2}^{+0.4}$ 
μm (Model C) ${29.8}_{-7.3}^{+5.8}$  ${30.1}_{-7.4}^{+5.8}$ 
σm (Model C) ${6.4}_{-4.2}^{+3.2}$  ${5.9}_{-4.0}^{+3.5}$ 
Rate
R (Model A) ${64.0}_{-33.0}^{+73.5}$  ${62.8}_{-33.3}^{+74.0}$  ${62.4}_{-31.9}^{+74.0}$ 
R (Model B) ${53.2}_{-28.2}^{+55.8}$  ${51.8}_{-26.9}^{+55.3}$  ${52.9}_{-28.2}^{+52.7}$ 
R (Model C) ${58.3}_{-32.2}^{+72.3}$  ${58.0}_{-32.0}^{+70.3}$ 
Spin
cos t1 (Model A) ${2.0}_{-1.4}^{+1.8}$  ${2.2}_{-1.6}^{+1.6}$  ${1.2}_{-1.0}^{+2.4}$ 
cos t1(Model B) ${2.1}_{-1.5}^{+1.7}$  ${2.2}_{-1.6}^{+1.6}$  ${1.4}_{-1.2}^{+2.2}$ 
cos t1 (Model C) ${1.8}_{-1.7}^{+1.9}$  ${1.2}_{-1.1}^{+2.4}$ 
cos t2 (Model A) ${2.3}_{-1.6}^{+1.5}$  ${2.4}_{-1.6}^{+1.5}$  ${2.1}_{-1.6}^{+1.7}$ 
cos t2 (Model B) ${2.3}_{-1.5}^{+1.5}$  ${2.4}_{-1.6}^{+1.5}$  ${2.0}_{-1.5}^{+1.8}$ 
cos t2 (Model C) ${2.0}_{-1.6}^{+1.8}$  ${2.0}_{-1.6}^{+1.8}$ 

Note. The reference uses the posteriors derived from the IMRPhenomPv2 waveform model and without spin effects included in $\langle {VT}\rangle $. The second column allows for spin effects in $\langle {VT}\rangle $ estimation. Finally, the third column shows the population model parameters inferred when the SEOBNRv3 waveform model is used to derive the event posteriors. Spin enabled $\langle {VT}\rangle $ is only available for Models A and B, but we expect that Model C would exhibit similar trends. Broadly, the mass and rate parameters are nearly the same and well within their respective uncertainties with and without spin effects in $\langle {VT}\rangle $, as well as considering the SEOBNRv3 waveform model. The most notable difference comes from the parameterized spin distribution. The differences are primarily related to the spin tilt distribution, and, for clarity, we suppress the spin magnitude distribution and mixture parameters, which are nearly identical.

Download table as:  ASCIITypeset image

A.2. Selection Effects and Sensitive Volume

In this subsection we detail the various assumptions and possible systematics that enter into our calculation of the detection efficiency. The detectability of a BBH merger in GWs depends on the distance and orientation of the binary along with its intrinsic parameters, especially its component masses. In order to model the underlying population and determine the BBH merger rate, we must properly model the mass-, redshift-, and spin-dependent selection effects and incorporate them into our population analysis according to Equation (11). One way to infer the sensitivity of the detector network to a given population of BBH mergers is by carrying out large-scale simulations in which synthetic GW waveforms are injected into the detector data and subsequently searched for. The parameters of the injected waveforms can be drawn directly from the fixed population of interest, or alternatively, the injections can be placed to more broadly cover parameter space and reweighed to match the properties of the population (Tiwari 2018). Such injection campaigns were carried out in Abbott et al. (2018a) to measure the total sensitive spacetime volume $\langle {VT}\rangle $ and the corresponding merger rate for two fixed-parameter populations (power law and flat-in-log). However, it is computationally expensive to carry out an injection campaign that sufficiently covers the multidimensional population hyperparameter space considered in this work. For this reason, for the parametric population studies in this work, we employ a semianalytic method to estimate the fraction of found detections as a function of masses, spins, and redshift (or equivalently, distance).

Our estimates of the network sensitivity are based on the semianalytic method that was used to infer the BBH mass distribution from the first four GW detections (Abbott et al. 2016c, 2017e). This method assumes that a BBH system is detectable if and only if it produces a signal-to-noise ratio (S/N) ρ ≥ ρth in a single detector, where the threshold S/N, ρth, is typically chosen to be 8. Given a BBH system with known component masses, spins, and cosmological redshift, and a detector with stationary Gaussian noise characterized by a given power spectral density (PSD), one can calculate the optimal S/N, ρopt, of the signal emitted by the BBH merger. The optimal S/N corresponds to the S/N of the signal produced by a face-on, directly overhead BBH merger with the same masses, spins, and redshift. Given ρopt, the distribution of single-detector S/Ns ρ—corresponding to sources with random orientations with respect to the detector—can be calculated using the analytic distribution of angular factors Θ ≡ ρ/ρopt (Finn & Chernoff 1993). Under these assumptions, the probability of detecting a system of given masses, spins, and redshift, Pdet(m1, m2, χ1, χ2, z), is given by the probability that ρ ≥ ρth, or equivalently, that a randomly drawn Θ ≥ ρth/ρopt(m1, m2, χ1, χ2, z). Pdet referred to in this section is equivalent to the f(zξ) that appears in Equation (12) of Section 2.

The semianalytic calculation relies on two main simplifying assumptions: the detection threshold ρth, and the choice of PSD for characterizing the detector noise. When fitting the mass distribution to the first four BBH events in Abbott et al. (2017e), we assumed that the PSD in each LIGO interferometer could be approximated by the Early High Sensitivity curve in Abbott et al. (2018b) during O1 and the first few months of O2, and we fixed ρth = 8. We refer to the sensitivity estimate under these assumptions as the raw semianalytic calculation. In reality, the detector PSD fluctuates throughout the observing period. Additionally, the fixed detection threshold on S/N does not directly account for the empirical distributions of astrophysical and noise triggers and does not have a direct correspondence with the detection statistic used by the GW searches to rank significance of triggers (Messick et al. 2017; Nitz et al. 2017; Abbott et al. 2018a). Consequently, the sensitive spacetime volume of a population estimated using an S/N threshold may differ from the one obtained using injections, which has a threshold on the pipeline-dependent detection statistic.

We therefore pursue two modifications to the raw semianalytic calculation in order to reduce the bias in our sensitivity estimates and the resulting population estimates. We emphasize that these modifications do not noticeably affect the inferred shape of the population, e.g., the mass power-law slope, but do lead to different rate estimates, reflecting a systematic uncertainty in the inferred merger rate and its evolution with redshift that, given the small number of events and uncertainty in the phenomenological population models, remains subdominant to the statistical uncertainty. This is explicitly shown in the remainder of this section.

In the first modification, which we employ throughout the mass distribution analysis (Section 3), we calibrate the raw semianalytic method to the injection campaign in Abbott et al. (2018a). The calibration takes the form of mass-dependent calibration factors, calculated by least-squares regression as described below; see Wysocki & O'Shaughnessy (2018) for relevant data products. Specifically, we use injections to evaluate ${\left\langle {VT}\right\rangle }_{i}\equiv \int d\xi p(\xi | {\theta }_{i}){{VT}}_{\mathrm{true}}(\xi )$ for a set of reference hyperparameters θi (here, mass distribution models with different exponents α and maximum masses ${m}_{\max }$), where ξ denotes all binary parameters. To calculate $\left\langle {\text{}}{VT}\right\rangle $i from injections into the PyCBC detection pipeline, we consider injections to be "detected" if they have a detection statistic ϱ ≥ 8, where ϱ is the statistic used in the PyCBC analysis of O2 data (Nitz et al. 2017; Abbott et al. 2018a). This is comparable to the detection statistic ϱ = 8.7 of the lowest-significance GW event included in our analysis, GW170729. Note that, as discussed in Section 4, because we adopt a fixed detection threshold, our analysis differs from the rate analysis in Abbott et al. (2018a), which does not fix a detection threshold, instead assigning to each trigger a probability of astrophysical origin (Farr et al. 2015b). Once we have computed $\left\langle {\text{}}{VT}\right\rangle $i, we correct the raw semianalytic model VTraw described above by a factor f(ξ), which is a low-order polynomial in ξ: $f(\xi )={\sum }_{\alpha }{\lambda }_{\alpha }{F}_{\alpha }(\xi )$, with Fα the relevant basis polynomials. We minimize the mean-square difference between $\left\langle {\text{}}{VT}\right\rangle $ as computed by injections and $\int d\xi f(\xi )\langle {VT}{\rangle }_{\mathrm{raw}}(\xi )p(\xi | \theta )$. If H is the precomputed matrix of weight "moments" ${H}_{k,\alpha }=\int d\xi p(\xi | {\theta }_{k}){VT}(\xi ){F}_{\alpha }(\xi )$, then the coefficients of this least-squares expression can be computed analytically as $\lambda ={({H}^{T}\gamma H)}^{-1}{H}^{T}\gamma \left\langle {VT}\right\rangle $, where γ is a diagonal inverse covariance matrix characterizing the Monte Carlo integration errors of each individual $\left\langle {\text{}}{VT}\right\rangle $i. This procedure yields the mass-calibrated sensitive volume $\langle {VT}\rangle $cal.

The top panel of Figure 11 shows the comparison between the raw semianalytic $\langle {VT}\rangle $, the calibrated $\langle {VT}\rangle $, and the injection $\langle {VT}\rangle $ across the 2D hyperparameter space of Model A for the mass distribution. We have repeated our mass distribution analysis with different choices of the $\langle {VT}\rangle $ calibration and found that the effect on the shape of the mass distribution and the overall merger rate ${ \mathcal R }$ are much smaller than the differences between Models A, B, and C and the statistical errors associated with a small sample of 10 events.

Figure 11.

Figure 11. Ratio between the raw semianalytic computation of $\langle {VT}\rangle $ and the $\langle {VT}\rangle $ computed by injections into the PyCBC search pipeline (top left panel), and the same ratio for the mass-calibrated $\langle {VT}\rangle $ (top right panel), for different mass distributions described by the two-parameter Model A. The bottom panel shows the same ratios, but this time comparing the $\langle {VT}\rangle $ derived with the time-varying PSDs applied to the semianalytic calculation VTpsd to the raw and the calibrated $\langle {VT}\rangle $. The $\langle {VT}\rangle $ for the injections is calculated for a threshold of ϱ = 8.0, where ϱ is the signal-to-noise ratio model statistic used in the PyCBC analysis of O2 data. This threshold roughly matches the detection statistic of the lowest-significance detection, GW170729, which has ϱ = 8.7 in PyCBC. We use the mass-calibrated VTcal for the parametric mass and spin distribution analyses in Sections 3 and 5 in order to better match the injection results, while the redshift evolution analysis uses VTpsd in order to carefully track the sensitivity at high redshift. The discrepancy between the methods may be due to the limited number of high-redshift injections. However, the difference between all three methods is relatively constant as a function of the mass population, particularly where posterior support for the mass distribution hyperparameters is high, indicating that systematic uncertainties in the $\langle {VT}\rangle $ estimation do not have a large impact on our results.

Standard image High-resolution image

As shown in Figure 11, the main effect of this calibration is to decrease $\langle {VT}\rangle $ by a factor of ∼1.6. Over the relevant part of parameter space (i.e., the regions of the α${m}_{\max }$ plane that have likelihood support), this factor remains fairly constant, implying that the inferred shape of the mass distribution is not affected by applying the $\langle {VT}\rangle $ calibration, although the overall rate is increased by about a factor of ∼1.6 compared to the raw semianalytic calculation. We have verified this explicitly by repeating the analysis with and without calibrated $\langle {VT}\rangle $.

For the redshift evolution analysis (Section 4), it is not sufficient to calibrate the mass dependence of the detection probability; we must verify that the semianalytic calculation reproduces the proper redshift dependence. Therefore, we pursue an alternative modification to the raw semianalytic calculation. In this modification, we replace the single PSD of the raw semianalytic calculation with a different PSD calculated for the Livingston detector for each 5-day chunk of observing time in O1 and O2. We find that this assumption correctly reproduces the redshift-dependent sensitivity empirically determined by the injection campaigns into the GstLAL pipeline for two fixed mass distributions (see Figure 12), whereas adopting different assumptions, such as using the PSDs calculated for the Hanford detector instead of the Livingston detector, or changing the single-detector S/N threshold away from 8, yields curves in Figure 12 that deviate noticeably from the distribution of recovered injections. This modification to the sensitivity calculation is necessary in the redshift analysis because the detection probability can fluctuate significantly at high redshifts z > 0.5, where there is a very small probability of detection but considerable physical volume. Due to computational cost, the number of detections available at high redshift is insufficient to directly calibrate the redshift-dependent detection probability to injections as we did in the mass distribution section.

Figure 12.

Figure 12. Redshift distribution of injections recovered with a false-alarm rate (FAR) less than 0.1 yr−1 by the search pipeline GstLAL for the two fixed-parameter injection sets, power law (red) and flat-in-log (green), compared to the expectation from the semianalytic calculation used for the redshift evolution analysis, as described in the text. The underlying redshift distributions of the injected populations are assumed to follow a uniform in comoving volume and source-frame time distribution. The FAR threshold of 0.1 yr−1 nearly matches the FAR of the lowest-significance GW event, GW170729, with an FAR of 0.18 yr−1 in the GstLAL pipeline. The semianalytic calculation closely predicts the redshift distribution of the found injections.

Standard image High-resolution image

We find that between the two methods we use to estimate detection efficiency, the effect on the inferred mass distribution is negligible. However, the second time-varying approach employed in the redshift analysis underpredicts the overall merger rate by ∼70% compared to the first calibrated approach (see the bottom right panel of Figure 11). This reflects a systematic uncertainty in the high-redshift detection efficiency and the implied merger rate. When additional detections lead to improved statistical constraints on the merger rate across redshift, it will become increasingly necessary to place a very large number of injections at high redshift and closely spaced in time in order to accurately estimate the high-redshift sensitivity.

Another difference between the mass distribution analysis presented in Section 3 and the redshift evolution analysis of Section 4 is in the treatment of BBH spins. Section 3 marginalizes over the spin distribution and includes first-order spin effects in the calculation of $\langle {VT}\rangle $, while the redshift analysis of Section 4 does not. From Table 7, we find that including first-order spin effects in the calculation of Pdet and the corresponding sensitive spacetime volume $\langle {VT}\rangle $ results in mostly indistinguishable population estimates compared to neglecting spin entirely. Similarly, fixing the spin distribution does not appreciably affect the inferred mass distribution. Therefore, for simplicity we neglect the effect of spin distribution in the redshift evolution analysis. Meanwhile, the effects of spin on the sensitive volume $\langle {VT}\rangle $ do have a moderate influence on inferences about the spin tilt angles, presented in Section 5.1. When considering the effects of spin with $\langle {VT}\rangle $, there is about a 10% shift in the median spin tilt angle parameters inferred, but this is well within the much wider credible interval. Therefore, such effects do not change our overall astrophysical conclusions, and their influence on the results shown is comparable to what would result from different priors on the population parameters (e.g., choosing a different prior range of σi as compared to Table 6).

We also note that all our calculations of the detection efficiency are based on the IMRPhenomPv2 waveform. Differences between the phasing and, more importantly, the amplitude of the waveform can lead to different S/Ns and detection statistics for the same sets of physical parameters. To bound the significance of this effect, we carry out the injection-based $\langle {VT}\rangle $ estimation for both the IMRPhenomPv2 and SEOBNRv2 waveforms and find that for populations described by the two-parameter mass Model A, the waveforms produce $\langle {VT}\rangle $ estimates consistent to 10% across the relevant region of hyperparameter space with high posterior probability. Therefore, compared to the statistical uncertainties, the choice in waveform does not contribute a significant systematic uncertainty for the $\langle {VT}\rangle $ estimation.

Finally, an additional systematic uncertainty we have neglected in the $\langle {VT}\rangle $ and parametric rate calculations is the calibration uncertainty. While the event posterior samples have incorporated a marginalization over uncertainties on the calibration (Farr et al. 2015a) for both strain amplitude and phase, the $\langle {VT}\rangle $ estimation here does not. The amplitude calibration uncertainty results in an 18% volume uncertainty (Abbott et al. 2018a), which is currently below the level of statistical uncertainty in our population-averaged merger rate estimate.

Appendix B: Alternative Spin Models

We perform here a number of complementary analyses to reinforce the robustness of the results in Section 5 and gauge the effect of fixed-parameter choices on spin inferences. Instead of a parameterized model such as those used in Section 5, we focus on a few discrete choices of model parameters to reinforce the conclusions in that section. These choices provide a complementary view to the results presented earlier and also display our current ability (or inability) to measure features in differing parts of the mass and spin parameter space.

B.1. Model Selection

We choose a set of specific realizations of the general model described in Section 2.2, building on Farr et al. (2017) and Tiwari et al. (2018). Four discrete spin magnitude models are considered, the first three being special cases of Equation (4):

  • 1.  
    Low (L): p(a) = 2 (1 − a), i.e., αa = 1, βa = 2.
  • 2.  
    Flat (F): p(a) = 1, i.e., αa = 1, βa = 1.
  • 3.  
    High (H): p(a) = 2a, i.e., αa = 2, βa = 1.
  • 4.  
    Very low (V): p(a) ∝ e−(a/0.2).

Such magnitude distributions are chosen as simple representations of low, moderate, and highly spinning individual BHs. The very low (V) population is added to capture the features of an even lower spinning population—this is motivated by the features at low spin of the parametric distribution displayed in Figure 8.

For spin orientations we consider three fixed models representing extreme cases of Equation (6):

  • 1.  
    Isotropic (I): p(cos ti) = 1/2; −1 < cos ti < 1, i.e., ζ = 0.
  • 2.  
    Aligned (A): p(cos ti) = δ(cos ti − 1), i.e., ζ = 1, σi = 0.
  • 3.  
    Restricted (R): p(cos ti) = 1; 0 < cos ti < 1. This is the same as I, except the spins are restricted to point above the orbital plane.

The isotropic distribution is motivated by dynamical or similarly disordered assembly scenarios, while the aligned one better captures a population of isolated binaries, under the simplifying assumption that the stars remain perfectly aligned throughout their evolution. In order to assess any preferences in the data for binaries with χeff > 0, we introduce the restricted model: it resembles the isotropic distribution but limits tilt angles to be positive. While we have mathematically defined the R model by assuming tilted spins, the same χeff distribution can be generated with nonprecessing spins.

Here we perform our inference entirely through ${\chi }_{\mathrm{eff}}$, whose 12 different distributions are illustrated in Figure 13. Since we do not have conclusive results on βq from Figure 3, we cannot make a single simplifying assumption on the mass model, which the ${\chi }_{\mathrm{eff}}$ distribution depends on. We therefore consider three limiting cases: two of these fix the mass ratio to fiducial values, q = 1 and q = 0.5. The third corresponds to a fixed-parameter model with α = 1, ${m}_{\min }$ = 5, ${m}_{\max }$ = 50. Figure 13 illustrates the ${\chi }_{\mathrm{eff}}$ distributions implied by each of these scenarios.

Figure 13.

Figure 13. Top row: p(χeff) under various model assumptions. Labels in each subpanel legend correspond to the tilt and magnitude models defined in Appendix B.1. Isotropic models (left) provide support for both negative and positive χeff. Aligned models (middle) assume perfect alignment for each of the four magnitude distributions. Restricted models (right) have the same shape as the Isotropic ones, with support over χeff > 0 only. However, they can be generated with nonprecessing spins. Bottom row: posterior on the mixture fraction ζ between isotropic and aligned distributions. ζ = 0 corresponds to a completely isotropic distribution.

Standard image High-resolution image

Following Farr et al. (2017) and Tiwari et al. (2018), we calculate the evidence and compute the Bayes factors for each of the zero-dimensional spin models. Results are provided in Table 8, with the low and isotropic distribution (LI) as the reference.

Table 8.  Natural Log Bayes Factors for Various Spin Distributions

q = 1 Very low Low Flat High
Isotropic 1.29 0.0 −1.04 −2.25
Restricted 3.5 3.22 1.06 −0.2
Aligned 1.39 −4.57 −13.62 −33.13
q = 0.5 Very low Low Flat High
Isotropic 1.32 0.0 −1.12 −2.6
Restricted 3.55 3.23 1.0 −0.58
Aligned 1.52 −4.15 −12.86 −31.6
Fixed param. Very low Low Flat High
Isotropic 0.64 0.0 −2.0 −3.85
Restricted 1.83 0.8 −2.3 −5.0
Aligned −2.69 −11.98 −21.98 −44.6

Note. The orientation models are described in Section 2. We find modest evidence for small spins. When spins are small, we cannot make strong statements about the distribution of spin orientations.

Download table as:  ASCIITypeset image

Because of degeneracies in the GW waveform between mass ratio and ${\chi }_{\mathrm{eff}}$, the choice of mass distribution impacts inferences about spins. This effect explains the significant difference in Bayes factors for the third row in the table. We find again that our result moderately favors small BH spins. The restricted models with χeff strictly positive consistently produce the highest Bayes factors. For the small-spin magnitude models we cannot make strong statements about the distribution of spin orientations. Models containing highly spinning components are significantly disfavored, with high or flat aligned spins particularly selected against (e.g., FA and HA are disfavored with Bayes factors ranging in $\left[{10}^{-11},{10}^{-6}\right]$ and $\left[{10}^{-21},{10}^{-13}\right]$, respectively). As a bracket for our uncertainty on the mass and mass ratio distribution, we evaluated the Bayes factors for the fixed-parameter model α = 2.3, ${m}_{\min }$ = 5, ${m}_{\max }$ = 50. They differ from the third mass model in Table 8 by a factor comparable to unity.

B.2. Spin Mixture Models

The models considered for model selection in Table 8 all assume a fixed set of spin magnitudes and tilts. There is no reason to believe, however, that the universe produces from only one of these distributions. A natural extension is to allow for a mixing fraction describing the relative abundances of perfectly aligned and isotropically distributed BH spins.

We assume that the aligned and isotropic components follow the same spin magnitude distribution. It is possible that BHs with a different distribution of spin orientations would have a different distribution of spin magnitudes, but given our weaker constraints on spin magnitudes, we focus on spin tilts sharing the same magnitude distribution.

We compute the posterior on the fraction of aligned binaries ζ in the population as per Equation (6) in the limit (${\sigma }_{i}\to 0$). The models here are subsets of the Mixture distribution, with purely isotropic being ζ = 0 and completely aligned being ζ = 1. The prior on the mixing fraction is flat.

All of the models that contain a completely aligned component favor isotropy over alignment. This ability to distinguish a mixing fraction diminishes with smaller spin magnitudes. This is because such spin magnitudes yield populations that are not distinguishable to within measurement uncertainty of ${\chi }_{\mathrm{eff}}$. We do not include the most-favored restricted (R) configuration, but we expect that the results would be similar. Coupled with the model selection results in the previous section, this implies that the mixing fraction is not well determined when fixed to the models (low and very low) that are favored by the data (see Figure 13). As stated above, in this case our ability to measure the mixing fraction is negligible.

B.3. Three-bin Analysis of χeff

We illustrate here how ${\chi }_{\mathrm{eff}}$ measurements can provide insights into discerning spin orientation distributions. Following Farr et al. (2018), we split the range of ${\chi }_{\mathrm{eff}}$ into three bins. One encompasses the fraction of uninformative binaries with ${\chi }_{\mathrm{eff}}$ consistent with zero ($| {\chi }_{\mathrm{eff}}| \leqslant 0.05$); the vertical axis of Figure 14 shows the fraction of binaries lying outside of this bin. The other two capture significantly positive (${\chi }_{\mathrm{eff}}$ > 0.05) and significantly negative (${\chi }_{\mathrm{eff}}$ < −0.05) binaries. The width 0.05 is chosen to be of the order of the uncertainty in a typical event posterior.

Figure 14.

Figure 14. Posterior distribution for the fraction of informative binaries (i.e., $| {\chi }_{\mathrm{eff}}| \gt 0.05$), and the fraction of those informative binaries with positive ${\chi }_{\mathrm{eff}}$ (i.e., ${\chi }_{\mathrm{eff}}$ > 0.05).

Standard image High-resolution image

The aligned spin scenario is preferred in the posterior support on the right half of Figure 14: the small fraction of binaries that are informative tend to possess ${\chi }_{\mathrm{eff}}$ greater than zero. Conversely, if the spins are isotropic, there would be no preference for positive or negative ${\chi }_{\mathrm{eff}}$, and the posterior in Figure 14 would peak toward the middle. However, of the 10 observed binaries, 8 are consistent with zero ${\chi }_{\mathrm{eff}}$ and only 2 are informative, thus demonstrating that our ability to distinguish between the two scenarios is weak.

Appendix C: Importance Resampling the Single-event Likelihood

Our hierarchical population analysis uses the individual-event likelihood for each event n = 1, ..., N, ${ \mathcal L }\left({d}_{n}| \xi ,z\right)$ (see Section 2, Equation (11)). Individual-event analyses report posterior samples drawn from a density that is proportional to this likelihood times a prior (Veitch et al. 2015; Abbott et al. 2018a). The prior density used is uniform in detector-frame masses and proportional to the square of the luminosity distance (Veitch et al. 2015); in terms of the source-frame masses and redshift, the prior is

Equation (18)

The derivative of the luminosity distance in a spatially flat universe (Hogg 1999) is

Equation (19)

where dH = c/H0 is the Hubble distance and

Equation (20)

in a ΛCDM universe.

Given a set of posterior samples as described above, we can transform them to samples from the likelihood over source-frame masses and redshift by importance resampling with weights that are the inverse prior

Equation (21)

The integral in Equation (11) may then be approximated as

Equation (22)

where the sum is taken over all posterior samples, and we assume that the population distribution is expressed in terms of source-frame masses (i.e., $\xi =\left({m}_{1}^{\mathrm{source}},{m}_{2}^{\mathrm{source}},\ \ldots \right)$). Alternately, we can construct a random resampling of the set of existing posterior samples, with sample i appearing in the resampling with probability proportional to $w\left({m}_{1,i}^{\mathrm{source}},{m}_{2,i}^{\mathrm{source}},{z}_{i}\right);$ the integral is then proportional to the average value of $d$N/$d$ξ$d$z over the resampled set. The (unknown) constant of proportionality is related to the Bayesian evidence for event n; as long as a consistent method (weighted sum or resampling) is used to compare different population models, this constant is irrelevant to computing Bayes factors between models.

Footnotes

  • 184 

    The data release for this work can be found at https://dcc.ligo.org/LIGO-P1800324/public.

  • 185 

    While this assumption is embedded (Farr et al. 2015b) in the selection of events used in this work, studies of event count per time do not show significant evidence for deviations from Poissonian statistics (Abbott et al. 2018a).

  • 186 

    The flat-in-log population (Equation (16)) cannot be parameterized by the mass Models A, B, and C used in this work because the mass ratio distribution takes a different form. However, it is very close to Model A with α = 1.

Please wait… references are loading.
10.3847/2041-8213/ab3800