Abstract

We use microwave observations from the South Pole Telescope (SPT) to examine the Sunyaev–Zel'dovich effect (SZE) signatures of a sample of 46 X-ray selected groups and clusters drawn from ∼6 deg2 of the XMM–Newton Blanco Cosmology Survey. These systems extend to redshift z = 1.02 and probe the SZE signal to the lowest X-ray luminosities (≥1042 erg s−1) yet; these sample characteristics make this analysis complementary to previous studies. We develop an analysis tool, using X-ray luminosity as a mass proxy, to extract selection-bias-corrected constraints on the SZE significance and Y500 mass relations. The former is in good agreement with an extrapolation of the relation obtained from high-mass clusters. However, the latter, at low masses, while in good agreement with the extrapolation from the high-mass SPT clusters, is in tension at 2.8σ with the Planck constraints, indicating the low-mass systems exhibit lower SZE signatures in the SPT data. We also present an analysis of potential sources of contamination. For the radio galaxy point source population, we find 18 of our systems have 843 MHz Sydney University Molonglo Sky Survey sources within 2 arcmin of the X-ray centre, and three of these are also detected at significance >4 by SPT. Of these three, two are associated with the group brightest cluster galaxies, and the third is likely an unassociated quasar candidate. We examine the impact of these point sources on our SZE scaling relation analyses and find no evidence of biases. We also examine the impact of dusty galaxies using constraints from the 220 GHz data. The stacked sample provides 2.8σ significant evidence of dusty galaxy flux, which would correspond to an average underestimate of the SPT Y500 signal that is (17 ± 9) per cent in this sample of low-mass systems. Finally, we explore the impact of future data from SPTpol and XMM-XXL, showing that it will lead to a factor of 4 to 5 tighter constraints on these SZE mass–observable relations.

1 INTRODUCTION

The Sunyaev–Zel'dovich effect (SZE; Sunyaev & Zel'dovich 1970, 1972) is a spectral distortion of the cosmic microwave background (CMB) arising from interactions between CMB photons and hot, ionized gas. Surveys of galaxy clusters using the SZE have opened a new window in the Universe by providing samples of hundreds of massive galaxy clusters with well-understood selection over a broad redshift range. Both space- and ground-based instruments, including the Planck satellite (Tauber et al. 2010), the South Pole Telescope (SPT; Carlstrom et al. 2011), and the Atacama Cosmology Telescope (Fowler et al. 2007), have released catalogues of their SZE selected clusters. The cluster samples have provided new cosmological constraints Hasselfield et al. (2013); Planck Collaboration XX (2014b); Reichardt et al. (2013) and have enabled important evolution studies of cluster galaxies and the intracluster medium over a broad range of redshift (e.g. Zenteno et al. 2011; Semler et al. 2012; McDonald et al. 2013).

Understanding the relationship between the SZE observable and cluster mass is important for both cosmological applications and astrophysical studies. Among observables, the integrated Comptonization from the SZE has been shown by numerical simulations Motl et al. (2005); Nagai, Kravtsov & Vikhlinin (2007) to be a good mass proxy with low intrinsic scatter. Cluster mass estimates derived from X-ray observations of SZE selected clusters have largely confirmed this expectation Andersson et al. (2011); Planck Collaboration XI (2011b). A related quantity, the SPT signal-to-noise ratio ξ, is linked to the underlying virial mass of the cluster by a power law with lognormal scatter at the ∼20 per cent level (Benson et al. 2013, hereafter B13).

Probing the SZE signature of low-mass clusters and groups is also important, although it is much more challenging with the current generation of experiments. These low-mass clusters and groups are far more numerous and are presumably important environments for the transformation of galaxies from the field to the cluster. Studies of their baryonic content show that low-mass clusters and groups are not simply scaled-down versions of the more massive clusters (e.g. Mohr, Mathiesen & Evrard 1999; Sun et al. 2009; Laganá et al. 2013). This breaking of self-similarity in moving from the cluster to the group mass scale is likely due to processes such as star formation and active galactic nucleus (AGN) feedback.

The Planck team has recently studied this low-mass population by stacking the Planck maps around samples of X-ray selected clusters in the nearby universe (Planck Collaboration X 2011a, hereafter P11). They show that the SZE signal is consistent with the self-similar scaling relation based on the X-ray luminosity over a mass range spanning 1.4 orders of magnitude.

Here we pursue a study of the SZE signatures of low-mass clusters extending over a broad range of redshift. We use the South Pole Telescope Sunyaev–Zel'dovich survey (SPT-SZ) data with the XMM–Newton Blanco Cosmology Survey (XMM-BCS) over 6 deg2 from which a sample of 46 X-ray groups and clusters has been selected (Šuhada et al. 2012, hereafter S12). The SPT-SZ data enable us to extract cluster SZE signal with high angular resolution and low instrument noise, making the most of this small sample.

The paper is organized as follows. In Section 2, we describe the data used from the XMM-BCS and the extraction of the SZE signature from the SPT-SZ maps. In Section 3, we introduce the calibration method for the mass–observable scaling relation, and we apply it to the cluster sample in Section 4. We also discuss possible systematic effects and present a discussion of the point source population associated with our sample. We conclude in Section 5 with a prediction of the improvement based on future surveys.

The cosmological model parameters adopted in this paper are the same as the ones used for the X-ray measurement from the XMM-BCS project (S12): (ΩM, ΩΛ, H0) = (0.3, 0.7, 70 kms−1Mpc−1). The amplitude of the matter power spectrum, which is needed to estimate bias corrections in the analysis, is fixed to σ8 = 0.8.

2 DATA DESCRIPTION AND OBSERVABLES

In this analysis, we adopt an X-ray selected sample of clusters, described in Section 2.1, together with published LX–mass scaling relations to examine the corresponding SPT-SZ significance and Y500 mass relations. The SPT-SZ observable ξ is measured by a matched filter approach, which we discuss in Sections 2.2 and 2.3. The estimation of Y500 is described in Section 2.4.

2.1 X-ray catalogue

The XMM-BCS project consists of an X-ray survey mapping 14 deg2 area of the Southern hemisphere sky that overlaps the griz bands Blanco Cosmology Survey (BCS; Desai et al. 2012) and the millimetre-wavelength SPT-SZ survey Carlstrom et al. (2011). S12 analyse the initial 6 deg2 core area, construct a catalogue of 46 galaxy clusters and present a simple selection function. Here we present a brief summary of the characteristics of that sample. The cluster physical parameters from table 2 of S12 are repeated in Table 2 with the same IDs.

The initial cluster sample was selected via a source detection pipeline in the 0.5–2 keV band. The spatial extent of the clusters leads to the need to have more counts to reach a certain detection threshold than are needed for point sources. S12 modelled the extended source sensitivity as an offset from the point source limit; the cluster sample is approximately a flux-limited sample with fmin = 1 × 10−14ergs−1cm−2.

The X-ray luminosity LX was measured in the detection band (0.5–2.0 keV) within a radius of R500c, which is iteratively determined using mass estimates from the LX–mass relation and is defined such that the interior density is 500 times the critical density of the Universe at the corresponding redshift. This luminosity was converted to a bolometric luminosity and to a 0.1–2.4 keV band luminosity using the characteristic temperature for a cluster with this 0.5–2.0 keV luminosity and redshift (see equation 3 in S12). The core radius, Rc, of the β model is calculated using (see equation 1 in S12)
\begin{equation} R_{\rm c} = 0.07{\times} R_{500} \Big (\frac{T}{{\rm 1\ keV}}\Big )^{0.63}, \end{equation}
(1)
where T is X-ray temperature determined through the LXT relation. The redshifts of the sample are primarily photometric redshifts extracted using the BCS optical imaging data. The optical data and their processing and calibration are described in detail elsewhere (Desai et al. 2012). The photometric redshift estimator has been demonstrated on clusters with spectroscopic redshifts and on simulations (Song et al. 2012a) and has been used for redshift estimation within the SPT-SZ collaboration (Song et al. 2012b). The typical photometric redshift uncertainty in this XMM-BCS sample is 〈Δz/(1 + z)〉 = 0.023, which is determined using a subsample of 12 clusters (z < 0.4) with spectroscopic redshifts. This value is consistent with the uncertainty 〈Δz/(1 + z)〉 = 0.017 we obtained on the more massive main sample SPT-SZ clusters.

The X-ray luminosities and photometric redshifts of the sample are shown in Fig. 1 in black squares and the approximate flux limit of the sample is shown as a red curve. For comparison, we also include a high-mass SPT-SZ cluster sample (blue triangles) with published X-ray properties (Andersson et al. 2011).

Figure 1.

The luminosity–redshift distribution of the XMM-BCS clusters from S12 (black dots) and the SPT-SZ clusters from Andersson et al. (2011, blue triangles). The X-ray sample is selected with a flux cut that varies somewhat across the field. The red line is the corresponding luminosity sensitivity determined by the median flux limit in the 0.5–2.0 keV band. The SPT-SZ sample is more massive and approximately mass limited.

In the analysis that follows we use the X-ray luminosity as the primary mass estimator for each cluster. We adopt the LX–mass scaling relation used in S12, which is based on the hydrostatic mass measurements in an ensemble of 31 nearby clusters observed with XMM–Newton (REXCESS; Pratt et al. 2009):
\begin{equation} L_{{\rm X}}=L_{0}\Big (\frac{M_{\rm 500c}}{2{\times} 10^{14}\,\mathrm{M_{{\odot }}}} \Big )^{\alpha _{\rm LM}}E(z)^{7/3}, \end{equation}
(2)
where H(z) = H0E(z). The intrinsic scatter in LX at fixed mass is modelled as lognormal distributions with widths |$\sigma _{L_{{\rm X}}}$|⁠, and the observational scatter is given in S12.

This scaling relation includes corrections for Malmquist and Eddington biases. Both biases are affected by the intrinsic scatter and the skewness of the underlying sample distribution. In general, the bias on the true mass is |$\Delta \ln M\ \propto \ \gamma \sigma _{\ln M}^2$|⁠, where dn(M)/dln M ∝ Mγ is the slope of the mass distribution and σln M is the scatter in mass at fixed observable (for more discussion, we refer the reader to Stanek et al. 2006; Vikhlinin et al. 2009; Mortonson, Hu & Huterer 2011). Typically γ is negative, and the result is that mass inferred from an observable must be corrected to a lower value than that suggested by naive application of the scaling relation.

The scaling relation parameters for different X-ray bands are listed in Table 1. We find the choice of luminosity bands has negligible impact on the parameter estimation given the current constraint precision. In addition, we investigate using the LX–mass scaling relations from Chandra observations (Vikhlinin et al. 2009; Mantz et al. 2010b). These studies draw upon higher mass cluster samples than the REXCESS sample, and therefore we adopt the Pratt et al. (2009) relation for our primary analysis. We discuss the impact of changing the LX–mass scaling relation in Section 4.3.

Table 1.

LX–mass relations with different luminosity bands (equation 2).

TypeL0(1044erg s−1)αLM|$\sigma _{\ln L_{{\rm X}}}$|
0.5–2.0 keV0.48 ± 0.041.83 ± 0.140.412 ± 0.071
0.1–2.4 keV0.78 ± 0.071.83 ± 0.140.414 ± 0.071
Bolometric1.38 ± 0.122.08 ± 0.130.383 ± 0.061
TypeL0(1044erg s−1)αLM|$\sigma _{\ln L_{{\rm X}}}$|
0.5–2.0 keV0.48 ± 0.041.83 ± 0.140.412 ± 0.071
0.1–2.4 keV0.78 ± 0.071.83 ± 0.140.414 ± 0.071
Bolometric1.38 ± 0.122.08 ± 0.130.383 ± 0.061
Table 1.

LX–mass relations with different luminosity bands (equation 2).

TypeL0(1044erg s−1)αLM|$\sigma _{\ln L_{{\rm X}}}$|
0.5–2.0 keV0.48 ± 0.041.83 ± 0.140.412 ± 0.071
0.1–2.4 keV0.78 ± 0.071.83 ± 0.140.414 ± 0.071
Bolometric1.38 ± 0.122.08 ± 0.130.383 ± 0.061
TypeL0(1044erg s−1)αLM|$\sigma _{\ln L_{{\rm X}}}$|
0.5–2.0 keV0.48 ± 0.041.83 ± 0.140.412 ± 0.071
0.1–2.4 keV0.78 ± 0.071.83 ± 0.140.414 ± 0.071
Bolometric1.38 ± 0.122.08 ± 0.130.383 ± 0.061

2.2 SPT observations

The SPT (Carlstrom et al. 2011) is a 10-m diameter, millimetre-wavelength, wide-field telescope that was deployed in 2007 and has been used since then to make arcminute-resolution observations of the CMB over large areas of the sky. The high angular resolution is crucial to detecting the SZE signal from high-redshift clusters. The SPT-SZ survey (e.g. Story et al. 2013), completed in 2011, covers a 2500 deg2 region of contiguous sky area in three bands – centred at 95, 150, and 220 GHz – at a typical noise level of < 18 μK 1 arcmin−1 pixel in the 150 GHz band.

The details of the SPT-SZ observation strategy, data processing and map making are documented in Schaffer et al. (2011); we briefly summarize them here. The SPT-SZ survey data were taken primarily in a raster pattern with azimuth scans at discrete elevation steps. A high-pass filter was applied to the time-ordered data to remove low-frequency atmospheric and instrumental noise. The beams, or angular response functions, were measured using observations of planets and bright AGNs in the field. The main lobe of the beam for a field observation is well approximated as a Gaussian with a full width at half-maximum (FWHM) of 1.6, 1.2, and 1.0 arcmin at 95, 150, and 220 GHz, respectively. The final temperature map was calibrated by the Galactic H ii regions RCW38 and MAT5a (cf. Vanderlinde et al. 2010). The SPT-SZ maps used in this work are from a 100 deg2 field centred at (α, δ) = (23° 30, −55°) and consist of observations from the 2008 and 2010 SPT-SZ observing seasons. The characteristic depths are 37, 12 and 35 μK-arcmin at 95, 150 and 220 GHz, respectively.

2.3 SPT-SZ cluster significance

The process of determining the SPT-SZ significance for our X-ray sample is very similar to the process of finding clusters in SPT-SZ maps, but there are certain key differences, which we highlight below. Clusters of galaxies are extracted from SPT-SZ maps through their distinct angular scale- and frequency-dependent imprint on the CMB. We adopt the multifrequency matched filter approach Melin, Bartlett & Delabrouille (2006) to extract the cluster signal. The matched filter is designed to maximize the given signal profile while suppressing all noise sources. A detailed description appears elsewhere (Vanderlinde et al. 2010; Williamson et al. 2011). Here we provide a summary. The SZE introduces a spectral distortion of the CMB at given frequency ν as
\begin{equation} \Delta T_{\rm CMB}(\boldsymbol {\theta },\nu ) = y(\boldsymbol {\theta }) g(\nu ) T_{\rm CMB}, \end{equation}
(3)
where g(ν) is the frequency dependency and the Compton-y parameter y(θ) is the SZE signature at direction θ, which is linearly related to the integrated pressure along the line of sight. To model the SZE signal y(θ), two common templates are adopted: the circular β model Cavaliere & Fusco-Femiano (1976) and the Arnaud profile Arnaud et al. (2010). The cluster profiles are convolved with the SPT beams to get the expected signal profiles. The map noise assumed in constructing the filter includes the measured instrumental and atmospheric noise and sources of astrophysical noise, including the primary CMB. Point sources are identified in a similar manner within each band independently, using only the instrument beams as the source profile (Vieira et al. 2010).

Once SPT-SZ maps have been convolved with the multifrequency matched filter, clusters are extracted with a simple peak-finding algorithm, with the primary observable ξ defined as the maximum signal-to-noise ratio of a given peak across a range of filter scales. The SPT-SZ significance ξ is a biased estimator that links to the underlying ζ as |$\langle \xi \rangle = \sqrt{\zeta ^2+3}$|⁠, because it is the maximum value identified through a search in sky position and filter angular scale (Vanderlinde et al. 2010). The observational scatter of ξ around ζ is a unit-width Gaussian distribution corresponding to the underlying rms noise of the SPT-SZ filtered maps.

In this work, we use the 90 and 150 GHz maps and employ the method described above to define an SPT-SZ significance for each X-ray selected cluster, but with two important differences: (1) we measure the SPT-SZ significance at the X-ray location and (2) we use a cluster profile shape informed from the X-ray data. We define this SPT-SZ significance as ξX, which is related to the unbiased SPT-SZ significance ζ as
\begin{equation} \zeta =\langle \xi _{{\rm X}} \rangle , \end{equation}
(4)
where the angle brackets denote the average over many realizations of the experiment. The observational scatter of ξX around ζ is also a unit-width Gaussian distribution. Therefore ξX is an unbiased estimator of ζ, under the assumption that the true X-ray position and profile are identical to the true SZE position and profile – a reasonable assumption – given that both the X-ray and the SZE signatures are reflecting the intracluster medium properties of the clusters. Note, however, that in the midst of a major merger the different density weighting of the X-ray and SZE signatures can lead to offsets (Molnar, Hearn & Stadel 2012).
We model the relationship between ζ and the cluster mass through
\begin{equation} \zeta = A_{{\rm SZ}}^{\rm SPT}\Big (\frac{M_{\rm 500c}}{4.3{\times} 10^{14} \,\mathrm{M_{{\odot }}}} \Big )^{B_{{\rm SZ}}} \Big [ \frac{E(z)}{E(0.6)}\Big ]^{C_{{\rm SZ}}}, \end{equation}
(5)
where the intrinsic scatter on ζ is described by a lognormal distribution of width DSZ (B13; Reichardt et al. 2013). We use |$A_{{\rm SZ}}^{\rm SPT}$| to denote the amplitude of the original SPT-SZ scaling relation. The differences in the depths of the SPT-SZ fields result in a re-scaling of the SPT-SZ cluster significance in spatially filtered maps. For the field we study here, the relation requires a factor of 1.38 larger normalization compared to the value in Reichardt et al. (2013).

For the massive SPT-SZ clusters (with ξ > 4.5), the ζ–mass relation is best parametrized as shown in Table 3 with CSZ = 0.83 ± 0.30 and DSZ = 0.21 ± 0.09 (B13). In our analysis, we examine the characteristics of the lower mass clusters within the SPT-SZ survey. To avoid a degeneracy between the scaling relation amplitude and slope, we shift the pivot mass to 1.5 × 1014  M, near the median mass of our sample and term the associated amplitude ASZ. At this pivot mass, with the normalization factor mentioned previously, the equivalent amplitude parameter for the main SPT-SZ sample corresponds to ASZ = 1.50. In Table 3 we also note the priors we adopt in our analysis of the low-mass sample. For our primary analysis, we adopt flat priors on the amplitude and slope parameters and fix the redshift evolution and scatter at the values obtained by B13.

2.4 Integrated Y500

To facilitate the comparison of our sample with cluster physical properties reported in the literature, we also convert the ξX to Y500, which is the integration of the Compton-y parameter within a spherical volume with radius R500c. The central y0 is linearly linked to ξX in the matched filter approach (Melin et al. 2006), with the corresponding Arnaud profile or β profile as the cluster template. The characteristic radii (R500c and Rc) are based on the X-ray measurements (S12), because the SZE observations are too noisy to constrain the profile accurately.

The projected circular β profile for the filter is
\begin{equation} y^\mathrm{(\beta )}_{\rm cyl}(r) \propto \left(1-r^2/R_{\rm c}^2\right)^{-(3\beta -1)/2}, \end{equation}
(6)
where β is fixed to 1, consistent with higher signal-to-noise ratio cluster studies (Plagge et al. 2010). And the spherical Y500 within the R500c is
\begin{equation} Y_{\rm 500}^\mathrm{(\beta )} = y_0 {\times} \pi R_{\rm c}^{2} \ln \left(1+R_{\rm 500c}^{2}/R_{\rm c}^{2}\right) {\times} f(R_{\rm 500c}/R_{\rm c}), \end{equation}
(7)
where f (x) corrects the cylindrical result to the spherical value for the β profile as
\begin{equation} f(x) = 2\frac{\ln (x+\sqrt{1+x^2})-x/\sqrt{1+x^2}}{\ln (1+x^2)}. \end{equation}
(8)
The |$Y_{\rm 500}^{\rm (A)}$| for the Arnaud profile is calculated similarly except that the projected profile is calculated numerically within 5R500c along the line-of-sight direction:
\begin{equation} y^{\rm (A)}_{\rm cyl}(r) \propto \int _{-5R_{\rm 500c}}^{5R_{\rm 500c}} P\Big (\frac{\sqrt{r^2+z^2}}{R_{\rm 500c}}\Big ) \mathrm{d}z, \end{equation}
(9)
where the pressure profile has the form
\begin{equation} P(x) \propto (c_{\rm 500}x)^{-\gamma _{\rm A}} [1+(c_{\rm 500}x)^{\alpha _{\rm A}}]^{(\gamma _{\rm A}-\beta _{\rm A})/\alpha _{\rm A}}, \end{equation}
(10)
with [c500, γA, αA, βA] = [1.177, 0.3081, 1.0510, 5.4905] Arnaud et al. (2010). The integration up to 5R500c includes more than 99 per cent of the total pressure contribution. The spherical Y500 for the Arnaud profile is
\begin{equation} Y_{\rm 500}^{\rm (A)} = 2\pi y_{0} \int _0^{R_{\rm 500c}} y^{\rm (A)}_{\rm cyl}(r) r\mathrm{d}r /1.203, \end{equation}
(11)
where the numerical factor 1.203 is the ratio between cylindrical integration and spherical integration for the adopted Arnaud profile parameters.

Measurements of Y500 are sensitive to the assumed profile. The Arnaud profile depends only on R500c, while the β profile depends on both R500c and Rc and therefore Y500 is sensitive to the ratio Rc/R500c. We find that with Rc/R500c = 0.2 the β and Arnaud profiles provide Y500 measurements in good agreement; this ratio is consistent with the previous SZE profile study using high-mass clusters (Plagge et al. 2010). Interestingly, the X-ray data indicate a characteristic ratio of 0.11 ± 0.03 for our sample, and a shift in the Rc/R500c ratio from 0.2 to 0.1 leads to a ∼40 per cent decrease in Y500. Given that the Planck analysis to which we compare is carried out using the Arnaud profile, we adopt that profile for the analysis in Section 4.4.

The Y500–mass scaling relation has been modelled using a representative local X-ray cluster sample Arnaud et al. (2010) and further studied in the SZE (Andersson et al. 2011; P11) as
\begin{equation} Y_{\rm 500}= A_{\rm Y}\Big (\frac{M_{500}}{1.5{\times} 10^{14} M_{{\odot }}}\Big )^{B_{\rm Y}}E(z)^{2/3}\Big [\frac{D_{\rm A}(z)}{500\ {\rm Mpc}}\Big ]^{-2}, \end{equation}
(12)
where DA(z) is the angular diameter distance and the intrinsic scatter on Y500 is described by a lognormal distribution of width |$\sigma _\mathrm{\ln Y} = 0.21$|⁠. The observational scatter of Y500 is propagated from the scatter of ξX. In Section 4, we fit this relation to the observations.

3 METHOD

In this section, we describe the method we developed to fit the SZE–mass scaling relations of the low-mass cluster population selected through the XMM-BCS and observed by the SPT. In principle, we could use our cluster sample observed in X-ray and SZE to simultaneously constrain the cosmology and the scaling relations, in the so-called self-calibration approach Majumdar & Mohr (2004). However, self-calibration requires a large sample. Without this, we take advantage of strong, existing cosmology constraints (e.g. Planck Collaboration XVI 2014a; Bocquet et al. 2015) and knowledge of the LX–mass scaling relation (e.g. Pratt et al. 2009). We focus only on the SZE–mass scaling relations, exploring the SZE characteristics of low-mass galaxy clusters and groups. In Section 3.1 we present the method and in Section 3.2 we validate it using mock catalogues.

3.1 Description of the method

The selection biases on scaling relations include the Malmquist bias and the Eddington bias, which are manifestations of scatter and population variations associated with the selection observable. Several methods have previously been developed (e.g. Vikhlinin et al. 2009; Mantz et al. 2010a; Allen, Evrard & Mantz 2011; B13; Bocquet et al. 2015) to account for the sampling biases when fitting scaling relation and cosmological parameters simultaneously. In this analysis, we use a likelihood function that can be derived from the one presented in B13. For a detailed discussion we refer the reader to Appendix A; here we present an overview of the key elements of this likelihood function.

The likelihood function |$\mathcal {L}(\boldsymbol {r}_{\rm SZ})$| we use to constrain the SZE–mass relations is the product of the individual conditional probabilities to observe each cluster with SZE observable Yi (e.g. SPT-SZ significance ξX or Y500), given the cluster has been observed to have an X-ray observable Li and redshift zi:
\begin{equation} \mathcal {L}(\boldsymbol {r}_{\rm SZ})=\Pi _{i}\ P(Y_{i}|L_{i},z_{i},\boldsymbol {c}, \boldsymbol {r}_{\rm X}, \boldsymbol {r}_{\rm SZ}, \Theta _{\rm X}), \end{equation}
(13)
where i runs over the cluster sample, rSZ contains the parameters describing the SZE mass–observable scaling relation that we wish to study, |$\boldsymbol c$| contains the cosmological parameters, rX contains the parameters describing the X-ray mass–observable scaling relation, and the survey selection in X-ray is encoded within ΘX. Note that the redshifts are assumed to be accurate such that the X-ray luminosity (LX) is used instead of the true survey selection observable, which is the X-ray flux.
As noted above, given the size of our data set we adopt fixed cosmology |$\boldsymbol c$| and X-ray scaling relation parameters rX to focus on the SZE–mass scaling relation. In Section 4 we examine the sensitivity of our results to the current uncertainties in cosmology and the X-ray scaling relation and find them to be unimportant for our analysis. Within this context, the conditional probability density function for cluster i can be written as the ratio of the expected number of clusters dN with observables Yi, Li and zi within infinitesimal volumes dY, dL and dz:
\begin{equation} P(Y_{i}|L_{i},z_{i},\boldsymbol {r}_{\rm SZ},\Theta _{\rm X}) =\frac{\mathrm{d}N(Y_{i}, L_{i}, z_{i}|\boldsymbol {r}_{\rm SZ},\Theta _{\rm X})}{\mathrm{d}N(L_{i}, z_{i}|\Theta _{\rm X})}, \end{equation}
(14)
where we have dropped the cosmology c and X-ray scaling relation parameters rX because they are held constant. Typically, the survey selection ΘX is a complex function of the redshift and X-ray flux, but in the above expression it is simply the probability that a cluster with X-ray luminosity Li and redshift zi is observed [i.e. |$\mathrm{d}N\left(Y_{i},L_{i},z_{i}|\boldsymbol {r}_{\rm SZ},\Theta _{\rm X}\right)=\Theta _{\rm X}(L_{i},z_{i})\mathrm{d}N(Y_{i},L_{i},z_{i}|\boldsymbol {r}_{\rm SZ})$|]; in equation (14) this same factor appears in both the numerator and denominator, and therefore it cancels out. Thus, studying the SZE properties of an X-ray selected sample does not require detailed modelling of the selection. If the selection were based on both L and Y, then there would be no cancellation, because the selection probability in the numerator would be just Θ(Li, Yi, zi) while in the denominator it would have to be marginalized over the unobserved Y as ∫Θ(Y, Li, zi)dY (see equation A8).
With knowledge of the cosmologically dependent mass function |$n(M,z)\equiv \mathrm{d}N(M,z|\boldsymbol {c})/\mathrm{d}M\mathrm{d}z$| (Tinker et al. 2008), the ratio of the expected number of clusters can be written as
\begin{equation} P(Y_{i}|L_{i},z_{i},\boldsymbol {r}_{\rm SZ}) =\frac{\int \mathrm{d}M P(Y_{i},L_{i}|M,z_{i},\boldsymbol {r}_{\rm SZ})\,n(M,z_{i})}{\int \mathrm{d}M P(L_{i}|M,z_{i})\,n(M,z_{i})}. \end{equation}
(15)

We emphasize that there is a residual dependence on the X-ray selection in our analysis in the sense that we can only study the SZE properties of the clusters that have sufficient X-ray luminosity to have made it into the sample. This effectively limits the mass range over which we can use the X-ray selected sample to study the SZE properties of the clusters.

To constrain the scaling relation in the presence of both observational uncertainties and intrinsic scatter, we further expand the conditional probability density functions in equation (15) as
\begin{eqnarray} P(Y_{i},L_{i}|M,z_{i},\boldsymbol {r}_{\rm SZ}) = \iint &\mathrm{d}Y_\mathrm{t} \mathrm{d}L_\mathrm{t}\ P(Y_{i}, L_{i}|Y_\mathrm{t},L_\mathrm{t}) \nonumber \\ &{\times} P(Y_\mathrm{t},L_\mathrm{t} |M,z_{i},\boldsymbol {r}_{\rm SZ}), \end{eqnarray}
(16)
\begin{eqnarray} P(L_{i}|M,z_{i}) = \int &\mathrm{d}L_\mathrm{t}\ P(L|L_\mathrm{t}) P(L_\mathrm{t}|M,z_{i}), \end{eqnarray}
(17)
where, as above, Yi and Li are the observed values, and Yt and Lt are the true underlying observables related to mass through scaling relations that have intrinsic scatter. The first factor in each integral represents the measurement error, and the second factor describes the relationship between the pristine observables and the halo mass. Improved data quality affects the first factor, but cluster physics dictates the form of the second. These second factors are fully described by the power-law mass–observable relations in equations (2), (5), and (12) together with the adopted lognormal scatter.

We use this likelihood function under the assumption that there is no correlated scatter in the observables; in Section 3.2 we use mock samples that include correlated scatter to examine the impact on our results.

3.2 Validation with mock cluster catalogues

We use mock samples of clusters to validate our likelihood and fitting approach and to explore our ability to constrain different parameters. Specifically, we generate 10 larger mock surveys of 60 deg2, with a similar flux limit of 1 × 10−14ergs−1cm2 and z > 0.2. Each mock catalogue contains ∼400 clusters, or approximately eight times as many as in the observed sample. The ξX of the sample spans −2.2 ≤ ξX ≤ 7.8 with a median value of 1.4. We include both the intrinsic scatter and observational uncertainties for both the LX and the ξX in the mock catalogue. The intrinsic scatter is lognormal distributed with values given as |$\sigma _{\ln L_{{\rm X}}}$| (DSZ). The observational uncertainties in LX and ξX are modelled as normal distributions. The standard deviation used for LX is proportional to |$\sqrt{L_{{\rm X}}}$| to mimic the Poisson distribution of photon counts, while the standard deviation for ξX is 1.

Here we focus on recovering the four SPT-SZ ζ–mass relation parameters from the mock catalogue; the fiducial values for these parameters are the B13 best-fitting values. We scan through the parameter space using a fixed grid. The following results contain 41 bins in each parameter direction. Given the limited constraining power, we validate the parameters using two different sets of priors. In the first set we adopt flat priors on ASZ, BSZ, and CSZ with fixed DSZ. In the second set we adopt flat positive priors on ASZ, BSZ, and DSZ with fixed CSZ. All other relevant parameters are fixed, including the LX–mass scaling and the cosmological model.

Our tests show good performance of the method. Using 10 mock samples that are each 10 times larger than our observed sample, and fitting for 3 parameters in each mock, we recover the parameters to within the marginalized 1σ statistical uncertainty 70 per cent of the time and to within 2σ for the rest. Fig. 2 illustrates our ζ–mass parameter constraints from one mock sample. Note that the constraints on CSZ and DSZ are both weak and exhibit no significant degeneracy with the other two SPT-SZ scaling parameters. We take this as motivation to fix CSZ and DSZ and focus on the amplitude ASZ and slope BSZ in the analysis of the observed sample. We have repeated this testing in the case of the Y500–mass relation, and we see no difference in behaviour.

Figure 2.

Constraints on the ζ–mass relation from an analysis of the mock catalogue. The left-hand panel constrains ASZ, BSZ, and CSZ with fixed DSZ. And the right-hand panel shows the result by fixing CSZ instead of DSZ. The red lines and stars denote the input values of the scaling relation parameters of the mock catalogue. Histograms in each case show the recovered projected likelihood distribution for each parameter. Joint constraints for different pairs of parameters are shown in blue with different shades indicating the 1σ, 2σ, and 3σ levels.

We also investigate the sensitivity of our method when a correlation between intrinsic scatter in the X-ray and ξX is included. Cluster observables can be correlated through an analysis approach. For example, if one uses the LX as a virial mass estimate, then when LX scatters up by 40 per cent, it leads to a 5 per cent increase in radius, and 8 per cent increase in Y500 if the underlying SZE brightness distribution is described by the Arnaud et al. (2010) profile. In comparison, the intrinsic scatter of Y500 about mass is about 20 per cent, which in this example would still dominate over the correlated component of the scatter. Correlated scatter in different observable–mass relations can also reflect underlying physical properties of the cluster that impact the two observables in a similar manner.

We find that even with a correlation coefficient ρ = 0.5 between the intrinsic scatter of the two observables, the change in constraints extracted using a no correlation assumption is small. Thus, our approximation does not lead to significant bias in the analysis of this sample. This result is also consistent with the fact that by extending equations (16) and (17) to include multidimensional lognormal scatter distributions, we find the constraint on correlated scatter in the mock catalogue to be very weak. We therefore do not include the possibility of correlated scatter when studying the real sample.

4 RESULTS

In this section, we present the observed relationship between the SZE significance ξX at the position of the X-ray selected cluster and the predicted value given the measured X-ray luminosity of the system. Thereafter, we test – and rule out – the null hypothesis that the SZE signal at the locations of the X-ray selected clusters is consistent with noise. We then present constraints on the SPT-SZ ζ–mass and Y500–mass relations. We end with a discussion of possible systematics and a presentation of the point source population for this X-ray selected group and cluster sample.

4.1 SPT significance extraction

We extract the ξX from the SPT-SZ multifrequency-filtered map at the location of each XMM-BCS selected cluster as described in Section 2. In the primary analysis, we adopt three matched-filtered maps from the SPT-SZ data, one each for β-model profiles with Rc = 0.25, 0.5, and 0.75 arcmin, and we extract the value of ξX for each cluster from the map that most closely matches the X-ray-derived Rc value for that cluster. The ξX is extracted at the X-ray-derived cluster position. The measured ξX values are presented in Table 2. We have also tried extracting SPT-SZ significance by making a matched-filtered map for every cluster, using a filter with the exact X-ray-derived value of Rc, and the change in the results is negligible.

Table 2.

SPT-SZ ξX of XMM-BCS sample.

IDLX, 500, bolΔLX, 500, bolRedshiftRedshiftRcξXSPT point sourceSUMSS point source
(1042 erg s−1)(1042 erg s−1)uncertainty(arcmin)separation (arcmin) and SNseparation (arcmin)
011345.251.60.970.100.1850.99
01866.36.50.390.040.2391.900.92
032684.056.80.830.070.2723.041.70, 2.30, 3.97
033209.017.60.790.050.1892.34
03416.02.50.280.020.197−0.38
03591.014.30.670.050.1642.780.10, 1.56
03816.32.50.390.050.147−0.201.85
03919.41.20.180.040.315−0.342.91
044310.520.50.440.020.3674.583.87 4.840.22
069124.921.50.750.070.1651.383.40 6.343.42
070137.92.80.1520.0010.7261.80
08193.115.40.850.120.133−1.56
08253.69.20.630.050.1440.55
088122.116.70.430.040.271−0.102.96
09025.45.80.580.020.1200.30
09426.32.90.2690.0010.2432.201.48
109196.928.81.020.090.1451.090.19
11068.89.30.470.060.205−1.070.10
12682.06.10.420.020.2400.031.22
1278.41.00.2070.0010.2071.28
132319.335.70.960.170.1821.74
13686.87.30.360.020.282−3.581.11 5.841.00
1398.71.20.1690.0010.252−0.170.44
15037.71.80.1760.0010.403−3.340.13 4.230.05, 2.29
1523.40.60.1390.0010.219−0.45
156166.011.70.670.060.2023.01
158104.215.60.550.030.2051.94
21045.09.00.830.090.1050.18
22714.51.80.3460.0010.157−1.030.06
24538.17.10.620.030.1300.241.38
27517.82.70.290.030.198−0.462.12
28731.111.00.570.040.131−0.02
28889.017.40.600.040.180−0.250.62
35766.38.30.480.060.198−0.97
38617.74.80.530.050.1150.830.417 4.53a
4304.50.90.2060.0010.167−0.67
44469.113.80.710.050.141−0.13
4571.10.30.1000.0010.201−1.24
4766.20.70.1010.0010.365−0.121.03
50247.24.20.550.050.156−0.30
51123.43.70.2690.0010.2330.110.15, 2.37
527160.826.20.790.060.1720.833.96
5286.42.10.350.020.1170.57
5385.12.10.200.020.1790.30
543134.529.60.570.030.2171.10
5474.11.30.2410.0010.140−6.450.20 6.750.12, 2.89
IDLX, 500, bolΔLX, 500, bolRedshiftRedshiftRcξXSPT point sourceSUMSS point source
(1042 erg s−1)(1042 erg s−1)uncertainty(arcmin)separation (arcmin) and SNseparation (arcmin)
011345.251.60.970.100.1850.99
01866.36.50.390.040.2391.900.92
032684.056.80.830.070.2723.041.70, 2.30, 3.97
033209.017.60.790.050.1892.34
03416.02.50.280.020.197−0.38
03591.014.30.670.050.1642.780.10, 1.56
03816.32.50.390.050.147−0.201.85
03919.41.20.180.040.315−0.342.91
044310.520.50.440.020.3674.583.87 4.840.22
069124.921.50.750.070.1651.383.40 6.343.42
070137.92.80.1520.0010.7261.80
08193.115.40.850.120.133−1.56
08253.69.20.630.050.1440.55
088122.116.70.430.040.271−0.102.96
09025.45.80.580.020.1200.30
09426.32.90.2690.0010.2432.201.48
109196.928.81.020.090.1451.090.19
11068.89.30.470.060.205−1.070.10
12682.06.10.420.020.2400.031.22
1278.41.00.2070.0010.2071.28
132319.335.70.960.170.1821.74
13686.87.30.360.020.282−3.581.11 5.841.00
1398.71.20.1690.0010.252−0.170.44
15037.71.80.1760.0010.403−3.340.13 4.230.05, 2.29
1523.40.60.1390.0010.219−0.45
156166.011.70.670.060.2023.01
158104.215.60.550.030.2051.94
21045.09.00.830.090.1050.18
22714.51.80.3460.0010.157−1.030.06
24538.17.10.620.030.1300.241.38
27517.82.70.290.030.198−0.462.12
28731.111.00.570.040.131−0.02
28889.017.40.600.040.180−0.250.62
35766.38.30.480.060.198−0.97
38617.74.80.530.050.1150.830.417 4.53a
4304.50.90.2060.0010.167−0.67
44469.113.80.710.050.141−0.13
4571.10.30.1000.0010.201−1.24
4766.20.70.1010.0010.365−0.121.03
50247.24.20.550.050.156−0.30
51123.43.70.2690.0010.2330.110.15, 2.37
527160.826.20.790.060.1720.833.96
5286.42.10.350.020.1170.57
5385.12.10.200.020.1790.30
543134.529.60.570.030.2171.10
5474.11.30.2410.0010.140−6.450.20 6.750.12, 2.89

aDetected in 220 GHz.

Table 2.

SPT-SZ ξX of XMM-BCS sample.

IDLX, 500, bolΔLX, 500, bolRedshiftRedshiftRcξXSPT point sourceSUMSS point source
(1042 erg s−1)(1042 erg s−1)uncertainty(arcmin)separation (arcmin) and SNseparation (arcmin)
011345.251.60.970.100.1850.99
01866.36.50.390.040.2391.900.92
032684.056.80.830.070.2723.041.70, 2.30, 3.97
033209.017.60.790.050.1892.34
03416.02.50.280.020.197−0.38
03591.014.30.670.050.1642.780.10, 1.56
03816.32.50.390.050.147−0.201.85
03919.41.20.180.040.315−0.342.91
044310.520.50.440.020.3674.583.87 4.840.22
069124.921.50.750.070.1651.383.40 6.343.42
070137.92.80.1520.0010.7261.80
08193.115.40.850.120.133−1.56
08253.69.20.630.050.1440.55
088122.116.70.430.040.271−0.102.96
09025.45.80.580.020.1200.30
09426.32.90.2690.0010.2432.201.48
109196.928.81.020.090.1451.090.19
11068.89.30.470.060.205−1.070.10
12682.06.10.420.020.2400.031.22
1278.41.00.2070.0010.2071.28
132319.335.70.960.170.1821.74
13686.87.30.360.020.282−3.581.11 5.841.00
1398.71.20.1690.0010.252−0.170.44
15037.71.80.1760.0010.403−3.340.13 4.230.05, 2.29
1523.40.60.1390.0010.219−0.45
156166.011.70.670.060.2023.01
158104.215.60.550.030.2051.94
21045.09.00.830.090.1050.18
22714.51.80.3460.0010.157−1.030.06
24538.17.10.620.030.1300.241.38
27517.82.70.290.030.198−0.462.12
28731.111.00.570.040.131−0.02
28889.017.40.600.040.180−0.250.62
35766.38.30.480.060.198−0.97
38617.74.80.530.050.1150.830.417 4.53a
4304.50.90.2060.0010.167−0.67
44469.113.80.710.050.141−0.13
4571.10.30.1000.0010.201−1.24
4766.20.70.1010.0010.365−0.121.03
50247.24.20.550.050.156−0.30
51123.43.70.2690.0010.2330.110.15, 2.37
527160.826.20.790.060.1720.833.96
5286.42.10.350.020.1170.57
5385.12.10.200.020.1790.30
543134.529.60.570.030.2171.10
5474.11.30.2410.0010.140−6.450.20 6.750.12, 2.89
IDLX, 500, bolΔLX, 500, bolRedshiftRedshiftRcξXSPT point sourceSUMSS point source
(1042 erg s−1)(1042 erg s−1)uncertainty(arcmin)separation (arcmin) and SNseparation (arcmin)
011345.251.60.970.100.1850.99
01866.36.50.390.040.2391.900.92
032684.056.80.830.070.2723.041.70, 2.30, 3.97
033209.017.60.790.050.1892.34
03416.02.50.280.020.197−0.38
03591.014.30.670.050.1642.780.10, 1.56
03816.32.50.390.050.147−0.201.85
03919.41.20.180.040.315−0.342.91
044310.520.50.440.020.3674.583.87 4.840.22
069124.921.50.750.070.1651.383.40 6.343.42
070137.92.80.1520.0010.7261.80
08193.115.40.850.120.133−1.56
08253.69.20.630.050.1440.55
088122.116.70.430.040.271−0.102.96
09025.45.80.580.020.1200.30
09426.32.90.2690.0010.2432.201.48
109196.928.81.020.090.1451.090.19
11068.89.30.470.060.205−1.070.10
12682.06.10.420.020.2400.031.22
1278.41.00.2070.0010.2071.28
132319.335.70.960.170.1821.74
13686.87.30.360.020.282−3.581.11 5.841.00
1398.71.20.1690.0010.252−0.170.44
15037.71.80.1760.0010.403−3.340.13 4.230.05, 2.29
1523.40.60.1390.0010.219−0.45
156166.011.70.670.060.2023.01
158104.215.60.550.030.2051.94
21045.09.00.830.090.1050.18
22714.51.80.3460.0010.157−1.030.06
24538.17.10.620.030.1300.241.38
27517.82.70.290.030.198−0.462.12
28731.111.00.570.040.131−0.02
28889.017.40.600.040.180−0.250.62
35766.38.30.480.060.198−0.97
38617.74.80.530.050.1150.830.417 4.53a
4304.50.90.2060.0010.167−0.67
44469.113.80.710.050.141−0.13
4571.10.30.1000.0010.201−1.24
4766.20.70.1010.0010.365−0.121.03
50247.24.20.550.050.156−0.30
51123.43.70.2690.0010.2330.110.15, 2.37
527160.826.20.790.060.1720.833.96
5286.42.10.350.020.1170.57
5385.12.10.200.020.1790.30
543134.529.60.570.030.2171.10
5474.11.30.2410.0010.140−6.450.20 6.750.12, 2.89

aDetected in 220 GHz.

We have also investigated the dependence of ξX on the assumed cluster profile. We repeated the analysis described above using the Arnaud profile and a β profile with β = 2/3. The resulting changes in the extracted values of ξX are less than 3 per cent of the measurement uncertainty on the individual ξX values. A similar lack of sensitivity to the assumed cluster profile is seen in the ξ > 5 SPT-SZ derived cluster samples.

The cluster with the strongest detection in the SPT-SZ maps is illustrated in Fig. 3, which contains a pseudo-colour optical image with SPT-SZ signal-to-noise ratio contours in white. The SPT-SZ significance, ξ, of this cluster is 6.23 corresponding to maximum signal-to-noise ratio in the filtered map (SPT-CL J 2316−5453; Bleem et al. 2015), whereas the ξX is 4.58 at the X-ray position with Rc of 0.367 arcmin. This reduction in signal-to-noise ratio is expected because there is noise in the SZE map, and the SPT-SZ cluster is selected to lie at the peak ξ.

Figure 3.

Blanco Cosmology Survey (BCS) optical pseudo-colour image of cluster 044 in gri bands. The yellow circle (1.5 arcmin diameter) centred at the X-ray peak indicates the rough size of the SPT beam (1.2 arcmin FWHM in 150 GHz and 1.6 arcmin in 95 GHz). The SPT-SZ filtered map is overlaid with white contours, which are marked with the significance levels. The offset between the X-ray centre and the SZE peak is 0.75 arcmin, and the BCG for this system lies near those two centres.

4.2 Testing the null hypothesis

To gain a sense of the strength of the SZE detection of the ensemble of XMM-BCS clusters, we test the measured significance around SZE null positions. A single null catalogue consists of the same number of clusters as the XMM-BCS sample where the X-ray luminosities and redshifts are maintained, but the SPT-SZ significances ξX are measured at random positions. We then carry out a likelihood analysis of three null catalogues. When fixing the slope BSZ of the scaling relation, we find that the normalization factor ASZ is constrained to be <0.56 at 99 per cent confidence level for all three null samples we tested. Because this constraint on the amplitude is small compared to the expected normalization for the XMM-BCS sample, we have essentially shown that there should be sufficient signal-to-noise ratio to detect the SZE signature of the cluster ensemble.

4.3 SPT ζ–mass relation

We explore the SZE signature of low-mass clusters by constraining the ASZ and BSZ parameters with the approach described and tested above. The X-ray luminosity–mass scaling relation, equation (2), is directly adopted with the additional observational uncertainties of each cluster that are listed in Table 2 (bolometric luminosities presented in S12).

We present results for four different subsets of our sample: (1) the full sample without removal of any cluster; (2) the sample excluding any cluster with a point source detected at >4σ in any SPT observing band within a 4 arcmin radius of the X-ray cluster (see Table 2), hereafter SPT-NPS sample; (3) the SPT-NPS clusters with redshift larger than 0.3, hereafter SPT-NPS(z > 0.3), which is the best match to the selection of the SPT-SZ high-mass sample in B13, and (4) the sample without any Sydney University Molonglo Sky Survey (SUMSS; Bock, Large & Sadler 1999; Mauch et al. 2003) point sources in 4 arcmin radius. We discuss further the astrophysical nature and impact of point sources in Section 4.6 .

In Fig. 4, we illustrate the ζ–mass relation obtained by plotting the observed ξX versus the expected 〈ζ(LX, z)〉, estimated using equation (13). Here we use the best-fitting scaling relation from the SPT-NPS (black points only). Note that the typical bias correction on the mass is about 10 per cent at the high-mass end.

Figure 4.

The measured significance ξX versus the expected SPT-SZ 〈ζ(LX, z)〉, where the best-fitting relation from the SPT-NPS sample and sampling bias corrections are applied. Overplotted is the line of equality. Clusters close to SPT point sources are marked with red diamonds.

We explore the likelihood as a function of ASZ and BSZ and show the parameter constraints for the three samples in Table 3, and we show the likelihood distribution of the SPT-NPS sample in Fig. 5. We also show marginalized single parameter probability distributions, which we use to calculate the 68 per cent confidence region for each parameter. This confidence region along with the modal value is reported in Table 3. For comparison, the constraints from the B13 analysis are shown in red.

Figure 5.

Constraints on the SPT-SZ ζ–mass relation parameters ASZ and BSZ for the SPT-NPS. The different shading indicates 1σ, 2σ, and 3σ confidence regions. The constraints from the SPT-SZ high-mass clusters (B13) are shown in red with 68 per cent confidence regions marked with dashed lines. The amplitudes for low- and high-mass clusters are compatible, but the slope is higher for low-mass systems by about 1.4σ.

Table 3.

Constraints on the SZE ζ–mass relation parameters.

ASZBSZ
SPT high mass (B13)1.50 ± 0.341.40 ± 0.16
Prior[0.1–5][0.1–6]
Full sample|$1.38^{+0.46}_{-0.36}$||$2.80^{+0.66}_{-0.63}$|
SPT-NPS|$1.37^{+0.48}_{-0.38}$||$2.14^{+0.86}_{-0.66}$|
SPT-NPS (z > 0.3)|$1.37^{+0.60}_{-0.46}$||$2.31^{+1.31}_{-0.86}$|
SPT-No-SUMSS|$1.42^{+0.58}_{-0.43}$||$2.14^{+0.91}_{-0.71}$|
ASZBSZ
SPT high mass (B13)1.50 ± 0.341.40 ± 0.16
Prior[0.1–5][0.1–6]
Full sample|$1.38^{+0.46}_{-0.36}$||$2.80^{+0.66}_{-0.63}$|
SPT-NPS|$1.37^{+0.48}_{-0.38}$||$2.14^{+0.86}_{-0.66}$|
SPT-NPS (z > 0.3)|$1.37^{+0.60}_{-0.46}$||$2.31^{+1.31}_{-0.86}$|
SPT-No-SUMSS|$1.42^{+0.58}_{-0.43}$||$2.14^{+0.91}_{-0.71}$|
Table 3.

Constraints on the SZE ζ–mass relation parameters.

ASZBSZ
SPT high mass (B13)1.50 ± 0.341.40 ± 0.16
Prior[0.1–5][0.1–6]
Full sample|$1.38^{+0.46}_{-0.36}$||$2.80^{+0.66}_{-0.63}$|
SPT-NPS|$1.37^{+0.48}_{-0.38}$||$2.14^{+0.86}_{-0.66}$|
SPT-NPS (z > 0.3)|$1.37^{+0.60}_{-0.46}$||$2.31^{+1.31}_{-0.86}$|
SPT-No-SUMSS|$1.42^{+0.58}_{-0.43}$||$2.14^{+0.91}_{-0.71}$|
ASZBSZ
SPT high mass (B13)1.50 ± 0.341.40 ± 0.16
Prior[0.1–5][0.1–6]
Full sample|$1.38^{+0.46}_{-0.36}$||$2.80^{+0.66}_{-0.63}$|
SPT-NPS|$1.37^{+0.48}_{-0.38}$||$2.14^{+0.86}_{-0.66}$|
SPT-NPS (z > 0.3)|$1.37^{+0.60}_{-0.46}$||$2.31^{+1.31}_{-0.86}$|
SPT-No-SUMSS|$1.42^{+0.58}_{-0.43}$||$2.14^{+0.91}_{-0.71}$|

All three low-mass subsamples show similar normalization to the extrapolated high-mass SPT-SZ sample, but there is a preference for larger slopes. The SPT-NPS sample is the best for comparison to the SPT-SZ high-mass sample used in B13; this is because the SPT point sources have been removed to mimic the SPT cluster catalogue selection and because there is no measurable difference between the SPT-NPS samples with or without the redshift cut.

The fact that we find consistent results with or without a low-redshift cut may at first be surprising, given that analyses of the high-mass SPT-SZ cut all clusters below z = 0.3. In the SPT-SZ high-mass sample, the low-redshift clusters are cut because the angular scales of these clusters begin to overlap the scales where there is significant CMB primary anisotropy, making extraction with the matched filter approach using two frequencies difficult. However the XMM-BCS clusters are low-mass systems with corresponding Rc less than 1 arcmin even at low redshift. So we are able to recover the same scaling relation with or without the low-redshift clusters.

The fully marginalized posterior probability distributions for BSZ can be used to quantify consistency between the two data sets. We do this for any pair of the distributions Pi(θ) by first calculating the probability density distribution of the difference Δθ:
\begin{equation} P(\Delta \theta ) = \int \mathrm{d}\theta P_{1}(\theta )P_{2}(\theta -\Delta \theta ). \end{equation}
(18)
We then calculate the likelihood p that the origin (Δθ = 0) lies within this distribution as
\begin{equation} p = \int _S \mathrm{d}\Delta \theta \ P(\Delta \theta ), \end{equation}
(19)
where S is the space where P(Δθ) < P(Δθ = 0). We then convert this p value to an equivalent Nσ significance within a normal distribution.

Overall, there is no strong statistical evidence that the low-mass clusters behave differently than expected by simply extrapolating the high-mass scaling relation to low mass; the slope parameter BSZ of the SPT-SZ high-mass and SPT-NPS samples differs by only 1.4σ (Table 3). The full sample has a 2.6σ higher BSZ than the SPT-SZ high-mass sample (B13). This steeper slope is presumably due to the contaminating effects of the SPT point sources. We find three outliers below the LX–ξX distribution (Fig. 4) that are all contaminated by SPT point sources. We list the separation between the cluster centres and the nearest SPT point source in Table 2.

It is clear from Fig. 4 and from the results for the full sample that including X-ray-selected clusters that are associated with point sources that are independently detected in SPT-SZ data can bias the derived SZE–mass relation. In these cases, the affected clusters can be removed from the sample, and this particular bias can be easily avoided. Point sources that are not detected in the SPT-SZ data but which could be significantly affecting the measured SZE signal – particularly in low-mass clusters and groups – do remain a potential issue. We discuss this and the effect of point sources on our results more generally in Section 4.6.

In addition to the X-ray bolometric luminosities, we test the luminosities based on two other bands (0.5–2.0 keV and 0.1–2.4 keV) as predictors of the cluster mass. After applying the appropriate LX–mass relations listed in Table 1 we find that the changes to the parameter estimates are small. The largest change is on the slope of the SPT-SZ ζ–mass relation, but the difference is less than 0.2σ. Thus, the choice of X-ray luminosity band is not important to our analysis.

Our results show some dependence on the assumed LX–mass scaling relation. Adopting the Vikhlinin et al. (2009) scaling relation has no significant impact on our results. However, with the Mantz et al. (2010b) LX–mass relation, the slope decreases to BSZ ∼ 1.57 from 2.14, which makes the SPT-NPS sample almost a perfect match to the high-mass SPT-SZ scaling relation. This shift is not surprising, because the Mantz et al. (2010b) LX–mass relation has a very different slope from Pratt et al. (2009) (1.63 versus 2.08, respectively). This causes clusters with a LX < 1 × 1044 erg s−1to have significantly lower estimated masses when assuming the Mantz et al. (2010b) relation (20 per cent on average and ∼40 per cent at the low-mass end). We expect the Pratt et al. (2009) relation to be more appropriate for our analysis, because the Mantz et al. (2010b) relation was calibrated from higher mass clusters, using only clusters with LX > 2.5 × 1044 erg s−1, above the majority of XMM-BCS clusters. Also we note the change of ξX caused by the updated R500c(LX) is negligible, which has been shown also in Saliwanchik et al. (2015).

4.4 SZE Y500–mass relation

We measure the Y500–mass relation, using the SPT-NPS sample. A similar fitting approach is used to account for the selection bias and with the same shifted pivot mass in equation (12) of 1.5 × 1014  M. The best-fitting parameters and uncertainties are presented in Table 4 along with the results from Andersson et al. (2011) and P11, which are adjusted to use our lower pivot mass. The Y500 is based on the Arnaud profile and the LX is based on the X-ray luminosity measured within the 0.1–2.4 keV band, which facilitates the comparison with the P11 result. The impact from different profiles is discussed later in this section.

Table 4.

Constraints on the Y500–mass relation.

ParameterAY(10−4arcmin2)BY
SPT-NPS|$1.59^{+0.63}_{-0.48}$||$2.94^{+0.77}_{-0.74}$|
SPT-No-SUMSS|$1.72^{+1.01}_{-0.66}$||$3.29^{+0.84}_{-0.96}$|
SPT2.19 ± 0.631.67 ± 0.29
Planck2.57 ± 0.111.78 ± 0.05
ParameterAY(10−4arcmin2)BY
SPT-NPS|$1.59^{+0.63}_{-0.48}$||$2.94^{+0.77}_{-0.74}$|
SPT-No-SUMSS|$1.72^{+1.01}_{-0.66}$||$3.29^{+0.84}_{-0.96}$|
SPT2.19 ± 0.631.67 ± 0.29
Planck2.57 ± 0.111.78 ± 0.05
Table 4.

Constraints on the Y500–mass relation.

ParameterAY(10−4arcmin2)BY
SPT-NPS|$1.59^{+0.63}_{-0.48}$||$2.94^{+0.77}_{-0.74}$|
SPT-No-SUMSS|$1.72^{+1.01}_{-0.66}$||$3.29^{+0.84}_{-0.96}$|
SPT2.19 ± 0.631.67 ± 0.29
Planck2.57 ± 0.111.78 ± 0.05
ParameterAY(10−4arcmin2)BY
SPT-NPS|$1.59^{+0.63}_{-0.48}$||$2.94^{+0.77}_{-0.74}$|
SPT-No-SUMSS|$1.72^{+1.01}_{-0.66}$||$3.29^{+0.84}_{-0.96}$|
SPT2.19 ± 0.631.67 ± 0.29
Planck2.57 ± 0.111.78 ± 0.05

Fig. 6 shows the joint parameter and fully marginalized constraints for AY and BY. The shaded regions denote the 1σ, 2σ, and 3σ confidence regions as in Fig. 5 with blue for the SPT-NPS, red for the SPT-SZ sample (Andersson et al. 2011), and green for the Planck sample (P11). This figure shows that the low-mass SPT-NPS sample has rather weak constraints that are shifted with respect to the high-mass SPT-SZ sample and the Planck sample.

Figure 6.

Constraints on the Y500–mass relation parameters AY and BY for the SPT-NPS. The SPT-NPS constraints are shown in blue and different shades show the 1σ, 2σ, and 3σ levels. The red is for the SPT-SZ result (Andersson et al. 2011), and the green is the best fit from the Planck analysis (P11). Marginalized constraints for each parameter are shown in blue with best fit and 1σ confidence regions marked by solid and dashed lines, respectively.

We estimate the significance of the difference using the method described in Section 4.3. We quantify the consistency between any pair of the two-parameter distributions |$P_i\left(\boldsymbol \theta \right)$| by calculating a p value in a manner similar to that in equation (18) with the null hypothesis |$\Delta \boldsymbol \theta =0$|⁠. Using this approach, we calculate that the SPT-NPS sample is roughly consistent with the high-mass SPT-SZ sample (a 1.4σ difference) but is in tension with the Planck result (a 2.8σ difference).

Also shown in Fig. 6 are the fully marginalized single parameter constraints. These distributions indicate that the normalization differs by 0.8σ (1.6σ), and the slope parameter differs by 1.7σ (1.7σ) for the SPT-SZ (Planck) sample. Alternatively, we fix BY = 1.67 (1.78) to limit the impact of the large uncertainty on the slope on the constraint of the normalization. In this case, we find |$A_{\rm Y}=1.33^{+0.34}_{-0.31}$| (⁠|$1.37^{+0.36}_{-0.32}$|⁠) and the discrepancy on AY is 1.5σ (3.1σ) for the SPT-SZ (Planck) sample. As in the ζ–mass relation, there is no strong statistical evidence that the SPT-SZ clusters at low mass behave differently than those at high mass. Tighter constraints on the high-mass SPT-SZ scaling relation will be helpful to understand the tension.

The tension with the Planck sample is intriguing; here we discuss several possible issues that could contribute. One difference is in the mass ranges probed in the two studies. In P11, the Planck team studies the relation between X-ray and SZE properties of 1600 clusters from the Meta-Catalogue of X-ray detected Clusters of galaxies (MCXC; Piffaretti et al. 2011) that span two decades in luminosity (1043 erg s−1 ≲ L500, [0.1–2.4 keV]E(z)−7/3 ≲ 2 × 1045 erg s−1). In contrast, our sample spans the range 1042erg s−1 ≲ L500, [0.1–2.4 keV]E(z)−7/3 ≲ 1044 erg s−1 extending into the galaxy group regime. Thus, it is interesting to probe for any mass trends in the discrepancy. In Fig. 7, we show our measurements along with the Planck relation with fixed slope and redshift evolution as listed in table 4 in P11 (solid black line). At the luminous (massive) end, our sample matches well with the Planck result (cyan points are taken from fig. 4 in P11). Beyond the Planck sample at the faint end, we find the preference for lower Y500 relative to the Planck relation.

Figure 7.

Comparison with the PlanckY500LX relation. The green dots are XMM-BCS clusters with 1σ uncertainty on ξX and measured uncertainties on LX converted from the 0.5–2 keV band. The blue points are inverse variance weighted means of ensembles of the XMM-BCS sample. The black line is the Planck SZE relation from table 4 in P11 with the last four binned data points from fig. 4 of P11 in cyan. The red line is the best-fitting relation from the SPT sample. The correction of selection bias leads to a higher-than-measured Y500 at high-mass (luminous) end as the mass function is steep. Consistent with our parameter constraints in Fig. 6, our measurements prefer a lower value than the Planck relation. Clusters close to SPT point sources are marked with red diamonds.

In the Planck analysis, an LX–mass relation without Malmquist bias correction is used (Pratt et al. 2009). They argue that based on the similarity between the REXCESS and MCXC samples, there is no bias correction needed. In our analysis, we use the Malmquist bias-corrected relation and our likelihood corrects for selection bias. Using the non-corrected relation (Pratt et al. 2009) has very little impact. Interestingly, if we adopt the Mantz et al. (2010b) relation, the tension between our result and the Planck result disappears mainly due to the lower masses predicted by the relation as discussed in Section 4.3. However, given that the Planck analysis adopted the Pratt et al. (2009) relation, it is with this same relation that the most meaningful comparisons can be made.

Secondly, the Planck relation is dominated by the high-mass clusters, and their measurements at the low-luminosity end (marked by cyan points in Fig. 7) also tend to fall below their best-fitting relation. The lowest luminosity Planck point has a Y500 that is 68 per cent (2σ offset) of the value of the best-fitting model at the same X-ray luminosity. Interestingly, the best-fitting normalization of the SPT-NPS sample is 53 per cent of the Planck model normalization. In this sense, the tension between the two low-mass samples is less than the tension between our sample and the best-fitting Planck relation.

Thirdly, we note the redshift dependence of Y500–mass relation could lead to a different normalization because the XMM-BCS sample is on average at higher redshift than the Planck sample. In P11, they show a weak redshift evolution of Y500, where the index of E(z) term is −0.007 ± 0.518. When they fit with the redshift evolution fixed to the self-similar expectation (2/3), it changes the Y500 normalization by −5 per cent (0.451/0.476), because E(z) is larger than 1 for z > 0. In comparison, if we assume an index of 0 for E(z) it will increase our Y500 normalization by 19 per cent compared to the E(z)2/3 case (XMM-BCS sample has a mean redshift of 0.48). In this sense, there is some systematic uncertainty in the tension between the two samples that depends on the true redshift evolution of the Y500–mass relation. If the samples evolve self-similarly, then the Planck normalization should be reduced by 5 per cent.

Finally, the comparison to Planck is complicated because of differences between the SPT and Planck instruments and data sets and also differences between the analyses. Our analysis of SPT-SZ data calculates the SZE signal exclusively at frequencies below the SZE null (95 and 150 GHz), where the SZE signal is negative, while Planck also includes information from frequencies above the 220 GHz SZE null, where the signal is positive. Thus, contamination from sources like radio galaxies with steeply falling spectra, which primarily affect the lowest frequency bands in both instruments, would tend to bias both the Planck and SPT-SZ relations in the same way. But there are other possible sources of contamination such as dusty star-forming galaxies that are much brighter at higher frequencies. A population of star-forming galaxies associated with clusters could artificially increase the Planck measured Y500, but could only negatively bias the SPT-SZ measurements. In their paper, the Planck team shows that at the low-mass end (stellar mass smaller than 1011.25 M), the Y500 estimated by six high-frequency bands directly is higher than the Y500 estimated when using a thermal dust model and the Y500 estimated just using the three low frequencies (100, 143, and 217 GHz) (Planck Collaboration XI 2013). However, at higher masses where we are seeing the discrepancy between the SPT and Planck signals there is no clear evidence for a dust related systematic in the cross-checks carried out by the Planck team. We present 2.8σ significant evidence for dusty galaxy flux in our cluster ensemble in Section 4.6. If present, this flux is likely contributing to some degree to the discrepancy we find between the SPT and Planck SZE signatures on these mass scales.

In summary, there are several potential contributing factors to the 2.8σ tension between the two results. None of them provide a convincing explanation for the offset on their own, but there are indications that differential sensitivity to dusty galaxy flux in SPT and Planck could be playing a role. What is needed next is a larger sample with higher quality data to probe this tension and – if the tension persists – to provide insights into the underlying causes of the discrepancy.

4.5 Potential systematics

In the likelihood approach, we fix the cosmological parameters and assume no redshift uncertainty to improve the efficiency of the calculation. We test both of these assumptions and find that neither significantly impacts the analysis. Specifically, the mass function used for correcting the sampling bias is adopted from a fixed cosmology (ΩM, ΩΛ, H0) = (0.3, 0.7, and 70 km s−1 Mpc−1). When we alter these to the recent Wilkinson Microwave Anisotropy Probe results for Λ cold dark matter Komatsu et al. (2011), we find a negligible impact.

We test the importance of possible photometric redshift biases by shifting the redshifts of all clusters up (or down) by 1σ. We update LX appropriately for the new redshifts, and we find a small (0.5σ) shift in the normalization and no change to the slope. Therefore, redshift biases at this level would not significantly bias the analysis.

4.6 Point source population

As already noted (see Section 4.3), there is a tendency for the systems with the most negative ξX to be those with nearby SPT point sources (see Fig. 4). In this section, we explore this association in more detail, testing whether it is biasing our constraints on the SZE mass–observable relations. For the purposes of our analysis, an object is identified as an SPT point source if it appears as a 4σ detection in a single frequency point-source filtered SPT-SZ map in any of the three bands (95, 150, or 220 GHz). An area within a 4 arcmin radius around each point source is defined, and all X-ray selected clusters within that region are flagged. There are six clusters flagged in our sample, and these are denoted with red diamonds in the figures presented above. Given the number densities of the SPT point sources (6 deg−2 in this field) and the X-ray selected clusters together with the association radius, we estimate a 36 per cent chance that these point sources are random associations with the clusters.

If we consider a smaller 2 arcmin association radius between the X-ray centre and the SPT point source location, we still find four associations: three of which correspond to the most negative ξX in Fig. 4, and the fourth is detected only at 220 GHz by SPT (and therefore is likely a dusty galaxy). With the smaller association radius the probability of a random association drops to 7 per cent, providing ∼2σ evidence that these point sources are physically associated with the X-ray selected groups.

To further study the point source issue, we cross-match our cluster sample with radio sources detected at 843 MHz by the SUMSS. The survey covers the whole sky at δ ≤ −30° with |b| > 10° down to limiting source brightness of 6 mJy beam−1. For the cross-matching, we utilize the latest version 2.1 of the catalogue1 and a similar matching radius of 2 arcmin. This threshold is much larger than the SUMSS positional uncertainty, which has a median value of ∼2.3 arcsec.

Within 2 arcmin of the X-ray centres, we find a total of 19 SUMSS point sources matching 18 clusters from our sample. In comparison, given the number density of SUMSS sources (31.6 deg−2, Mauch et al. 2003), the number density of our clusters, and our association radius, we would expect to find ∼5 clusters randomly overlapping with point sources in the 6 deg2 survey; there is a 3 × 10−4 per cent chance of explaining the associations as random superpositions. Thus, our small sample provides clear evidence of physical associations between low-frequency radio point sources and X-ray selected groups and clusters; this is consistent with previous findings Best et al. (2005); Lin & Mohr (2007) that low-frequency radio sources are associated with cluster galaxies in both optically and X-ray selected cluster samples. As expected, given the tendency for radio galaxies to have steeply falling spectra as a function of frequency, only a small fraction (3 out of 19) of these low-frequency radio galaxies are detectable at SPT frequencies.

We use the BCS data (Desai et al. 2012) to examine the optical counterparts of the six SPT point sources that lie within 4 arcmin of our X-ray selected group and cluster sample. We do this by first associating the SPT point sources with a SUMSS source, which in general is only possible for the radio galaxies and not the dusty galaxies (Vieira et al. 2010). For our sample, three of the SPT point sources within 4 arcmin of the X-ray selected groups and clusters have SUMSS counterparts. All three of these have strongly negative ξX (see Fig. 4). For two of the three point sources, the optical counterpart is the group brightest cluster galaxies (BCG). In the third case the SPT point source corresponds to a quasar candidate (MRC 2319−550; Wright & Otrupcek 1990) and does not appear to be a cluster member. The three remaining SPT point sources do not have SUMSS counterparts and are likely dusty galaxies; the SZE signatures ξX of those systems are not obviously impacted. Thus we confirm that in two of our 46 low-mass systems there are associated radio galaxies bright enough to be detected at SPT frequencies.

Based on the prediction from Lin et al. (2009), we would have expected that radio sources completely fill in the YSZ signal (100 per cent contamination) at a redshift of 0.1 (or a redshift of 0.6) in approximately 2.5 (or 0.5) per cent of clusters with similar mass (M200 = 1014 M). For our 46 cluster sample, we would have expected this to happen for 1.15 (or 0.23) clusters, consistent with the two clusters we find associated with radio galaxies detected as point sources by SPT-SZ. We also expect a 20 per cent level YSZ contamination on 9 (2) per cent of the sample. This predicted contamination is significantly smaller than our current uncertainties on the YSZ normalization, and therefore cannot be tested in this analysis.

We repeat the SZE–mass relation analysis while excluding the half of the clusters with SUMSS point source associations. We find that the results are qualitatively similar using either the SPT-NPS or SPT-No-SUMSS sample (see Tables 3 and 4), although the uncertainties increase; this is consistent with the expectation that the level of the effect is too small to be measured with our sample. As already shown in Tables 3 and 4, our analysis shows no statistically significant difference in the SZE–mass relations when excluding or including the systems with nearby SPT point sources.

As pointed out in Section 4.4, the dusty star-forming galaxies would have a net negative biasing impact on the SPT-SZ measurement. We examine the contamination from the dusty galaxies, which are not bright enough to be directly detectable in the 150 and 95 GHz bands. To do this, we measure the specific intensities at 220 GHz in a single frequency adaptive filter that uses cluster profiles at the locations of our X-ray selected cluster sample. In the SPT-NPS sample, the evidence for dusty galaxies is significant at the 2.8σ level. We then convert the 220 GHz intensities to temperature fluctuations at 150 and 95 GHz by assuming the intensity follows I ∝ ν3.6 for dusty sources Shirokoff et al. (2011). These are then converted to the corresponding values of Y500. Dividing then by the expected Y500 for a cluster of this redshift and X-ray luminosity, we then estimate the inverse variance weighted mean contamination to be 32 ± 18 and 7 ± 4 per cent at 150 and 95 GHz, respectively. Together, this contamination would lead the SPT-SZ observed Y500 signature to be biased low by ∼(17 ± 9) per cent. This fractional contamination depends on the mass and redshift of the cluster together with the typical star formation activity. In particular, as a function of mass the SZE signature grows as Y500 ∝ M5/3, whereas the blue or star-forming component of the galaxy population falls (e.g. Weinmann et al. 2006); thus, contamination would fall with mass. As one pushes to even higher redshift than this sample (i.e. z > 1) where star formation is more prevalent, the contamination would be expected to increase.

This level of contamination is consistent with a recent study of ∼550 galaxy clusters selected via optical red-sequence techniques. Using Herschel and SPT millimetrewave data to jointly fit an SZE+dust spectral model, Bleem (2013) finds the contamination at 150 GHz to be 40 ± 30 per cent for low-richness optical groups (M200 ∼ 1 × 1014 M). The fractional contamination declines as a function of optical richness and is measured to be 5 ± 5 per cent for the richest 3 per cent of clusters in the sample (M200 ∼ 3–6 × 1014 M). A larger sample size combined with deeper millimetrewave data will improve our ability to estimate the contamination from dusty galaxies in clusters and groups.

In summary, this small sample of 46 X-ray selected groups and low-mass clusters provides high significance evidence of having physically associated low-frequency SUMSS radio galaxies. For the SPT point source sample within 2 arcmin, there is less than 2σ statistical evidence of physical association, but two of the sources have optical counterparts that are in the groups. Although we would expect physically associated high-frequency radio galaxies to bias the SZE mass–observable relation, our analysis provides no evidence of this impact. We use the 220 GHz SPT-SZ data in this sample to estimate that the Y500 measured by the SPT is biased ∼17 ± 9 per cent low. A larger sample from a broader survey (through XMM-XXL or eROSITA, for example) or a deeper SZE survey would both help to improve our understanding of the impact of point sources.

5 CONCLUSIONS

Using data from the SPT-SZ survey, we have explored the SZE signatures of low-mass clusters and groups selected from a uniform XMM–Newton X-ray survey. The cluster and group sample from the XMM-BCS has a well-understood selection, and previously published calibrations of the LX–mass relation allow us to estimate the masses of each of these systems. Although these systems have masses that are too low for them to have been individually detected within the SPT-SZ survey, we are able to use the ensemble to constrain the underlying relationship between the halo mass and the SZE signature for low-mass systems.

Our method corrects for the Eddington bias and shows that there is no Malmquist like bias effect on the SZE mass–observable relation within this X-ray selected sample. We test our likelihood using a large mock sample, and we show with the current sample size we can at most extract constraints from two scaling relation parameters: the power-law amplitude ASZ and slope BSZ (see equations 5 and 12).

We separate the sample of 46 groups and clusters into three subsamples: (1) the full sample, (2) the point source-free sample, for which we exclude systems with point sources detected at significance >4 at either 95, 150, or 220 GHz in the SPT-SZ data within 4 arcmin radius of the X-ray centre, and (3) the point source-free sample, with clusters at z < 0.3 excluded. We find that, due to the point source contamination in three of the lowest ξX groups, the full sample exhibits a steep slope (⁠|$B_{{\rm SZ}}=2.80^{+0.66}_{-0.63}$|⁠) that is in tension at 2.6σ with the high-mass SPT sample (BSZ = 1.40 ± 0.16). The point source-free subsample has a slope (⁠|$B_{{\rm SZ}}=2.14^{+0.86}_{-0.66}$|⁠) that is in rough agreement with the slope of the high-mass SPT sample (1.4σ difference). We find no evidence that the low-redshift clusters deviate from the scaling relation of the point source-free sample.

We also measure the Y500–mass relation for our sample and compare it to the results from the SPT-SZ high-mass clusters and the Planck sample. Our low-mass sample exhibits a preference for lower normalization and steeper slope than the other two samples, but the uncertainties are large (see Fig. 6 and Table 4). Within the SPT samples, there is no statistically significant evidence for differences in the scaling relation as one moves from high to low masses. On the other hand, the Planck sample exhibits a 2.8σ significant tension with our sample. As shown in Fig. 7, the lowest X-ray luminosity portion of our sample has lower Y500 than expected from the Planck relation. We discuss a range of possible explanations for this tension (Section 4.4), in particular contamination from dusty sources. Given the significance level of the tension the appropriate next step is to enlarge the sample to better quantify the differences in the SZE signatures of low- and high-mass clusters and the possible differences between Planck and SPT.

We examine radio point source contamination. Cross-matching our X-ray selected groups and clusters with the SUMSS catalogue, we find that 18 of 46 members have associated 843 MHz SUMSS point sources within 2 arcmin. This represents highly significant evidence of physical association between our sample and low-frequency point sources. At higher frequencies, we find four systems with associated SPT detected point sources; three of these also have SUMSS counterparts. Two of these three point sources have optical counterparts that lie within the X-ray group, and the third is a quasar candidate that is likely unassociated with the group. Having two out of 46 groups or clusters with physically associated bright, high-frequency point sources is consistent with the expectations from Lin et al. (2009). The predicted contamination from undetected radio point sources (Lin & Mohr 2007; Lin et al. 2009) in the remainder of the sample is significantly smaller than our measurement uncertainty on the Y500 normalization, and so we cannot test these predictions here.

We also examine the impact of undetected dusty galaxies. Using the SPT-SZ 220 GHz band, we find 2.8σ significant evidence of a flux excess due to dusty galaxies. Extrapolating to lower frequencies, we estimate that the measured Y500 signature is biased low by ∼(17 ± 9) per cent in this ensemble of low-mass clusters and groups. Given the different frequency coverage of Planck and SPT, it is not clear that the Planck bias due to dusty galaxy flux would be the same. If flux from dusty galaxies would induce a smaller negative bias or even a positive bias in PlanckY500 measurements, then that would reduce the tension between the PlanckY500–mass relation and ours.

We point out that these contamination levels are for this X-ray selected low-mass sample with a median mass of 1014 M, which is a factor of several below the typical mass of SPT selected clusters. Given the increasing rarity of blue and star-forming galaxies as one moves from groups to high-mass clusters (e.g. Weinmann et al. 2006), any contamination in the SPT selected sample would be much lower.

Finally, the receiver on the SPT was upgraded in 2012. The SPTpol camera provides sensitivity to CMB polarization and, more importantly for SZE work, increased sensitivity to CMB temperature fluctuations. The final SPTpol maps are expected to cover 500 square degrees of sky to noise levels of ∼5 and |${\sim } 9\,\mu\mathrm{K}\hbox{-}\mathrm{arcmin}$| at 150 and 95 GHz (Austermann et al. 2012). Meanwhile, the XXL survey Pierre et al. (2011) has increased the survey area that has a characteristic 10 ks XMM–Newton exposure from 6 to 25 deg2. This should enable an interesting new insight into possible differences in the SZE signatures of low- and high-mass clusters. We make a forecast with a mock catalogue that consists of 144 clusters within redshift range 0.2–1.2 and a bolometric flux limit of 1 × 10−14 ergs−1cm−2. Analysing this sample with the appropriate SPTpol increase in depth indicates that with the future sample we can tighten the fractional error on ASZ to 6 per cent compared to our current result of 30 per cent. On BSZ the uncertainty shrinks from 34 to 8 per cent. These improvements should enable a more revealing comparison of the SZE signatures of low- and high-mass clusters and perhaps also enable a detailed study of potential contamination of the SZE signal by associated radio or dusty galaxies.

We acknowledge the support of the DFG through TR33 ‘The Dark Universe’ and the Cluster of Excellence ‘Origin and Structure of the Universe’. Some calculations have been carried out on the computing facilities of the Computational Center for Particle and Astrophysics (C2PAP). The South Pole Telescope is supported by the National Science Foundation through grant PLR-1248097. Partial support is also provided by the NSF Physics Frontier Center grant PHY-1125897 to the Kavli Institute of Cosmological Physics at the University of Chicago, the Kavli Foundation and the Gordon and Betty Moore Foundation grant GBMF 947. This work is also supported by the US Department of Energy. Galaxy cluster research at Harvard is supported by NSF grants AST-1009012 and DGE-1144152. Galaxy cluster research at SAO is supported in part by NSF grants AST-1009649 and MRI-0723073. The McGill group acknowledges funding from the National Sciences and Engineering Research Council of Canada, Canada Research Chairs programme, and the Canadian Institute for Advanced Research.

REFERENCES

Allen
S. W.
Evrard
A. E.
Mantz
A. B.
ARA&A
2011
, vol. 
49
 pg. 
409
 
Andersson
K.
, et al. 
ApJ
2011
, vol. 
738
 pg. 
48
 
Arnaud
M.
Pratt
G. W.
Piffaretti
R.
Böhringer
H.
Croston
J. H.
Pointecouteau
E.
A&A
2010
, vol. 
517
 pg. 
A92
 
Austermann
J. E.
, et al. 
Proc. SPIE
2012
, vol. 
8452
 pg. 
84521E
 
Benson
B. A.
, et al. 
ApJ
2013
, vol. 
763
 pg. 
147
  
(B13)
Best
P. N.
Kauffmann
G.
Heckman
T. M.
Brinchmann
J.
Charlot
S.
Ivezić
Ž.
White
S. D. M.
MNRAS
2005
, vol. 
362
 pg. 
25
 
Bleem
L. E.
PhD thesis
2013
Chicago
 
Univ.
Bleem
L. E.
, et al. 
ApJS
2015
, vol. 
216
 pg. 
27B
 
Bock
D. C.-J.
Large
M. I.
Sadler
E. M.
AJ
1999
, vol. 
117
 pg. 
1578
 
Bocquet
S.
, et al. 
ApJ
2015
, vol. 
799
 pg. 
214
 
Carlstrom
J. E.
, et al. 
PASP
2011
, vol. 
123
 pg. 
568
 
Cavaliere
A.
Fusco-Femiano
R.
A&A
1976
, vol. 
49
 pg. 
137
 
Desai
S.
, et al. 
ApJ
2012
, vol. 
757
 pg. 
83
 
Fowler
J. W.
, et al. 
Appl. Opt.
2007
, vol. 
46
 pg. 
3444
 
Hasselfield
M.
, et al. 
J. Cosmol. Astropart. Phys.
2013
, vol. 
7
 pg. 
8
 
Komatsu
E.
, et al. 
ApJS
2011
, vol. 
192
 pg. 
18
 
Laganá
T. F.
Martinet
N.
Durret
F.
Lima Neto
G. B.
Maughan
B.
Zhang
Y.-Y.
A&A
2013
, vol. 
555
 pg. 
A66
 
Lin
Y.-T.
Mohr
J. J.
ApJS
2007
, vol. 
170
 pg. 
71
 
Lin
Y.
Partridge
B.
Pober
J. C.
Bouchefry
K. E.
Burke
S.
Klein
J. N.
Coish
J. W.
Huffenberger
K. M.
ApJ
2009
, vol. 
694
 pg. 
992
 
McDonald
M.
, et al. 
ApJ
2013
, vol. 
774
 pg. 
23
 
Majumdar
S.
Mohr
J. J.
ApJ
2004
, vol. 
613
 pg. 
41
 
Mantz
A.
Allen
S. W.
Rapetti
D.
Ebeling
H.
MNRAS
2010a
, vol. 
406
 pg. 
1759
 
Mantz
A.
Allen
S. W.
Ebeling
H.
Rapetti
D.
Drlica-Wagner
A.
MNRAS
2010b
, vol. 
406
 pg. 
1773
 
Mauch
T.
Murphy
T.
Buttery
H. J.
Curran
J.
Hunstead
R. W.
Piestrzynski
B.
Robertson
J. G.
Sadler
E. M.
MNRAS
2003
, vol. 
342
 pg. 
1117
 
Melin
J.-B.
Bartlett
J. G.
Delabrouille
J.
A&A
2006
, vol. 
459
 pg. 
341
 
Mohr
J. J.
Mathiesen
B.
Evrard
A. E.
ApJ
1999
, vol. 
517
 pg. 
627
 
Molnar
S. M.
Hearn
N. C.
Stadel
J. G.
ApJ
2012
, vol. 
748
 pg. 
45
 
Mortonson
M. J.
Hu
W.
Huterer
D.
Phys. Rev. D
2011
, vol. 
83
 pg. 
023015
 
Motl
P. M.
Hallman
E. J.
Burns
J. O.
Norman
M. L.
ApJ
2005
, vol. 
623
 pg. 
L63
 
Nagai
D.
Kravtsov
A. V.
Vikhlinin
A.
ApJ
2007
, vol. 
668
 pg. 
1
 
Pierre
M.
Pacaud
F.
Juin
J. B.
Melin
J. B.
Valageas
P.
Clerc
N.
Corasaniti
P. S.
MNRAS
2011
, vol. 
414
 pg. 
1732
 
Piffaretti
R.
Arnaud
M.
Pratt
G. W.
Pointecouteau
E.
Melin
J.-B.
A&A
2011
, vol. 
534
 pg. 
A109
 
Plagge
T.
, et al. 
ApJ
2010
, vol. 
716
 pg. 
1118
 
Planck Collaboration X
A&A
2011a
, vol. 
536
 pg. 
A10
  
(P11)
Planck Collaboration XI
A&A
2011b
, vol. 
536
 pg. 
A11
 
Planck Collaboration XI
A&A
2013
, vol. 
557
 pg. 
A52
 
Planck Collaboration XVI
A&A
2014a
, vol. 
571
 pg. 
A16
 
Planck Collaboration XX
A&A
2014b
, vol. 
571
 pg. 
A20
 
Pratt
G. W.
Croston
J. H.
Arnaud
M.
Böhringer
H.
A&A
2009
, vol. 
498
 pg. 
361
 
Reichardt
C. L.
, et al. 
ApJ
2013
, vol. 
763
 pg. 
127
 
Saliwanchik
B. R.
, et al. 
ApJ
2015
, vol. 
799
 pg. 
137
 
Schaffer
K. K.
, et al. 
ApJ
2011
, vol. 
743
 pg. 
90
 
Semler
D. R.
, et al. 
ApJ
2012
, vol. 
761
 pg. 
183
 
Shirokoff
E.
, et al. 
ApJ
2011
, vol. 
736
 pg. 
61
 
Song
J.
Mohr
J. J.
Barkhouse
W. A.
Warren
M. S.
Rude
C.
ApJ
2012a
, vol. 
747
 pg. 
58
 
Song
J.
, et al. 
ApJ
2012b
, vol. 
761
 pg. 
22
 
Stanek
R.
Evrard
A. E.
Böhringer
H.
Schuecker
P.
Nord
B.
ApJ
2006
, vol. 
648
 pg. 
956
 
Story
K. T.
, et al. 
ApJ
2013
, vol. 
779
 pg. 
86
 
Šuhada
R.
, et al. 
A&A
2012
, vol. 
537
 pg. 
A39
  
(S12)
Sun
M.
Voit
G. M.
Donahue
M.
Jones
C.
Forman
W.
Vikhlinin
A.
ApJ
2009
, vol. 
693
 pg. 
1142
 
Sunyaev
R. A.
Zel'dovich
Y. B.
Comments Astrophys. Space Phys.
1970
, vol. 
2
 pg. 
66
 
Sunyaev
R. A.
Zel'dovich
Y. B.
Comments Astrophys. Space Phys.
1972
, vol. 
4
 pg. 
173
 
Tauber
J. A.
, et al. 
A&A
2010
, vol. 
520
 pg. 
A1
 
Tinker
J.
Kravtsov
A. V.
Klypin
A.
Abazajian
K.
Warren
M.
Yepes
G.
Gottlöber
S.
Holz
D. E.
ApJ
2008
, vol. 
688
 pg. 
709
 
Vanderlinde
K.
, et al. 
ApJ
2010
, vol. 
722
 pg. 
1180
 
Vieira
J. D.
, et al. 
ApJ
2010
, vol. 
719
 pg. 
763
 
Vikhlinin
A.
, et al. 
ApJ
2009
, vol. 
692
 pg. 
1033
 
Weinmann
S. M.
van den Bosch
F. C.
Yang
X.
Mo
H. J.
MNRAS
2006
, vol. 
366
 pg. 
2
 
Williamson
R.
, et al. 
ApJ
2011
, vol. 
738
 pg. 
139
 
Wright
A.
Otrupcek
R.
1990
 
Parkes Catalogue. ATNF
Zenteno
A.
, et al. 
ApJ
2011
, vol. 
734
 pg. 
3
 

APPENDIX A: LIKELIHOOD FUNCTION

We start from the full likelihood function based on B13 to constrain both the cosmological model and the scaling relations as (note that the observables are different from the ones used in B13)
\begin{eqnarray} \ln \ \mathcal {L}(\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X},\Theta ) &=& \sum _{i}\ln \frac{\mathrm{d}N(Y_{i}, f_{i}, z_{i}|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X},\Theta )}{\mathrm{d}Y\mathrm{d}f\mathrm{d}z}\nonumber \\ &&- \int \!\!\!\!\int \!\!\!\!\int \frac{\mathrm{d}N(Y, f, z|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X},\Theta )}{\mathrm{d}Y\mathrm{d}f\mathrm{d}z}\mathrm{d}Y\mathrm{d}f\mathrm{d}z, \nonumber\\ \end{eqnarray}
(A1)
where i runs over the cluster sample, Yi is the SZE signal (i.e. ξX or Y500), fi is the X-ray flux, and zi is the redshift. rSZ represents the SZE scaling relation, rX represents the X-ray scaling relation, and Θ describes the sample selection. dN(Yi, fi, zi|c, rSZ, rX, Θ) is the expected number of clusters within a three-dimensional cell dYdfdz, and the second term is the integral of the differential cluster number density overall Y, f and z.

Given the limited sample size, we focus on the SZE–mass scaling relation, keeping the cosmological c and the X-ray scaling relation rX fixed. In addition, we assume the redshift measurements have insignificant uncertainties. Within this context, the X-ray flux is equivalent to the X-ray luminosity L.

The differential number density of clusters can be expressed as
\begin{eqnarray} &&{\frac{\mathrm{d}N(Y, L, z|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X},\Theta )}{\mathrm{d}Y\mathrm{d}L\mathrm{d}z}}\nonumber \\ &&{\quad = P(Y|L,z,\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X}, \Theta )\,\frac{\mathrm{d}N(L, z| \boldsymbol {c}, \boldsymbol {r}_{\rm SZ}, \boldsymbol {r}_{\rm X}, \Theta )}{\mathrm{d}L\mathrm{d}z}, } \end{eqnarray}
(A2)
where the first factor is the conditional probability of Y given observables L and z with other model parameters, and we are using the relation dN/dY = P(Y)N. The second factor is the differential number density of clusters as a function of L and z.
The full likelihood can be split into three parts:
\begin{eqnarray} \ln \mathcal {L}(\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X},\Theta ) &=& \sum _{i}\ln P(Y_{i}|L_{i},z_i,\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X}, \Theta ) \nonumber \\ &&+\; \sum _{i}\ln \frac{\mathrm{d}N(L_{i}, z_{i}|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X}, \Theta )}{\mathrm{d}L\mathrm{d}z} \nonumber \\ &&- \,\int \!\!\!\!\int \!\!\!\!\int \frac{\mathrm{d}N(Y, L, z|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X}, \Theta )}{\mathrm{d}Y\mathrm{d}L\mathrm{d}z}\mathrm{d}Y\mathrm{d}L\mathrm{d}z. \nonumber\\ \end{eqnarray}
(A3)
If the sample selection is based on the X-ray only, then we have
\begin{equation} \mathrm{d}N\left(L_{i},z_{i}|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X},\Theta _{\rm X}\right)=\Theta _{\rm X}(L_{i},z_{i})\mathrm{d}N(L_{i},z_{i}|\boldsymbol {c},\boldsymbol {r}_{\rm X}), \end{equation}
(A4)
where ΘX is simply the probability that a cluster with X-ray luminosity Li and redshift zi is observed. In addition,
\begin{equation} \int P(Y|L,z,\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X}, \Theta _{\rm X}) \mathrm{d}Y = 1, \end{equation}
(A5)
which simply means that, because there is only X-ray selection ΘX, any cluster that makes it into the sample due to its X-ray properties will always have a corresponding value Y. Using this condition together with equation (A2) allows us to write the third term in equation (A3) as
\begin{eqnarray} &&{\int \!\!\!\!\int \!\!\!\!\int \frac{\mathrm{d}N(Y, L, z|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X}, \Theta _{\rm X})}{\mathrm{d}Y\mathrm{d}L\mathrm{d}z}\mathrm{d}Y\mathrm{d}L\mathrm{d}z }\nonumber\\ &&{\quad=\int \!\!\!\!\int \frac{\mathrm{d}N(L, z|\boldsymbol {c},\boldsymbol {r}_{\rm X}, \Theta _{\rm X})}{\mathrm{d}L\mathrm{d}z}\mathrm{d}L\mathrm{d}z. } \end{eqnarray}
(A6)
Note that by adopting equations (A4) and (A6), the last two terms in equation (A3) have no remaining dependence on Y and depend only on cosmology c, the X-ray–mass scaling relation rX and the X-ray sensitive selection ΘX. Thus, within the context of a fixed cosmology and X-ray scaling relation these two terms are constant and do not contribute to constraining the SZE scaling relation rSZ. Thus, for the final likelihood that we use in this analysis, we obtain
\begin{equation} \ln \mathcal {L}(\boldsymbol {r}_{\rm SZ}) = \sum _{i}\ln P(Y_{i}|L_{i},z_{i},\boldsymbol {c},\boldsymbol {r}_{\rm X},\boldsymbol {r}_{\rm SZ},\Theta _{\rm X}). \end{equation}
(A7)
The derivation of the likelihood is correct even in the presence of correlated scatter between L and Y.
However if the selection were based on both L and Y, then equation (A7) would no longer be equivalent to the full likelihood. For instance equation (A4) would need to be extended as
\begin{eqnarray} &&{\mathrm{d}N\left(L_{i},z_{i}|\boldsymbol {c},\boldsymbol {r}_{\rm SZ},\boldsymbol {r}_{\rm X},\Theta \right)}\nonumber\\ &&{\quad=\int \mathrm{d}Y \Theta (Y,L_{i},z_{i}) \mathrm{d}N(Y, L_{i},z_{i}|\boldsymbol {c},\boldsymbol {r}_{\rm X}, \boldsymbol {r}_{\rm SZ}).} \end{eqnarray}
(A8)
And therefore detailed modelling of the selection would be required to calculate the likelihood and constrain the scaling relation parameters.