A publishing partnership

A Statistical Inference Method for Interpreting the CLASP Observations

, , , , , , , , , , , , , , , , , , , , , , , , , , , and

Published 2018 September 20 © 2018. The American Astronomical Society. All rights reserved.
, , Citation J. Štěpán et al 2018 ApJ 865 48 DOI 10.3847/1538-4357/aad910

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/865/1/48

Abstract

On 2015 September 3, the Chromospheric Lyman-Alpha SpectroPolarimeter (CLASP) successfully measured the linear polarization produced by scattering processes in the hydrogen Lyα line of the solar disk radiation, revealing conspicuous spatial variations in the Q/I and U/I signals. Via the Hanle effect, the line-center Q/I and U/I amplitudes encode information on the magnetic field of the chromosphere–corona transition region, but they are also sensitive to the three-dimensional structure of this corrugated interface region. With the help of a simple line-formation model, here we propose a statistical inference method for interpreting the Lyα line-center polarization observed by CLASP.

Export citation and abstract BibTeX RIS

1. Introduction

Recently, the Chromospheric Lyman-Alpha Spectro-Polarimeter (CLASP) sounding rocket experiment provided the first-ever successful measurement of the linear polarization Q/I and U/I profiles produced by scattering processes in the hydrogen Lyα line of the solar disk radiation (Kano et al. 2017), as well as in the Si iii resonance line at 1206 Å (Ishikawa et al. 2017). Such novel spectropolarimetric observations, with a spatial resolution of about 3'', confirmed the following theoretical predictions for the Lyα line (Trujillo Bueno et al. 2011; Belluzzi et al. 2012; Štěpán et al. 2015):

  • 1.  
    at the CLASP spatial resolution, the line-center Q/I and U/I signals, where the Hanle effect operates, are smaller than 1% (see Figure 1);
  • 2.  
    the Q/I and U/I wing signals are larger than 1%, with the Q/I ones showing a clear center-to-limb variation (CLV) with negative amplitudes increasing toward the limb; and
  • 3.  
    both the line-core and wing signals show fluctuations along the spatial direction of the spectrograph's slit, which in the CLASP experiment was radially oriented from 20'' off-limb to 380'' on the solar disk.

Figure 1.

Figure 1. CLASP observations of the intensity (top panel), Q/I (middle panel), and U/I (bottom panel) line-center signals of the hydrogen Lyα line. Each point corresponds to a pixel along the slit of the spectrograph with its corresponding value for the cosine of the heliocentic angle, μ. The reference direction for the positive Stokes Q is the parallel to the nearest limb. The dashed curve in the middle panel shows the CLV of the Q/I line-center signal calculated by Trujillo Bueno et al. (2011) in the semi-empirical model C of Fontenla et al. (1993), while the dotted curve shows the CLV of the $\langle Q\rangle /\langle I\rangle $ line-center signal calculated by Štěpán et al. (2015) in the 3D MHD model of Carlsson et al. (2016; with the symbol $\langle ...\rangle $ meaning spatial average at each μ position). We note that the standard deviation of the observed Q/I and U/I signals due to the presence of noise is σ = 0.05%, i.e., comparable to the vertical size of the data markers in the plot.

Standard image High-resolution image

Although the CLASP observation of the hydrogen Lyα line confirmed the abovementioned theoretical predictions, it also revealed an interesting surprise, namely the lack of CLV in the Q/I line-center signal, in contrast with the predictions resulting from detailed radiative transfer calculations in one-dimensional (1D) semi-empirical and hydrodynamical models (Trujillo Bueno et al. 2011; Belluzzi et al. 2012) and in the Carlsson et al. (2016) three-dimensional (3D) magnetohydrodynamical (MHD) model of the solar atmosphere (Štěpán et al. 2015).

The CLASP observations encode unique information on the 3D structure of the upper solar chromosphere. In particular, via the Hanle effect, the line-center Q/I and U/I signals encode information on the magnetic field of the chromosphere–corona transition region (TR), but the same linear polarization signals are also sensitive to the geometry of this complex interface region. In order to constrain the mean field strength and geometrical complexity of the chromosphere–corona TR from the CLASP observation, it is necessary to develop suitable inference methods. In this paper, we propose a statistical inference method for interpreting the observed Lyα line-center polarization and illustrate its applicability with the help of a simple radiative transfer model of the formation of the Lyα line-core radiation in the corrugated surface that delineates the TR. In our next paper (J. Trujillo Bueno et al. 2018, in preparation), the statistical inference method explained here will be applied to the CLASP line-center data in order to constrain the magnetic field and geometrical complexity of the solar TR via radiative transfer calculations in 3D models of the solar atmosphere.

2. Lyα Line Formation in a 3D Medium

In semi-empirical models of the solar atmosphere, the core of the hydrogen Lyα line forms in a thin (10–100 km) layer at the base of the chromosphere–corona TR. Below this layer, the anisotropy of the line-core radiation is practically zero (see Figure 1 of Trujillo Bueno et al. 2011); therefore, such deeper regions do not contribute to the scattering polarization at the line core. An analogous situation occurs in 3D models resulting from MHD simulations, such as that of Carlsson et al. (2016). However, in this 3D model of the solar atmosphere, the layer of line-core formation is no longer horizontally oriented but is now a highly corrugated layer whose upper surface delineates the chromosphere–corona TR (see Figure 3 of Štěpán et al. 2015). At each spatial point, the thickness of this corrugated layer is typically much smaller than the local radius of curvature of its surface; hence, one can look at the Lyα line-core formation as taking place locally in a thin planar slab located at the height of line-center optical depth unity and having a well-defined normal vector, ${\boldsymbol{n}}$, that points in the direction of the gradient of the kinetic temperature.17 The concept of a local normal vector of the TR has already been introduced in Figure 7 of Štěpán et al. (2015) in order to quantify the degree of corrugation of the TR.

The variation of the physical quantities along the local normal vector ${\boldsymbol{n}}$ is much more important than that along the perpendicular direction. Therefore, it is a reasonable approximation to assume that the Lyα line core originates in a plane-parallel atmosphere whose normal vector ${\boldsymbol{n}}$ is inclined with respect to the local vertical. This picture is useful because it allows us to use locally inclined 1D models to approximate the complex 3D structure of the chromosphere–corona TR. We call this approximation the corrugated transition region (CTR) model, and we develop it quantitatively in the Appendix.

As shown below, our simple CTR model is capable of relatively easily explaining the surprising behavior of the Lyα polarization observed by CLASP, namely that the Q/I line-center signal does not show any clear CLV. In this paper, we use the CTR model to shed light on the sensitivity of the Lyα line-center scattering polarization to the mean field strength and geometrical complexity of the TR and to illustrate the general statistical inference method we propose for interpreting the CLASP observations. We note, however, that the CTR model should be understood only as a rough approximation to the realistic 3D problem of the scattering polarization in the Lyα line, because it fails at some fundamental level if applied to the analysis of real spectropolarimetric observations (see Appendix A.3). In other words, in this paper, the CTR model is used only for illustrative purposes and for facilitating the explanation of our statistical inference method, but not for interpreting any real observation of the Lyα scattering polarization. As we shall show in our next paper (J. Trujillo Bueno et al. 2018; in preparation), our interpretation of the CLASP data is instead based on a statistical model of the solar TR that uses many 3D models of the solar atmosphere, which we have constructed starting from a state-of-the-art model resulting from an MHD simulation.

Finally, it is necessary to emphasize that while calculations of the line intensity using the so-called 1.5D approximation, by means of which the radiation transfer in each column of the model atmosphere is treated as if it were a 1D plane-parallel atmosphere, can sometimes be a sufficiently good approximation, if scattering polarization is considered, the 1.5D approximation often fails (e.g., Štěpán & Trujillo Bueno 2016).

2.1. CLV and the Symmetry-breaking Problem

The linear polarization of the emergent spectral-line radiation is determined by the atomic polarization induced in the line's levels by the pumping radiation field and the orientation of the line of sight. The atomic polarization of a level of angular momentum J, which can be suitably quantified by the ${\rho }_{Q}^{K}$ multipolar components of the atomic density matrix, are closely connected to the radiation field anisotropy in the line-formation region. The symmetry properties of the radiation field can be conveniently described in terms of the irreducible radiation field tensor ${J}_{Q}^{K}$ (Landi Degl'Innocenti & Landolfi 2004). These quantities are determined by the thermodynamical and dynamical structure of the atmosphere. In the idealized case of a static, unmagnetized one-dimensional (1D) plane-parallel model atmosphere, the radiation field is fully determined by the vertical stratification of the atmosphere, and, due to the cylindrical symmetry of the problem, the only nonzero components of the radiation field tensor are the familiar mean intensity, ${J}_{0}^{0}$, and the anisotropy component, ${J}_{0}^{2}$. In such a model atmosphere, the linear polarization of the emergent spectral-line radiation can be either parallel or perpendicular to the projection of the local vertical (symmetry axis of the problem) onto the field of view (FOV). For more information on the anisotropy and polarization of the hydrogen Lyα line in plane-parallel models of the solar atmosphere, see Section 3 of Trujillo Bueno et al. (2011).

In the CTR model, the local normal vector ${\boldsymbol{n}}$ of the TR can be inclined with respect to the local vertical, and the reference direction of linear polarization changes accordingly, being either parallel or perpendicular to the projection of ${\boldsymbol{n}}$ onto the FOV (because the local symmetry axis of the radiation field is now ${\boldsymbol{n}}$ instead of Z). The modification of the emergent polarization by deviations of the model's physical conditions from the 1D plane-parallel limit is usually referred to as the symmetry-breaking effect (e.g., Manso Sainz & Trujillo Bueno 2011), and it has been demonstrated, using self-consistent 3D simulations, that it affects the polarization signals of many chromospheric lines, including Lyα (e.g., Štěpán 2015; Štěpán et al. 2015; Štěpán & Trujillo Bueno 2016).

The symmetry-breaking effects pose a great challenge for the interpretation of the scattering polarization spectra because the magnitude and orientation of the spectral-line polarization can be misinterpreted as being due to the action of a magnetic field via the Hanle effect. Carlin et al. (2012) emphasized that the gradients of the vertical component of the macroscopic velocity can change the polarization amplitudes, while Štěpán & Trujillo Bueno (2016) pointed out that the horizontal components of the macroscopic velocity can have an important impact on the scattering polarization signals via their breaking of the axial symmetry of the pumping radiation field. Fortunately, for the particular case of the Lyα line, the impact of the dynamical state of the model atmosphere on the anisotropy and symmetry breaking of the Lyα radiation is not very important, due to the large Doppler width of Lyα and its line-core formation taking place within a geometrically thin layer.

The CTR model allows us to construct random realizations of the TR at any given point on the solar disk, characterized by the cosine of the heliocentric angle, $\mu =\cos \theta $. In each such realization, the TR normal vector has a random inclination (sampled from the distribution given by Equation (18)), and the magnetic field vector has a random orientation and a magnitude sampled from the distribution of Equation (19). If we average the Stokes I, Q, and U signals emerging from many such random realizations, we can construct the CLV curves, which provide some information about the average properties of the atmosphere (hereafter referred to as the average CLV curves).

The impact of different levels of magnetization, quantified by the $\bar{B}$ parameter of Equation (19), and corrugation, quantified by the a parameter of Equation (18), can be seen in the CLV curves of Q/I shown in Figure 2. The simplest case of a nonmagnetized 1D plane-parallel atmosphere corresponds to the CLV curve with the largest amplitude, shown in the top left panel of the figure. As the magnetic field increases, the polarization amplitudes are reduced by the Hanle effect. In the limit of very strong fields (the so-called Hanle effect saturation regime; Landi Degl'Innocenti & Landolfi 2004), the polarization amplitudes are about a factor of 1/5 lower than in the nonmagnetized atmosphere.

Figure 2.

Figure 2. CLV of the Lyα Q/I line-center signals calculated in several CTR models. Top left panel: case of the plane-parallel FAL-C model ($a=\infty $) with increasing mean field strengths $\bar{B}$. Top right panel: unmagnetized CTR models ($\bar{B}=0\,{\rm{G}}$) with various corrugation parameters a. The black dotted line shows the CLV of the original FAL-C model, $(\infty ,0)$. Bottom left panel: CTR models with a = 5 and increasing $\bar{B}$ values. Bottom right panel: CTR magnetized model ($\bar{B}=50$ G) with various corrugation parameters a. The step between the curves corresponding to different $\bar{B}$ values is 5 G, while the step between the curves with different corrugation parameters a is 2.

Standard image High-resolution image

The case of a slightly corrugated atmosphere (a = 5) is shown in the bottom left panel. As in the 1D plane-parallel model, we see the depolarizing effect of a magnetic field of increasing strength up to the Hanle saturation limit.

The top right panel of Figure 2 demonstrates the effect of different corrugations of a nonmagnetized atmosphere. In the limit of a very large positive a value, we obtain the plane-parallel solution. As the corrugation parameter a decreases, the Q/I amplitudes also decrease. At the same time, the near-limb Q/I values tend to increase for an interval of a values. This is due to the fact that the sharp Q/I ≈ 0 value around μ = 0.1 in the plane-parallel case gets mixed with larger Q/I signals from models having slightly different orientations of the local TR normal vector. Similar behavior can be seen in more realistic investigations based on 3D models (see Figure 12 of Štěpán et al. 2015). For a = 0, the orientation of the TR normal vectors is isotropic in the upper hemisphere. This case corresponds to the maximum level of corrugation of the atmosphere, and the corresponding CLV practically vanishes.18 For a < 0, we see that the CLV curve is mostly positive, although it shows a sign reversal around μ = 0.8. An increasingly negative corrugation parameter makes the local TR normal vectors increasingly predominantly horizontal. This situation corresponds to the case in which the plasma structures in the model atmosphere are mostly vertically oriented, such as in the case of solar spicules. In the limit $a\to -\infty $, the normal vectors become horizontal. It is in this extreme case that we obtain the largest negative Q/I values near the disk center (μ ≈ 1). We note, however, that in the limit μ → 1, the Q/I value is zero.

In the bottom right panel of Figure 2, we show a case analogous to that of the top right panel but with an average strength $\bar{B}=50$ G, which is close to the critical Hanle field of Lyα. The impact of the magnetic field is simply a depolarization of the signals.

2.2. Distribution of the Polarization Signals and Statistical Models of the Atmosphere

Hereafter, we will often use q = Q/I, u = U/I, and ${\boldsymbol{S}}={[q,u]}^{{\rm{T}}}$ in order to simplify the notation.

As seen in Figure 1, the q and u line-center signals observed by CLASP are fluctuating around zero, with typical amplitudes around 0.2%. The Doppler core of the hydrogen Lyα line forms in the upper chromosphere, and it is important to note that the Hanle effect operates at the line center, where the CRD approximation provides a suitable approximation (see Belluzzi et al. 2012). The linear polarization of this resonance line is due to scattering processes in a complex 3D medium and the action of the Hanle effect. The Zeeman effect can be neglected due to the large Doppler width of the line and the weakness of the magnetic field in the upper chromosphere of the quiet Sun. Since the hydrogen Lyα line is very broad, the scattering polarization produced by anisotropic radiation pumping in this line is not very sensitive to the Doppler shifts produced by the macroscopic motions of the plasma. This contrasts with the case of narrower chromospheric lines whose linear polarization signals can be significantly affected by the presence of macroscopic velocity field gradients (see, e.g., Štěpán & Trujillo Bueno 2016). An estimate of the average CLV of the q signals observed by CLASP by, for instance, a parabolic fit of the data gives an almost zero CLV, which is in sharp contrast with the results obtained using the abovementioned semi-empirical and 3D MHD models (see the dashed and dotted lines of Figure 1).

In general, the statistical models chosen for interpreting the CLASP observations should take into account the fact that there may exist correlations between the intensity and polarization signals. This was predicted by Štěpán et al. (2015) and later confirmed by CLASP (see Kano et al. 2017), and in our next paper (J. Trujillo Bueno et al. 2018), we will take into account that plasma structures, such as internetwork and network regions, differ not only in intensity but also in their polarization signals. However, for the sake of simplicity, in this paper we ignore such correlations. This is natural in the case of our simple CTR model, which is based on a single semi-empirical model atmosphere with its corresponding line-center intensity at any given μ value.

While a fitting of the data with a smooth CLV curve (e.g., with the parabolic fit previously mentioned) can be visually compared with the synthetic CLVs, such an approach is not sufficient for a quantitative analysis. The main reason is that by fitting just the CLV, we ignore a significant amount of information contained in the q and u data. If the average CLV of q is zero, then all the models with a zero average CLV would be considered to be equally suitable, regardless of the amplitude of local polarization fluctuations.

Let us assume that the quiet-Sun atmosphere can be regarded as a stochastic system with observables having a definite probability distribution function at any given time, point on the FOV, and line of sight. The observable quantities of interest are the line-center Stokes parameters of the spectral line. In general, the statistical properties of the observables may depend on the phase of the solar cycle and the solar latitude of the quiet Sun. It is important to clarify that we do not assume the so-called microturbulent limit in this paper. The magnetic field vector and the orientation of the TR normal vector have definite values at any point of the FOV. However, these quantities can be considered to have random values because the plasma structures of the solar disk are randomly oriented over the FOV and as time goes by.

Consider the idealized case of a spatially and temporally resolved observation. The observed fractional linear polarization signals at a given μ value result from a particular realization of the atmospheric corrugation and magnetic field at a given spatial point and time. Let us assume that we have a statistical model of the quiet-solar atmosphere depending on a finite set of parameters ${\boldsymbol{\theta }}$ common to all the points in the FOV.19 These parameters define the global properties of the model atmosphere.

There is an associated probability density function (PDF) ${p}^{\mu }({\boldsymbol{S}}| {\boldsymbol{\theta }})$ that quantifies the probability of ${\boldsymbol{S}}$ conditional on the truth of ${\boldsymbol{\theta }}$, which can be obtained from the histograms of the Stokes signals produced by the model. In the simple case of the CTR model, the set of model parameters is ${\boldsymbol{\theta }}=\{a,\bar{B}\}$, and the corresponding signals are obtained simply by generating random orientations and magnetic field strengths of the TR from the model's PDFs controlled by ${\boldsymbol{\theta }}$ (see the Appendix). In general, the model can result from a 3D numerical simulation of the solar atmosphere, as we will show in our next paper.

In Figure 3, we show some examples of such PDFs for different model parameters. The general properties of these PDFs can be summarized as follows.

  • (1)  
    The u PDFs are symmetric with respect to the zero value. This follows from the definition of the Stokes parameters and the symmetry of the problem (u and −u must be equally likely).
  • (2)  
    The q PDFs are generally asymmetric, except at μ = 1, where they are equal to the u PDFs.

Figure 3.

Figure 3. PDFs ${p}^{\mu }(q| {\boldsymbol{\theta }})$ (top panels) and ${p}^{\mu }(u| {\boldsymbol{\theta }})$ (bottom panels) for various CTR model parameters ${\boldsymbol{\theta }}=\{a,\bar{B}\}$ (indicated at the top of each panel). For the sake of simplicity, we assume that the joint PDF can be factorized as ${p}^{\mu }({\boldsymbol{S}}| {\boldsymbol{\theta }})={p}^{\mu }(q| {\boldsymbol{\theta }}){p}^{\mu }(u| {\boldsymbol{\theta }})$. For each value of μ (horizontal axis), the darker shades of gray indicate larger PDF values. The dashed red curves in the top panels show the CLV of the q signals after averaging the q values at each μ (cf. Figure 2). The PDFs are normalized to unity at each μ value. See the text for more details. The gray-scale color table differs from panel to panel in order to improve the contrast of the individual plots. The normalization is such that $\int {p}^{\mu }(q| {\boldsymbol{\theta }})\ {dq}=\int {p}^{\mu }(u| {\boldsymbol{\theta }})\ {du}=1$ for every μ.

Standard image High-resolution image

The left panels of Figure 3 correspond to the most corrugated (a = 0) unmagnetized ($\bar{B}=0$) model. The CLV indicated by the red dashed line is very small, but the spread of possible values of both q and u around zero is very significant for all LOS μ values. This is due to the fact that the relative orientation of the TR normal vectors with respect to the line of sight (LOS) is only weakly dependent on the LOS. As mentioned in the previous section, there is a remaining asymmetry in the problem due to the restriction of the TR normal vectors to the upper hemisphere, which causes a nonzero CLV and generally a dependence of pμ on μ.

The panels in the second column correspond to the same level of corrugation (a = 0) but with a magnetic field in the saturation regime of the Hanle effect. The depolarizing effect of the magnetic field leads to a significant reduction of the probability of having large polarization signals. The q and u signals generated from this model would therefore have much smaller amplitudes along the spatial direction of the spectrograph slit, assuming it is radially oriented and extending from μ = 0.1 to 1.

The third column of Figure 3 shows the PDFs of a strongly magnetized FAL-C plane-parallel model atmosphere. In this model, the symmetry-breaking effects are only due to the action of the magnetic field. In contrast to the unmagnetized FAL-C model, in which the PDF would coincide with the red dashed curve of the upper panel of the figure, the magnetic field can produce both negative and positive q and u signals. While positive and negative q values are equally likely near the disk center, there is a significant predominance of negative signals for μ < 1. In contrast to these models, the q signals observed by CLASP along the spectrograph slit show a symmetric distribution around zero (see the middle panel of Figure 1).

The last column of Figure 3 contains another example of the PDFs in a slightly corrugated unmagnetized model with predominantly vertical normal vectors (a = 5). While there is a clear average CLV in q, the spread of values around it and the distribution of u signals around zero indicate that the symmetry-breaking mechanism plays a very significant role. This model resembles the case of the 3D MHD model of Carlsson et al. (2016; see the Appendix).

Different combinations of magnetic fields and corrugation parameters produce a rich variety of PDFs that contain much more information than the CLV curves. Instead of a single value of q at a given μ, one has PDFs for q and u that can be quantitatively compared with the distribution of observed data. This is especially important for observations like CLASP, where we have only a small amount of data. In the following section, we formulate the statistical inference method in a more rigorous way.

3. The Statistical Inference Method

3.1. Bayesian Analysis

We denote the observed line-center Q/I and U/I signals at the spatial position i along the CLASP spectrograph slit by ${{\boldsymbol{S}}}^{\mathrm{obs}}({\mu }_{i})={[{q}^{\mathrm{obs}}({\mu }_{i}),{u}^{\mathrm{obs}}({\mu }_{i})]}^{{\rm{T}}}$, where μi is the cosine of the heliocentric angle. Likewise, ${\boldsymbol{S}}({\mu }_{i};{\boldsymbol{\theta }})={[q({\mu }_{i};{\boldsymbol{\theta }}),u({\mu }_{i};{\boldsymbol{\theta }})]}^{{\rm{T}}}$ represents the random Q/I and U/I line-center signals sampled from our model atmosphere characterized by the hyperparameters ${\boldsymbol{\theta }}$. For example, in the CTR model, ${\boldsymbol{\theta }}=\{a,\bar{B}\}$. Our generative model of the CLASP observations can be defined as

Equation (1)

where ${\boldsymbol{\epsilon }}$ represents uncorrelated Gaussian noise with variance σ2.

Using the notation of Gregory (2005), the Bayes theorem reads

Equation (2)

where, for the case of the CLASP experiment, we have

  • 1.  
    $D\equiv {{\boldsymbol{S}}}^{\mathrm{obs}}$: proposition representing the data;
  • 2.  
    ${H}_{j}\equiv {\boldsymbol{S}},{\boldsymbol{\theta }}$: proposition asserting the truth of the model hypothesis; and
  • 3.  
    I: proposition representing our prior information (e.g., the μi value, point spread function (PSF) of the instrument, etc.).

The quantity $p({H}_{j}| D,I)$ is the desired posterior PDF of the hypothesis given the observed data, $p({H}_{j}| I)$ is the prior probability distribution of the model parameters, and $p(D| {H}_{j},I)$ is the likelihood, i.e., a measure of how well the given model can fit the observed data.

In the following, we drop I from the equations for notational simplicity because it represents obvious information we have about our measurements. Finally, we note that the denominator of Equation (2), the so-called evidence, satisfies the normalization condition

Equation (3)

which ensures that ${\sum }_{j}p({H}_{j}| D,I)=1$. Since this normalization constant is not relevant for the final expressions given below, hereafter we take it equal to unity.

In our notation, the Bayes theorem applied to each spatial pixel (characterized by μi) of the CLASP spectrograph slit establishes that20

Equation (4)

We now marginalize this posterior PDF by integrating over ${\boldsymbol{S}}$, which allows us to obtain the following expression for the posterior of the hyperparameters:

Equation (5)

The prior distribution ${p}^{{\mu }_{i}}({\boldsymbol{S}},{\boldsymbol{\theta }})$ can be factorized as (cf. Gregory 2005)

Equation (6)

where $p({\boldsymbol{\theta }})$ is the prior of the hyperparameters. Under the assumption of uncorrelated Gaussian noise, the likelihood factor ${p}^{{\mu }_{i}}({{\boldsymbol{S}}}^{\mathrm{obs}}| {\boldsymbol{S}},{\boldsymbol{\theta }})$ reads

Equation (7)

where we have used the compact notation $\parallel {\boldsymbol{S}}-{{\boldsymbol{S}}}^{\mathrm{obs}}{\parallel }^{2}={(q-{q}^{\mathrm{obs}})}^{2}+{(u-{u}^{\mathrm{obs}})}^{2}$. Finally, the marginal posterior of the hyperparameters for a single pixel reads

Equation (8)

By using the notation for the likelihood

Equation (9)

we can finally write the posterior of the hyperparameters based on the data from all the CLASP pixels:

Equation (10)

It is important to emphasize that, in our forthcoming paper regarding the interpretation of the Lyα line-center data observed by CLASP, we will not aim at performing a point-by-point inversion of the atmospheric physical quantities. Instead, we will use the information contained in all the pixels along the CLASP spectrograph slit in order to determine the parameterization $\hat{{\boldsymbol{\theta }}}$ of suitable statistical models of the solar atmosphere that maximizes the marginal posterior of Equation (10). In other words, our first step will be to estimate some global properties of the quiet-Sun atmosphere observed by CLASP (e.g., the average field strength and degree of corrugation of the chromosphere–corona TR).

3.2. Maximum-likelihood Estimation

The general Equation (10) allows us to impose a reasonable prior distribution for the hyperparameters. In this paper, for the hyperparameters of our CTR model, we assume a flat prior distribution, $p({\boldsymbol{\theta }})\propto 1$, and we therefore restrict our analysis to the study of the likelihood ${ \mathcal L }$.

Our goal is to find the estimator

Equation (11)

which maximizes the likelihood function; i.e., it corresponds to the model with the strongest observational evidence. Given two model parameterizations ${{\boldsymbol{\theta }}}_{A}$ and ${{\boldsymbol{\theta }}}_{B}$, the likelihood ratio

Equation (12)

measures the strength of evidence of ${{\boldsymbol{\theta }}}_{A}$ with respect to ${{\boldsymbol{\theta }}}_{B}$. In the following, the maximum of the likelihood function corresponding to a particular observation will be denoted by ${{ \mathcal L }}_{\max }={ \mathcal L }(\hat{{\boldsymbol{\theta }}}| {{\boldsymbol{S}}}^{\mathrm{obs}})$, and the likelihood ratio

Equation (13)

will be used to quantify the model fitness.

3.3. Maximum-likelihood Estimation Using the CTR Model

In this section, we show a practical application of the maximum-likelihood estimation (MLE) method using synthetic data corresponding to the CTR model introduced in the Appendix. To this end, we assume that the true quiet-solar chromosphere corresponds to our CTR model with the particular value of parameters, ${{\boldsymbol{\theta }}}_{0}=\{{a}_{0},{\bar{B}}_{0}\}=\{5,50\,{\rm{G}}\}$. We use this model to generate random realizations of the Stokes parameters along the imaginary spectrograph slit oriented radially from near the solar limb toward the disk center. The abovementioned data play the role of an actual observation, and we will show how to infer the parameters of the model by applying the MLE.

Figure 4 shows three different realizations (i.e., synthetic observations) of a CLASP-like experiment with different numbers of data points affected by Gaussian noise. It is important to realize that the accuracy of the MLE depends sensitively on the amount of available data. Under the assumption that the real solar atmosphere can be exactly described by the CTR model, then in the limit of an infinite number of data points, the MLE provides exact results. Finite data sets lead to approximate results. For this reason, it is important to study the accuracy of the method with respect to N.

Figure 4.

Figure 4. Multiple random realizations of the line-center Q/I (left panels) and U/I (right panels) signals of the Lyα line sampled from the CTR model with parameters ${{\boldsymbol{\theta }}}_{0}=\{{a}_{0},{\bar{B}}_{0}\}=\{5,50\,{\rm{G}}\}$. For an increasing number of data points, N (i.e., the number of pixels along the imaginary spectrograph slit extending from μ = 0.1 to 0.9), the fractional polarization signals are distributed around the average CLV curves (shown with the red lines) according to the PDF ${P}_{0}^{{\mu }_{i}}({q}_{i},{u}_{i}| {{\boldsymbol{\theta }}}_{0})$, where i = 1, ..., N. We have added Gaussian noise with a standard deviation σ = 0.05% to the data (the 1σ bars are only shown in the top panels).

Standard image High-resolution image

Without a loss of generality, we use the approximation from Section 2.2 for decoupling the Stokes parameters,

Equation (14)

This factorization simplifies the practical calculations, and our numerical tests show that it does not significantly affect the results, at least in the case of the CTR model. The practical calculation of the likelihood of model hyperparameters ${\boldsymbol{\theta }}=\{a,\bar{B}\}$ requires us to evaluate the integral on the right-hand side of Equation (9) for every pixel along the spectrograph slit. The PDFs of q and u can be approximated by normalized histograms of these quantities constructed from random samples of the TR realization using the CTR model. For any given parameterization, ${\boldsymbol{\theta }}$, the distribution of the normal vectors of the TR is given by Equation (18), and the distribution of magnetic field strengths follows from Equation (19). The polarization of the emergent radiation can then be efficiently calculated using the methods of the Appendix and incorporated into the histogram of the polarization signals.

Since the number of free parameters is only two, it is feasible to efficiently calculate a dense grid of models covering the relevant part of the parameter space and determine the estimator $\hat{{\boldsymbol{\theta }}}$. If the number of free parameters is larger, more advanced methods, such as the Markov chain Monte Carlo method, can be considered.

As the magnetic field strength reaches the saturation limit of the Hanle effect ($\bar{B}\gg {B}_{H}$), the polarization signals are no longer sensitive to the value of $\bar{B}$. In order to be able to visualize the infinite range of magnetic field intensities, we use the parameter

Equation (15)

instead of $\bar{B}$. The likelihood ratios corresponding to the slit realizations of Figure 4 are shown in Figure 5. The parameter b on the horizontal axis is equal to zero for $\bar{B}=0$ G, 0.5 for $\bar{B}={B}_{H}$, and 1 for $\bar{B}=\infty $. It is clear that as the number of data points in the observation increases above about 102, the MLE becomes very close to the true value, and the uncertainty of the inferred parameters significantly decreases.

Figure 5.

Figure 5. Maps of the likelihood ratios (see Equation (13)) corresponding to the three particular data realizations of Figure 4. From top to bottom, the number of observed points is 10, 50, and 200. The white dot at a = 5, $\bar{B}=50$ G (b ≈ 0.5) indicates the true model parameters. For N = 200, the estimator, i.e., the point of maximum likelihood, is already very close to the true value.

Standard image High-resolution image

3.4. Note on the Limited Spatial Resolution of the Observations

The observed data are always affected by the finite spatial and spectral resolution of the instrument, and the limited resolution affects the statistical distribution of the observed signals. In the analysis of the data, one needs to take this fact into account and apply the correct degradation procedure equivalent to that of the telescope and spectrograph to the synthetic spectra before constructing the ${p}^{\mu }({\boldsymbol{S}}| {\boldsymbol{\theta }})$ functions.

This cannot be done in the CTR model, because this simple model does not provide any information about the local spatial variability of the Stokes parameters in the FOV (see Appendix A.3). However, such information is available in 3D MHD models in which one has to convolve the theoretical spectra with the point-spread function of the instrument and degrade the spectra in the wavelength space. In our following paper, we apply the statistical inference method described in Section 3.1 to the CLASP observations.

4. Concluding Comments

Recently, unprecedented spectropolarimetric observations of the hydrogen Lyα line of the solar disk radiation have been provided by the CLASP sounding rocket experiment. The Q/I and U/I line-center signals observed along the spatial direction of the (radially oriented) spectrograph slit encode key information on the magnetic field and geometrical complexity of the corrugated layer that delineates the chromosphere–corona TR. With only one spectral line, it is not possible to determine the magnetic field of the solar TR, unless the geometrical complexity of its defining corrugated layer is known beforehand. However, it should be possible to constrain the mean field strength and geometrical complexity via the application of a suitable statistical inference method.

The aim of this paper has been to propose a statistical inference method based on the concept of maximum likelihood, which we consider suitable for interpreting the CLASP observations. Here we have illustrated this method by applying it to the theoretical Q/I and U/I line-center signals resulting from a simple line-formation model (the CTR model) that we have introduced for qualitatively understanding the CLASP observations. In our next paper (J. Trujillo Bueno et al. 2018, in preparation), we will apply the statistical inference method developed here in order to interpret the CLASP data themselves and estimate global properties of the quiet-Sun atmosphere by means of more realistic models of the chromosphere–corona TR resulting from 3D numerical simulations.

The CLASP team is an international partnership between the NASA Marshall Space Flight Center, National Astronomical Observatory of Japan (NAOJ), Japan Aerospace Exploration Agency (JAXA), Instituto de Astrofísica de Canarias (IAC), and Institut d'Astrophysique Spatiale; additional partners are the Astronomical Institute ASCR, Istituto Ricerche Solari Locarno (IRSOL), Lockheed Martin, and University of Oslo. The US participation was funded by NASA Low Cost Access to Space (award Number 12-SHP 12/2-0283). The Japanese participation was funded by the basic research program of ISAS/JAXA, internal research funding of NAOJ, and JSPS KAKENHI grant Numbers 23340052, 24740134, 24340040, and 25220703. The Spanish participation was funded by the Ministry of Economy and Competitiveness through project AYA2010-18029 (Solar Magnetism and Astrophysical Spectropolarimetry). The French hardware participation was funded by the Centre National d'Etudes Spatiales (CNES). Moreover, we acknowledge the supercomputing grants provided by the Barcelona Supercomputing Center (National Supercomputing Center, Barcelona, Spain), as well as the financial support received through grant 16-16861S of the Grant Agency of the Czech Republic, project RVO:67985815 of the Czech Academy of Sciences, and projects AYA2014-60476-P and AYA2014-55078-P of the Spanish Ministry of Economy and Competitiveness. JS is grateful to the Visiting Researchers Programme of the Jesús Serra Foundation for financing a four-month working visit at the IAC. Finally, JTB acknowledges the funding received from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (ERC Advanced Grant agreement No. 742265).

Appendix: Corrugated Transition Region Model

A.1. Definition

The CTR model is a simple approximation to the full 3D problem of formation of the Lyα line core. Inspired by the peculiarities of the Lyα formation that have been discussed in Section 2, the CTR model assumes that the TR can be locally approximated by a thin slab whose normal vector is inclined with respect to the local vertical (see Figure 6). For our purposes, we assume that the thermal structure of the slab is that of the semi-empirical FAL-C model (Fontenla et al. 1993). Since in the CTR model, the problem becomes effectively 1D instead of 3D, it allows us to greatly speed up the calculations while retaining some of the crucial properties of the symmetry-breaking mechanism of the full 3D problem.

Figure 6.

Figure 6. Motivation for the CTR model. The physical quantities of the TR of geometrical thickness D in which the Lyα line core forms vary much faster along the local TR normal vector ${\boldsymbol{n}}$ than along the perpendicular direction. Therefore, we approximate the line-core formation region using an inclined plane-parallel model whose normal vector ${\boldsymbol{n}}$ makes an angle θn with respect to the local vertical direction Z. The quantity L is of the order of the local radius of curvature of the TR.

Standard image High-resolution image

In the CTR model, the physical state of the TR at any given spatial point and instant of time can be described by a set of parameters ${\boldsymbol{\xi }}\equiv \{{\mu }_{B},{\chi }_{B},B,{\mu }_{n},{\chi }_{n}\}$, where ${\mu }_{B}=\cos {\theta }_{B}$ and χB are the cosine of the inclination and the azimuth of the magnetic field vector, respectively; B is the magnetic field strength; and ${\mu }_{n}=\cos {\theta }_{n}$ and χn are the cosine of the inclination and azimuth of the normal vector of the TR, respectively. For a given set of parameters ${\boldsymbol{\xi }}$, it is easy to synthesize the emergent Stokes profiles for any LOS using a 1D non-LTE radiative transfer code and applying the necessary rotations of the reference frame (see below).

The statistical model of the quiet-Sun atmosphere introduced in Section 2.2 allows us to randomly sample the local atmospheric parameters ${\boldsymbol{\xi }}$ from a probability density distribution $p({\boldsymbol{\xi }}| {\boldsymbol{\theta }})$ given the model parameters ${\boldsymbol{\theta }}$ that play the role of hyperparameters for the local physical quantities. In the CTR model, we define the joint PDF in a factorized form:

Equation (16)

It is evident that $p({\chi }_{n}| {\boldsymbol{\theta }})$ and $p({\chi }_{B}| {\boldsymbol{\theta }})$ must be uniform in the interval [0, 2π). In order to keep the model simple, we furthermore assume that the magnetic field vector is distributed isotropically; hence, $p({\mu }_{B}| {\boldsymbol{\theta }})$ is uniform in [−1, 1]. Therefore, the resulting distribution reads

Equation (17)

In this paper, we assume the PDF of μn to have the form

Equation (18)

and we assume μn ∈ [0, 1]; i.e., we do not allow the TR to be oriented toward the solar interior. The hyperparameter $a\in (-\infty ,\infty )$, hereafter the corrugation parameter, determines whether the TR normal vector is predominantly vertical (a > 0) or horizontal (a < 0). The limiting cases of interest are the well-known case of a plane-parallel atmosphere ($a=\infty $) and the case of an atmosphere composed of vertically oriented plasma structures ($a=-\infty $). In the a ≈ 0 case, the distribution of ${\mu }_{{\boldsymbol{n}}}$ is that of highly chaotic orientations of the TR. As an illustration of the adequacy of the distribution (Equation (18)), we show in Figure 7 the histogram of μn obtained from a 3D model snapshot of an enhanced network region resulting from an MHD simulation by Carlsson et al. (2016). The distribution qualitatively agrees with the functional form of Equation (18), with the best fit obtained for a ≈ 5 (solid curve).

Figure 7.

Figure 7. Histogram of the cosines of the normal vectors, ${\mu }_{{\boldsymbol{n}}}$, in the 3D MHD model of Carlsson et al. (2016) and its best least-squares fit with the distribution of Equation (18) with a ≈ 5.

Standard image High-resolution image

The CTR distribution of the magnetic field intensity, B, is given by the Rayleigh distribution,

Equation (19)

parameterized by the average magnetic field value, $\bar{B}$, which is the second hyperparameter of the CTR model.21

We note that the factorization in Equation (16) is possible only if the involved quantities are independent. In reality, it is very likely that the orientation of the magnetic field vector is closely related to the orientation of the observed plasma structures in the TR (see, e.g., Vourlidas et al. 2010). In our academic CTR model, for simplicity, we do not consider such correlations. In the next paper of this series, we will show that our inference of information from the CLASP line-center data is based on a more realistic statistical model of the solar TR, based on parameterized 3D models having different levels of magnetization and geometrical complexity.

A.2. Calculation of the Emergent Polarization Signals

The hyperparameters ${\boldsymbol{\theta }}=\{a,\bar{B}\}$ define the model for the random atmosphere realizations. Since we use the FAL-C semi-empirical model for our calculations, the emergent Stokes parameters need to be calculated numerically by applying a non-LTE radiative transfer code. It would be very time-consuming to perform a non-LTE calculation for every possible combination of the normal and magnetic field vectors. Therefore, we use the Eddington–Barbier approximation to speed up the calculations. We consider an unmagnetized 1D plane-parallel atmosphere. For each LOS defined by its μ value, the emergent Q/I signal at the center of Lyα can be approximated by (Trujillo Bueno et al. 2011)

Equation (20)

where ${J}_{Q}^{K}$ are the multipolar components of the radiation field tensor at the height along the LOS where the line-center optical depth τ is unity. For each μ value, the self-consistent non-LTE calculation provides the line-center Q/I. From Equation (20), we calculate the corresponding ${J}_{0}^{2}$ and ${J}_{0}^{0}$ values.22

In order to calculate the emergent signals from the inclined TR at the point being considered, we use the formalism of the irreducible spherical tensors. Given the components of a general spherical tensor ${[{T}_{P}^{K}]}_{\mathrm{old}}$ in an "old" reference frame, its components in a "new" frame are given by the transformation

Equation (21)

where R is the rotation that brings the old frame to the new one and ${{ \mathcal D }}_{{PQ}}^{K}(R)$ is the corresponding rotation matrix (see Section 2.7 of Landi Degl'Innocenti & Landolfi 2004).

Calculation of the emergent fractional polarization from the rotated TR involves several transformations of the coordinate systems. In the following, we use three different reference frames: (1) the laboratory frame, L, whose positive Z axis is parallel to the local vertical in the atmosphere; (2) the TR frame, T, whose positive Z axis is parallel to the normal vector of the TR; and (3) the magnetic field reference frame, M, whose positive Z axis is parallel to the local magnetic field vector. The algorithm to calculate the emergent fractional polarization for a particular LOS, orientation of the TR normal vector, and arbitrary magnetic field is as follows.

  • 1.  
    Calculate the cosine of the inclination of the LOS vector, $\hat{{\boldsymbol{l}}}$, in the TR frame: ${[\mu ]}_{T}=\hat{{\boldsymbol{l}}}\cdot {\boldsymbol{n}}$.
  • 2.  
    For [μ]T, we obtain the value of [Q/I]T from the CLV curve of the FAL-C model (this curve is calculated by numerically solving the full non-LTE radiative transfer problem in the absence of magnetic fields).
  • 3.  
    From Equation (20), we get ${[{J}_{0}^{2}/{J}_{0}^{0}]}_{T}$.
  • 4.  
    We express the ${[{J}_{Q}^{K}]}_{M}$ tensor in the magnetic field reference frame (hereafter the M-frame) by rotating the reference frame using Equation (21).
  • 5.  
    We solve the statistical equilibrium equations in the magnetic field reference frame to obtain ${[{\rho }_{Q}^{K}]}_{M}$ (i.e., the multipolar components of the atomic density matrix of the only level of Lyα that contributes to its scattering polarization, namely its upper level with J = 3/2). This is particularly easy in the M-frame by using Equation (10.27) of Landi Degl'Innocenti & Landolfi (2004).
  • 6.  
    Express the atomic density matrix in the laboratory frame ${[{\rho }_{Q}^{K}]}_{L}$ by rotating the reference frame from M to L.
  • 7.  
    Calculate the emergent Stokes parameters in the laboratory frame.

A.3. CTR Model Limitations

The CTR model helps us to easily understand some of the key aspects of the Lyα line-core polarization in the solar TR. Thanks to its simplicity, it is also useful for demonstrating the general statistical inference method that will be applied in our next paper. Below, we summarize the main deficiencies of the CTR model that make it of limited use for practical applications.

The CTR model is based on only one atmospheric component, FAL-C. Therefore, it cannot take into account differences between network and internetwork, and, in general, it cannot be used to fit the line-center intensity. This problem can be partially solved by using 1D multicomponent models, but since the CTR model cannot be used in practice for a number of other reasons, such a generalization would probably be inappropriate.

With the CTR model, we cannot account for the limited spatial resolution of the observations. The 3D radiative transfer calculations show that the fractional polarization amplitudes can reach up to several percent in the case of observations with very high spatial resolution (Štěpán et al. 2015). The spatial smearing of the signals leads to a significant decrease of the amplitudes. Accidentally, the amplitudes of the Q/I and U/I CLASP signals are comparable to those corresponding to the CTR model, but the CLASP amplitudes are already significantly affected by the limited spatial resolution, and the true solar signals have, therefore, necessarily higher amplitudes.

While the CTR model is an intuitive approximation of reality, the symmetry breaking at a given point in the TR is always affected by the illumination from distant regions that reaches that point through the optically thin gaps of coronal plasma (see Figure 3 of Štěpán et al. 2015). In addition, the local curvature of the TR is often not negligible, and it produces additional symmetry breaking.

Footnotes

  • 17 

    An alternative definition is possible, with ${\boldsymbol{n}}$ being the normal vector to the τ = 1 surface. Both definitions lead to very similar results, which provides an additional justification of the approximation.

  • 18 

    The small CLV residual is due to the fact that the inclination of the normal vectors in the CTR model is restricted to the [0°, 90°] interval.

  • 19 

    In the hierarchical Bayesian formulation of the inference problem described below, we can consider these parameters as the hyperparameters of the model. Following the Bayesian terminology, the term hyperparameter stands for the parameter of the prior distribution (Gelman & Hill 2007).

  • 20 

    For notational simplicity, we drop the explicit dependence of ${\boldsymbol{S}}$ and ${{\boldsymbol{S}}}^{\mathrm{obs}}$ on μi.

  • 21 

    In more realistic atmospheric models, one should instead assume a lognormal distribution. However, the Rayleigh distribution depends on just one free parameter, which is advantageous in the inference problem.

  • 22 

    We point out that the use of these values for ${J}_{0}^{2}/{J}_{0}^{0}$ provides better accuracy than the use of those at τ = 1.

Please wait… references are loading.
10.3847/1538-4357/aad910