This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.

Error Analysis for Dual‐Beam Optical Linear Polarimetry1

and

Published 2006 January 3 © 2006. The Astronomical Society of the Pacific. All rights reserved. Printed in U.S.A.
, , Citation Ferdinando Patat and Martino Romaniello 2006 PASP 118 146 DOI 10.1086/497581

1538-3873/118/839/146

ABSTRACT

In this paper we present an error analysis for polarimetric data obtained with dual‐beam instruments. After recalling the basic concepts, we introduce the analytical expressions for the uncertainties of polarization degree and angle. These are then compared with the results of Monte Carlo simulations, which are also used to briefly discuss the statistical bias. We then approach the problem of background subtraction and the errors introduced by an imperfect Wollaston prism, flat‐fielding, and retarder plate defects. Finally, we investigate the effects of instrumental polarization and propose a simple test to detect and characterize it. The application of this method to real VLT‐FORS1 data has shown the presence of a spurious polarization that is of the order of ∼1.5% at the edges of the field of view. The cause of this problem has been identified as the presence of rather curved lenses in the collimator, combined with the incomplete removal of reflections by the coatings. This problem is probably common to all focal‐reducer instruments equipped with a polarimetric mode. An additional spurious and asymmetric polarization field, whose cause is still unclear, is visible in the B band.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

Performing polarimetry basically means measuring flux differences along different electric field oscillation planes. In ground‐based astronomy, this becomes a particularly difficult task, due to the variable atmospheric conditions, which make it difficult to detect the relatively low polarization degrees that characterize most astronomical sources (a few percent; see, e.g., Leroy 2000). These fluctuations in fact introduce flux variations among different polarization directions, which can be eventually mistaken for genuine polarization effects.

This problem has been solved in a number of different ways, reviewed by Tinbergen (1996), to which we refer the reader for a detailed description. In this paper, we focus on the so‐called dual‐beam configuration, which is the most popular one for instruments currently mounted at large telescopes. Despite new technologies, the basic concept of astronomical dual‐beam polarimeters (see, e.g., Appenzeller 1967; Scarrot et al. 1983) has remained unchanged. A mask is placed on the focal plane, preventing image (or spectra) overlap, followed by a Wollaston prism, which splits the incoming beam into two rays that are characterized by orthogonal polarization states and are separated by a suitable angular throw. The rotation of the polarization plane is usually achieved with the introduction of a turnable retarder plate (half‐ or quarter‐wave for linear and circular polarization, respectively) just before the Wollaston prism (see, e.g., Schmidt et al. 1992). Recently, new solutions have been proposed to fully solve the problem in a single exposure (see Oliva 1997; Pernechele et al. 2003, for an example application), but so far they have been implemented in a few cases only.

Alternatives to Wollaston‐based systems have been devised. These are mainly based on the charge transfer in CCDs, which allows an on‐chip storage of two different polarization states that are obtained by rotating a polarization modulator. After the pioneering work of McLean et al. (1983), this technique, originally proposed by P. Stockman, has been successfully applied in a number of instruments (McLean 1997).

In this work, we address the most relevant problems connected to two‐beam polarimetric observations and data reduction. The paper is organized as follows. In § 2 we introduce the basic concepts of the problem, and in § 3 we recall the analytical expressions for the uncertainties of polarization degree and angle, which are then compared to Monte Carlo simulations in § 4. In the same section we also recap the basics of polarization bias. Section 5 deals with the effects of background on polarization measurements, and § 6 covers flat‐fielding issues. Section 7 is devoted to Wollaston prism deviations from the ideal. The consequences of retarder plate defects are addressed in § 8, while the effects of postanalyzer optics are discussed in § 9. Section 10 is dedicated to instrumental polarization, and § 11 deals with the case of VLT‐FORS1. Finally, in § 12 we discuss and summarize our results.

2. BASIC CONCEPTS

The polarization state of incoming light can be described through a Stokes vector S(I, Q, U, V); see, e.g., Chandrasekhar (1950). Its components, also known as Stokes parameters, have the following meanings: I is the intensity, Q and U describe the linear polarization, and V is the circular polarization. Linear polarization degree P and polarization angle χ are related to the Stokes parameters as:

where we have introduced the normalized Stokes parameters Q- = Q/I and U- = U/I. The above relations can be easily inverted to yield

Finally, the degree of circular polarization, not discussed in this paper, is simply Pc = V - ≡V/I. For the sake of clarity, we set V = 0 and neglect all circular polarization effects throughout the paper.

The ideal measurement system for linear polarization is composed of a half‐wave retarder plate (HWP) followed by the analyzer, which is a Wollaston prism (WP) producing two beams with orthogonal directions of polarization. In general, each of these elements can be treated as a mathematical operator that acts on the input Stokes vector S (see, e.g., Shurcliff 1962; Goldstein 2003). What one measures on the detector are the intensities in the ordinary and extraordinary beams at a given HWP angle θi, which are related to the Stokes parameters by

If the observations are carried out using N positions for the HWP, the whole problem of computing I, Q, and U reduces to the solution of the 2N linear equations system given by equation (4). It is clear that given three unknowns (I, Q, and U), at least N = 2 HWP position angles have to be used.

Introducing the normalized flux differences Fi,

and noting that fO,i + fE,i = I, equation (4) reduces to the N equation

We note that each F‐parameter is totally determined by a single observation and is therefore independent of changes in sky conditions. It is also worth mentioning that there are alternative approaches to the normalized flux ratios. One example can be found in Miller et al. (1987).

In principle, one can use any set of HWP angles to solve the problem, but it is easy to show that adopting a constant step Δθ = π/8 is the optimal choice. In fact, besides minimizing the errors of the Stokes parameters, this choice makes the solution of equation (6) trivial:

Finally, it prevents "power leakage" (see, e.g., Press et al. 1999) when performing a Fourier analysis (see below).

In the ideal case, the normalized flux differences Fi obey equation (6), which is a pure cosinusoid. Since all possible effects introduced by the HWP must reproduce after a full revolution, it is natural to consider them as harmonics of a fundamental function whose period is 2π.

Therefore, if θi = πi/8, equation (6) can be rewritten as the following Fourier series:

where the Fourier coefficients are given by

which are valid for N = 4, 8, 12, and 16. Comparing equation (9) with equation (6), it is clear that the polarization signal is carried by the k = N/4 harmonic. In a quasi‐ideal case, all Fourier coefficients are expected to be small compared with aN/4 and bN/4, and deviations from this behavior could arise from a number of effects. For such an approach to the error analysis, and for the meaning of the various harmonics, the reader is referred to Fendt et al. (1996). Here we just note that the a0 term, which should be rigorously null in the ideal case, is related to the deviations of the WP from the ideal result (see § 7).

In general, a Fourier analysis is meaningful when N = 16, and it can reveal possible problems directly related to HWP quality (cf. § 8). In most cases, however, for practical reasons, one typically uses N = 4, and in that case, a different error treatment is required.

3. ANALYTICAL ERROR ANALYSIS

Under the assumption that all relevant quantities are distributed according to Gaussian laws, one can analytically derive simple expressions for the corresponding errors of the final results. As we see in § 4, this assumption is not always correct, and when this happens, a numerical treatment is required in order to test the analytical results and their range of validity. Assuming that the background level is the same in the ordinary and extraordinary beams, and that the readout noise can be neglected, the analytical expression for the absolute error of P can be readily derived (see, e.g., Miller et al. 1987) by propagating the various errors through the relevant equations2:

where S/N is the signal‐to‐noise ratio of the intensity image (fO + fE). The S/N one expects to achieve in the polarization degree, (S/N)P = P/σP, is simply given by

As for the error of χ, this is given by

from which it is clear that at variance with the degree of polarization, the accuracy of the polarization angle does depend on the intrinsic polarization degree.

4. MONTE CARLO SIMULATIONS

The analytical treatment presented in § 3 relies on the assumption that all relevant variables obey Gaussian statistics. Numerical simulations are required in order to derive more realistic distributions and to verify the validity of the analytical results. One can easily implement the concepts we have developed until now in a Monte Carlo (MC) code, which also allows higher sophistication, such as the inclusion of Poissonian noise. With this tool, one can readily investigate the effects of non‐Gaussian distributions of the derived quantities, the most important of which is the systematic error of the polarization, as first pointed out by Serkowski (1958).

4.1. Linear Polarization Bias

Due to the various noise sources, the vector components Q- and U- are normally distributed, but since P is defined as the quadrature sum of Q- and U-, the statistical errors always add in the positive direction, leading to a systematic increase of the estimated polarization degree, thus introducing a bias. The problem was addressed by several authors, using both analytical and numerical methods (Serkowski 1958; Wardle & Kronberg 1974; Simmons & Stewart 1985; Clarke & Stewart 1986; Sparks & Axon 1999). We refer the reader to those papers for a detailed description of the problem; here we recall just the basic concepts and apply them to our case.

The polarization bias is usually quantified using a robust estimator, supposedly giving a statistically significant representation of the observed value, which is then compared with the input polarization in order to derive the systematic correction. Different choices have been adopted (see Sparks & Axon 1999). Following the considerations by Wardle & Kronberg (1974) we have adopted here the mode 〈P〉 of the distribution in order to estimate the bias, which we therefore define as ΔP = 〈P〉-P0, where P0 is the input polarization. Once applied to the observed data, the bias correction ΔP tends to restore the symmetry of the deviation distribution (see, e.g., Sparks & Axon 1999, their Fig. 4).

In Figure 1 we show the results of our MC simulations for the estimated rms error of the polarization σP (top panel) and polarization bias ΔP (bottom) for N = 4. Following Sparks & Axon (1999), we have used η≡P0(S/N) and 〈η〉≡〈P〉(S/N) as independent variables in our plots. As we had anticipated, σP follows the analytical prediction of equation (10) when η>2. For lower values of η, σP tends to be systematically smaller than the analytical prediction, and it converges to the value expected for the Rayleigh distribution (dashed line), which becomes a very good approximation for η<0.5, provided that S/N>3. For intermediate values of η, the distribution is described by a Rice function (Rice 1944). In conclusion, one can safely use the analytical solution given by equation (10) for η≥2 only.

Fig. 1.—

Fig. 1.— Top: Comparison between the rms error on the polarization degree from MC simulations (circles) and eq. (10). The dashed line traces the expected rms error for the Rayleigh distribution (see text). Bottom: Bias estimated using the mode (circles) and the average (crosses). For comparison, the solid curve traces the Wardle & Kronberg (1974) solution for the mode, while the dashed line shows the Sparks & Axon (1999) solution for the average.

As for the polarization bias, we have plotted it as a function of measurable quantities, namely the S/N and the observed polarization 〈P〉.

The results of our MC simulations, as plotted in the bottom panel of Figure 1, are in good agreement with the analytical solution found by Wardle & Kronberg (1974) for the same statistical estimator. For comparison, we have also plotted the results obtained when the average is adopted (crosses). For (N/2)1/2〈η〉>4, the relation between log (ΔP/〈P〉) and log ([N/2]1/2〈η〉) is well approximated by a linear law. A least‐squares fit gives the result

which can be used to correct the observed polarization values according to the input S/N, measured polarization, and number of HWP positions. In general, the bias effect is present even at reasonably high values of S/N when the polarization is small and σP and ΔP tend to become similar, so that the systematic bias correction is comparable to the random uncertainty of the polarization. This is better seen in Figure 2, where we have plotted the ratio ΔP over σP as a function of P0P≡(S/N)P deduced from our simulations. As anticipated, for low values of (S/N)P, the ratio between ΔP and σP tends to unity, with some variations among different estimators. For (S/N)P≥3, the bias correction is less than 10% of the expected accuracy, and it is therefore negligible. Moreover, above that threshold, all estimators give practically identical results.

Fig. 2.—

Fig. 2.— Comparison between systematic bias ΔP and random error σP for different estimators as a function of input polarization S/N.

In Figure 2 we have plotted for comparison the function computed by Simmons & Stewart (1985), who have used a maximum likelihood estimator in order to evaluate the bias. As these authors have shown, this is the best estimator for (S/N)P≤0.7, while for (S/N)P>0.7 the mode first used by Wardle & Kronberg (1974) should be used.

5. THE EFFECTS OF THE BACKGROUND

Until now we have assumed that one is able to perfectly subtract the background contribution. This is most likely the case when performing polarimetric measurements on pointlike sources, since in that situation local background subtraction is in most cases straightforward.

We note that the background, whatever its nature is, must be subtracted before calculating normalized Stokes parameters (see also Tinbergen 1996), so that possible background polarization can be vectorially removed.

If we assume that the object is characterized by Po and χo, and the background by Pb and χb, the two polarization fields can be expressed using Stokes vectors defined as So(Io, IoPo cos2χo, IoPo sin2χo) and Sb(Ib, IbPb cos2χb, IbPb sin2χb), where we have neglected any circular polarization. Since Stokes vectors are additive (see, e.g., Chandrasekhar 1950), the resulting polarization field is described by S = So + Sb, and therefore the total polarization is given by the formula

where r = (IbPb)/(IoPo); i.e., the ratio between the polarized fluxes of background and object. The corresponding polarization angle is

Clearly, the background is going to significantly influence the object when r≳1. For r∼1, one can write

which implies that for comparable polarized fluxes, the resulting polarization is nulled when the polarization fields are perpendicular (|χo - χb| = π/2).

6. FLAT‐FIELDING

One of the basic problems in reducing the data produced by dual‐beam instruments is the flat‐fielding. Because image splitting occurs after the focal mask, collimator, and the HWP, one would in principle need to obtain flat exposures with all optical components in the light path. Unfortunately, in all practical conditions, this introduces strong artificial effects, due to the strong polarization typical of flat‐field sources (either twilight sky or internal screens). In principle, one can reduce this effect using the continuous rotation of the HWP as a depolarizer. This is implemented, for example, in EFOSC2, currently mounted at the ESO 3.6 m telescope (Patat 1999), and it is effective only if the HWP rotation time is much shorter than the required exposure time. The depolarizing effect can also be achieved by averaging flats taken with the same set of HWP angles used for the scientific exposures. In fact, with the use of the optimal angle set (see § 2), one has

and a similar expression for fE,i, which do not contain any polarization information. The problem is that this is true only if the source is stable in intensity, which is surely not the case for the twilight sky, and probably not really true for most lamps.

An alternative solution (at least for imaging) is the use of a set of twilight flats obtained without HWP and WP. While on the one hand this eliminates source polarization, on the other hand it does not allow for a proper flat‐field correction. In fact, while the pixel‐to‐pixel variations are properly taken into account, the large‐scale patterns are not, due to the splitting of the beam, which maps a given focal plane area into two different regions of the post‐WP optics and the detector. Moreover, these calibrations do not carry any information about possible spatial effects introduced by the HWP and the WP. However, as the simulations show, this problem becomes milder if some redundancy is introduced. For example, if one uses N = 4 HWP positions, the ordinary and extraordinary rays will just swap when the angle differs by π/4 within each of the two redundant pairs (fiO = -fi + 2E; see eq. [4]). This tends to cancel out the flat‐field effect and becomes more efficient if the maximum redundancy (N = 16) is used. However, it must be noted that time‐dependent effects, such as fringing, may affect the redundant pairs in a different way, therefore decreasing the cancellation efficiency.

7. EFFECTS OF A NONIDEAL WOLLASTON PRISM

So far we have assumed that our system is described by equation (4); i.e., that the Wollaston prism splits incoming unpolarized light into identical fractions. A deviation from this ideal behavior can be described by the introduction of a new parameter t in equation (4), which can be reformulated as

An ideal system is obtained for t = 1/2. Now, for unpolarized light (Q = U = 0), these new equations give fO,i = tI and fE,i = (1-t)I for all HWP angles, so that all normalized flux differences turn out to be identical (i.e., Fi = 2t-1). Therefore, the value of t can be directly estimated observing an unpolarized source.

In the simplest situation, where N = 2, neglecting the presence of the t term would lead to a spurious polarization degree P = 2(2t-1), with a polarization angle χ = π/8. It is interesting to note that this is not the case, for example, when N = 4. In fact in that situation, because all redundant F's are identical, equations (7) and (8) would correctly yield null Stokes parameters.

The problem is more complicated when the incoming light is polarized, since the normalized flux differences are no longer a linear combination of Q- and U-, as one can verify from equation (12):

where we have set κ = 2t-1 (|κ|≤1, κ = 0 in the ideal case). If κ is known, one can correct the observed fO and fE by dividing them by 2t and 2(1-t), respectively, before following the procedure adopted for the ideal case. Of course, if the source has a known polarization (e.g., a polarized standard), one can use this information together with the observed F ratios to derive κ for each HWP angle, according to the relation

If the input polarization is unknown, then one can in principle derive κ from the observations themselves, provided that N≥4. In fact, after introducing the parameter

and using equation (13), it is easy to demonstrate that

where j = 1,...,N/2-2 and the positive sign refers to the case gj≤0. For instance, for N = 4, one can determine two independent estimates of κ, which can be averaged to improve on the accuracy.

It is interesting to note that when P≪1, equation (13) can be approximated as Fi≃κ + Q-cos4θi + U - sin4θi. If N is a multiple of 4 (i.e., if the F function is sampled for an integer number of periods π/2), one has

which clarifies the meaning of the a0 term in the Fourier series (see eq. [9]).

It is worth mentioning that the redundancy in the F parameters does eliminate the effects of a nonideal WP, to a large extent. For example, a blind application of equations (7) and (8) to the case of N = 4 gives the result

and a similar expression for U-. If the polarization is small (P0≤0.1), we have that κ2 Q-2≪1, and therefore the application of the procedure for an ideal case in a nonideal situation would lead to a value of Q-, which is (1-κ2) times smaller than the real one. Since the same is true for U-, the resulting polarization P will also be (1-κ2) times smaller than the input value, while the polarization angle remains unchanged. For example, if |κ|≤0.2, σP/P is less than 4%.

Equation (12) describes a particular case only, in which the incoming unpolarized flux is distributed into two fractions, t and (1-t). More generally, one should replace the term (1-t) with an independent parameter s so that the fraction of light split by the WP in the ordinary and extraordinary rays become uncorrelated. Using the same procedure, it is easy to demonstrate that one can estimate the ratio κj = (t-s)/(t+s) still using equation (15).

Another effect we have investigated is the possibility that the difference in polarization direction between the ordinary and the extraordinary rays of the WP is not π/2. If we call Δα the deviation from this ideal angle, then using the general expression of the Mueller matrix for a linear polarizer (see, e.g., Goldstein 2003) and deriving the expressions for the normalized flux ratios, one gets

where the approximation is valid under the assumption that P≪1. From of this expression, it is easy to conclude that for reasonably small values of Δα (≤10°), the implied errors are of the order of 0.05% on the polarization degree and 5° on the polarization angle, irrespective of the number of HWP positions used.

8. HWP DEFECTS

In the ideal case, the normalized flux differences are modulated by the HWP rotation according to equation (6), which is a pure cosinusoid. If defects such as dirt or inhomogeneously distributed dust are present on the HWP, one can expect spurious flux modulations that are unrelated to the polarization of the incoming light and that can reduce the performance of the instrument. As a consequence, error estimates based on pure photon statistics are systematically smaller than the actual errors.

These kinds of problems can be investigated with the aid of Fourier analysis, following the procedure we have outlined at the end of § 2. This method becomes particularly effective when the observations are taken sampling the full HWP angle range (i.e., 2π), which, given the choice of the optimal angle set θi = πi/8, implies N = 16 retarder plate positions. Under these circumstances, one can determine the Fourier coefficients ak and bk for the first eight harmonics, the fundamental harmonic (k = 1) being related to local transparency fluctuations, which repeat themselves after a full revolution, like dirt or dust. By definition (e.g., eq. [6]), the fourth harmonic is directly related to the linear polarization (i.e., a4≡ Q- and b4≡ U-). All other harmonics, with the only remarkable exception being the second one (k = 2), are simply overtones of harmonics with lower frequencies, and include part of the noise generated by the photon statistics, which is present at all frequencies and is therefore indicated as white noise. For this reason, the global random error is often estimated as the signal carried by the harmonics with k = 3, 5–8 (see, e.g., Fendt et al. 1996, their Appendix A.3).

The second harmonic deserves a separate discussion. Ideally, the HWP operates as a pure rotator of the input Stokes vector, with the advantage that one does not need to rotate the whole instrument in order to analyze different polarization planes. In the real case, in which the HWP is usually constructed using birefringent materials, it is affected by so‐called pleochroism. This is a wavelength‐dependent variation of the transmission that takes place when the direction of the incoming light is changed with respect to the crystal lattice. Because of the way the HWP is manufactured, the crystals have an axial symmetry, which gives a period of π. Therefore, this effect is seen as the k = 2 component.

In Figure 3 we show a real case in which we have applied this analysis to archival data obtained with FORS1, which is currently mounted at the Cassegrain focus of the ESO VLT 8.2 m telescope (Szeifert 2002).

Fig. 3.—

Fig. 3.— Example of Fourier analysis applied to archival VLT‐FORS1 observations of a bright star in the V passband (see text). Top: Normalized flux differences. Partial reconstructions using eight harmonics (solid curve) and the fourth harmonic only (dashed curve). The dashed horizontal line is placed at the average of F values (a0). Middle: Residuals after subtracting the k = 1, 2, and 4 components. Bottom: Harmonics power spectrum.

A bright (and supposedly unpolarized) star was observed using N = 16 HWP positions. First, a Fourier analysis indicates the presence of a small deviation of the WP from the ideal (κ ≈ a0 ≃4.1 × 10-3; see § 7). Second, a clear polarization is detected at the level of about 0.4%, while all other components are smaller than 0.05% (this polarization is actually an instrumental effect present in FORS1; see § 10). The effective significance of harmonics other than k = 4 can be judged on the basis of the expected errors of the Fourier coefficients. For example, using the expression ak, one finds that

With this kind of analysis, one can see that in the example of Figure 3, ak and bk are consistent with a null value for k = 3, 5–7 (see center panel). As for the k = 1 and 2 harmonics, the Fourier coefficients are non‐null at a 2 σ level. Since for the test star it was S/N∼1600, it is clear that to detect k = 1 and 2 harmonics of this amplitude (0.05%), a S/N≥3000 is required.

It is important to note that the presence of these non‐null harmonics is implicitly corrected for when one has a sufficient number of HWP positions covering the maximum period 2π. In the most common case, in which N = 4 angles spaced by π/8 are used, one can derive the fundamental (i.e., linear polarization, period π/2) and the first overtone (period π/4) only. The latter corresponds to the k = 8 component of the N = 16 cases, which therefore carries the high‐frequency information only. As a consequence, if other harmonics are present, they are not properly removed and contribute to the final error, effectively setting the maximum accuracy one can achieve, irrespective of the S/N. Numerical simulations performed assuming a virtually infinite S/N show that in the presence of k = 1 and 2 components, the use of N = 4 HWP angles leads to systematic errors that are of the same order of amplitude as the two harmonics. From this and the example reported in Figure 3, we estimate that the absolute maximum accuracy attainable with FORS1 using N = 4 is of the order of 0.05%.

Another typical problem that affects the retarder plates is the chromatic dependence of the angle zero point. This is usually measured by means of a Glan‐Thompson prism, and it can change by more than 5° across the optical wavelength range. The computed polarization angle can be corrected by simply adding the HWP angle offset for the relevant wavelength (or effective wavelength, in the case of broadband imaging); see, for example, Szeifert (2002).

Finally, we have investigated the effects produced by a deviation Δβ from the nominal phase retardance of an HWP (π). Using the general expression of the Mueller matrix for a retarder (see, e.g., Goldstein 2003), the normalized flux ratios turn out to be

from which it is clear that the measured linear polarization depends on the circular polarization of the input signal. For V = 0 and Δβ≤10°, the corresponding absolute error of the polarization degree is less than 0.05%, while the outcome on the polarization angle is negligible. For V≠0, the exact effect depends on the ratio between the degrees of circular and linear polarization. For example, for Δ = 10° and Q = U = V = 0.01, the absolute error of the computed polarization degree is about 0.1% for N = 4, which decreases to 0.01% for N = 16. It is worthwhile noting that this defect would be detected by a Fourier analysis as a component with a period π and an intensity of |V- sinΔ|.

9. EFFECTS OF POSTANALYZER OPTICS

Typically, the analyzer is followed by additional optics, such as filters, grisms, and camera lenses, which, due to their possible tilt with respect to the optical axis, may behave as poor linear polarizers. In the most probable case, in which the polarization is produced by transmission (see also § 10), the properties of postanalyzer (PA) components can be described by the approximate Mueller matrix

where A ≈ 1, C ≈ 1, c = cos2φ, s = sin2φ, and φ is the polarization angle (which can change across the field of view), while B is related to the polarization degree introduced by the PA optics. This expression can be deduced from the general formulation (see Keller 2002, eq. [4.63]) after applying the usual matrix rotation (see, e.g., Keller 2002, eq. [2.5]). If S0 = (I0, Q0, U0, V0) is the input Stokes vector, the effect of PA optics can be evaluated by computing the Stokes vectors that correspond to the ordinary and extraordinary beams produced by the WP, transforming them using the operator MPA, and using the resulting intensity components to compute the normalized flux differences Fi. After simple calculations, one arrives at the expression

where we have assumed that |B|≪1, i.e., that the linear polarization induced by the PA optics is small. Given this expression, it is clear that the redundancy in the HWP positions (N = 4, 8, and 16) eliminates this problem, since the additive term Bcos2φ is not modulated by the HWP rotation, while for N = 2 the derived polarization degree and angle would be affected, possibly severely. If the optimal HWP angle set has been used, it is easy to verify that Bcos2φ = ∑i=0N-1 Fi/N, which is identical to equation (16). This means that in a first approximation, it is not possible to distinguish between an imperfect WP and the presence of polarization in the PA optics. Therefore, the fact that a0≃0.4% in Figure 3 can actually be attributed to both kinds of problems. The PA optics effect definitely becomes stronger when these include highly tilted components, such as grisms. This is very well illustrated by the two examples in Figure 4, where we show the results obtained using VLT‐FORS1 archival data of a highly polarized star (Vela 1 95, α = 09h06m00s, δ = −47°19'00'') and an unpolarized star (WD 1615−154, α = 16h17m55s, δ = −15°35'51''),3 which were observed on the optical axis, where the instrumental polarization is known to be null (Szeifert 2002).

Fig. 4.—

Fig. 4.— VLT‐FORS1 observations of Vela 1 95 (left) and WD 1615−154 (right). The plots show the linear polarization derived with N = 4 HWP positions (top), N = 2 (middle) and ∑i = 0N Fi/N for N = 4 (bottom; see text for more details). The original spectra were obtained with the 300V grism and a slit of 1''; for presentation, they have been binned to 25 Å. The open circles in the top left panel mark the broadband polarimetric measurements for Vela 1 95 (UBVRI, from left to right).

In both cases, the polarization degree deduced using N = 2 (center panels) is markedly different from that derived with N = 4 (top), and the deviation is particularly severe for the unpolarized object. As is finally apparent, the resulting values of Bcos2φ show a strong wavelength dependency and are higher than 5% at about 800 nm. It is interesting to note that Bcos2φ∼0 at about 450 nm; i.e., at the wavelength where the antireflection coatings are optimized (see § 10). This fact, together with the marked wavelength dependency and the much lower level seen in broadband imaging (see Fig. 3), strongly suggests that the effect seen in Figure 4 is indeed produced by the tilted surfaces of the grism.

10. EFFECTS OF INSTRUMENTAL POLARIZATION

So far, we have assumed that all optics preceding the analyzer do not introduce any polarization. Of course, this is not generally true (see, e.g., Tinbergen 1996; Leroy 2000 for a general introduction to the subject).

To show the effect of instrumental polarization, we assume that the preanalyzer optics, which include telescope mirrors, collimator, HWP, and so on, introduce an artificial polarization, which depends on the position in the field. For the sake of simplicity, we assume that these optics act as a nonperfect linear polarizer characterized by a position‐dependent polarization degree p(x,y) and polarization angle φ(x,y). This can be described by the Mueller matrix

where we have neglected circular polarization and have set s = sin2φ and c = cos2φ. For p = 0, one obtains a totally transparent component, while p = 1 gives an ideal linear polarizer.

If S0(I0, Q0, U0) is the Stokes vector describing the input polarization state, it will be transformed by preanalyzer optics into the vector S1 = MI·S0 before entering the analyzer:

and therefore the measurements would lead to S1, which would then need to be corrected for the instrumental effect inverting equation (17), provided that p(x,y) and φ(x,y) are known.

Of course, if the observed source is known to be unpolarized, p and φ can be derived immediately, for example by placing a single target on different positions of the field of view, or observing an unpolarized stellar field.

If the source is polarized, the problem becomes much more complicated, since equation (17) is nonlinear in c and s. The solution can be simplified by assuming that p≪1 and P0≪1, which is a reasonable hypothesis in most real cases, since instrumental polarization is typically less than a few percent. In this case, equation (17) can be rewritten as

It is important to note that the instrumental polarization is not removed by the local background subtraction. Moreover, it is independent of the object's intensity; in fact, using the previous expressions, one can verify that

where P0 and χ0 are the input polarization degree and angle, respectively. From this expression, it is clear that when P0≫p, it is also P ≈ P0, while in the case that object and instrumental polarization are comparable (p ≈ P0), the observed polarization is approximately given by

which, according to the value of (χ0-φ), gives values that range from 0 to 2 P0. It is important to note that the main difference between instrumental polarization and a polarized background is that the latter is effective only when the background is ≳I (see § 5), while the former acts regardless of the object intensity; what counts is its polarization.

With the aid of these approximate expressions (eq. [18]), one can easily evaluate the instrumental polarization, provided that the input polarization field is known and the observed source covers a large fraction of the instrument field of view. In fact, solving equation (18) for φ and p yields

and

where p1 and p2 are two independent estimates of p that can be averaged to increase the accuracy.

As it is well known, the night sky shows a polarization that varies according to the ecliptic and Galactic coordinates (see Leinert et al. 1998 for an extensive review). It is mostly dominated by zodiacal light polarization, which reaches its minimum, below a few percent, at the antisolar position (Roach & Gordon 1973). Since this is not expected to vary on scales of a few arcminutes, in principle, relatively empty fields represent suitable targets for panoramic polarization tests, provided that the S/N per spatial resolution element is of the order of several thousand.

11. THE CASE OF FORS1 AT THE ESO VLT

As an example application, we have performed a test using real data obtained with FORS1 at the ESO VLT. In this instrument, the polarimetric mode is achieved by inserting into the beam a superachromatic HWP and a WP that has a throw of about 22'' (Szeifert 2002).

We have identified in the ESO archive three sets of data obtained in rather empty fields in B, V, and I passbands. Table 1 lists the equatorial coordinates (α, δ), ecliptic longitude and latitude (λ, β), helioecliptic longitude (λ-λ), and degree of sky polarization and angle (Psky, χsky) for the different fields. In all three cases, the S/N achieved on the sky background Isky in the combined images is larger than S/N ≃200 pixel−1. With such a signal and for a typical 1% polarization, the bias effect is expected to be small (see Fig. 1), and the rms error of the polarization degree, according to equation (10), is of the order of 0.3%, while the uncertainty of χ is about 9° (see eq. [11]). In order to further increase the accuracy and to allow for outlier rejection, we have computed a clipped average in 30 × 30 pixel bins, which, given the FORS1 detector scale (0farcs2 pixel−1), translates into an angular resolution of 6''.

Since the instrumental polarization on the optical axis, measured with unpolarized standard stars, is smaller than 0.03% (Szeifert 2002), we are confident that the sky background polarization field (Psky, χsky) measured close to that area is not affected by spurious effects (the values are reported in the last two columns of Table 1). Therefore, we can easily compute p(x,y) and φ(x,y), using the method previously outlined, where I0 = Isky, Q0 = IskyPsky cos2χsky, and U0 = IskyPsky sin2χsky. The results of these calculations are presented in Figures 5,7, and 9. With the remarkable exception of the B band, the instrumental polarization of FORS1 shows a quasi‐symmetric radial pattern. For example, for the V filter, the instrumental polarization remains below 0.1% within 1' of the geometrical center of the detector, while it grows to ∼0.6% at 3', to reach the maximum (i.e., ∼1.4%) at the corners of the field of view.

Fig. 5.—

Fig. 5.— FORS1 instrumental polarization map in the B band. The contours trace 0.3%, 0.6%, and 0.9% polarization levels. Coordinates, expressed in arcminutes, refer to the geometrical center of the detector.

Fig. 6.—

Fig. 6.— Top: FORS1 instrumental polarization radial profile for the B band. Each point is the result of a 30 × 30 pixel binning in the original images. Radius, expressed in arcminutes, is computed from the geometrical center of the detector. The thick line traces a linear least‐squares fit, while the thin line is the polarimetric ray‐tracing prediction. Bottom: Instrumental polarization angle as a function of pixel polar angle. The solid line is not a fit to the data, but rather has unit slope and zero intercept.

Fig. 7.—

Fig. 7.— Same as Fig. 5, but for the V band. The dark segment marked by a circle in the lower right corner indicates the values obtained from an unpolarized standard star.

Fig. 8.—

Fig. 8.— Same as Fig. 6, but for the V band.

Fig. 9.—

Fig. 9.— Same as Fig. 5, but for the I band.

This is illustrated in the top panels of Figures 6, 8, and 10, where we have plotted the estimated instrumental polarization for each 30 × 30 pixel bin as a function of its average distance r from the center. The deviation from a perfect central symmetry is distinctly shown by the dispersion of the points, which is larger than the measurement error. Particularly marked is the case of the B band, which shows a strong azimuthal dependence and thus deserves a separate discussion (see § 11.1). For the V band, there is a systematic deviation from central symmetry for a polar angle α = arctan (y/x) between 10° and 80°. This region is probably disturbed by the presence of a saturated star and a reflection caused by the HWP, visible in the input images. Excluding these points, a linear least‐squares fit to the V data gives

where r is expressed in arcminutes (Fig. 8, solid curve). The I band shows the smoothest behavior, and the observations are described very well by the polynomial

Fig. 10.—

Fig. 10.— Same as Fig. 6, but for the I band.

The absolute rms deviations shown by the data from the best fits are of the order of 0.05%, so that in these two passbands the spurious polarization can be corrected with an accuracy that is comparable to that dictated by the photon statistics. In both cases, but especially in the I band, the pattern is remarkably radial, as shown in the bottom panels of Figures 8 and 10, where we have plotted φ as a function of polar angle α.

In order to verify these results for the V filter, we have carried out a test observing an unpolarized standard star placed in the lower right corner of the detector. Measured polarization was P = 0.92% ± 0.04% and χ = -48°±1fdg4, while according to the previous analysis, the expected instrumental polarization in that position is p = 0.96% and φ = -51fdg9, which are in very good agreement with each other (see also Fig. 7, lower right corner).

Once the instrumental polarization is mapped, one can correct for it using the approximate equation (18), which holds when p and P0 are small, and only if the instrumental polarization is produced by a linear polarizer preceding the analyzer.

11.1. The Cause of Instrumental Polarization in FORS1

As we have seen, the spurious polarization detected in FORS1 in V and I passbands has a clear central symmetry, is null on the optical axis (Szeifert 2002), shows a radial pattern, and grows with the distance from the optical axis. All these facts suggest that this must be generated by the optics that precede the analyzer (i.e., within the collimator). In fact, when a light beam enters an optical interface along a nonnormal direction, the component of the transmitted beam perpendicular to the plane of incidence is attenuated, according to Fresnel equations (see, e.g., Born & Wolff 1980). As a consequence, the emerging beam is linearly polarized in a direction that is parallel to the plane of incidence. If the surface is curved, as it is in the case of lenses, the incidence angle quickly increases, moving away from the optical axis, and this in turn produces an increase in the induced polarization. The effect becomes more pronounced if the lens is strongly curved; i.e., if the curvature radius is comparable to its diameter. Of course, on the optical axis, the incidence angle is null, so no polarization is produced. Therefore, at least from a qualitative point of view, the polarization induced by transmission has all the features necessary to explain the observed pattern.

The polarization induced by transmission can be easily evaluated using the appropriate expression for the corresponding Mueller matrix (see Keller 2002, eq. [4.63]). For a typical refraction index n = 1.5 and an incidence angle of 30°, refraction through an uncoated glass would produce a polarization of about B = 1.7% per optical surface. This polarization is usually reduced in a drastic way (i.e., down to 0.1%–0.2%) by antireflection (AR) coatings. Nevertheless, since the effect of multiple surfaces is roughly additive, in the presence of numerous and rather curved lenses, residual polarization can be nonnegligible. Another important aspect is that this mechanism has no effect on circular polarization, supported by the fact that no instrumental circular polarization has been measured in FORS1 (Bagnulo et al. 2002).

We have run polarization ray‐tracing simulations, including telescope mirrors, collimator lenses, and AR coatings. This kind of calculation allows one to describe in detail the optical system, taking into account partial polarization cancellation produced by symmetries within the optical beams, and the depolarizing effect of AR coatings.

The standard resolution collimator of FORS1 contains three lenses and a doublet, all treated with a single‐layer MgF2 quarter‐wave AR coating at 450 nm (Seifert 1994). The ray‐tracing calculations (Avila 2005) show that the polarization induced by transmission is indeed not totally removed by the AR coatings. For V and I filters, a best fit to the simulated data gives a radial dependence that is very similar to the results we have derived from the experimental data. The deviation reaches maximum at the edges of the field of view, where the ray‐tracing model gives a polarization that is ∼0.08% and ∼0.05% smaller than what is actually observed in V and I, respectively (see also Figs. 8 and 10, thin curves, top panel). Possible explanations for this small discrepancy can be attributed to imperfections in the AR coatings, and to the effects of nonorthogonal incidence on the HWP.

In principle, since single‐layer AR coatings are optimized for one specific wavelength (450 nm, in the case of FORS1), the residual polarization is expected to be higher at other wavelengths. Simulations have been run in order to sample the wavelength range 400–900 nm, and these show that the expected wavelength dependency can be very well approximated by the linear relation

where λ is expressed in nm. This relation predicts reasonably accurately the ∼40% relative increase we indeed see passing from V to the I passband, and it can thus be safely used to predict the effect in the R band.

According to the simulations, one would expect that the spurious polarization in B is about 25% smaller than in V. But as we have already mentioned, this passband shows a rather weird behavior and does not conform to the model predictions. In fact, the polarization pattern strongly deviates from central symmetry, displaying a marked azimuthal dependence (Fig. 5, left panel). This becomes more evident in the radial profile presented in Figure 6 (top): the purely radial dependence is clearly disturbed by an asymmetric field. In some directions, the polarization field grows much faster than in others, producing a great spread in the observed data, and in most of the cases, the observed polarization is larger than that predicted by the polarization ray‐tracing (thin solid curve). The deviations from a centrally symmetric pattern reach up to 0.5% (see Fig. 11), making the correction in the B band quite difficult, and certainly not feasible using simple smooth functions, as in the cases of V and I. Rather, a much more accurate correction can be obtained by interpolating the map of Figure 5 at the required field position. We must note, however, that a rigorous correction for this secondary effect will be possible only once its physical cause is identified and its mathematical description is formulated.

Fig. 11.—

Fig. 11.— Residual field obtained subtracting the ray‐tracing model from the observed polarization in the B band.

We have tried to reproduce the observed behavior by introducing defects in the system, such as a weak linear polarization from the HWP, and the presence of linear polarization in the post‐analyzer optics. In both cases, the effect is completely different from what we see in the B band. Therefore, the physical reason for this phenomenon is still unclear (see also the discussion in § 12). What we can say here is that the deviation from central symmetry is also present, although to a much smaller extent, in the V band. This is shown by the contours at constant polarization, which are clearly box‐shaped (see Fig. 7), while in the I band they are practically circular (see Fig. 9). The conclusion is that this additional effect, whatever its origin is, becomes more severe at shorter wavelengths.

We must note that our method relies heavily on the assumptions that the instrumental polarization is null on the optical axis and that the sky background polarization is constant across the field of view. In fact, the night‐sky polarization is not very well studied. The only extensive analysis we could find in the literature is that of Wolstencroft & Bandermann (1973), who concluded that the polarization structure varies in scale from a few degrees to about 30°; i.e., on scales that are much larger than the field of view of FORS1 (6farcm8 × 6farcm8). In order to explain the deviations we observe from a centrally symmetric pattern, one would need a variation in the sky polarization of the order of 0.5%, on a scale of a few arcminutes. Even though this seems to be quite a large gradient, in principle we cannot exclude it. Only further tests will clarify the nature of the effect we see in the B band.

12. DISCUSSION AND CONCLUSIONS

Dual‐beam polarizers coupled to two‐dimensional arrays provide a tool with which to perform panoramic imaging polarimetry and multiobject spectropolarimetry. For these instruments, the problem of atmospheric fluctuations is solved by obtaining simultaneous measurements of two orthogonal polarization states. Of course the use of a WP has also some drawbacks, such as the flat‐fielding issue discussed in § 6. With the exception of this one feature, data reduction and analysis are totally similar to other polarimetric systems, as we have shown with both analytical and numerical approaches (§§ 3 and 4).

When the targets of study are extended and cover a large fraction of the field of view, accurate background subtraction becomes an issue whose effects we have investigated in § 5. This is particularly important when the background is not the simple sky background, but rather has a complicated structure. This is the case, for instance, for a faint supernova projected onto a galactic spiral arm.

Another problem that may reduce the performance of a dual‐beam polarimeter is the imperfect behavior of the WP. In § 7 we discuss this issue and present a test to determine possible deviations from the ideal case. As an example, we have applied it to the FORS1 archive data described in § 10. Using an object‐free region roughly in the center of the field of view, we have used equations (14) and (15) to compute t (see eq. [12]), which turns out to be t = 0.502 ± 0.001, i.e., perfectly compatible with the value derived from the Fourier analysis (§ 8). As we have shown, the redundancy introduced by having N≥4 strongly reduces this problem, even in the cases where t differs by about 10% from the ideal case (t = 0.5). This is also the case for the presence of linear polarization in the postanalyzer optics, whose effects are practically eliminated by the redundancy (§ 9).

Finally, we have addressed the instrumental polarization issue, described its consequences, and proposed an easy test to detect any spurious effect, with rather high accuracy (§ 10). As an example, we have applied it to archival FORS1 data and have detected an instrumental polarization pattern that is roughly centrally symmetric (for V and I) and has a radial dependency. The presence of this spurious polarization affects all objects placed at distances larger than 1farcm5 from the optical axis, with intrinsic polarizations of a few percent or less. The problem becomes particularly severe when p≃P, in which case the measured Stokes parameters can be very wrong. For objects filling most of the field of view, there will always be regions that are affected by this problem. Moreover, the correct sky background estimate, which is absolutely necessary to recover the intrinsic object field in the outer parts of the Galaxy, becomes impossible if the instrumental polarization is not taken care of properly. The spurious field must be removed before one is able to estimate the background contribution. Both our data and ray‐tracing simulations show that the effect is wavelength dependent. In the case of FORS1, a strong deviation from central symmetry is seen in the B band, and we have interpreted this as a signature of an additional effect that has yet to be explained and is not included in the ray‐tracing simulations that, in contrast, accurately reproduce the observed data in V and I.

One possible source of asymmetric instrumental polarization is the unrelieved stress birefringence in the optical glasses, due to thermal strain and mechanical loading (see, e.g., Theocaris & Gdoutos 1979). This phenomenon is known to introduce a retardance that can in turn change the polarization status of incoming polarized light (the effect is null if the light is unpolarized). Since the incoming radiation is certainly polarized by FORS1 in a differential way across the field of view, this would also imply that the secondary effect should be weaker where the centrally symmetric component is smaller. The fact that this is indeed the case (see Fig. 11), and also that the retardance is expected to grow faster than λ-1, seems to suggest that this is a plausible explanation for the asymmetric component. If this is indeed the case, then it is not possible to correct the measured linear polarization just by vector‐subtracting the residual field (like the one shown in Fig. 11), simply because the effect of retardance depends on the polarization state of the incoming light. This requires a more sophisticated treatment that is necessarily based on the exact knowledge of the physical mechanism and its mathematical description through Mueller matrix formalism.

In general, instrumental polarization induced by transmission is most likely common to all focal reducers equipped with a polarimetric mode. While the overall pattern should be a general feature of these instruments, the exact radial dependence may change according to the optical design and the curvature of the lenses. The method we have described in this paper provides an accurate way to characterize the instrument, as well as a tool to correct for this effect.

This paper is partially based on observations made with ESO telescopes at Paranal Observatory under programs 066.A‐0397, 69.C‐0579, 069.D‐0461, and 072.A‐0025. The authors would like to thank T. Szeifert for his kind support and collaboration, J.Walsh and S. Bagnulo for interesting discussions, G. Ruprecht and W. Seifert for providing us with the optical design specifications of FORS1, R. Tommasini for introducing us to polarimetric ray‐tracing, S. D'Odorico and H. Dekker for their kind support, and G. Avila for his polarimetric ray‐tracing calculations. Special thanks go to C. Keller for his invaluable advice, clarifications, and help during the analysis of the instrumental polarization of FORS1. Finally, we would like to thank an anonymous referee for her/his comments and suggestions, which helped to improve the quality of the paper.

Online Material

  • Color figures

Footnotes

  • This paper is partially based on observations made with ESO telescope at Paranal Observatory, under programs 066.A‐0397, 69.C‐0579, 069.D‐0461, and 072.A‐0025.

  • Here we consider photon shot noise as the only source of random error. Another potential source is represented by the mispositioning of the HWP with respect to the optimal angles. However, as analytical solutions and numerical simulations show, with the typical positioning accuracy currently attainable (<1°), the associated error of the polarization degree and angle is negligible.

Please wait… references are loading.
10.1086/497581