Efficient and exact tests of the risk ratio in a correlated table with structural zero
Introduction
A sample of n individuals have a binary response measured. For reasons of design, only those who give a certain response on the first occasion are measured a second time. Such designs arise, for instance, in both treating and testing for disease, see Johnson and May (1995). An often quoted example is Toyota et al. (1999) who study the detection rates of a screening test for tuberculosis. For those who test negative on the first occasion the test is applied a second time 1–3 weeks later, whereas those who test positive on the first occasion do not need to be retested. It is suspected that application of the first test, even if negative, makes infected individuals more sensitive to subsequent tests. This booster phenomenon can be measured by the extent to which the probability of a negative response decreases from the first to the second occasion, given the first.
Another example which we will study in some detail was given in Agresti (1990, p. 45) (Table 1). A sample of 156 calves were tested for pneumonia during the first 60 days of life and a total of were positive. Of these 93 calves with primary infection, suffered a secondary infection in the following two weeks. There was interest in comparing the rate of primary infection, estimated to be , with the rate of secondary infection, estimated to be . The ratio of these two probabilities, known as the risk ratio (RR), represents the factor by which chance of infection changes after first infection, and is here estimated to be . An RR less than 1.0 suggests that primary infection has an immunising effect. Agresti defines a kind of statistic for testing whether the . For this example, the value turns out to be 19.7. The signed version of this statistic is and the approximate one-sided -value is . Certainly the evidence for a protective effect of first infection seems to be overwhelming.
Liu (1998, 2000) studied confidence intervals for the ratio and difference in response probabilities, respectively. This work was further developed by Tang and Tang (2002) for the ratio and Tang and Tang (2003) for the difference of probabilities. Lloyd and Moldovan, 2007a, Lloyd and Moldovan, 2007b have recently applied the exact method of Buehler (1957) to confidence limits for the RR. There has been less work on the testing problem though obviously the confidence intervals can be used to define two-sided tests.
This paper is motivated by several considerations. First, in this problem it is quite computationally feasible to calculate a -value with exact statistical properties. This is achieved by maximising over the nuisance parameter. Within the frequentist paradigm of inference it is essential to account for the worst possible parameter values if the statistical properties are to be guaranteed. While this may seem conservative, maximisation is the most efficient method possible of achieving this guarantee. Tests which are not maximised over the nuisance parameter are either systematically conservative or explicitly violate their stated properties, as explained in Lloyd (2005). Second, standard asymptotic tests have statistical properties that are far from ideal, even for large samples. The issue of exactness is a practical one. For instance in the above example of Agresti, the exact -value obtained by maximising over the nuisance parameter is 0.00361 which, while still small, corresponds to an equivalent -statistic of rather than . Such behaviour is not at all uncommon. Third, such behaviour can be largely eliminated by replacing the nuisance parameter with a null estimate and then maximising, as described in Section 3. This results in a -value that is less sensitive to the nuisance parameter and consequently the maximised versions tend to be smaller. This will be seen to translate into superior power for guaranteed size. Lastly, we look at the performance when and separately and discover quite different behaviour.
Section snippets
Model notation and approximate test statistics
The possible responses of an individual are , where 00 denotes a negative response on occasion 1, in which case the second response is negative by convention. Let be the number of individuals with response ij and the probability of this response. The count is absent by design. The probability of a positive response on the first occasion is . The probability of a second positive response given a first positive response is . The ratio of these two
Exact tests and -values
Tang and Tang (2002) have shown numerically that confidence intervals based on and can have poor coverage properties even for moderate sample sizes. This leads to poor performance of the implied two-sided test, at least for some null values. In this section we give a brief overview of methods for constructing the so-called exact tests from a given, possibly approximate, test statistic. We have data Y and parameter and want to test the null hypothesis , against either one or
Numerical study
We have described four basic test statistics and their modified versions. Each of these eight basic statistics generate four -values, namely the approximate -value based on the normal asymptotic, the M -value, the E -value and the -value. Only the M and -values can be guaranteed as valid. For all possible data sets when and , all 32 -values for testing the null hypothesis versus were computed. This allows a full investigation of the performance of the
Discussion
Another method of comparing secondary and primary probability of success is by the simple difference rather than the ratio. Approximate confidence intervals which generate two-sided tests are given by Lui (2000) and Tang and Tang (2003). When only a proportion of individuals have a structural zero, inference has been studied by Tang and Tang (2004). The study in Lloyd (2005) has considered some other basic generating statistics, including one based on the conditional distribution of X given T.
References (19)
- Agresti, A., 1990. Categorical data analysis. first ed. Wiley,...
On the elimination of nuisance parameters
J. Amer. Statist. Assoc.
(1977)- et al.
values maximised over a confidence set for the nuisance parameter
J. Amer. Statist. Assoc.
(1994) - et al.
Exact unconditional tests for a matched pairs design
Statist. Methods Med. Res.
(2003) - et al.
Mathematical Statistics
(1977) Confidence intervals for the product of two binomial parameters
J. Amer. Statist. Assoc.
(1957)- et al.
EM algorithm and its application to testing hypotheses
Sci. China A
(2003) - et al.
Combining tables that contain structural zero
Statist. Med.
(1995) - Lloyd, C.J., 2005. E+M P-values. Austral. NZ. J. Statist., submitted for publication and available as Working Paper...
Cited by (3)
Exact Statistical Inference for Categorical Data
2015, Exact Statistical Inference for Categorical DataA comparison of exact tests for trend with binary endpoints using bartholomew's statistic
2014, International Journal of BiostatisticsSample size determination for a matched-pairs study with incomplete data using exact approach
2018, British Journal of Mathematical and Statistical Psychology