Confidence Intervals and Sample Size to Compare the Predictive Values of Two Diagnostic Tests

Roldán-Nofuentes, José Antonio; Regad, Saad Bouh

doi:10.3390/math9131462

Open AccessArticle

Confidence Intervals and Sample Size to Compare the Predictive Values of Two Diagnostic Tests

by

José Antonio Roldán-Nofuentes

^1,*

and

Saad Bouh Regad

²

¹

Department of Statistics, School of Medicine, University of Granada, 18016 Granada, Spain

²

Epidemiology and Public Health Research Unit and URMCD, University of Nouakchott Alaasriya, BP 880 Nouakchott, Mauritania

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(13), 1462; https://doi.org/10.3390/math9131462

Submission received: 11 May 2021 / Revised: 14 June 2021 / Accepted: 18 June 2021 / Published: 22 June 2021

(This article belongs to the Special Issue Methods and Applications of Statistics in the Social and Health Sciences)

Download Versions Notes

Abstract

:

A binary diagnostic test is a medical test that is applied to an individual in order to determine the presence or the absence of a certain disease and whose result can be positive or negative. A positive result indicates the presence of the disease, and a negative result indicates the absence. Positive and negative predictive values represent the accuracy of a binary diagnostic test when it is applied to a cohort of individuals, and they are measures of the clinical accuracy of the binary diagnostic test. In this manuscript, we study the comparison of the positive (negative) predictive values of two binary diagnostic tests subject to a paired design through confidence intervals. We have studied confidence intervals for the difference and for the ratio of the two positive (negative) predictive values. Simulation experiments have been carried out to study the asymptotic behavior of the confidence intervals, giving some general rules for application. We also study a method to calculate the sample size to compare the parameters using confidence intervals. We have written a program in R to solve the problems studied in this manuscript. The results have been applied to the diagnosis of colorectal cancer.

Keywords:

binary diagnostic test; confidence interval; positive predictive value; negative predictive value; sample size

1. Introduction

A diagnostic test is medical test that is applied to an individual in order to determine the presence of a certain disease. Binary diagnostic tests are a very common type of diagnostic test in clinical practice. A binary diagnostic test (BDT) is a diagnostic test whose possible result is positive or negative. A positive result indicates the presence of the disease, and a negative result indicates the absence. Mammography for the diagnosis of breast cancer is an example of BDT. Accuracy of a BDT is measured in terms of two fundamental parameters: sensitivity and specificity. Sensitivity (Se) is the probability of the result of the BDT being positive when the individual has the disease, and specificity (Sp) is the probability of the result of the BDT being negative when the individual does not have the disease. Therefore, Se and Sp are probabilities of getting the disease diagnosis right, and they represent the intrinsic accuracy of the BDT, since these parameters depend on the physical, chemical, or biological properties upon which the BDT is developed. Other parameters that are used to assess and compare two BDTs are the positive and negative predictive values. Positive predictive value (

τ

) is the probability of an individual having the disease when the result of the BDT is positive, and the negative predictive value (

υ

) is the probability of an individual not having the disease when the result of the BDT is negative. Predictive values represent the accuracy of the diagnostic test when it is applied to a cohort of individuals, and they are measures of the clinical accuracy of the BDT. Predictive values depend on Se, Sp, and on the disease prevalence (p), and are easily calculated applying Bayes theorem, i.e.,

τ = \frac{p \times S e}{p \times S e + (1 - p) \times (1 - S p)} a n d υ = \frac{(1 - p) \times S p}{p \times (1 - S e) + (1 - p) \times S p} .

(1)

The accuracy of a BDT is assessed in relation to a gold standard. A gold standard (GS) is a medical test that determines without error whether or not an individual has the disease. Biopsy for the diagnosis of breast cancer is an example of GS.

On the other hand, the comparison of parameters of two BDTs is an important topic in the study of statistical methods for diagnosis in Medicine. The most frequent sample design to compare the parameters of two BDTs is the paired design. The paired design consists of applying the two BDTs and the GS to all individuals of a random sample sized n. The comparison of the predictive values of two BDTs subject to a paired design has been the subject of different studies. Bennett [1,2], Leisenring et al. [3], Wang et al. [4], Kosinski [5], Tsou [6], and Takahashi and Yamamoto [7] have studied hypothesis tests to compare the two positive predictive values and the negative predictive values independently. Roldán-Nofuentes et al. [8] studied a global hypothesis test to simultaneously compare the positive and negative predictive values of two BDTs. However, the comparison of predictive values using confidence intervals has been little studied. If the hypothesis test is significant to an

α

error, the confidence interval (CI) allows determining how much one predictive value is greater than the other. Moskowitz and Pepe [9] have proposed a Wald-type CI for the ratio of the two positive (negative) predictive values. A Wald-type CI for the difference of the two positive (negative) predictive values is easily obtained by inverting the contrast statistic of the hypothesis test studied by Wang et al. [4].

The objective of this manuscript is study CIs to compare the positive (negative) predictive values of two BDTs subject to a paired design. For this, we have studied CIs for the difference and for the ratio of the two positive (negative) predictive values. If a CI for the difference (ratio) does not contain the zero (one) value, then we reject the equality between the two positive (negative) predictive values, and we estimate how much bigger one predictive value is than another one. The problem of calculating the sample size to compare the two positive (negative) predictive values through a CI is also studied.

This manuscript is structured in the following way. In Section 2, the existing CIs are presented and other new CIs are proposed, both for the ratio and for the difference between the positive (negative) predictive values. In Section 3, simulation experiments are carried out to study the coverage probabilities and the average lengths of the CIs. In Section 4, a method to calculate the sample size to compare the parameters through CIs is proposed. In Section 5, we present the program called “cicpvbdt”, which is a program written in R that solves the problems studied in this manuscript. In Section 6, the results were applied to an example on the diagnosis of colorectal cancer, and in Section 7, the results are discussed.

2. Confidence Intervals

Let us consider two BDTs that are assessed in relation to the same GS. Let

T_{i}

be the variable that models the result of the ith BDT, with

i = 1, 2

.

T_{i} = 0

indicates that the test result is negative, and

T_{i} = 1

indicates that the test result is positive. Let D be the random variable that models the result of the GS, so that

D = 1

when the individual is diseased and

D = 0

when the individual is non-diseased. Let

S e_{i}

and

S p_{i}

be the sensitivity and specificity of the ith BDT. Table 1 shows the observed frequencies obtained subject to a paired design. The frequencies

s_{j k}

and

r_{j k}

are the product of a multinomial distribution whose probabilities are

p_{j k} = P (D = 1, T_{1} = j, T_{2} = k)

and

q_{j k} = P (D = 0, T_{1} = j, T_{2} = k)

, with

j, k = 0, 1

. Applying the conditional dependence model of Vacek [10], probabilities

p_{j k}

and

q_{j k}

are written as

p_{j k} = p [S e_{1}^{j} {(1 - S e_{1})}^{1 - j} S e_{2}^{k} {(1 - S e_{2})}^{1 - k} + δ_{j k} ε_{1}]

(2)

and

q_{j k} = q [S p_{1}^{1 - j} {(1 - S p_{1})}^{j} S p_{2}^{1 - k} {(1 - S p_{2})}^{k} + δ_{j k} ε_{0}],

(3)

where

δ_{j k} = 1

if

j = k

and

δ_{j k} = - 1

if

j \neq k

, with

j, k = 0, 1

,

ε_{1}

is the dependence factor between the two BDTs when

D = 1

and

ε_{0}

is the dependence factor between the two BDTs when

D = 0

. It is verified that

0 \leq ε_{1} \leq M i n {S e_{1} (1 - S e_{2}), S e_{2} (1 - S e_{1})}

and

0 \leq ε_{0} \leq M i n {S p_{1} (1 - S p_{2}), S p_{2} (1 - S p_{1})}

. If

ε_{1} = ε_{0} = 0

, then the two BDTs are conditionally independent on the disease. This assumption is not realistic, so in practice, it is verified that

ε_{1} > 0

and/or

ε_{0} > 0

. Let

π = {(p_{11}, p_{10}, p_{01}, p_{00}, q_{11}, q_{10}, q_{01}, q_{00})}^{T}

be the vector of probabilities of the multinomial distribution,

p = \sum_{i, j = 0}^{1} p_{i j}

and

q = 1 - p = \sum_{i, j = 0}^{1} q_{i j}

. The maximum likelihood estimators of

p_{j k}

and

q_{j k}

are:

{\hat{p}}_{j k} = \frac{s_{j k}}{n} a n d {\hat{q}}_{j k} = \frac{r_{j k}}{n} .

From Equation (1), the sensitivity and specificity of each BDT are written, in terms of the predictive values and of p, as

S e_{i} = \frac{τ_{i} (υ_{i} - q)}{p Y_{i}} a n d S p_{i} = \frac{υ_{i} (τ_{i} - p)}{q Y_{i}}

(4)

where

q = 1 - p

and

Y_{i} = τ_{i} + υ_{i} - 1

. Then

0 \leq ε_{1} \leq M i n {\frac{τ_{1} (υ_{1} - q)}{p Y_{1}} (1 - \frac{τ_{2} (υ_{2} - q)}{p Y_{2}}), \frac{τ_{2} (υ_{2} - q)}{p Y_{2}} (1 - \frac{τ_{1} (υ_{1} - q)}{p Y_{1}})}

(5)

and

0 \leq ε_{0} \leq M i n {\frac{υ_{1} (τ_{1} - p)}{q Y_{1}} (1 - \frac{υ_{2} (τ_{2} - p)}{q Y_{2}}), \frac{υ_{2} (τ_{2} - p)}{q Y_{2}} (1 - \frac{υ_{1} (τ_{1} - p)}{q Y_{1}})} .

(6)

In terms of predictive values, Equations (2) and (3) are written as

p_{j k} = p [\frac{τ_{1}^{j} {(υ_{1} - q)}^{j} {(τ_{1} + p)}^{1 - j} {(1 - υ_{1})}^{1 - j}}{p^{j} p^{1 - j} Y_{1}^{j} Y_{1}^{1 - j}} \times \frac{τ_{2}^{k} {(υ_{2} - q)}^{k} {(τ_{2} + p)}^{1 - k} {(1 - υ_{2})}^{1 - k}}{p^{k} p^{1 - k} Y_{2}^{k} Y_{1}^{1 - k}} + δ_{j k} ε_{1}]

(7)

and

q_{j k} = q [\frac{{(1 - τ_{1})}^{j} {(υ_{1} - q)}^{j} {(τ_{1} - p)}^{1 - j} υ_{1}^{1 - j}}{q^{j} q^{1 - j} Y_{1}^{j} Y_{1}^{1 - j}} \times \frac{{(1 - τ_{2})}^{k} {(υ_{2} - q)}^{k} {(τ_{2} - p)}^{1 - k} υ_{2}^{1 - k}}{q^{k} q^{1 - k} Y_{2}^{k} Y_{2}^{1 - k}} + δ_{j k} ε_{0}] .

(8)

The estimator of sensitivities and specificities are

\hat{S} e_{1} = \frac{s_{11} + s_{10}}{s}, \hat{S} e_{2} = \frac{s_{11} + s_{01}}{s}, \hat{S} p_{1} = \frac{r_{01} + r_{00}}{r} and \hat{S} p_{2} = \frac{r_{10} + r_{00}}{r},

(9)

and applying the delta method, the estimators of the variances-covariances of

\hat{S} e_{i}

and

\hat{S} p_{i}

are

\begin{matrix} \hat{V} a r (\hat{S} e_{i}) = \frac{\hat{S} e_{i} (1 - \hat{S} e_{i})}{s}, V a r (\hat{S} p_{i}) = \frac{\hat{S} p_{i} (1 - \hat{S} p_{i})}{r}, \\ \hat{C} o v (\hat{S} e_{1}, \hat{S} e_{2}) = \frac{{\hat{ε}}_{1}}{s} a n d \hat{C} o v (\hat{S} p_{1}, \hat{S} p_{2}) = \frac{{\hat{ε}}_{0}}{r}, \end{matrix}

where

{\hat{ε}}_{1} = \frac{n {\hat{p}}_{11}}{s} - \hat{S} e_{1} \hat{S} e_{2} = \frac{s_{11} s_{00} - s_{10} s_{01}}{s^{2}}

and

{\hat{ε}}_{0} = \frac{n {\hat{q}}_{00}}{r} - \hat{S} p_{1} \hat{S} p_{2} = \frac{r_{11} r_{00} - r_{10} r_{01}}{r^{2}}

, and the estimator of the disease prevalence is

\hat{p} = \frac{s}{n} .

(10)

Let

Q_{i} = p S e_{i} + q (1 - S p_{i})

be the probability that the result of the ith BDT is positive and let

{\bar{Q}}_{i} = 1 - Q_{i} = p (1 - S e_{i}) + q S p_{i}

be the probability that the result is negative. Its estimators are:

{\hat{Q}}_{1} = \frac{s_{10} + s_{11} + r_{10} + r_{11}}{n} and {\hat{Q}}_{2} = \frac{s_{01} + s_{11} + r_{01} + r_{11}}{n},

and

{\hat{\bar{Q}}}_{1} = \frac{s_{01} + s_{00} + r_{01} + r_{00}}{n} and {\hat{\bar{Q}}}_{2} = \frac{s_{10} + s_{00} + r_{10} + r_{00}}{n},

respectively. With respect to the predictive values, their estimators are:

{\hat{τ}}_{1} = \frac{s_{11} + s_{10}}{s_{11} + s_{10} + r_{11} + r_{10}}, {\hat{τ}}_{2} = \frac{s_{11} + s_{01}}{s_{11} + s_{01} + r_{11} + r_{01}}, {\hat{υ}}_{1} = \frac{r_{01} + r_{00}}{s_{01} + s_{00} + r_{01} + r_{00}}

and

{\hat{υ}}_{2} = \frac{r_{10} + r_{00}}{s_{10} + s_{00} + r_{10} + r_{00}} .

Applying the delta method, the estimators of the variances–covariances of

{\hat{τ}}_{i}

and

{\hat{υ}}_{i}

are [8]:

\hat{V} a r ({\hat{τ}}_{1}) = \frac{(s_{10} + s_{11}) (r_{10} + r_{11})}{{(s_{10} + s_{11} + r_{10} + r_{11})}^{3}}, \hat{V} a r ({\hat{υ}}_{1}) = \frac{(s_{00} + s_{01}) (r_{00} + r_{01})}{{(s_{00} + s_{01} + r_{00} + r_{01})}^{3}} \hat{V} a r ({\hat{τ}}_{2}) = \frac{(s_{01} + s_{11}) (r_{01} + r_{11})}{{(s_{01} + s_{11} + r_{01} + r_{11})}^{3}}, \hat{V} a r ({\hat{υ}}_{2}) = \frac{(s_{00} + s_{10}) (r_{00} + r_{10})}{{(s_{00} + s_{10} + r_{00} + r_{10})}^{3}}, \hat{C} o v ({\hat{τ}}_{1}, {\hat{τ}}_{2}) = \frac{s_{11} r_{10} r_{01} + r_{11} [s_{01} (s_{10} + s_{11}) + s_{11} (s_{11} + s_{10} + r_{11} + r_{10} + r_{01})]}{{(s_{01} + s_{11} + r_{01} + r_{11})}^{2} {(s_{10} + s_{11} + r_{10} + r_{11})}^{2}}

and

\hat{C} o v ({\hat{υ}}_{1}, {\hat{υ}}_{2}) = \frac{s_{10} (s_{00} + s_{01}) r_{00} + s_{00} [r_{00}^{2} + r_{01} r_{10} + r_{00} (s_{00} + s_{01} + r_{10} + r_{01})]}{{(s_{00} + s_{01} + r_{00} + r_{01})}^{2} {(s_{00} + s_{10} + r_{00} + r_{10})}^{2}} .

When two parameters are compared in Statistics, the interest is to study the difference or the ratio between them. Then, we compare the positive (negative) predictive values of two BDTs through CIs for the difference, i.e.,

δ_{τ} = τ_{1} - τ_{2}

and

δ_{υ} = υ_{1} - υ_{2}

, and for the ratio, i.e.,

ρ_{τ} = τ_{1} / τ_{2}

and

ρ_{υ} = υ_{1} / υ_{2}

.

2.1. CIs for the Difference

Three CIs for each difference

δ_{τ}

and

δ_{υ}

are studied: Wald CI, bias-corrected bootstrap CI, and Monte Carlo Bayesian CI.

2.1.1. Wald CI

Wang et al. [4] have studied the comparison of the PVs of two BDTs through the weighted least square method. The test statistics for

H_{0} : τ_{1} = τ_{2}

and

H_{0} : υ_{1} = υ_{2}

are

z_{τ} = \frac{{\hat{δ}}_{τ}}{\sqrt{\hat{V} a r ({\hat{δ}}_{τ})}} a n d z_{υ} = \frac{{\hat{δ}}_{υ}}{\sqrt{\hat{V} a r ({\hat{δ}}_{υ})}},

respectively. Both test statistics follow a standard normal distribution where

\hat{V} a r ({\hat{δ}}_{τ}) = \hat{V} a r ({\hat{τ}}_{1}) + \hat{V} a r ({\hat{τ}}_{2}) - 2 \hat{C} o v ({\hat{τ}}_{1}, {\hat{τ}}_{2})

and

\hat{V} a r ({\hat{δ}}_{υ}) = \hat{V} a r ({\hat{υ}}_{1}) + \hat{V} a r ({\hat{υ}}_{2}) - 2 \hat{C} o v ({\hat{υ}}_{1}, {\hat{υ}}_{2})

are the estimators of the variances of

{\hat{δ}}_{τ}

and

{\hat{δ}}_{υ}

, respectively. Inverting the two test statistics, the Wald CIs for

δ_{τ}

and for

δ_{υ}

are

δ_{τ} \in {\hat{δ}}_{τ} \pm z_{1 - α / 2} \sqrt{\hat{V} a r ({\hat{δ}}_{τ})} a n d δ_{υ} \in {\hat{δ}}_{υ} \pm z_{1 - α / 2} \sqrt{\hat{V} a r ({\hat{δ}}_{υ})},

respectively, where

z_{1 - α / 2}

is the

100 (1 - α / 2) th

percentile of the standard normal distribution.

2.1.2. Bias-Corrected Bootstrap CI

The bias-corrected bootstrap CI is calculated from B random samples with replacement generated from the sample of n individuals. In each of the B samples, we calculate

{\hat{τ}}_{1 b}

,

{\hat{τ}}_{2 b}

,

{\hat{υ}}_{1 b}

,

{\hat{υ}}_{2 b}

,

{\hat{δ}}_{τ b} = {\hat{τ}}_{1 b} - {\hat{τ}}_{2 b}

and

{\hat{δ}}_{υ b} = {\hat{υ}}_{1 b} - {\hat{υ}}_{2 b}

, with

b = 1, \dots, B

. Then, the average differences are calculated as

{\hat{\bar{δ}}}_{τ B} = \frac{1}{B} \sum_{b = 1}^{B} {\hat{δ}}_{τ b}

and

{\hat{\bar{δ}}}_{υ B} = \frac{1}{B} \sum_{b = 1}^{B} {\hat{δ}}_{τ b}

. Assuming that the bootstrap statistics

{\hat{\bar{δ}}}_{τ B}

and

{\hat{\bar{δ}}}_{υ B}

can be transformed to a normal distribution, the bias-corrected bootstrap CIs [11] are calculated in the following way. Let

A_{τ} = # ({\hat{δ}}_{τ b} < {\hat{δ}}_{τ})

be the number of bootstrap estimators

{\hat{δ}}_{τ b}

that are lower than the maximum likelihood estimator (MLE)

{\hat{δ}}_{τ}

, and let

A_{υ} = # ({\hat{δ}}_{υ b} < {\hat{δ}}_{υ})

be the number of bootstrap estimators

{\hat{δ}}_{υ b}

that are lower than the MLE

{\hat{δ}}_{υ}

. Let

{\hat{z}}_{τ} = Φ^{- 1} (A_{τ} / B)

and

{\hat{z}}_{υ} = Φ^{- 1} (A_{υ} / B)

, where

Φ^{- 1} (\cdot)

is the inverse function of the standard normal cumulative distribution function. Let

α_{1 τ} = Φ (2 {\hat{z}}_{τ} - z_{1 - α / 2})

,

α_{2 τ} = Φ (2 {\hat{z}}_{τ} + z_{1 - α / 2})

,

α_{1 υ} = Φ (2 {\hat{z}}_{υ} - z_{1 - α / 2})

, and

α_{2 υ} = Φ (2 {\hat{z}}_{υ} + z_{1 - α / 2})

; then, the bias-corrected bootstrap CI for

δ_{τ}

is

δ_{τ} \in ({\hat{δ}}_{τ B}^{(α_{1 τ})}, {\hat{δ}}_{τ B}^{(α_{2 τ})})

and the bias-corrected bootstrap CI for

δ_{υ}

is

δ_{υ} \in ({\hat{δ}}_{υ B}^{(α_{1 υ})}, {\hat{δ}}_{υ B}^{(α_{2 υ})})

where

{\hat{δ}}_{τ B}^{(γ)}

is the

γ th

quantile of the distribution of the B bootstrap estimations of

δ_{τ}

, and

{\hat{δ}}_{υ B}^{(γ)}

is the

γ th

quantile of the distribution of the B bootstrap estimations of

δ_{υ}

.

2.1.3. Monte Carlo Bayesian CI

The number of diseased individuals (

s

) is the product of a binomial distribution, i.e.,

s \to B (n, p)

. Conditioning on

D = 1

, it is verified that

s_{11} + s_{10} \to B (s, S e_{1}) a n d s_{11} + s_{01} \to B (s, S e_{2}) .

The number of non-diseased individuals (

r

) is the product of a binomial distribution, i.e.,

r \to B (n, q)

. Conditioning on

D = 0

, it is verified that

r_{01} + r_{00} \to B (r, S p_{1}) a n d r_{10} + r_{00} \to B (r, S p_{2}) .

On the other hand, the estimators

\hat{S} e_{i}

,

\hat{S} p_{i}

, and

\hat{p}

(Equations (9) and (10)) are estimators of binomial proportions. Therefore, for these estimators we propose conjugate beta prior distributions, i.e.,

\hat{S} e_{i} \to B e t a (α_{S e_{i}}, β_{S e_{i}}), \hat{S} p_{i} \to B e t a (α_{S p_{i}}, β_{S p_{i}}) a n d \hat{p} \to B e t a (α_{p}, β_{p}) .

(11)

Let

n = (s_{11}, s_{10}, s_{01}, s, r_{11}, r_{10}, r_{01}, n - s)

be the vector of observed frequencies, with

s_{00} = s - s_{11} - s_{10} - s_{01}

,

r = n - s

, and

r_{00} = n - s - r_{11} - r_{10} - r_{01}

. Then, the posteriori distributions for the estimators of

S e_{i}

,

S p_{i}

, and p are:

\begin{matrix} \hat{S} e_{1} | n \to B e t a (s_{11} + s_{10} + α_{S e_{1}}, s - s_{11} - s_{10} + β_{S e_{1}}), \\ \hat{S} p_{1} | n \to B e t a (r_{01} + r_{00} + α_{S p_{1}}, n - s - r_{01} - r_{00} + β_{S p_{1}}), \\ \hat{S} e_{2} | n \to B e t a (s_{11} + s_{01} + α_{S e_{2}}, s - s_{11} - s_{01} + β_{S e_{2}}), \\ \hat{S} p_{2} | n \to B e t a (r_{10} + r_{00} + α_{S p_{2}}, n - s - r_{10} - r_{00} + β_{S p_{2}}), \\ \hat{p} | n \to B e t a (s + α_{p}, n - s + β_{p}) . \end{matrix}

(12)

The posteriori distribution for the positive (negative) predictive value of each BDT, and for

δ_{τ}

and

δ_{υ}

, can be approximated applying the Monte Carlo method. The Monte Carlo method is a computational method that consists of generating M values from the posteriori distributions (12). In the mth iteration, the values generated for

S e_{j}^{(m)}

,

S p_{j}^{(m)}

, and

p^{(m)}

are plugged in the equations

τ_{i}^{(m)} = \frac{p^{(m)} \times S e_{i}^{(m)}}{p^{(m)} \times S e_{i}^{(m)} + (1 - p^{(m)}) \times (1 - S p_{i}^{(m)})}

and

υ_{i}^{(m)} = \frac{(1 - p^{(m)}) \times S p_{i}^{(m)}}{p^{(m)} \times (1 - S e_{i}^{(m)}) + (1 - p^{(m)}) \times S p_{i}^{(m)}},

with

i = 1, 2

, and then,

δ_{τ}^{(m)} = τ_{1}^{(m)} - τ_{2}^{(m)}

and

δ_{υ}^{(m)} = υ_{1}^{(m)} - υ_{2}^{(m)}

are calculated. As estimators of

δ_{τ}

and

δ_{υ}

, we calculate the average of the M estimations of the differences, i.e.,

{\hat{\bar{δ}}}_{τ B a y} = \frac{1}{M} \sum_{m = 1}^{M} δ_{τ}^{(m)}

and

{\hat{\bar{δ}}}_{υ B a y} = \frac{1}{M} \sum_{m = 1}^{M} δ_{υ}^{(m)}

. Finally, based on the M values of

δ_{τ}^{(m)}

and of

δ_{υ}^{(m)}

, we propose CIs based on quantiles. Therefore, the

100 \times (1 - α) %

CI for

δ_{τ}

is

δ_{τ} \in (q_{τ B a y}^{(α / 2)}, q_{τ B a y}^{(1 - α / 2)})

and the

100 \times (1 - α) %

CI for

δ_{υ}

is

δ_{υ} \in (q_{υ B a y}^{(α / 2)}, q_{υ B a y}^{(1 - α / 2)})

where

q_{τ B a y}^{(γ)}

(q_{υ B a y}^{(γ)})

is the

γ th

quantile of the distribution of the M values

δ_{τ}^{(m)}

(δ_{υ}^{(m)})

.

2.2. CIs for the Ratio

Five CIs for each ratio

ρ_{τ} = τ_{1} / τ_{2}

and

ρ_{υ} = υ_{1} / υ_{2}

are studied: Wald CI, logarithmic CI, Fieller CI, bias-corrected bootstrap CI, and Monte Carlo Bayesian CI.

2.2.1. Wald CI

Moskowitz and Pepe [9] have studied a Wald-type confidence interval for the ratio of the two positive (negative) predictive values. The

100 \times (1 - α) %

Wald CI for

δ_{τ}

is

ρ_{τ} \in {\hat{ρ}}_{τ} \pm z_{1 - α / 2} \sqrt{\hat{V} a r ({\hat{ρ}}_{τ})}

and the

100 \times (1 - α) %

Wald CI for

δ_{υ}

is

ρ_{υ} \in {\hat{ρ}}_{υ} \pm z_{1 - α / 2} \sqrt{\hat{V} a r ({\hat{ρ}}_{υ})},

where

\hat{V} a r ({\hat{ρ}}_{τ})

and

\hat{V} a r ({\hat{ρ}}_{υ})

, obtained applying the delta method, are

\hat{V} a r ({\hat{ρ}}_{τ}) = \frac{{\hat{τ}}_{2}^{2} \hat{V} a r ({\hat{τ}}_{1}) + {\hat{τ}}_{1}^{2} \hat{V} a r ({\hat{τ}}_{2}) - 2 {\hat{τ}}_{1} {\hat{τ}}_{2} \hat{C} o v ({\hat{τ}}_{1}, {\hat{τ}}_{2})}{{\hat{τ}}_{2}^{4}}

and

\hat{V} a r ({\hat{ρ}}_{υ}) = \frac{{\hat{υ}}_{2}^{2} \hat{V} a r ({\hat{υ}}_{1}) + {\hat{υ}}_{1}^{2} \hat{V} a r ({\hat{υ}}_{2}) - 2 {\hat{υ}}_{1} {\hat{υ}}_{2} \hat{C} o v ({\hat{υ}}_{1}, {\hat{υ}}_{2})}{{\hat{υ}}_{2}^{4}} .

These CIs are for

ρ_{τ} = τ_{1} / τ_{2}

and

ρ_{υ} = υ_{1} / υ_{2}

. If we want to calculate the CI for the ratio

τ_{2} / τ_{1} = 1 / ρ_{τ}

and for the ratio

υ_{2} / υ_{1} = 1 / ρ_{υ}

, then we have to divide the CI for

ρ_{τ}

by

{\hat{ρ}}_{τ}^{2}

and the CI for

ρ_{υ}

by

{\hat{ρ}}_{υ}^{2}

. For example, if

(L_{τ}, U_{τ})

is the Wald CI for

τ_{1} / τ_{2}

, then

(L_{τ} / {\hat{ρ}}_{τ}^{2}, U_{τ} / {\hat{ρ}}_{τ}^{2})

is the Wald CI for

τ_{2} / τ_{1}

.

2.2.2. Logarithmic CI

Assuming the asymptotic normality of the Napierian logarithm of

{\hat{ρ}}_{τ}

and of

{\hat{ρ}}_{υ}

, i.e.,

l n ({\hat{ρ}}_{τ}) \to N (l n (δ_{τ}), V a r [l n (δ_{τ})])

and

l n ({\hat{ρ}}_{υ}) \to N (l n (δ_{υ}), V a r [l n (δ_{υ})])

when n is large, an asymptotic CI for

l n (ρ_{τ})

is

l n ({\hat{ρ}}_{τ}) \pm z_{1 - α / 2} \sqrt{\hat{V} a r [l n ({\hat{ρ}}_{τ})]}

and an asymptotic CI for

l n (ρ_{υ})

is

l n ({\hat{ρ}}_{υ}) \pm z_{1 - α / 2} \sqrt{\hat{V} a r [l n ({\hat{ρ}}_{υ})]} .

Taking exponential in each of the previous expressions, the logarithmic CI for

δ_{τ}

is

ρ_{τ} \in {\hat{ρ}}_{τ} \times e x p {\pm z_{1 - α / 2} \sqrt{\hat{V} a r [l n ({\hat{ρ}}_{τ})]}}

and the logarithmic CI for

δ_{υ}

is

ρ_{υ} \in {\hat{ρ}}_{υ} \times e x p {\pm z_{1 - α / 2} \sqrt{\hat{V} a r [l n ({\hat{ρ}}_{υ})]}},

where

\hat{V} a r [l n ({\hat{ρ}}_{τ})]

and

\hat{V} a r [l n ({\hat{ρ}}_{υ})]

, obtained applying the delta method, are:

\hat{V} a r [l n ({\hat{ρ}}_{τ})] = \frac{\hat{V} a r [{\hat{τ}}_{1}]}{{\hat{τ}}_{1}^{2}} + \frac{\hat{V} a r [{\hat{τ}}_{2}]}{{\hat{τ}}_{2}^{2}} - \frac{2 \hat{C} o v [{\hat{τ}}_{1}, {\hat{τ}}_{2}]}{{\hat{τ}}_{1} {\hat{τ}}_{2}}

and

\hat{V} a r [l n ({\hat{ρ}}_{υ})] = \frac{\hat{V} a r [{\hat{υ}}_{1}]}{{\hat{υ}}_{1}^{2}} + \frac{\hat{V} a r [{\hat{υ}}_{2}]}{{\hat{υ}}_{2}^{2}} - \frac{2 \hat{C} o v [{\hat{υ}}_{1}, {\hat{υ}}_{2}]}{{\hat{υ}}_{1} {\hat{υ}}_{2}} .

If we want to calculate the logarithmic CI for the ratio

τ_{2} / τ_{1}

, then the CI is obtained by calculating the inverse of each boundary of CI for

ρ_{τ} = τ_{1} / τ_{2}

. In a similar way, the CI for

υ_{2} / υ_{1}

is calculated.

2.2.3. Fieller CI

The method of Fieller [12] is a classic method used to estimate the ratio of two parameters. In order to apply this method, it is necessary to assume that

{\hat{τ}}^{T} \to N (τ, \sum_{τ})

and that

{\hat{υ}}^{T} \to N (υ, \sum_{υ})

; i.e., it is necessary to assume that the estimators of the positive (negative) predictive values are distributed according to a normal bivariate distribution, and where

\hat{τ} = ({\hat{τ}}_{1}, {\hat{τ}}_{2})

,

\hat{υ} = ({\hat{υ}}_{1}, {\hat{υ}}_{2})

,

\sum_{τ} = (\begin{matrix} V a r (τ_{1}) & C o v (τ_{1}, τ_{2}) \\ C o v (τ_{1}, τ_{2}) & V a r (τ_{2}) \end{matrix}) = (\begin{matrix} σ_{τ 11} & σ_{τ 12} \\ σ_{τ 21} & σ_{τ 22} \end{matrix})

and

\sum_{υ} = (\begin{matrix} V a r (υ_{1}) & C o v (υ_{1}, υ_{2}) \\ C o v (υ_{1}, υ_{2}) & V a r (υ_{2}) \end{matrix}) = (\begin{matrix} σ_{υ 11} & σ_{υ 12} \\ σ_{υ 21} & σ_{υ 22} \end{matrix}) .

Applying the method of Fieller, it is verified that

{\hat{τ}}_{1} - ρ_{τ} {\hat{τ}}_{2} \to N (0, σ_{τ 11} + ρ_{τ}^{2} σ_{τ 22} - 2 ρ_{τ} σ_{τ 12})

and that

{\hat{υ}}_{1} - ρ_{υ} {\hat{υ}}_{2} \to N (0, σ_{υ 11} + ρ_{υ}^{2} σ_{υ 22} - 2 ρ_{υ} σ_{υ 12})

when n is large. The Fieller CI for

ρ_{τ}

is obtained by solving the inequality

\frac{{({\hat{τ}}_{1} - ρ_{τ} {\hat{τ}}_{2})}^{2}}{{\hat{σ}}_{τ 11} + ρ_{τ}^{2} {\hat{σ}}_{τ 22} - 2 ρ_{τ} {\hat{σ}}_{τ 12}} < z_{1 - α / 2}^{2},

and the Fieller CI for

ρ_{υ}

is obtained by solving the inequality

\frac{{({\hat{υ}}_{1} - ρ_{υ} {\hat{υ}}_{2})}^{2}}{{\hat{σ}}_{υ 11} + ρ_{υ}^{2} {\hat{σ}}_{υ 22} - 2 ρ_{υ} {\hat{σ}}_{υ 12}} < z_{1 - α / 2}^{2} .

Finally, the Fieller CI for

ρ_{τ}

is

ρ_{τ} \in \frac{{\hat{β}}_{τ 12} \pm \sqrt{{\hat{β}}_{τ 12}^{2} - {\hat{β}}_{τ 11} {\hat{β}}_{τ 22}}}{{\hat{β}}_{τ 22}}

where

{\hat{β}}_{τ i j} = {\hat{τ}}_{i} {\hat{τ}}_{j} - {\hat{σ}}_{τ i j} z_{1 - α / 2}^{2}

with

i, j = 1, 2

and verifying that

{\hat{β}}_{τ 12} = {\hat{β}}_{τ 21}

. This CI is valid when

{\hat{β}}_{τ 12}^{2} > {\hat{β}}_{τ 11} {\hat{β}}_{τ 22}

and

{\hat{β}}_{τ 22} \neq 0

. Similarly, the Fieller CI for

ρ_{υ}

is

ρ_{υ} \in \frac{{\hat{β}}_{υ 12} \pm \sqrt{{\hat{β}}_{υ 12}^{2} - {\hat{β}}_{υ 11} {\hat{β}}_{υ 22}}}{{\hat{β}}_{υ 22}}

where

{\hat{β}}_{υ i j} = {\hat{υ}}_{i} {\hat{υ}}_{j} - {\hat{σ}}_{υ i j} z_{1 - α / 2}^{2}

with

i, j = 1, 2

, and

{\hat{β}}_{υ 12} = {\hat{β}}_{υ 21}

. This CI is valid when

{\hat{β}}_{υ 12}^{2} > {\hat{β}}_{υ 11} {\hat{β}}_{υ 22}

and

{\hat{β}}_{υ 22} \neq 0

. The Fieller CI for

τ_{2} / τ_{1}

(υ_{2} / υ_{1})

is calculated by inverting the limits of the CI for

ρ_{τ}

(ρ_{υ})

.

2.2.4. Bias-Corrected Bootstrap CI

The bias-corrected bootstrap CI for

ρ_{τ}

(ρ_{υ})

is obtained in a similar way to that of

δ_{τ}

(δ_{υ})

. In each sample with a replacement obtained, we calculate

{\hat{τ}}_{1 b}

,

{\hat{τ}}_{2 b}

,

{\hat{υ}}_{1 b}

,

{\hat{υ}}_{2 b}

,

{\hat{ρ}}_{τ b} = {\hat{τ}}_{1 b} / {\hat{τ}}_{2 b}

, and

{\hat{ρ}}_{υ b} = {\hat{υ}}_{1 b} / {\hat{υ}}_{2 b}

, with

b = 1, \dots, B

. Then, based on the B ratios, we estimate the average ratios as

{\hat{\bar{ρ}}}_{τ B} = \frac{1}{B} \sum_{b = 1}^{B} {\hat{ρ}}_{τ b}

and

{\hat{\bar{ρ}}}_{υ B} = \frac{1}{B} \sum_{b = 1}^{B} {\hat{ρ}}_{υ b}

. Assuming that these statistics can be transformed to a normal distribution, the bias-corrected bootstrap CI [11] for

ρ_{τ}

(ρ_{υ})

is calculated in a similar way as the bias-corrected bootstrap CI for

δ_{τ}

(δ_{υ})

, considering that

A_{τ} = # ({\hat{ρ}}_{τ b} < {\hat{ρ}}_{τ})

and that

A_{υ} = # ({\hat{ρ}}_{υ b} < {\hat{ρ}}_{υ})

. Finally, the bias-corrected bootstrap CI for

ρ_{τ}

is

ρ_{τ} \in ({\hat{ρ}}_{τ B}^{(α_{1})}, {\hat{ρ}}_{τ B}^{(α_{2})}),

where

{\hat{ρ}}_{τ B}^{(γ)}

is the

γ th

quantile of the distribution of the B bootstrap estimations of

ρ_{τ}

. Similarly, the bias-corrected bootstrap CI for

ρ_{υ}

is

ρ_{υ} \in ({\hat{ρ}}_{υ B}^{(α_{1})}, {\hat{ρ}}_{υ B}^{(α_{2})}),

where

{\hat{ρ}}_{υ B}^{(γ)}

is the

γ th

quantile of the distribution of the B bootstrap estimations of

ρ_{υ}

. The bias-corrected bootstrap CI for

τ_{2} / τ_{1}

(υ_{2} / υ_{1})

is calculated by inverting the limits of the bias-corrected bootstrap CI for

ρ_{τ}

(ρ_{υ})

.

2.2.5. Monte Carlo Bayesian CI

The Monte Carlo Bayesian CI for

ρ_{τ}

(ρ_{υ})

is obtained in a similar way as the Monte Carlo Bayesian CI for

δ_{τ}

(δ_{υ})

. Considering the same distributions (10) and (11), in the mth iteration, we calculate the ratios

ρ_{τ}^{(m)} = τ_{1}^{(m)} / τ_{2}^{(m)}

and

ρ_{υ}^{(m)} = υ_{1}^{(m)} / υ_{2}^{(m)}

. As estimators, we calculate

{\hat{\bar{ρ}}}_{τ B a y} = \frac{1}{M} \sum_{m = 1}^{M} ρ_{τ}^{(m)}

and

{\hat{\bar{ρ}}}_{υ B a y} = \frac{1}{M} \sum_{m = 1}^{M} ρ_{υ}^{(m)}

. Finally, we calculate the CIs based on quantiles, i.e.,

ρ_{τ} \in (q_{τ B a y}^{(α / 2)}, q_{τ B a y}^{(1 - α / 2)}) a n d ρ_{υ} \in (q_{υ B a y}^{(α / 2)}, q_{υ B a y}^{(1 - α / 2)})

where

q_{τ B a y}^{(γ)}

(q_{υ B a y}^{(γ)})

is the

γ th

quantile of the distribution of the M values

ρ_{τ}^{(m)}

(ρ_{υ}^{(m)})

. The Monte Carlo Bayesian CI for

τ_{2} / τ_{1}

(υ_{2} / υ_{1})

is calculated by inverting the limits of the Monte Carlo Bayesian CI for

ρ_{τ}

(ρ_{υ})

.

3. Simulation Experiments

The CIs studied in Section 2 are approximate, and therefore, it is necessary to study their asymptotic behaviors. For this, Monte Carlo simulation experiments have been carried out to study the coverage probabilities and the average lengths of the CIs studied, considering a confidence level of 95%. These experiments have consisted of generating

N = 10, 000

random samples with multinomial distribution sized

n = {50, 100, 200, 500, 1000}

, and whose probabilities have been calculated from Equations (7) and (8). The experiments have been designed from the predictive values of both BDTs. As the value of disease prevalence, we have taken

p = {10 %, 25 %, 50 %, 75 %}

, and as values of predictive values, we have taken the values

τ_{i}, υ_{i} = {0.70, 0.75, \dots, 0.90, 0.95}

, which are realistic values in clinical practice. Next, using these values, we have calculated the maximum values of the dependence factors

ε_{1}

and

ε_{0}

(Equations (5) and (6)). As values of

ε_{1}

and

ε_{0}

, we have taken intermediate and high values, i.e., 50% of the maximum value of

ε_{i}

and 90% of the maximum value of

ε_{i}

, respectively. Finally, we have calculated the probabilities of the multinomial distributions using Equations (7) and (8). In each scenario, we have calculated all the CIs for each of the N random samples.

For the bias-corrected bootstrap CIs, for each one of the N random samples,

B = 2000

samples with replacement have been generated, and from these B samples, the bias-corrected bootstrap CIs have been calculated.

For the Monte Carlo Bayesian CI, we have considered a

B e t a (1, 1)

distribution as a priori distribution for the estimators of sensitivities, specificities and prevalence. The

B e t a (1, 1)

distribution is a non-informative distribution, which is flat for every possible value of the sensitivities, specificities, and prevalence. Therefore, the impact of the

B e t a (1, 1)

distribution on the posteriori distributions is minimal. Moreover, for each one of the N random samples,

M = 10, 000

random samples have been generated, and the Monte Carlo Bayesian CIs have been calculated from them.

In each of the N samples generated, all the CIs have been calculated. Furthermore, it has been checked whether each CI contains the value of the parameter (difference or ratio, depending on the type of CI). The coverage probability has been calculated by dividing the number of samples in which the CI contains the parameter by the total number of samples. For each CI, its length (upper limit minus the lower limit) has also been calculated, and finally, the average length of each CI has been calculated.

3.1. CIs for the Differences and Ratios of Positive Predictive Values

Table 2 shows some of the results obtained for the three CIs for the difference

δ_{τ}

for four different scenarios and for intermediate values of

ε_{1}

and

ε_{0}

. When the sample size is small

(n = 50)

or moderate

(n = 100)

, the CIs for

δ_{τ}

have a coverage probability close to 1. For the difference

δ_{τ}

, in very general terms, the Wald CI is the interval that has a coverage probability with better fluctuations around 95%, especially when n is moderate or large

(n \geq 200)

. The bias-corrected bootstrap CI has a very similar behavior to the Wald CI, especially when the sample size is large. In general terms, the Monte Carlo Bayesian CI has a coverage probability greater than that of the other two intervals, even when the coverage probability of the other two intervals fluctuates around 95%.

Regarding the CIs for the ratio

ρ_{τ}

, Table 3 shows the results obtained for the same scenarios as in Table 2. When the sample size is small

(n = 50)

, the five CIs for

ρ_{τ}

have a coverage probability close to 1. In general terms, there is not an important difference between the coverage probabilities and the average lengths of the Wald, logarithm, and Fieller CIs, especially when

n \geq 100

. When the sample size is small, the logarithmic CI and the Fieller CI have an average length slightly greater than the Wald CI. The bias-corrected bootstrap CI has a very similar behavior to the Wald, logarithmic, and Fieller CIs, especially when the sample size is large. In general terms, the Monte Carlo Bayesian CI has a coverage probability greater than that of the other four intervals.

3.2. CIs for the Differences and Ratios of Negative Predictive Values

Table 4 shows the results for the three CIs for the difference

δ_{υ}

for the same scenarios as in Table 2 and Table 3. In general terms, when the sample size is small, the three CIs have a coverage probability close to 1, although in some situations, the bias-corrected bootstrap CI may have a coverage probability well below 95%. In general terms, the Monte Carlo Bayesian CI has a coverage probability that almost always fluctuates above 95%. For the difference

δ_{υ}

, the Wald CI is the interval that has a coverage probability with better fluctuations around 95%, especially when the sample size is moderate or large.

Regarding the CIs for the ratio

ρ_{υ}

, Table 5 shows the results obtained for the same scenarios as in Table 4. When the sample size is small

(n = 50)

, the five CIs for

ρ_{τ}

fail or have a coverage probability close to 1. In general terms, the conclusions for the bias-corrected bootstrap CI and for the Monte Carlo Bayesian CI are very similar to those obtained for the corresponding intervals for the difference

δ_{υ}

. With respect to the other intervals, there is not an important difference between the coverage probabilities and the average lengths of the Wald, logarithm, and Fieller CIs, especially when

n \geq 100

. When the sample size is small, the logarithmic CI and the Fieller CI have an average length slightly greater than that of the Wald CI.

Similar conclusions are obtained when

ε_{1}

and

ε_{0}

take high values. Therefore, the dependency factors

ε_{1}

and

ε_{0}

do not have an important effect on the behavior of the CIs for the difference (ratio) of the two negative predictive values.

As a conclusion, the following general rules of application can be given depending on the sample size, since the sample size is the only parameter controlled by the researcher: (a) apply the Wald CI for the difference of the positive (negative) predictive values whatever the sample size; (b) apply the Wald CI for the ratio of the two positive (negative) predictive values when the sample size is small, and apply the Wald CI, the logarithmic CI, the Fieller CI, or the bias-corrected bootstrap CI when the sample size is moderate or high.

Once some general rules of application have been established, what is better: to use a CI for the difference or a CI for the ratio? Simulation experiments have shown that the Wald CIs for the difference and the Wald CIs for the ratio have a very similar coverage probability. Furthermore, the Wald CI for the difference has a coverage probability very similar to that of the Fieller CI when the sample size is large. The Wald CIs for the difference are obtained by inverting the Wald test statistics of the tests

H_{0} : τ_{1} - τ_{2} = 0

and

H_{0} : υ_{1} - υ_{2} = 0

, and the Wald CIs for the ratio are obtained by inverting the Wald test statistics of the tests

H_{0} : τ_{1} / τ_{2} = 1

and

H_{0} : υ_{1} / υ_{2} = 1

. Wang et al. [4] have shown through simulation experiments that the hypothesis tests

H_{0} : τ_{1} - τ_{2} = 0

and

H_{0} : υ_{1} - υ_{2} = 0

have better asymptotic behavior than the tests

H_{0} : τ_{1} / τ_{2} = 1

and

H_{0} : υ_{1} / υ_{2} = 1

. Furthermore, Wang et al. recommend using the difference-based approach as it is more straightforward and more understandable for researchers. Therefore, we recommend using a CI for the difference instead of a CI for the ratio.

4. Sample Size

The calculation of the sample size to compare parameters has great interest in Statistics. Next, we propose a procedure to determine the sample size necessary to estimate the difference between the two positive (negative) predictive values with a precision

ϕ_{τ}

(ϕ_{υ})

and a confidence

100 (1 - α) %

. This procedure is based on the Wald CI for the difference

δ_{τ}

(δ_{υ})

, since in general terms, this CI is the interval with the best asymptotic behavior. This procedure requires having a pilot sample (or another study) in order to estimate the predictive values and their differences. If the pilot sample is not small and the Wald CI for the difference

δ_{τ}

(δ_{υ})

contains the value 0, then the null hypothesis of equality of the predictive values is not rejected, and it does not make sense to calculate the sample size. However, if the sample is small, it may be necessary to calculate the sample size, since the Wald CI will be very wide and may contain the value 0 even if the predictive values are different. Let us considerer that

τ_{1} \geq τ_{2}

(υ_{1} \geq υ_{2})

and therefore

δ_{τ} \geq 0

(δ_{υ} \geq 0)

, and let

ϕ_{τ}

and

ϕ_{υ}

be the precisions set by the researcher (

ϕ

must be a small value if the researcher wants high precision). Based on the asymptotic normality of

{\hat{δ}}_{τ} = {\hat{τ}}_{1} - {\hat{τ}}_{1}

and of

{\hat{δ}}_{υ} = {\hat{υ}}_{1} - {\hat{υ}}_{2}

, it is verified that

{\hat{δ}}_{τ} \in δ_{τ} \pm z_{1 - α / 2} \sqrt{V a r ({\hat{δ}}_{τ})} and {\hat{δ}}_{υ} \in δ_{υ} \pm z_{1 - α / 2} \sqrt{V a r ({\hat{δ}}_{υ})},

i.e., the probability of obtaining an estimator

{\hat{δ}}_{τ}

({\hat{δ}}_{υ})

is in this interval with a probability

100 (1 - α) %

.

For positive predictive values, the method is as follows. Setting a precision

ϕ_{τ}

, the sample size is calculated from the equation

ϕ_{τ} = z_{1 - α / 2} \sqrt{V a r ({\hat{δ}}_{τ})},

(13)

where the variance is

V a r ({\hat{δ}}_{τ}) = \frac{p q Q_{2} τ_{1} {\bar{τ}}_{1} + p q Q_{1} τ_{2} {\bar{τ}}_{2} - 2 (p q^{2} τ_{1} τ_{2} ε_{0} + p^{2} q {\bar{τ}}_{1} {\bar{τ}}_{2} ε_{1} + τ_{1} τ_{2} {\bar{τ}}_{1} {\bar{τ}}_{2} Q_{1} Q_{2})}{n p q Q_{1} Q_{2}} .

(14)

The proof can be seen in the Appendix A. This variance depends on the positive predictive values

(τ_{i})

, on the disease prevalence

(p)

, on the probability of a positive result of each test

(Q_{i})

, on the dependency factors

(ε_{i})

, and on the sample size n. Substituting in Equation (13) the parameters with their estimators and clearing n, the sample size to estimate the difference

δ_{τ}

with precision

ϕ_{τ}

and a confidence

100 (1 - α) %

is

n_{τ} = n = \frac{z_{1 - α / 2}^{2}}{ϕ_{τ}^{2}} \times \frac{\hat{p} \hat{q} {\hat{τ}}_{1} {\hat{\bar{τ}}}_{1} {\hat{Q}}_{2} + \hat{p} \hat{q} {\hat{τ}}_{2} {\hat{\bar{τ}}}_{2} {\hat{Q}}_{1} - 2 (\hat{p} {\hat{q}}^{2} {\hat{τ}}_{1} {\hat{τ}}_{2} {\hat{ε}}_{0} + {\hat{p}}^{2} \hat{q} {\hat{\bar{τ}}}_{1} {\hat{\bar{τ}}}_{2} {\hat{ε}}_{1} + {\hat{τ}}_{1} {\hat{τ}}_{2} {\hat{\bar{τ}}}_{1} {\hat{\bar{τ}}}_{2} {\hat{Q}}_{1} {\hat{Q}}_{2})}{\hat{p} \hat{q} {\hat{Q}}_{1} {\hat{Q}}_{2}} .

(15)

Once the equation for the sample size is obtained, the method to calculate the sample size consists of the following steps:

(1): Take pilot samples sized ${n^{'}}_{τ}$ , and from this sample, calculate ${\hat{τ}}_{i}$ , ${\hat{υ}}_{i}$ , ${\hat{ε}}_{i}$ , $\hat{p}$ , ${\hat{Q}}_{i}$ and the Wald CI for the difference $δ_{τ}$ . If the Wald CI has a precision $ϕ_{τ}$ , then with the pilot sample, the precision has been reached, and the process ends. In this situation, the difference $δ_{τ}$ has been estimated with a precision $ϕ_{τ}$ to a confidence $100 (1 - α) %$ . Otherwise, go to the next step.
(2): From the estimates obtained with the pilot sample, calculate the sample size $n_{τ}$ applying Equation (15).
(3): Take the sample sized $n_{τ}$ ( $n_{τ} - {n^{'}}_{τ}$ individuals are added to the initial pilot sample), and from this sample, calculate all the estimators and the Wald CI for the difference $δ_{τ}$ . If the Wald CI has a precision $ϕ_{τ}$ , then the process ends (the precision has been reached with the new sample). If the Wald CI does not have the precision $ϕ_{τ}$ , then this sample is considered as a pilot sample and go to step 1.

This method to calculate the sample size n is an iterative method, which depends on the initial pilot sample and therefore does not guarantee that the difference between the positive predictive values will be estimated with the precision

ϕ_{τ}

.

Sample size to estimate the difference

δ_{υ}

is calculated in a similar way. In this case,

V a r ({\hat{δ}}_{υ}) = \frac{p q υ_{1} {\bar{υ}}_{1} {\bar{Q}}_{2} + p q υ_{2} {\bar{υ}}_{2} {\bar{Q}}_{1} - 2 (p q^{2} {\bar{υ}}_{1} {\bar{υ}}_{2} ε_{0} + p^{2} q υ_{1} υ_{2} ε_{1} + υ_{1} υ_{2} {\bar{υ}}_{1} {\bar{υ}}_{2} {\bar{Q}}_{1} {\bar{Q}}_{2})}{n p q {\bar{Q}}_{1} {\bar{Q}}_{2}}

(16)

and the sample size

n_{υ}

to estimate the difference

δ_{υ}

with precision

ϕ_{υ}

and a confidence

100 (1 - α) %

is

n_{υ} = \frac{z_{1 - α / 2}^{2}}{ϕ_{υ}^{2}} \times \frac{\hat{p} \hat{q} {\hat{υ}}_{1} {\hat{\bar{υ}}}_{1} {\hat{\bar{Q}}}_{2} + \hat{p} \hat{q} {\hat{υ}}_{2} {\hat{\bar{υ}}}_{2} {\hat{\bar{Q}}}_{1} - 2 (\hat{p} {\hat{q}}^{2} {\hat{\bar{υ}}}_{1} {\hat{\bar{υ}}}_{2} {\hat{ε}}_{0} + {\hat{p}}^{2} \hat{q} {\hat{υ}}_{1} {\hat{υ}}_{2} {\hat{ε}}_{1} + {\hat{υ}}_{1} {\hat{υ}}_{2} {\hat{\bar{υ}}}_{1} {\hat{\bar{υ}}}_{2} {\hat{\bar{Q}}}_{1} {\hat{\bar{Q}}}_{2})}{\hat{p} \hat{q} {\hat{\bar{Q}}}_{1} {\hat{\bar{Q}}}_{2}} .

(17)

If the researcher wants to estimate

δ_{τ}

with precision

ϕ_{τ}

and also wants to estimate

δ_{υ}

with precision

ϕ_{υ}

, at the same level of confidence, then the final sample size is

n = M a x (n_{τ}, n_{υ})

. Using the largest sample size, can guarantee that the CI for the difference of the two positive predictive values and that the CI for the difference of the two negative predictive values verify the set precision for each of them.

The method for calculating the sample size depends on the values of the estimators obtained from the pilot sample. As the values of the estimators depend on each sample (and therefore vary from one sample to another), it is necessary to study how the values of the estimators affect the calculation of the sample size. Therefore, we have carried out simulation experiments to study the effect of the values of the estimators on the calculation of the sample size. These simulation experiments consisted of the following steps:

(1): Calculate the sample size $n_{τ}$ $(n_{υ})$ , Equations (14) and (16), from the values of the parameters in the scenarios considered (Table 2 and Table 4). Therefore, these equations have been applied using the values of the parameters instead of the values of the estimators.
(2): Generate $N = 10, 000$ multinomial random samples sized $n_{τ}$ $(n_{υ})$ and whose probabilities have been calculated from Equations (7) and (8), using the parameters of the scenarios considered, and as values $ε_{i}$ , intermediate (50%) and high (90%) values have been considered. From each one of the N random samples, all the estimators ( ${\hat{τ}}_{i}$ , ${\hat{υ}}_{i}$ , ${\hat{ε}}_{i}$ , $\hat{p}$ and ${\hat{Q}}_{i}$ ) have been calculated, and then, the sample size ${n^{'}}_{τ}$ $({n^{'}}_{υ})$ has been calculated applying Equation (14) (Equation (16)).
(3): In each scenario considered, the average sample size and relative bias have been calculated.

Table 6 shows the results obtained for different precision values (2.5% and 5%, which are values that can be considered as high precision values) and

1 - α = 0.95

. The relative biases are very small, the equations of the sample sizes provide robust values, and therefore, the pilot sample has little effect on the calculation of the sample sizes.

5. Program Cipvbdt

We have written a program in R [13] to solve the problems raised in this manuscript. The program is called “cicpvbdt” (confidence intervals to compare the predictive values of binary diagnostic tests), and it allows calculating all CIs and sample sizes. The program runs with the command “

cicpvbdt (s_{11}, s_{10}, s_{01}, s_{00}, r_{11}, r_{10}, r_{01}, r_{00}, δ_{τ}, δ_{υ})

”. By default, the level of confidence is 95%. The program does not calculate the sample sizes when

δ_{τ} = 0

and

δ_{υ} = 0

, and only calculates the sample sizes when

δ_{τ} > 0

and/or

δ_{υ} > 0

. In this last situation, the program checks if the set precision is reached. The program checks that all values are valid (e.g., that there are no negative observed frequencies, etc.). The program also checks that all the parameters and their variances–covariances can be estimated. For the bias-corrected bootstrap CIs, 2000 samples with replacement are generated, and for the Monte Carlo Bayesian CIs, 10,000 random samples are generated. The results obtained are saved in a file called “results_cicpvbt.txt” in the folder from where the program is run. The program is available as Supplemental Material of this manuscript.

6. Example

The results obtained have been applied to a study on the diagnosis of colorectal cancer, using two diagnostic tests: Fecal Immunochemical Testing (FIT) and Fecal Occult Blood Testing (FOBT). The GS for the diagnosis of colorectal cancer is the biopsy. Table 7 shows the observed frequencies when the two BDTs and the GS have been applied to a sample of 168 adult men suspected of having colorectal cancer. Using the program “cicpvbdt” with the command “

cicpvbdt (68, 18, 1, 13, 4, 1, 2, 61, 0, 0)

”; all the results shown in Table 7 are obtained.

The estimated positive predictive values of FIT and of FOBT are 94.5% and 92.0%, and the estimated negative predictive values are 81.8% and 66.7%, respectively. Using the recommendations given in Section 3, the 95% Wald CI for the difference between the two positive predictive values contains the value zero, and therefore (with

α = 5 %

), the equality of the two positive predictive values is not rejected.

Regarding the negative predictive values, the 95% Wald CI does not contain the value zero, and therefore, we reject the equality of both negative predictive values. Therefore, negative predictive value of FIT is significantly greater than the negative predictive value of FOBT. The negative predictive value of FIT is (with a confidence of 95%) a value between 8.1% and 22.2% greater than the negative predictive value of FOBT. The same conclusions are obtained using the other CIs.

To illustrate the method of calculating the sample size, we are going to suppose that the clinician is interested in calculating the sample size necessary to estimate the difference between the two negative predictive values with a precision

ϕ_{υ} = 0.05

and

1 - α = 0.95

. The 95% Wald CI for

δ_{υ} = υ_{1} - υ_{2}

is

(0.081, 0.222)

, and the precision is 0.0705

(= (0.222 - 0.081) / 2)

. Since

ϕ_{υ} = 0.05 < 0.0705

, with the sample of 168 individuals, the precision has not been reached, and therefore, the sample size must be calculated. Using the sample of 168 patients as a pilot sample and executing the command

“ cicpvbdt (68, 18, 1, 13, 4, 1, 2, 61, 0, 0.05) ”

, it is obtained that

n_{υ} = 338

. A sample of 338 patients is necessary to estimate the difference between the two negative predictive values with a precision

ϕ_{υ} = 0.05

and a confidence of 95%. To the sample of 168 patients, another 170 new patients must be added. The two BDTs and the biopsy should be applied to new patients. Finally, it is necessary to recalculate the CIs with the sample of 338 patients and check if the set precision is verified.

7. Discussion

Comparison of the predictive values of two medical tests is a topic of interest in biostatistics. There are several articles that have studied hypothesis tests to solve these problems; however, the comparison of predictive values through confidence intervals has been little studied. In this manuscript, we have studied confidence intervals for the difference and for the ratio of the positive (negative) predictive values of two diagnostic tests under a paired design. We have carried out simulation experiments to study the asymptotic behaviors of the CIs, and we have given general rules of application. These rules are based on the sample size, since this is the only parameter that is set by the researcher, and also on the practical interpretation of the CIs. As a general conclusion, we recommend using the Wald interval for the difference of the two positive (negative) predictive values.

We have also proposed a method, based on the Wald CI for the difference, to calculate the sample size to estimate the difference between the two positive (negative) predictive values with a determined precision and confidence. This method starts from an initial pilot sample, and then the sample size is calculated from the estimators obtained with the initial sample. This method depends on the estimators of the pilot sample, so we have carried out simulation experiments to study the effect of the pilot sample on the sample size. The results obtained in these experiments have shown that the pilot sample does not have any important effect on the calculation of the sample size, and that therefore, the method has practical validity.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/math9131462/s1.

Author Contributions

The two authors have collaborated equally in the realization of this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We thank the Academic Editor and the anonymous referees for their helpful comments that improved the quality of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Let

π = {(p_{11}, p_{10}, p_{01}, p_{00}, q_{11}, q_{10}, q_{01}, q_{00})}^{T}

be the vector of probabilities of the multinomial distribution; then, the variance–covariance matrix of

\hat{π}

is

\sum_{\hat{π}} = {diag (π) - π π^{T}} / n

. In terms of

ω

, the predictive values are written as

τ_{1} = \frac{p_{10} + p_{11}}{Q_{1}}, τ_{2} = \frac{p_{01} + p_{11}}{Q_{2}}, υ_{1} = \frac{q_{00} + q_{01}}{{\bar{Q}}_{1}} and υ_{2} = \frac{q_{00} + q_{10}}{{\bar{Q}}_{2}},

where

\begin{matrix} Q_{1} = P (T_{1} = 1) = p_{10} + p_{11} + q_{10} + q_{11} = p S e_{1} + q (1 - S p_{1}), \\ {\bar{Q}}_{1} = 1 - Q_{1} = P (T_{1} = 0) = p_{00} + p_{01} + q_{00} + q_{01} = p (1 - S e_{1}) + q S p_{1}, \\ Q_{2} = P (T_{2} = 1) = p_{01} + p_{11} + q_{01} + q_{11} = p S e_{2} + q (1 - S p_{2}) \end{matrix}

and

{\bar{Q}}_{2} = 1 - Q_{2} = P (T_{2} = 0) = p_{00} + p_{10} + q_{00} + q_{10} = p (1 - S e_{2}) + q S p_{2} .

Let

ω = {(τ_{1}, τ_{2}, υ_{1}, υ_{2})}^{T}

be the vector of predictive values; then, applying the delta method, the matrix of the asymptotic variances–covariances of

\hat{ω}

is

\sum_{\hat{ω}} = (\frac{\partial ω}{\partial π}) \sum_{\hat{π}} {(\frac{\partial ω}{\partial π})}^{T} .

Performing the algebraic operations, it is obtained that

V a r ({\hat{τ}}_{1}) = \frac{(p_{10} + p_{11}) (q_{10} + q_{11})}{n Q_{1}^{3}} = \frac{τ_{1} {\bar{τ}}_{1}}{n Q_{1}}, V a r ({\hat{τ}}_{2}) = \frac{(p_{01} + p_{11}) (q_{01} + q_{11})}{n Q_{2}^{3}} = \frac{τ_{2} {\bar{τ}}_{2}}{n Q_{2}}, V a r ({\hat{υ}}_{1}) = \frac{(q_{01} + q_{00}) (p_{01} + p_{00})}{n {\bar{Q}}_{1}^{3}} = \frac{υ_{1} {\bar{υ}}_{1}}{n {\bar{Q}}_{1}}, V a r ({\hat{υ}}_{2}) = \frac{(q_{00} + q_{10}) (p_{00} + p_{10})}{n {\bar{Q}}_{2}^{3}} = \frac{υ_{2} {\bar{υ}}_{2}}{n {\bar{Q}}_{2}}, C o v ({\hat{τ}}_{1}, {\hat{τ}}_{2}) = \frac{p q^{2} τ_{1} τ_{2} ε_{0} + p^{2} q {\bar{τ}}_{1} {\bar{τ}}_{2} ε_{1} + τ_{1} τ_{2} {\bar{τ}}_{1} {\bar{τ}}_{2} Q_{1} Q_{2}}{n p q Q_{1} Q_{2}}

and

C o v ({\hat{υ}}_{1}, {\hat{υ}}_{2}) = \frac{p q^{2} {\bar{υ}}_{1} {\bar{υ}}_{2} ε_{0} + p^{2} q υ_{1} υ_{2} ε_{1} + υ_{1} υ_{2} {\bar{υ}}_{1} {\bar{υ}}_{2} {\bar{Q}}_{1} {\bar{Q}}_{2}}{n p q {\bar{Q}}_{1} {\bar{Q}}_{2}}

where

{\bar{τ}}_{i} = 1 - τ_{i}

and

{\bar{υ}}_{i} = 1 - υ_{i}

, with

i = 1, 2

. The estimated variances–covariances are obtained by substituting the parameters for their estimators. Equations (15) and (16) are obtained by substituting in equations

V a r ({\hat{δ}}_{τ}) = V a r ({\hat{τ}}_{1}) + V a r ({\hat{τ}}_{2}) - 2 C o v ({\hat{τ}}_{1}, {\hat{τ}}_{2})

and

V a r ({\hat{δ}}_{υ}) = V a r ({\hat{υ}}_{1}) + V a r ({\hat{υ}}_{2}) - 2 C o v ({\hat{υ}}_{1}, {\hat{υ}}_{2})

the variances–covariances by their corresponding expressions obtained previously.

References

Bennett, B.M. On comparison of sensitivity, specificity and predictive value of a number of diagnostic procedures. Biometrics 1972, 28, 793–800. [Google Scholar] [CrossRef] [PubMed]
Bennett, B.M. On tests for equality of predictive values for t diagnostic procedures. Stat. Med. 1985, 4, 535–539. [Google Scholar] [CrossRef] [PubMed]
Leisenring, W.; Alonzo, T.; Pepe, M.S. Comparisons of predictive values of binary medical diagnostic tests for paired designs. Biometrics 2000, 56, 345–351. [Google Scholar] [CrossRef] [PubMed]
Wang, W.; Davis, C.S.; Soong, S.J. Comparison of predictive values of two diagnostic tests from the same sample of subjects using weighted least squares. Stat. Med. 2006, 25, 2215–2229. [Google Scholar] [CrossRef] [PubMed]
Kosinski, A.S. A weighted generalized score statistic for comparison of predictive values of diagnostic tests. Stat. Med. 2013, 32, 964–977. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Tsou, T.S. A new likelihood approach to inference about predictive values of diagnostic tests in paired designs. Stat. Methods Med. Res. 2018, 27, 541–548. [Google Scholar] [CrossRef] [PubMed]
Takahashi, K.; Yamamoto, K. An exact test for comparing two predictive values in small-size clinical trials. Pharm. Stat. 2020, 19, 31–43. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Roldán-Nofuentes, J.A.; Luna del Castillo, J.D.; Montero-Alonso, M.A. Global hypothesis test to simultaneously compare the predictive values of two binary diagnostic tests. Comput. Stat. Data Anal. 2012, 56, 1161–1173. [Google Scholar] [CrossRef]
Moskowitz, C.S.; Pepe, M.S. Comparing the predictive values of diagnostic tests: Sample size and analysis for paired study designs. Clin. Trials 2006, 3, 272–279. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Vacek, P.M. The effect of conditional dependence on the evaluation of diagnostic tests. Biometrics 1985, 41, 959–968. [Google Scholar] [CrossRef] [PubMed]
Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; Chapman and Hall: Binghamton, NY, USA, 1993. [Google Scholar]
Fieller, E.C. The biological standardization of insulin. J. R. Stat. Soc. 1940, 7, 1–64. [Google Scholar] [CrossRef]
Team, R.C. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2016; Available online: https://www.R-project.org/ (accessed on 11 June 2021).

Table 1. Observed frequencies subject to a paired design.

	$T_{1} = 1$		$T_{1} = 0$
	$T_{2} = 1$	$T_{2} = 0$	$T_{2} = 1$	$T_{2} = 0$	Total
$Diseased (D = 1$ )	$s_{11}$	$s_{10}$	$s_{01}$	$s_{00}$	$s$
$Non - diseased (D = 0$ )	$r_{11}$	$r_{10}$	$r_{01}$	$r_{00}$	$r$
Total	$s_{11} + r_{11}$	$s_{10} + r_{10}$	$s_{01} + r_{01}$	$s_{00} + r_{00}$	$n$

Table 2. Asymptotic behaviors of the CIs for the difference of the two positive predictive values.

$\begin{matrix} τ_{1} = 0.75 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.95 δ_{τ} = 0 \\ ε_{1} = 0.124 ε_{0} = 0.010 p = 0.10 \end{matrix}$
	Wald		BCB		MCB
n	CP	AL	CP	AL	CP	AL
50	1	0.749	1	0.743	1	0.787
100	1	0.538	1	0.527	1	0.683
200	0.988	0.409	1	0.397	1	0.454
500	0.953	0.261	0.990	0.254	0.998	0.363
1000	0.944	0.185	0.957	0.186	0.993	0.259
$\begin{matrix} τ_{1} = 0.90 τ_{2} = 0.85 υ_{1} = 0.95 υ_{2} = 0.90 δ_{τ} = 0.05 \\ ε_{1} = 0.021 ε_{0} = 0.044 p = 0.50 \end{matrix}$
	Wald		BCB		MCB
n	CP	AL	CP	AL	CP	AL
50	0.982	0.251	1	0.242	0.999	0.326
100	0.951	0.174	0.966	0.182	0.993	0.223
200	0.954	0.122	0.952	0.126	0.991	0.156
500	0.941	0.077	0.933	0.077	0.981	0.099
1000	0.952	0.055	0.948	0.055	0.988	0.070
$\begin{matrix} τ_{1} = 0.85 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.85 δ_{τ} = 0.10 \\ ε_{1} = 0.037 ε_{0} = 0.024 p = 0.25 \end{matrix}$
	Wald		BCB		MCB
n	CP	AL	CP	AL	CP	AL
50	0.998	0.513	1	0.499	1	0.602
100	0.981	0.354	1	0.343	0.999	0.445
200	0.941	0.250	0.987	0.243	0.991	0.318
500	0.956	0.158	0.959	0.159	0.989	0.204
1000	0.953	0.112	0.954	0.113	0.989	0.145

CP: coverage probability. AL: average length. Wald: Wald CI. BCB: bias-corrected bootstrap CI. MCB: Monte Carlo Bayesian CI.

Table 3. Asymptotic behaviors of the CIs for the ratio of the two positive predictive values.

$\begin{matrix} τ_{1} = 0.75 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.95 ρ_{τ} = 1 \\ ε_{1} = 0.124 ε_{0} = 0.010 p = 0.10 \end{matrix}$
	Wald		Log		Fieller		BCB		MCB
n	CP	AL	CP	AL	CP	AL	CP	AL	CP	AL
50	1	1.326	1	1.450	1	2.046	1	1.348	1	2.183
100	0.999	0.966	1	1.018	1	1.311	1	0.973	1	1.534
200	0.989	0.630	0.992	0.643	0.994	0.652	1	0.648	1	0.978
500	0.962	0.359	0.966	0.361	0.953	0.369	0.988	0.364	0.998	0.535
1000	0.956	0.250	0.952	0.251	0.944	0.253	0.956	0.256	0.993	0.363
$\begin{matrix} τ_{1} = 0.90 τ_{2} = 0.85 υ_{1} = 0.95 υ_{2} = 0.90 ρ_{τ} = 1.06 \\ ε_{1} = 0.021 ε_{0} = 0.044 p = 0.50 \end{matrix}$
	Wald		Log		Fieller		BCB		MCB
n	CP	AL	CP	AL	CP	AL	CP	AL	CP	AL
50	0.986	0.326	0.994	0.328	0.993	0.334	1	0.347	1	0.448
100	0.949	0.216	0.952	0.216	0.950	0.218	0.957	0.232	0.994	0.288
200	0.952	0.151	0.955	0.151	0.954	0.152	0.953	0.157	0.990	0.196
500	0.941	0.095	0.941	0.095	0.940	0.095	0.935	0.095	0.982	0.122
1000	0.950	0.067	0.951	0.067	0.949	0.067	0.946	0.067	0.987	0.086
$\begin{matrix} τ_{1} = 0.85 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.85 ρ_{τ} = 1.13 \\ ε_{1} = 0.037 ε_{0} = 0.024 p = 0.25 \end{matrix}$
	Wald		Log		Fieller		BCB		MCB
n	CP	AL	CP	AL	CP	AL	CP	AL	CP	AL
50	0.997	0.950	1	0.958	0.998	0.961	1	0.992	1	1.407
100	0.972	0.591	0.983	0.596	0.979	0.636	1	0.689	0.999	0.841
200	0.941	0.390	0.943	0.392	0.940	0.396	0.988	0.398	0.989	0.528
500	0.950	0.241	0.954	0.241	0.957	0.242	0.960	0.244	0.989	0.314
1000	0.951	0.169	0.953	0.170	0.950	0.171	0.953	0.171	0.988	0.218

CP: coverage probability. AL: average length. Wald: Wald CI. Log: logarithmic CI. Fieller: Fieller CI. BCB: bias-corrected bootstrap CI. MCB: Monte Carlo Bayesian CI.

Table 4. Asymptotic behaviors of the CIs for the difference of the two negative predictive values.

$\begin{matrix} τ_{1} = 0.75 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.95 δ_{υ} = 0 \\ ε_{1} = 0.124 ε_{0} = 0.010 p = 0.10 \end{matrix}$
	Wald		BCB		MCB
n	CP	AL	CP	AL	CP	AL
50	1	0.127	1	0.119	1	0.154
100	0.999	0.072	1	0.069	0.999	0.095
200	0.989	0.046	1	0.042	0.999	0.063
500	0.949	0.028	0.968	0.028	0.994	0.040
1000	0.946	0.020	0.943	0.020	0.993	0.028
$\begin{matrix} τ_{1} = 0.90 τ_{2} = 0.85 υ_{1} = 0.95 υ_{2} = 0.90 δ_{υ} = 0.05 \\ ε_{1} = 0.021 ε_{0} = 0.044 p = 0.50 \end{matrix}$
	Wald		BCB		MCB
n	CP	AL	CP	AL	CP	AL
50	0.999	0.264	1	0.276	1	0.344
100	0.966	0.169	0.952	0.170	0.995	0.222
200	0.950	0.115	0.932	0.119	0.989	0.147
500	0.949	0.073	0.948	0.073	0.983	0.090
1000	0.952	0.052	0.953	0.052	0.984	0.063
$\begin{matrix} τ_{1} = 0.85 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.85 δ_{υ} = 0.10 \\ ε_{1} = 0.037 ε_{0} = 0.024 p = 0.25 \end{matrix}$
	Wald		BCB		MCB
n	CP	AL	CP	AL	CP	AL
50	0.936	0.207	0.720	0.182	0.948	0.218
100	0.938	0.142	0.874	0.133	0.960	0.151
200	0.948	0.099	0.937	0.096	0.967	0.107
500	0.957	0.062	0.961	0.062	0.975	0.068
1000	0.946	0.044	0.947	0.044	0.964	0.048

CP: coverage probability. AL: average length. Wald: Wald CI. BCB: bias-corrected bootstrap CI. MCB: Monte Carlo Bayesian CI.

Table 5. Asymptotic behaviors of the CIs for the ratio of the two negative predictive values.

$\begin{matrix} τ_{1} = 0.75 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.95 ρ_{υ} = 1 \\ ε_{1} = 0.124 ε_{0} = 0.010 p = 0.10 \end{matrix}$
	Wald		Log		Fieller		BCB		MCB
n	CP	AL	CP	AL	CP	AL	CP	AL	CP	AL
50	1	0.144	1	0.144	1	0.145	1	0.128	1.000	0.173
100	0.999	0.076	1	0.079	0.999	0.080	1	0.074	0.999	0.103
200	0.988	0.046	0.991	0.047	0.992	0.048	1	0.045	0.999	0.068
500	0.950	0.030	0.950	0.030	0.949	0.030	0.971	0.029	0.994	0.042
1000	0.948	0.021	0.947	0.021	0.946	0.021	0.946	0.021	0.993	0.030
$\begin{matrix} τ_{1} = 0.90 τ_{2} = 0.85 υ_{1} = 0.95 υ_{2} = 0.90 ρ_{υ} = 1.06 \\ ε_{1} = 0.021 ε_{0} = 0.044 p = 0.50 \end{matrix}$
	Wald		Log		Fieller		BCB		MCB
n	CP	AL	CP	AL	CP	AL	CP	AL	CP	AL
50	1	0.324	1	0.326	0.999	0.332	1	0.349	1	0.462
100	0.954	0.201	0.964	0.201	0.962	0.202	0.964	0.219	0.995	0.274
200	0.945	0.134	0.946	0.134	0.947	0.135	0.921	0.138	0.989	0.175
500	0.946	0.085	0.945	0.085	0.946	0.085	0.947	0.085	0.982	0.105
1000	0.954	0.060	0.952	0.060	0.952	0.060	0.950	0.060	0.983	0.074
$\begin{matrix} τ_{1} = 0.85 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.85 ρ_{υ} = 1.12 \\ ε_{1} = 0.037 ε_{0} = 0.024 p = 0.25 \end{matrix}$
	Wald		Log		Fieller		BCB		MCB
n	CP	AL	CP	AL	CP	AL	CP	AL	CP	AL
50	0.936	0.271	0.934	0.272	0.933	0.275	0.724	0.283	0.933	0.291
100	0.939	0.184	0.936	0.184	0.935	0.185	0.849	0.191	0.961	0.199
200	0.945	0.129	0.947	0.129	0.945	0.129	0.923	0.125	0.963	0.140
500	0.957	0.081	0.958	0.081	0.958	0.081	0.959	0.081	0.974	0.088
1000	0.949	0.057	0.950	0.057	0.950	0.057	0.948	0.057	0.964	0.062

CP: coverage probability. AL: average length. Wald: Wald CI. Log: logarithmic CI. Fieller: Fieller CI. BCB: bias-corrected bootstrap CI. MCB: Monte Carlo Bayesian CI.

Table 6. Sample size for estimated the difference between the positive (negative) predictive values.

Positive Predictive Values
	$\begin{matrix} τ_{1} = 0.90 τ_{2} = 0.85 υ_{1} = 0.95 υ_{2} = 0.90 \\ δ_{τ} = 0.05 ε_{1} = 0.021 ε_{0} = 0.044 p = 0.50 \end{matrix}$		$\begin{matrix} τ_{1} = 0.85 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.85 \\ δ_{τ} = 0.10 ε_{1} = 0.037 ε_{0} = 0.024 p = 0.25 \end{matrix}$
	$ϕ_{τ} = 0.025$	$ϕ_{τ} = 0.05$	$ϕ_{τ} = 0.025$	$ϕ_{τ} = 0.05$
Sample size	1203	301	5048	1262
Average sample size	1204	302	5054	1267
Relative bias (%)	0.17	0.33	0.12	0.40
Negative predictive values
	$\begin{matrix} τ_{1} = 0.90 τ_{2} = 0.85 υ_{1} = 0.95 υ_{2} = 0.90 \\ δ_{υ} = 0.05 ε_{1} = 0.021 ε_{0} = 0.044 p = 0.50 \end{matrix}$		$\begin{matrix} τ_{1} = 0.85 τ_{2} = 0.75 υ_{1} = 0.95 υ_{2} = 0.85 \\ δ_{υ} = 0.10 ε_{1} = 0.037 ε_{0} = 0.024 p = 0.25 \end{matrix}$
	$ϕ_{υ} = 0.025$	$ϕ_{υ} = 0.05$	$ϕ_{υ} = 0.025$	$ϕ_{υ} = 0.05$
Sample size	1079	270	782	196
Average sample size	1080	272	783	198
Relative bias (%)	0.09	0.74	0.13	1.02

Table 7. Observed frequencies and CIs.

Observed Frequencies
	FIT: positive		FIT: negative
Biopsy	FOBT: positive	FOBT: negative	FOBT: positive	FOBT: negative	Total
Cancer	68	18	1	13	100
Normal	4	1	2	61	68
Total	72	19	3	74	168
Results
	Positive predictive value		Negative predictive value
FIT	$0.945 \pm 0.024$		$0.818 \pm 0.044$
FOBT	$0.920 \pm 0.031$		$0.667 \pm 0.049$
$p$	$ε_{1}$	$ε_{0}$	$Q_{1}$	$Q_{2}$
0.595	0.087	0.052	0.542	0.446
$CIs for δ_{τ} = τ_{1} - τ_{2}$
Wald		BCB		MCB
$(- 0.016, 0 .066)$		$(- 0.013, 0.073)$		$(- 0.045, 0.105)$
$CIs for ρ_{τ} = τ_{1} / τ_{2}$
Wald	Log	Fieller	BCB	MCB
$(0.981, 1.073)$	$(0.982, 1.074)$	$(0.983, 1.076)$	$(0.985, 1.084)$	$(0.952, 1.124)$
$CIs for δ_{υ} = υ_{1} - υ_{2}$
Wald		BCB		MCB
$(0.081, 0 .222)$		$(0.089, 0.231)$		$(0.049, 0 . 248)$
$CIs for ρ_{υ} = υ_{1} / υ_{2}$
Wald	Log	Fieller	BCB	MCB
$(1.101, 1.353)$	$(1.108, 1.350)$	$(1.112, 1.368)$	$(1.121, 1.382)$	$(1.069, 1.420)$

Wald: Wald CI. Log: logarithmic CI. Fieller: Fieller CI. BCB: bias-corrected bootstrap CI. MCB: Monte Carlo Bayesian CI.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Roldán-Nofuentes, J.A.; Regad, S.B. Confidence Intervals and Sample Size to Compare the Predictive Values of Two Diagnostic Tests. Mathematics 2021, 9, 1462. https://doi.org/10.3390/math9131462

AMA Style

Roldán-Nofuentes JA, Regad SB. Confidence Intervals and Sample Size to Compare the Predictive Values of Two Diagnostic Tests. Mathematics. 2021; 9(13):1462. https://doi.org/10.3390/math9131462

Chicago/Turabian Style

Roldán-Nofuentes, José Antonio, and Saad Bouh Regad. 2021. "Confidence Intervals and Sample Size to Compare the Predictive Values of Two Diagnostic Tests" Mathematics 9, no. 13: 1462. https://doi.org/10.3390/math9131462

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Confidence Intervals and Sample Size to Compare the Predictive Values of Two Diagnostic Tests

Abstract

1. Introduction

2. Confidence Intervals

2.1. CIs for the Difference

2.1.1. Wald CI

2.1.2. Bias-Corrected Bootstrap CI

2.1.3. Monte Carlo Bayesian CI

2.2. CIs for the Ratio

2.2.1. Wald CI

2.2.2. Logarithmic CI

2.2.3. Fieller CI

2.2.4. Bias-Corrected Bootstrap CI

2.2.5. Monte Carlo Bayesian CI

3. Simulation Experiments

3.1. CIs for the Differences and Ratios of Positive Predictive Values

3.2. CIs for the Differences and Ratios of Negative Predictive Values

4. Sample Size

5. Program Cipvbdt

6. Example

7. Discussion

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI