Next Article in Journal
β-Delayed γ Emissions of 26P and Its Mirror Asymmetry
Next Article in Special Issue
A Bimodal Model Based on Truncation Positive Normal with Application to Height Data
Previous Article in Journal
Chirality and the Origin of Life
Previous Article in Special Issue
Generalizing Normality: Different Estimation Methods for Skewed Information
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Asymmetric Bimodal Double Regression Model

by
Yolanda M. Gómez
1,
Diego I. Gallardo
1,*,
Osvaldo Venegas
2 and
Tiago M. Magalhães
3
1
Departamento de Matemática, Facultad de Ingeniería, Universidad de Atacama, Copiapó 1530000, Chile
2
Departamento de Ciencias Matemáticas y Físicas, Facultad de Ingeniería, Universidad Católica de Temuco, Temuco 4780000, Chile
3
Department of Statistics, Institute of Exact Sciences, Federal University of Juiz de Fora, Juiz de Fora 36036-900, MG, Brazil
*
Author to whom correspondence should be addressed.
Symmetry 2021, 13(12), 2279; https://doi.org/10.3390/sym13122279
Submission received: 26 October 2021 / Revised: 24 November 2021 / Accepted: 26 November 2021 / Published: 30 November 2021
(This article belongs to the Special Issue Symmetric and Asymmetric Bimodal Distributions with Applications)

Abstract

:
In this paper, we introduce an extension of the sinh Cauchy distribution including a double regression model for both the quantile and scale parameters. This model can assume different shapes: unimodal or bimodal, symmetric or asymmetric. We discuss some properties of the model and perform a simulation study in order to assess the performance of the maximum likelihood estimators in finite samples. A real data application is also presented.

1. Introduction

A wide range of phenomena can be defined more appropriately using probability distributions. This description is very useful because of the properties associated with a distribution: expectation, shape, range, etc. However, fitting data can be difficult when their distribution is bimodal, which occurs commonly in practice: in astrophysics, the metallicity of the globular cluster system in the Milky Way (see [1]); in ecology, the tree cover of moist savanna and tropical forest ecosystems (see [2]); in genetics, gene expression measurements (see [3]). Other practical examples of bimodality in data can be seen in [4,5,6]. In the literature, there are many proposals discussing bimodal distributions; e.g., the works of [7,8,9,10,11,12]. Bimodal data can be fitted by a mixture of two unimodal distributions. When the mixture is created from the same model, the main difficulty is the non-identifiability of the proposed mixture model. The traditional example is the mixture of normals. Alternatively, the most workable practical method is to use a distribution which already has bimodal properties. For the latter situation, we introduce the gamma–sinh Cauchy (GSC) distribution, proposed by [13]. We note that the initials GSC can be found in the literature as an acronym for Generalized Skew-Cauchy, and readers should be aware of this when reviewing the literature. This model has uni/bimodal properties. However, unlike the distributions discussed in the works mentioned above, in this model, one of the parameters can be interpreted as the q-th quantile under certain conditions. This is very convenient, because it allows covariates to be introduced into non-homogeneous populations directly. The probability density function (pdf) for the GSC distribution is given by
f ( x ; μ , σ , λ , ϕ ) = λ cosh ( x μ σ ) σ π Γ ( ϕ ) 1 + λ sinh ( x μ σ ) 2 log 1 2 1 π arctan λ sinh x μ σ ϕ 1 ,
where x , μ R , σ , λ and ϕ > 0 . The corresponding cumulative distribution function (cdf) is given by
F ( x ; μ , σ , λ , ϕ ) = G log 1 2 1 π arctan λ sinh x μ σ ; ϕ ,
where G ( · ; ϕ ) denotes the cdf of the gamma distribution with shape and scale parameters equal to ϕ and 1 respectively. The GSC can be asymmetric or symmetric and, as the main advantage, its cdf has a closed-form expression which can be generated quickly in many different softwares. This is useful for generating random data, besides defining quantiles.
Regression models seek to describe the behavior of a variable of interest (or response) from covariables (explanatory variables). In general, a function called a link function links a characteristic of the response variable to the explanatory variables through parameters estimated from observed data. In our case, the response variable is bimodal and described by the GSC distribution, while the relationship between the response and explanatory variables is through the quantile. This type of relation is known as quantile regression. The literature on parametric models in the context of quantile regressions has increased considerably in recent years. For instance, for responses in the unit interval, see the works of [14,15]; for responses in the positive line see the work of [16]; and for responses in the real line see the works of [17,18].
The main advantage of quantile regression is that it is more robust against outliers. The advantage is that we can have a more informative approach to the response than simply modeling some specific measure of the population, such as the mean or median. The applicability of this method can be seen in ecology [19], econometrics [20], environmetrics [21], and medicine [22], for instance. In general terms, the distributions are not parametrized directly in terms of a general quantile q ( 0 , 1 ) or any specific quantile, except for some particular cases. For instance, for the N ( θ , σ 2 ) model, θ represents the mean and the median of the distribution, but the q-th quantile is given by θ + z q σ , where z q is the q-th of the standard normal distribution. On the other hand, which quantile is of interest depends on the research. In some cases, small quantiles will be of interest, whereas in other contexts, large quantiles will be the focus. For instance, in a nutrition context, larger quantiles of weight are of interest because they enable the nutritionist to define which patients are at higher risk; on this basis, they can define special treatments for such patients.
In view of the importance of bimodal distributions, the chief object of this paper is to build on the quantile regression structure in the GSC distribution. Double regression has been studied quite extensively in the literature; for example, the authors of [23] consider a regression structure for both components based on a new parameterization indexed by mean and dispersion parameters; in [24], a regression model is proposed that is useful for situations where the variable of interest is continuous and restricted to the positive real line and is related to other variables through the mean and precision parameters; and in [25], a new parameterization of the gamma distribution is used that is indexed by mode and precision parameters.
The paper is organized as follows. In Section 2, we develop the GSC regression model with its properties. In Section 3, we perform a small-scale simulation and evaluate the point estimation. An application to real data, which illustrates the usefulness of the proposed model, is discussed in Section 4. Finally, conclusions are given in Section 5.

2. The GSC Regression Model

Gómez et al. [13] show that F ( μ ; μ , σ , λ , ϕ ) = G ( log ( 2 ) , ϕ ) and then, for a fixed q ( 0 , 1 ) such as
G ( log ( 2 ) , ϕ ) = q ,
the parameter μ represents the q-th quantile of the distribution. This equation can be solved numerically. Henceforth, we use the notation GSC q ( μ , σ , λ ) to refer to a random variable with GSC distribution where ϕ is fixed as in (1), μ is the q-th quantile of the distribution, σ is a scale parameter and λ is a shape parameter. Figure 1 shows the relation between q and ϕ and the regions where the model is unimodal or bimodal, depending on the modeled quantile q and the parameter λ . Note that, for q ( 0 , 1 ) , we have ( λ 1 , λ 2 ) R + 2 such that the model is unimodal for λ 1 and bimodal for λ 2 . In the next proposition, we enunciate a property related to the unimodality of the model.
Proposition 1.
The GSC 0.5 ( μ , σ , λ ) model is unimodal for λ 1 / 2 and bimodal for 0 < λ < 1 / 2 .
Proof. 
Since μ and σ are location and scale parameters, without a loss of generality, we can consider μ = 0 and σ = 1 . Deriving the pdf of the GSC 0.5 ( μ = 0 , σ = 1 , λ ) model in relation to x, we obtain that
f ( x ) = f ( x ) x = sinh ( x ) ( 1 2 λ 2 λ 2 sinh 2 ( x ) ) cosh ( x ) ( 1 + λ 2 sinh 2 ( x ) ) f ( x ) .
Note that f ( x ) = 0 if and only if sinh ( x ) = 0 or 1 2 λ 2 λ 2 sinh 2 ( x ) = 0 . From the last equation, it follows that sinh ( x ) = ± 1 2 λ 2 λ , which has two solutions if 0 < λ < 1 / 2 , one solution if λ = 1 / 2 , and no solution if λ > 1 / 2 . Therefore,
  • For 0 < λ < 1 / 2 , the equation f ( x ) = 0 has three solutions. In this case, f ( x ) > 0 , x ( , asinh 1 2 λ 2 λ ) ( 0 , asinh 1 2 λ 2 λ ) and f ( x ) < 0 , x ( asinh 1 2 λ 2 λ , 0 ) ( asinh 1 2 λ 2 λ , + ) . Then, asinh ± 1 2 λ 2 λ are the two modes of the distribution.
  • For λ 1 / 2 , the equation f ( x ) = 0 has one solution. In this case, f ( x ) > 0 , x < 0 and f ( x ) < 0 , x > 0 . Then, x = 0 is the only mode of the distribution.
Suppose now that we are interested in modeling the q-th quantile of the distribution for a non-homogeneous population. We assume that, for a fixed q ( 0 , 1 ) , the q-th quantile of the distribution μ and the scale parameter σ satisfy the following functional relations:
μ i ( q ) = x 1 i β 1 ( q ) and log ( σ i ( q ) ) = x 2 i β 2 ( q ) ,
where β 1 ( q ) = ( β 11 ( q ) , , β 1 p 1 ( q ) ) and β 2 ( q ) = ( β 21 ( q ) , , β 2 p 2 ( q ) ) are vectors of unknown regression coefficients such that β 1 ( q ) , β 2 ( q ) R p 1 + p 2 , with p 1 + p 2 < n ; and x 1 i = ( x 11 i , , x 1 p 1 i ) and x 2 i = ( x 21 i , , x 2 p 2 i ) are the observations of the known regressors p 1 and p 2 . Note that the vector x 1 i is linked with the μ i ( q ) parameter using the identity link; then, interpretations for the covariates in x 1 i can be performed using the same idea as an ordinary linear regression. For instance, if the jth covariate 1 j p 1 is a quantitative variable, then, fixing the rest of the covariates, the q-th quantile of the distribution increases by β 1 j ( q ) units when x 1 j is increased to x 1 j + 1 . Similarly, for the regression coefficients related to the scale parameter, after fixing the rest of the covariates, the scale of the distribution for the q-th quantile of the distribution is increased by exp ( β 2 k ( q ) ) units when x 2 k is increased to x 2 k + 1 , 1 k p 2 . We highlight that in [13], a regression structure was assumed only for μ i ( q ) . However, the assumption that all the observations have the same scale parameter could be unrealistic, as each observation could have its own scale. For this reason, it seems reasonable to assume this double regression structure.
In this setting, the log-likelihood function for ψ ( q ) = ( β 1 ( q ) , β 2 ( q ) , λ ) , up to a constant, is given by
ψ ( q ) = i = 1 n log 1 + [ λ sinh z i ( q ) ] 2 + ( ϕ 1 ) log log 1 2 1 π arctan { λ sinh z i ( q ) } + n log λ log σ ,
where z i ( q ) = ( x i μ ( q ) ) / σ i ( q ) . The maximum likelihood (ML) estimator of ψ ( q ) , say ψ ^ ( q ) , is obtained by maximizing ψ ( q ) in relation to ψ ( q ) . For this model, such a maximization procedure does not provide a closer form, meaning that numerical procedures need to be implemented. Specifically, we use the Broyden–Fletcher–Goldfarb–Shanno (BFGS) quasi-Newton method; see [26] (p. 199). This procedure is implemented in the R software [27]. The programs are available on request. Finally, under regularity conditions, ψ ^ ( q ) satisfies that
n H ^ ( ψ ^ ) 1 ψ ^ ψ N p 1 + p 2 ( 0 , I p 1 + p 2 ) , a s n + ,
where N p ( 0 , I p ) denotes the standard multivariate distribution and H ^ ( ψ ^ ) denotes the estimated Hessian matrix of the log-likelihood function in relation to ψ .

3. Simulation Study

In this section, we present a simulation study to evaluate the performance of ML estimates in finite samples. The computational procedure is implemented using R software [27]. Values of the GSC distribution were drawn using inverse transform sampling. We considered a scheme with two covariates, both simulated from the uniform distribution between −2 and 2. We considered combinations of values for the quantile q : 0.10 , 0.50 and 0.75 ; vectors for β 1 : ( 1 , 1 , 0.5 ) , ( 0.5 , 2 , 1 ) , ( 1 , 1 , 0.5 ) and ( 0.5 , 2 , 1 ) ; vectors for β 2 : ( 1 , 1.6 , 0.5 ) , ( 1 , 1.6 , 0.5 ) and ( 1 , 0.7 , 0.3 ) ; and the parameter log ( λ ) : 1.39 and 1.61 . We also considered three sample sizes: 100, 200 and 500. Based on 5000 replicates, we compute the mean of the estimated bias for each estimator (bias), the mean of the estimated standard errors (SE), the root of the estimated mean squared error (RMSE), and the 95% coverage probabilities (CP). Table 1 summarizes the results. From Table 1, it can be observed that the bias, SE, and RMSE for all the parameters tend to approach zero when the sample size is increased, showing that the ML estimates obtained are asymptotically consistent. On the other hand, the CP values are closer to the nominal values used in their construction (95%), suggesting that the asymptotic distribution in Equation (3) is reasonable, even in finite samples.

4. Application

To illustrate the GSC double regression model, we consider the Australian data set available in the package sn in R [28], which includes data on 202 athletes collected at the Australian Institute of Sport. Codes were performed in [27] and are available upon request. Our main aim is to explain the body fat percentage (Bfat) in terms of the body mass index (bmi) and the lean body mass (lbm). Particularly, we consider Bfat i ( q ) G S C ( μ i ( q ) , σ i ( q ) , λ , ϕ ( q ) ) , where ϕ ( q ) satisfies (1), q ( 0 , 1 ) and for i = 1 , , 202 , we have that
μ i ( q ) = β 11 ( q ) + β 12 ( q ) bmi i + β 13 ( q ) lbm i   and   σ i ( q ) = β 21 ( q ) + β 22 ( q ) bmi i + β 23 ( q ) lbm i .
In other words, the bmi and lbm explain both the q-th quantile of Bfat and the scale of the distribution. The same structure of covariates was considered in [13], but without modeling the scale parameter; i.e., considering β 22 ( q ) = β 23 ( q ) = 0 . We refer to those models as GSC and GSC 0 for the cases where σ is modeled and not modeled, respectively. We considered q { 0.1 , 0.25 , 0.5 , 0.75 , 0.9 } . Our approach is compared with the skewed Laplace (SKL) model in [29]. Table 2 shows the Akaike information criterion (AIC [30]) for the three models. We also present the statistic for the likelihood ratio test (LRT) to test H 0 : β 22 ( q ) = β 23 ( q ) = 0 versus H 0 : β 22 ( q ) 0 or β 23 ( q ) 0 , for all the quantiles considered. In addition, we also compute the quantile residuals [31] for the GSC model. If the model is correctly specified, the residual should be a random sample from the standard normal distribution. We checked this assumption with the traditional Kolmogorov–Smirnov test. Note that the GSC presents the lowest AIC for the quantiles up to the median, and GSC 0 presents the lowest AIC for the rest of the quantiles. This is explained because, according to the LRT, the coefficients related to the bmi and lbm variables are not significant (under any common level of significance) for modeling the scale parameter for q = 0.75 and q = 0.9 , while they are significant for the rest of the quantiles. Finally, based on the quantile residuals, the GSC double regression model seems to be appropriate for modeling all quantiles, except the largest. Figure 2 also shows the regression coefficients in terms of the quantiles and their respective 95% confidence intervals. Note that β 12 ( q ) and β 13 ( q ) are significant (based on 5% significance) for all the quantiles considered; i.e., the bmi and lbm variables are relevant for explaining the different quantiles of Bfat. Specifically, we can obtain the following interpretations for β 12 ( q ) and β 13 ( q ) :
  • (Interpreting β 12 ( q ) ) For a fixed lbm, for athletes in the lowest 10% of Bfat, the Bfat is increased by 0.8572 units (95% confidence interval 0.4191; 1.2954) for each unit increase in bmi, and for athletes in the highest 90% of Bfat, the Bfat is increased by 2.5834 units (95% confidence interval 2.3227; 2.8442) for each unit increase in bmi.
  • (Interpreting β 13 ( q ) ) For a fixed bmi, for athletes in the lowest 10% of Bfat, the Bfat is decreased by 0.3039 units (95% confidence interval −0.4093; −0.1984) for each unit increase in lbm, and for athletes in the highest 10% of Bfat, the Bfat is decreased by 0.5511 units (95% confidence interval −0.6077; −0.4945) for each unit increase in lbm.
We highlight the large difference between the interpretations for athletes in the lowest 10% and highest 10% of Bfat.
Finally, Figure 3 shows the different estimated pdf values for the Bfat under different scenarios for bmi and lbm. We note that these estimated pdf values assumed different shapes—unimodal, bimodal symmetric, and bimodal asymmetric—justifying the use of the double regression GSC model in this example. Finally, Figure 4 shows the pairs ( λ , q ) for the five quantiles modeled, identifying the unimodal and bimodal cases.

5. Conclusions

In this paper, we present a new extension of the GSC model, introducing a double regression structure to model both the quantile and the scale of the distribution. This structure produces a competitive model for modeling heterogeneous populations with different shapes: unimodal symmetric, unimodal asymmetric, bimodal symmetric, and bimodal asymmetric. The illustration with a real data set shows that the model provides better performance than other proposals in the literature. A limitation of the model is that the shape parameter is common for all the observations. Further extensions should include covariates in this parameter also.

Author Contributions

Conceptualization, Y.M.G. and D.I.G.; methodology, Y.M.G. and D.I.G.; software, Y.M.G. and D.I.G.; validation, O.V. and T.M.M.; formal analysis, Y.M.G. and D.I.G.; investigation, O.V. and T.M.M.; resources, O.V. and T.M.M.; data curation, Y.M.G. and D.I.G.; writing—original draft preparation, Y.M.G. and D.I.G.; writing—review and editing, O.V. and T.M.M. All authors have read and agreed to the published version of the manuscript.

Funding

Nothing to declare.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

For data, we refer the reader to [28].

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ashman, K.M.; Bird, C.M.; Zepf, S.E. Detecting bimodality in astronomical datasets. Astron. J. 1994, 108, 2348–2361. [Google Scholar] [CrossRef] [Green Version]
  2. De Michele, C.; Accatino, F. Tree cover bimodality in savannas and forests emerging from the switching between two fire dynamics. PLoS ONE 2014, 9, e91195. [Google Scholar] [CrossRef] [Green Version]
  3. Wang, J.; Wen, S.; Symmans, W.F.; Pusztai, L.; Coombes, K.R. The bimodality index: A criterion for discovering and ranking bimodal signatures from cancer gene expression profiling data. Cancer Inform. 2009, 7, 199–216. [Google Scholar] [CrossRef] [Green Version]
  4. Freeman, J.B.; Dale, R. Assessing bimodality to detect the presence of a dual cognitive process. Behav. Res. Methods 2013, 45, 83–97. [Google Scholar] [CrossRef] [Green Version]
  5. Sambrook-Smith, G.H.; Nicholas, A.P.; Ferguson, R.I. Measuring and defining bimodal sediments: Problems and implications. Water Resour. Res. 1997, 33, 1179–1195. [Google Scholar] [CrossRef]
  6. Sturrock, P.A. Analysis of bimodality in histograms formed from GALLEX and GNO solar neutrino data. Sol. Phys. 2008, 249, 1–10. [Google Scholar] [CrossRef] [Green Version]
  7. Gómez, H.W.; Elal-Olivero, D.; Salinas, H.S.; Bolfarine, H. Bimodal extension based on the skew-normal distribution with application to pollen data. Environmetrics 2018, 22, 50–62. [Google Scholar] [CrossRef]
  8. Venegas, O.; Salinas, H.S.; Gallardo, D.I.; Bolfarine, H.; Gómez, H.W. Bimodality based on the generalized skew-normal distribution. J. Stat. Comput. Simul. 2018, 88, 156–181. [Google Scholar] [CrossRef]
  9. Butt, N.S.; Khalil, M.G. A new bimodal distribution for modeling asymmetric bimodal heavy-tail real lifetime data. Symmetry 2020, 12, 2058. [Google Scholar] [CrossRef]
  10. Iriarte, Y.A.; de Castro, M.; Gómez, H.W. A Unimodal/Bimodal Skew/Symmetric Distribution Generated from Lambert’s Transformation. Symmetry 2021, 13, 269. [Google Scholar] [CrossRef]
  11. Reyes, J.; Gómez-Déniz, E.; Gómez, H.W.; Calderín-Ojeda, E. A bimodal extension of the exponential distribution with applications in risk theory. Symmetry 2021, 13, 679. [Google Scholar] [CrossRef]
  12. Reyes, J.; Arrué, J.; Leiva, V.; Martín-Barreiro, C. A new Birnbaum–Saunders distribution and its mathematical features applied to bimodal real-world data from environment and medicine. Mathematics 2021, 9, 1891. [Google Scholar] [CrossRef]
  13. Gómez, Y.; Gómez-Déniz, E.; Venegas, O.; Gallardo, D.I.; Gómez, H.W. An asymmetric bimodal distribution with application to quantile regression. Symmetry 2019, 11, 899. [Google Scholar] [CrossRef] [Green Version]
  14. Bayes, C.L.; García, C. A new robust regression model for proportions. Bayesian Anal. 2012, 7, 841–866. [Google Scholar] [CrossRef]
  15. Bayes, C.L.; Bazán, J.L.; de Castro, M. A quantile parametric mixed regression model for bounded response variables. Stat. Interface 2017, 10, 483–493. [Google Scholar] [CrossRef]
  16. Sánchez, L.; Leiva, V.; Galea, M.; Saulo, H. Birnbaum-Saunders quantile regression and its diagnostics with application to economic data. Appl. Stoch. Model. Bus. Ind. 2021, 37, 53–73. [Google Scholar] [CrossRef]
  17. Bernardi, M.; Bottone, M.; Petrella, L. Bayesian quantile regression using the skew exponential power distribution. Comput. Stat. Data Anal. 2018, 126, 92–111. [Google Scholar] [CrossRef] [Green Version]
  18. Korkmaz, M.Ç.; Chesneau, C.; Korkmaz, Z.S. On the arcsecant hyperbolic normal distribution. Properties, quantile regression modeling and applications. Symmetry 2021, 13, 117. [Google Scholar] [CrossRef]
  19. Cade, B.S.; Noon, B.R. A gentle introduction to quantile regression for ecologists. Front. Ecol. Environ. 2003, 1, 412–420. [Google Scholar] [CrossRef]
  20. Xiao, Z.; Guo, H.; Lam, M.S. Quantile regression and value at risk. In Handbook of Financial Econometrics and Statistics; Lee, C.F., Lee, J., Eds.; Springer: New York, NY, USA, 2015; pp. 1143–1167. [Google Scholar]
  21. Alencar, A.P.; Santos, B.R. Association of pollution with quantiles and expectations of the hospitalization rate of elderly people by respiratory diseases in the city of São Paulo, Brazil. Environmetrics 2014, 25, 165–171. [Google Scholar] [CrossRef]
  22. Wei, Y.; Pere, A.; Koenker, R.; He, X. Quantile regression methods for reference growth charts. Stat. Med. 2005, 25, 1369–1382. [Google Scholar] [CrossRef] [PubMed]
  23. Bourguignon, M.; Gallardo, D.I.; Medeiros, R.M.R. A simple and useful regression model for underdispersed count data based on Bernoulli–Poisson convolution. Stat. Pap. 2021, in press. [Google Scholar] [CrossRef]
  24. Bourguignon, M.; Santos-Neto, M.; de Castro, M. A new regression model for positive random variables with skewed and long tail. METRON 2021, 79, 33–55. [Google Scholar] [CrossRef]
  25. Bourguignon, M.; Leão, J.; Gallardo, D.I. Parametric modal regression with varying precision. Biom. J. 2020, 62, 2002–2020. [Google Scholar] [CrossRef]
  26. Mittelhammer, R.C.; Jodge, G.G.; Miller, D.J. Econometric Foundations; Cambridge University Press: New York, NY, USA, 2000. [Google Scholar]
  27. Core Team, R. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
  28. Azzalini, A. The R Package sn: The Skew-Normal and Related Distributions Such as the Skew-t and the SUN (Version 2.0.0); Università di Padova: Padova, Italy, 2021. [Google Scholar]
  29. Galarza, C.E.; Lachos, V.H.; Barbosa, C.; Castro, L.M. Robust quantile regression using a generalized class of skewed distributions. Stat 2017, 6, 113–130. [Google Scholar]
  30. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 379–397. [Google Scholar] [CrossRef]
  31. Dunn, P.K.; Smyth, G.K. Randomized Quantile Residuals. J. Comput. Graph. Stat. 1996, 5, 236–244. [Google Scholar]
Figure 1. (a) Relation between q and ϕ and (b) regions of unimodality and bimodality for the GSC model in terms of q and λ .
Figure 1. (a) Relation between q and ϕ and (b) regions of unimodality and bimodality for the GSC model in terms of q and λ .
Symmetry 13 02279 g001
Figure 2. Estimated parameters for regression coefficients (and 95% confidence intervals) for different quantile regression models in the athlete data set.
Figure 2. Estimated parameters for regression coefficients (and 95% confidence intervals) for different quantile regression models in the athlete data set.
Symmetry 13 02279 g002
Figure 3. Estimated density function for different quantiles of Bfat under different combinations of bmi and lbm: (a) q = 0.25 , bmi = 32 , lbm = 40; (b) q = 0.50 , bmi = 32 , lbm = 40; (c) q = 0.25 , bmi = 30 , lbm = 80 and; (d) q = 0.75 , bmi = 30 , lbm = 80.
Figure 3. Estimated density function for different quantiles of Bfat under different combinations of bmi and lbm: (a) q = 0.25 , bmi = 32 , lbm = 40; (b) q = 0.50 , bmi = 32 , lbm = 40; (c) q = 0.25 , bmi = 30 , lbm = 80 and; (d) q = 0.75 , bmi = 30 , lbm = 80.
Symmetry 13 02279 g003aSymmetry 13 02279 g003b
Figure 4. Points of unimodality and bimodality for the GSC model in the athlete data set for the different quantiles modeled.
Figure 4. Points of unimodality and bimodality for the GSC model in the athlete data set for the different quantiles modeled.
Symmetry 13 02279 g004
Table 1. Estimated bias, SE, RMSE, and 95%CP for the ML estimators for the GSC double regression model under different scenarios based on 5000 Monte Carlo replicates.
Table 1. Estimated bias, SE, RMSE, and 95%CP for the ML estimators for the GSC double regression model under different scenarios based on 5000 Monte Carlo replicates.
n = 100 n = 200 n = 500
q β 1 β 2 log ( λ ) ParameterBiasSERMSECPBiasSERMSECPBiasSERMSECP
0.10(1, −1, 0.5)(−1, 1.6, −0.5)−1.39 β 10 0.01460.05140.06480.86940.00660.03290.03630.91980.00250.02000.02080.9394
β 11 0.00600.02750.03400.87860.00290.01770.01910.92460.00100.01060.01100.9398
β 12 −0.00110.00780.00990.8676−0.00030.00580.00660.9086−0.00020.00380.00400.9410
β 20 −0.05440.10040.11840.9006−0.02570.06990.07500.9314−0.00850.04380.04490.9424
β 21 0.01940.05050.05980.90400.00830.03570.03850.93140.00330.02240.02360.9390
β 22 −0.00890.04570.05030.9222−0.00510.03320.03590.9268−0.00090.02060.02100.9462
log ( λ ) −0.09830.31510.35200.9332−0.04980.21630.22730.9460−0.01620.13400.13600.9498
1.61 β 10 0.00030.04490.05360.8920−0.00020.02870.03100.93100.00020.01750.01790.9428
β 11 0.00030.02550.03110.8866−0.00010.01640.01810.92220.00010.00990.01020.9410
β 12 −0.00010.00690.00980.83080.00010.00530.00640.89500.00000.00360.00390.9334
β 20 −0.06270.11160.13110.8904−0.03150.07910.08630.9162−0.01130.05010.05140.9398
β 21 0.01130.04600.05040.92640.00470.03370.03520.93860.00140.02150.02160.9472
β 22 −0.00580.04670.05030.9280−0.00230.03360.03500.9388−0.00120.02110.02150.9464
log ( λ ) −0.14920.30120.34610.9266−0.07770.20730.22770.9312−0.02740.12880.13330.9446
0.50(−0.5, −2, 1)(−1, 1.6, −0.5)−1.39 β 10 0.01260.05120.06360.87440.00730.03310.03680.91820.00230.02000.02040.9398
β 11 0.00500.02750.03380.88120.00320.01780.01950.92300.00090.01060.01080.9422
β 12 −0.00110.00770.00980.8628−0.00050.00580.00670.9118−0.00030.00390.00410.9350
β 20 −0.05760.10010.11970.8924−0.02390.07000.07480.9288−0.00870.04380.04460.9446
β 21 0.01860.04990.05860.90740.00800.03580.03810.93420.00360.02240.02360.9376
β 22 −0.00840.04550.04890.9274−0.00440.03330.03520.9346−0.00110.02070.02070.9492
log ( λ ) −0.11420.31490.35070.9382−0.04320.21620.22420.9448−0.01540.13400.13910.9422
(1, 0.7, −0.3)−1.39 β 10 −0.00680.05290.06680.8808−0.00200.03460.03840.9198−0.00080.02100.02200.9428
β 11 −0.00250.02970.03820.8720−0.00050.01960.02230.9164−0.00030.01180.01240.9428
β 12 0.00020.00820.01210.82760.00010.00630.00790.87820.00000.00430.00470.9342
β 20 −0.06670.10810.12970.8862−0.02750.07650.08130.9306−0.01020.04820.04970.9402
β 21 0.01540.05110.05740.91820.00480.03750.03930.93520.00230.02380.02450.9430
β 22 −0.00700.05120.05600.9208−0.00470.03710.03860.9426−0.00100.02320.02340.9510
log ( λ ) −0.15790.31610.36250.9240−0.06940.21640.22920.9398−0.02270.13420.13750.9440
0.75(1, −1, 0.5)(1, 0.7, −0.3)1.61 β 10 −0.00610.58130.64220.9130−0.00040.41120.43140.9352−0.00200.25520.26040.9442
β 11 0.00320.37090.42670.90040.00150.26960.28750.92960.00000.16320.16680.9426
β 12 0.00590.22880.29300.86900.00610.16500.18630.9152−0.00010.10740.11150.9414
β 20 −0.06290.11180.12970.8912−0.02920.07920.08660.9180−0.01270.05000.05100.9422
β 21 0.00790.04580.04970.92740.00330.03380.03510.94000.00190.02140.02190.9490
β 22 −0.00370.04670.05030.9292−0.00120.03370.03540.9382−0.00070.02110.02130.9462
log ( λ ) −0.14810.30160.34080.9286−0.07270.20730.22530.9362−0.03130.12890.13050.9508
(−0.5, −2, 1)(1, 0.7, −0.3)1.61 β 10 −0.00430.12040.13020.9226−0.00130.08340.08860.9322−0.00070.05160.05190.9474
β 11 0.00260.07720.08800.91020.00090.05470.05940.92420.00050.03300.03330.9462
β 12 −0.00130.04940.06270.8624−0.00060.03380.03660.92080.00010.02170.02240.9370
β 20 −0.05470.22660.23800.9246−0.03070.15880.16590.9306−0.00990.10000.09970.9500
β 21 0.02640.11230.12070.92820.00980.08050.08330.94140.00450.05040.05170.9428
β 22 −0.01420.11470.12210.9342−0.00800.08030.08430.9382−0.00140.04960.05080.9470
log ( λ ) −0.00640.29470.29990.9452−0.01070.20410.20840.9432−0.00280.12820.12770.9488
Table 2. AIC for GSC, GSC 0 , and SKL models in the athlete data set for different quantiles. We also present the statistical p-value.
Table 2. AIC for GSC, GSC 0 , and SKL models in the athlete data set for different quantiles. We also present the statistical p-value.
AIClog-LikelihoodLRTKS
τ GSC 0 GSCSKLGSC 0 GSCStatisticalp-Valuep-Value
0.101168.541154.641194.28−574.27−563.3221.90<0.00010.988
0.251172.721164.551172.70−576.36−568.2716.170.00030.646
0.501174.741171.831182.66−577.37−571.9110.910.00430.180
0.751171.501176.511221.65−575.75−574.253.000.22350.839
0.901223.711229.751280.45−601.85−600.881.950.37680.004
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gómez, Y.M.; Gallardo, D.I.; Venegas, O.; Magalhães, T.M. An Asymmetric Bimodal Double Regression Model. Symmetry 2021, 13, 2279. https://doi.org/10.3390/sym13122279

AMA Style

Gómez YM, Gallardo DI, Venegas O, Magalhães TM. An Asymmetric Bimodal Double Regression Model. Symmetry. 2021; 13(12):2279. https://doi.org/10.3390/sym13122279

Chicago/Turabian Style

Gómez, Yolanda M., Diego I. Gallardo, Osvaldo Venegas, and Tiago M. Magalhães. 2021. "An Asymmetric Bimodal Double Regression Model" Symmetry 13, no. 12: 2279. https://doi.org/10.3390/sym13122279

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop