Next Article in Journal
Comparison and Explanation of Forecasting Algorithms for Energy Time Series
Previous Article in Journal
Bibliometrics of Machine Learning Research Using Homomorphic Encryption
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Extended Generalized Sinh-Normal Distribution

by
Guillermo Martínez-Flórez
1,†,
David Elal-Olivero
2 and
Carlos Barrera-Causil
3,*
1
Departamento de Matemática y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 230001, Colombia
2
Departamento de Matemática, Facultad de Ingeniería, Universidad de Atacama, Copiapó 1530000, Chile
3
Grupo de Investigación Davinci, Facultad de Ciencias Exactas y Aplicadas, Instituto Tecnológico Metropolitano, Medellín 050034, Colombia
*
Author to whom correspondence should be addressed.
Present address: Programa de Pós-Graduação em Modelagem e Métodos Quantitativos, Universidade Federal do Ceará, Fortaleza 60020-181, Brazil.
Mathematics 2021, 9(21), 2793; https://doi.org/10.3390/math9212793
Submission received: 21 September 2021 / Revised: 20 October 2021 / Accepted: 25 October 2021 / Published: 4 November 2021
(This article belongs to the Section Probability and Statistics)

Abstract

:
Positively skewed data sets are common in different areas, and data sets such as material fatigue, reaction time, neuronal reaction time, agricultural engineering, and spatial data, among others, need to be fitted according to their features and maintain a good quality of fit. Skewness and bimodality are two of the features that data sets like this could present simultaneously. So, flexible statistical models should be proposed in this sense. In this paper, a general extended class of the sinh-normal distribution is presented. Additionally, the asymmetric distribution family is extended, and as a natural extension of this model, the extended Birnbaum–Saunders distribution is studied as well. The proposed model presents a better goodness of fit compared to the other studied models.

1. Introduction

When materials are exposed to pressure or stress levels, material structural damage could occur. This is known as material fatigue, and a statistical model to fit random variables to model the failure time of fatigue for material was proposed by Birnbaum and Saunders (1969) [1], known in the literature as Birnbaum–Saunders distribution and generally denoted by B S ( α , β ) , where α > 0 represents a shape parameter and β > 0 is a scale parameter and the median of the distribution. Later, Desmond (1985) [2] showed that the BS distribution describes the time failure that occurs when some kind of damage is accumulated after a given time.
A distribution associated with that of Birnbaum and Saunders (BS) is the sinh-normal (SHN) distribution. This distribution, introduced by Rieck and Nederman (1991) [3], is based on a nonlinear transformation of a normal distribution. Therefore, let Z = 2 α sinh Y ξ σ N ( 0 , 1 ) , where ξ and σ are location and scale parameters, respectively, and α is a shape parameter. Then, random variable (r.v.) Y follows a sinh-normal distribution denoted by S H N ( α , ξ , σ ) . A probability density function (pdf) of a random variable with a SHN distribution is given by
f ( z ) = b y ϕ ( b y ) ,
where b y = b ( y , α , ξ , σ ) = 2 α sinh y ξ σ , b y = 2 α σ cosh y ξ σ is a derivative of b y with respect to y, and ϕ ( . ) is the pdf of the normal distribution. It can be shown that the s-th moment of a random variable with a SHN distribution is given by
μ s = E ( Y s ) = k = 0 s s k σ k ξ s k c k ( α ) ,
where
c k ( α ) = sinh 1 α 2 w k ϕ ( w ) d w .
From this result, we obtain that μ 1 = E ( Y ) = ξ ,   μ 2 = ξ 2 + σ 2 c 2 ( α ) and v a r ( Y ) = σ 2 c 2 ( α ) . It can also be shown, from the two central moments, that E ( Y ξ ) 3 = 0 is a symmetric distribution with respect to ξ . In general, we find that
E ( Y ξ ) s = σ s c s ( α ) .
Asymmetric extensions of the SHN model have been considered, for instance, by Leiva et al. (2010) ([4]) and Lemonte et al. (2011) ([5]), who studied the skewed SHN model, as well as by Martínez-Flórez et al. (2017) [6], who investigated the power SHN model, and Moreno-Arenas et al. (2016) [7], who presented the Proportional Hazard BS model (PHBS).
The SHN distribution is also known as the log-Birnbaum–Saunders (LBS) distribution because Y SHN ( α , ξ , σ = 2 ) , then T = exp ( Y ) follows a BS distribution, with parameters α and β = exp ( ξ ) .
The pdf of a random variable T B S ( α , β ) is given by
f ( t ) = ϕ ( a t ) a t t > 0 ,
where
a t = a t ( α , β ) = 1 α t β β t , a n d a t = a t ( α , β ) = d a t d t = t 3 / 2 ( t + β ) 2 α β 1 / 2 .
An important feature of this distribution is its robustness concerning the estimation of its parameters—an aspect that was analyzed by Barros, Paula, and Leiva (2010) [8]. Moreover, extensions of this distribution to an elliptical family and to a skew elliptical family (this latter is known as the double generalized BS distribution) have been studied by Díaz-García et al. (2005) [9] and by Vilca-Labra and Leiva-Sánchez (2006) [10], respectively. Martínez-Flórez et al. [11] present an extension to a power-skew-elliptical family. Other types of extensions have also been considered by authors such as Castillo et al. (2011) [12], Cordeiro and Lemonte. (2014) [13], and Reyes et al. (2018) [14].
All these extensions are particularly characterized by fitting skew unimodal data while not being appropriate to fit bimodal data. However, Martinez-Florez et al. (2017) [6] and Olmos et al. (2017) [15] recently presented BS models to fit positive bimodal data. Likewise, Bolfarine et al. (2011) [16] introduced another model to fit positive bimodal data generated by the log-skew-normal distribution.
Cortés et al. (2018) [17] presented a class of extended distribution, generated by the pdf, g ( x ) , of the bimodal-normal distribution. They specifically defined a general class of distributions with a pdf given by
f ( x ) = 1 + ϵ h ( x ) 1 + ϵ κ g ( x ) ,
where g ( x ) is a pdf, ϵ 0 is a shape parameter, and h is a continuous positive function such that κ = E g ( h ( X ) ) < . This distribution is called the “general class of distributions”. Further, the authors study the normal, t-student, Laplace, and BS distributions as special cases. As special cases of this family, from Elal-Olivero et al. (2010) [18] and the bimodal log-skew-normal of Bolfarine et al. (2011) [16], the class of bimodal skew-elliptical distributions can be found in the literature.
In this study, we analyze and study a general extended class of the SHN distributions, which has an extra parameter to the SHN model and introduces flexibility to the SHN distribution. Additionally, an extension of the BS model is presented. We highlight that this kind of model could be applied to data sets related to material fatigue [2], reaction and neuronal reaction time [19], agricultural engineering [20], and spatial data (see [21,22], among others). Note that this distribution can fit bimodal data sets, which could be present in problems where the population is divided by groups such as gender or different levels of HIV-RNA.
This paper unfolds as follows: Section 2 presents the extended sinh-normal distribution (ESHN) model. Section 3 provides the statistical approximation of the moments of an ESHN random variable. Section 4 outlines the properties of the extended generalized sinh-normal (EGSHN) distribution. Section 5 shows the features of the extended sinh-normal regression model. Section 6 presents the results of a simulation study to analyze the properties of the EGSHN model. Section 7 develops two numerical illustrations to evaluate the relevance of the EGSHN. Section 8 discusses the statistical and practical implications of the proposed distribution.

2. Extended Sinh-Normal Distribution

We now propose an extension of the SHN case introduced by Rieck and Nederman (1991) [3] to a general class of distributions. Then, taking h ( y ) = b y 2 and g ( y ) = b y ϕ ( b y ) in equation (7), where b y and b y are defined as in (1), the extended class of SHN distribution is defined through the pdf given by
f ( y ) = 1 + γ b y 2 1 + γ b y ϕ b y ) ,
with ξ R and α , σ > 0 defined as the SHN distribution of Rieck and Nederman (1991) [3], and γ R + as a shape parameter. This distribution is denoted by E S H N ( α , ξ , σ , γ ) , which, for some values of the γ parameter, could be bimodal. In this context, γ could be considered as a bimodality parameter. It can be easily deduced that for γ = 0 , the SHN distribution is obtained, and if γ , then f ( y ) b y 2 b y ϕ ( b y ) , which is a new family of distributions. It can be also shown that a random variable
b Y = 2 α sinh Y ξ σ E N ,
where E N is the extended normal distribution, as studied by Cortés et al. (2018) [17].
Denoting Z = 2 α Y ξ σ , then, we find that Y ξ σ = α 2 Z ; thus, when α 0 , then b y z ,   b y 1 and b y ϕ ( b y ) D ϕ ( z ) , where ϕ ( z ) N ( 0 , 1 ) (see Rieck et al. [3]). Next, it is possible to conclude that when α 0 , then f ( y ) E N .
Figure 1 shows the pdf for an extended sinh-normal (ESHN) distribution with two parameters, respectively, where it can be seen that the distribution, for γ = 0 , is symmetric and unimodal in cases (b) and (c), and for the other cases, it is strongly bimodal.
Denoting p = 1 1 + γ and 1 p = γ 1 + γ , then, the pdf of E S H N ( α , ξ , σ , γ ) could be written as a mixture of two distributions:
f ( y ) = p b y ϕ ( b y ) + ( 1 p ) b y 2 b y ϕ ( b y ) .
From (9) and remembering that a cumulative distribution function (cdf) of S H N ( α , ξ , σ ) is given by Φ ( b y ) , then it can be shown that the cdf of the E S H N ( α , ξ , σ , γ ) distribution is given by
F ( y ) = 1 1 + γ Φ ( b y ) + γ 1 + γ b y z 2 ϕ ( z ) d z = Φ ( b y ) γ 1 + γ b y ϕ ( b y ) .
Thus, the survival, risk (or hazard), and inverse risk functions of this distribution are, respectively, given by
S ( t ) = S S H N ( t ) + γ 1 + γ b t ϕ ( b t ) ,
and
h ( t ) = ( 1 + γ b t 2 ) h S H N ( t ) ( 1 + γ ) + σ γ tanh ( z ) h S H N ( t ) and R ( t ) = ( 1 + γ b t 2 ) R S H N ( t ) ( 1 + γ ) σ γ tanh ( z ) R S H N ( t ) ,
where z = t ξ σ ,   b t is defined as b y , with S S H N ,   h S H N ( t ) , and R S H N ( t ) being the survival, hazard and inverse risk functions of the model S H N ( α , ξ , σ ) .
Another important result of this distribution is presented as follows:
Let Y E S H N ( α , ξ , 2 , γ ) ; then, the random variable T = exp ( Y ) follows an E B S ( α , β , γ ) distribution, and where β = exp ( ξ ) and E B S is related to the extended BS distribution, whose properties and moments are studied in Cortés et al. (2018) [17]. So, if T E B S ( α , β , γ ) , then (i) a T E B S ( α , a β , γ ) with a > 0 and (ii) T 1 E B S ( α , β 1 , γ ) .

3. Moments of an Extended Sinh-Normal Random Variable

For a random variable Y E S H N ( α , ξ , σ , γ ) , the r-th moment is given by:
E ( Y r ) = 1 1 + γ k = 0 r r k σ k ξ r k c k 0 ( α ) + γ c k 2 ( α ) ,
where
c k l ( α ) = sinh 1 α 2 w k w l ϕ ( w ) d w .
From (9), it can be shown that the k-th moment of the E S H N ( α , 0 , 1 , γ ) distribution is
E ( Z k ) = 1 1 + γ z k b z ϕ ( b z ) d z + γ 1 + γ z k b z 2 b z ϕ ( b z ) d z = 1 1 + γ sinh 1 α w 2 k ϕ ( w ) d w + γ 1 + γ sinh 1 α w 2 k w 2 ϕ ( w ) d w ,
where the second step is obtained using the w = ( 2 / α sinh ( z ) ) , transformation. Then, denoting
c k l ( α ) = sinh 1 α w 2 k w 2 ϕ ( w ) d w ,
we find that
E ( Z k ) = 1 1 + γ c k 0 ( α ) + γ 1 + γ c k 2 ( α ) .
The location case of the E S H N ( α , ξ , σ , γ ) model can be obtained using the binomial theorem and the previous result in the expression
E ( Y r ) = E ( ξ + σ Z ) r = k = 0 r r k ξ r k σ k Z k
= 1 1 + γ k = 0 r r k ξ r k σ k ( c k 0 ( α ) + γ c k 2 ( α ) ) .
From this result, and given that c 10 ( α ) = 0 , it is obtained that E ( Y ) = ξ and v a r ( Y ) = σ 2 1 + γ ( c 20 ( α ) + γ c 22 ( α ) ) . It can also be shown, from the central moments, that E ( Y ξ ) 3 = 0 , which is a symmetric distribution with respect to ξ and E ( Y ξ ) 4 = σ 4 1 + γ ( c 40 ( α ) + γ c 42 ( α ) ) .

4. Extended Generalized Sinh-Normal Distribution

The ESHN distribution studied in the previous section has the main feature of bimodal symmetric data fitting, for which it is necessary to extend this model to asymmetric data; this asymmetric extension comes from the results found by Azzalini and Capitanio (2003) [23] and Azzalini (2005) [24], who showed that if g is a pdf symmetric around zero, and H is a cdf so that its density h is symmetric around zero as well, then for any odd function w ( x ) , we find that f ( x ) = 2 g ( x ) H ( w ( x ) ) , for < x < , is a pdf in R .
Then, since the pdf ( f ( y ) ) of the standard distribution, E S H N ( α , 0 , 1 , γ ) , is continuous and symmetric around zero; H ( · ) = Φ ( · ) is an absolutely continuous distribution function that is symmetric around zero and whose density, ϕ ( · ) , is also symmetric around zero; and w ( y ) = b y = b y ( α , 0 , 1 , γ ) = γ 2 α sinh y , for constant values α , γ , is an odd function, then 2 f ( y ) Φ ( λ b y ) is a pdf for any λ R . Thus, the location-scale-extended generalized sinh-normal (EGSN) distribution is defined through the pdf given by
g ( y ) = 2 1 + γ b y 2 1 + γ b y ϕ b y Φ λ b y ,
where λ is a skewness parameter. So, in this work, this model is denoted by E G S H N   ( α , ξ , σ , γ , λ ) .
Figure 2 shows the pdf of the EGSHN distribution for different values of the parameters. As observed, the distribution could be unimodal or bimodal depending on such values.
Likewise, for γ = 0 , we obtain the asymmetric SHN distribution based on the model considered by Leiva et al. (2010) [4]. In addition, if γ , analogously to the ESHN case, we have the new family of distributions g ( y ) 2 b y 2 b y ϕ ( b y ) Φ ( λ b y ) . Now, for γ = λ = 0 , an SHN model is followed. It can also be shown that a random variable
b Y = 2 α sinh Y ξ σ E S N ,
where E S N is the extended skew-normal distribution, which was introduced by Elal-Olivero et al. (2009) [25]. As in the ESHN model, it can be shown that when α 0 , then g ( y ) E S N .
For p, as in the case of the ESHN model, the pdf of the E G S H N ( α , ξ , σ , γ , λ ) distribution could be written as
f ( y ) = 2 p b y ϕ ( b y ) Φ ( λ b y ) + 2 ( 1 p ) b y 2 b y ϕ ( b y ) Φ ( λ b y ) .
From (18), it follows that the cdf of E G S H N ( α , ξ , σ , γ , λ ) is given by
F ( y ) = Φ S N ( b y ; λ ) γ 1 + γ b y ϕ S N ( b y ; λ ) ,
where Φ ( · ; λ ) and ϕ ( · ; λ ) denote the cdf and pdf of the skew-normal distribution, respectively, with a location parameter of 0 and a scale parameter of 1.
Then, the survival, risk (or hazard), and inverse risk functions of this distribution are, respectively, given by
h G ( t ) = ( 1 + γ b t 2 ) h S S H N ( t ) ( 1 + γ ) + σ γ tanh ( z ) h S S H N ( t ) , and R G ( t ) = ( 1 + γ b t 2 ) R S S H N ( t ) ( 1 + γ ) σ γ tanh ( z ) R S S H N ( t ) ,
where S S S H N ,   h S S H N ( t ) , and R S S H N ( t ) are the survival, hazard, and inverse risk functions of the skew-SHN function, S S H N ( α , ξ , σ , λ ) , respectively.

4.1. Stochastic Representation

The stochastic representation of the EGSHN model is based on Elal-Olivero (2010) [18] and Elal-Olivero, et al. (2009) [25]. This is presented below.
Definition 1.
If the random variable X has a pdf given by
r ( x ) = x 2 ϕ ( x ) ,
then we say that X follows a bimodal-normal distribution, and it is denoted as X N B (see [18]).
Remark 1.
Let W and U independent random variables with W χ ( 3 ) 2 , a chi-square distribution with three degrees of freedom, and U a pdf, such that P ( U = 1 ) = P ( U = 1 ) = 1 2 . If Y = W U then Y N B .
Remark 2.
Let Y N B and considers the random variable X 2 which is defined as
X 2 = arcsinh ( α Y / 2 ) ,
then,
F X 2 ( x ) = P ( X 2 x ) = P arcsinh ( α Y / 2 ) x = P Y 2 α sinh ( x ) = F Y 2 α sinh ( x ) .
Then,
f X 2 ( x ) = r 2 α sinh ( x ) 2 α cosh ( x ) = 2 α sinh ( x ) 2 ϕ 2 α sinh ( x ) 2 α cosh ( x ) ,
which is denoted as X 2 2 α sinh ( x ) 2 ϕ 2 α sinh ( x ) 2 α cosh ( x ) .
Proposition 1.
Let X 1 S H N ( α , 0 , 1 ) = S H N ( α ) and X 2 2 α sinh ( x ) 2 ϕ 2 α sinh ( x )   2 α cosh ( x ) , and say that U U ( 0 , 1 ) is a uniform random variable, independent of X 1 and X 2 . If
X = X 1 , if U < 1 1 + γ , X 2 , if U γ 1 + γ ,
with γ > 0 , then X E S H N ( α , γ ) .
Proof. 
F X ( x ) = P ( X x ) = P X x | U < 1 1 + γ P U < 1 1 + γ + P X x | U 1 1 + γ P U γ 1 + γ = P ( X x ) 1 1 + γ + P ( X x ) γ 1 + γ = 1 1 + γ F X 1 ( x ) + γ 1 + γ F X 2 ( x ) ,
so
f X ( x ) = 1 1 + γ f X 1 ( x ) + γ 1 + γ f X 2 ( x ) = 1 + γ 2 α sinh ( x ) 2 1 + γ ϕ 2 α sinh ( x ) 2 α cosh ( x ) .
 □
Definition 2.
If the random variable W has a pdf
m ( x ) = 2 f ( x ) Φ λ 2 α sinh ( x ) , x R ,
with λ R , then W follows a E G S H N ( α , γ , λ ) and this is denoted as W E G S H N ( α , γ , λ ) .
Proposition 2.
Let Z and X be independent random variables with Z N ( 0 , 1 ) and X E S H N   ( α , γ ) .
If
W = X , if Z < λ 2 α sinh ( x ) , X , if Z λ 2 α sinh ( x ) ,
then W G E S H N ( α , γ , λ ) .
Proof. 
Note that W ( X ) = λ 2 α sinh ( X ) is an odd function and the density of both the random variables X and the distribution function of Z are symmetric around zero; thus, applying Lemma 1 as in [24], the result follows. □

Location-Scale Extension

Let W E G S H N ( α , γ , λ ) and let V = μ + σ W , then
F V ( v ) = P ( V v ) = P v μ σ = F W v μ σ .
Then, the pdf of V is given by
f V ( v ) = 1 σ f W v μ σ = 2 σ f v μ σ Φ λ 2 α sinh v μ σ ,
which is denoted by V E G S H N ( α , μ , σ , γ , λ ) .

4.2. Moments of an Extended Generalized Sinh-Normal Random Variable

For a random variable Y E G S H N ( α , ξ , σ , γ , λ ) , the r-th moment is given by
E ( Y r ) = 1 1 + γ k = 0 r r k σ k ξ r k d k 0 ( α , λ ) + γ d k 2 ( α , λ ) ,
where
d k l ( α ) = 2 sinh 1 α 2 w k w l ϕ ( w ) Φ ( λ w ) d w .

4.3. Extended Generalized Birnbaum–Saunders Distribution

Let Y E G S H N ( α , ξ , 2 , γ , λ ) . Then, the distribution of a random variable T = exp ( Y ) follows an extended generalized BS distribution, which is denoted by E G B S ( α , β , γ , λ ) , where β = exp ( ξ ) . The proof of this result is obtained from the transformation theorem of random variables. The pdf of an E G B S ( α , β , γ , λ ) random variable is given by
φ ( t ) = 2 1 + γ a t 2 1 + γ a t ϕ a t Φ λ a t ,
where a t = a t ( α , β ) and a t = a t ( α , β ) are defined as in (6).
Note that, for γ = 0 , we have the doubly generalized BS distribution developed by Vilca et al. (2006) [10] for the special case of the skew-normal distribution introduced by Azzalini (1985) [26], BSSN. If γ , then g ( y ) 2 a t 2 a t ϕ ( a t ) Φ ( λ a t ) ; if λ = 0 , then φ ( t ) follows the extended BS distribution class studied by Cortés et al. (2018) [17], and for γ = λ = 0 , φ ( t ) follows a BS model. Likewise, it can be shown that a random variable a T E S N .
As in the EGSHN model, it can be shown that the cdf of the E G B S ( α , β , γ , λ ) distribution is given by
F ( y ) = Φ S N ( a t ; λ ) γ 1 + γ a t ϕ S N ( a t ; λ ) .
Then, the survival, risk, or hazard, and inverse risk functions of this distribution are given, respectively, by
S E G B S ( t ) = S B S S N ( t ) + γ 1 + γ a t ϕ S N ( a t ; λ ) ,
and
h E G B S ( t ) = ( 1 + γ a t 2 ) ( t + β ) h B S S N ( t ) ( 1 + γ ) ( t + β ) + β 1 / 2 γ t ( t β ) h B S S N ( t ) and
R E G B S ( t ) = ( 1 + γ a t 2 ) ( t + β ) R B S S N ( t ) ( 1 + γ ) ( t + β ) β 1 / 2 γ t ( t β ) R B S S N ( t ) ,
where S B S S N ,   h B S S N ( t ) e R B S S N ( t ) are the survival, hazard, and inverse risk functions of the BS skew-normal model, B S S N ( α , β , λ ) . For the BSSN model, l i m t h B S S N ( t ) = 1 + λ 2 2 α 2 β ; then, for the EGBS model, it follows that l i m t h E G B S ( t ) = 0 .
Some properties of the BS model remain true for the EGBS model. Thus, if T E G B S ( α , β , γ , λ ) , then (i) a T E G B S ( α , a β , γ , λ ) , with a > 0 , and (ii) T 1 E G B S ( α , β 1 ,   γ , λ ) .
The moments of an EGBS random variable with parameters α , β , γ , and λ can be obtained by means of the following expression:
E ( T r ) = 1 α 2 β ( 1 + γ ) γ β 2 E B S S N ( T r 1 ) + ( α 2 2 γ ) E B S S N ( T r ) + γ E B S S N ( T r + 1 ) ,
where E B S S N ( · ) is the expectation operator of the Birnbaum–Saunders skew-normal distribution (BSSN). To calculate the mean, variance, and the skewness and kurtosis coefficients of the EGBS model, the expressions of the corollaries 2.1 and 2.2 of Vilca et al. [10] could be used.
For ( α , β , γ , λ ) , the log-likelihood function corresponding to the random sample t 1 , t 2 , , t n is
( α , β , γ , λ ) = i = 1 n log 1 + γ a t i 2 n log 1 + γ + i = 1 n log ( A t i ) 1 2 i = 1 n a t i 2 + i = 1 n log ( Φ ( λ a t i ) ) .
The score function is composed of the following elements:
U ( α ) = 2 γ α i = 1 n a t i 2 1 + γ a t i 2 1 α i = 1 n 1 a t i 2 + λ a t i ζ i = 0 ,
U ( β ) = γ α 2 i = 1 n 1 t i t i β 2 1 + γ a t i 2 n 2 β + i = 1 n 1 β + t i 1 2 α 2 i = 1 n 1 t i t i β 2 λ 2 α β 3 2 i = 1 n t i + β t i 1 2 ζ i = 0 ,
U ( γ ) = n 1 + γ + i = 1 n a t i 2 1 + γ a t i 2 = 0 , and U ( λ ) = i = 1 n a t i ζ i = 0 ,
where ζ i = ϕ ( λ a t i ) Φ ( λ a t i ) , with i = 1 , , n . Iterative numerical methods must be used to solve this system of nonlinear equations.

5. Extended Sinh-Normal Regression Model

One of our main goals is to develop a log-BS regression model based on the E S H N ( α , ξ ,   σ , γ , λ ) model. This regression model will be an optimal alternative to the log-BS model introduced by Rieck et al. (1991) [3] in order to fit bimodal or survival asymmetric data. Now, the extended generalized log-linear BS regression model is defined following the considerations given by Rieck et al. (1991) [3] and considering that Y i is a dependent variable; that a set (p) of explanatory variables, denoted by x i = ( x i 1 , x i 2 , , x i p ) , is given; and that θ = ( θ 1 , θ 2 , , θ p ) is a p-dimensional vector of unknown parameters, where a linear predictor, ξ i = x i θ , is obtained for i = 1 , 2 , , n .
Then, let us suppose that T 1 , T 2 , . . . T n are independent and identically distributed random variables (i.i.d. r.v.) such that T i EGBS ( α i , β i , γ i , λ i ) . Now, let us suppose that the distribution of T i is independent of the set of explanatory variables, x i = ( x i 1 , x i 2 , , x i p ) , where
  • β i = exp ( x i θ ) , for i = 1 , 2 , , n , with θ = ( θ 1 , θ 2 , . . . , θ p ) , being a p-dimensional vector of unknown parameters.
  • The shape, bimodality, and skew parameters do not consider x i ; i.e., α i = α , γ i = γ , and λ i = λ for i = 1 , 2 , , n .
Let us suppose that Y i = log ( T i ) . Then, the extended generalized log-linear BS regression model is defined as
y i = x i θ + ϵ i , i = 1 , , n ,
where ϵ i E G S H N ( α , 0 , 2 , γ , λ ) , for i = 1 , , n and y i is the log-survival for the i-th individual. This model is denoted by M R E S H N ( α , θ , 2 , γ , λ ) . When γ = λ = 0 it follows the log-BS regression model, L B S ( α , θ , 2 ) , of Rieck et al. [3]; i.e., the MRESHN model is more flexible than the log-BS model in terms of skewness and kurtosis.
When λ = 0 , it follows that ϵ i E S H N ( α , 0 , 2 , γ ) , for i = 1 , , n ; then, important results are obtained. Thus, we find that E ( ϵ i ) = 0 and V a r ( ϵ i ) = 4 c 20 ( α ) ; additionally, as the errors are independent random variables, then for i i , it follows that C o v ( ϵ i , ϵ i ) = 0 . Furthermore, considering that the explanatory variables are independent of the shape parameter, from the above results, it is therefore possible to conclude that Y i E S H N ( α , x i θ , 2 , γ ) for i = 1 , , n ; furthermore, like ϵ = 0 , it can be shown that E ( Y i ) = ξ i = x i θ , so the linear estimators for θ can be derived from the ordinary least squares method, whose solution is given by
θ ^ = ( X X ) 1 X Y ,
with covariance matrix
C o v ( θ ^ ) = 4 c 20 ( α ) ( X X ) 1 .
So, a biased estimator for c 2 ( α ) could be
c ^ 20 ( α ) = 1 4 ( n p ) i = 1 n ( y i x i θ ^ ) .
Now, the model provided in (26) is a linear regression model similar to the models of the theory of linear models, with the characteristic that the random component follows a E G S H N ( α , 0 , 2 , γ , λ ) distribution; that is, assume that these parameters are located around zero and have a scale parameter equal to 2. Then, the interpretation of their parameters, with relation to the observed variable Y, is given in the same manner as the linear regression model.
For the vector ( α , θ , γ , λ ) , we find that the log-likelihood function corresponding to the random sample y 1 , y 2 , , y n is
( α , θ , γ , λ ) = i = 1 n log 1 + γ ξ i 2 2 n log 1 + γ + i = 1 n log ( ξ i 1 ) 1 2 i = 1 n ξ i 2 2 + i = 1 n log Φ λ ξ i 2 ,
where ξ i 1 = 2 α 1 cosh z i and ξ i 2 = 2 α 1 sinh z i with z i = y i x i θ 2 , for i = 1 , 2 , , n . The elements of the score function are given by
U ( α ) = 2 γ α i = 1 n ξ i 2 2 1 + γ ξ i 2 2 n α + 1 α i = 1 n ξ i 2 2 λ α i = 1 n w i ξ i 2 ,
U ( θ j ) = γ i = 1 n x i j ξ i 1 ξ i 2 1 + γ ξ i 2 2 + 1 2 i = 1 n x i j ξ i 1 ξ i 2 ξ i 2 ξ i 1 λ 2 i = 1 n x i j ξ i 1 w i , j = 1 , 2 , , p ,
U ( γ ) = n 1 + γ + i = 1 n ξ i 2 2 1 + γ ξ i 2 2 , and U ( λ ) = i = 1 n ξ i 2 w i ,
where w i = ϕ ( λ ξ i 2 ) Φ ( λ ξ i 2 ) , for i = 1 , , n .
The maximum likelihood estimator of θ 1 , θ 2 , , θ p ; α ;   γ ; and λ is the solution to equations U ( θ j ) = 0 , for j = 1 , 2 , , p ; U ( α ) = 0 ;   U ( γ ) = 0 ; and U ( λ ) = 0 , respectively, which require iterative numerical methods.
The least squares estimator ( θ ^ ) may be used to initialize the iterative numerical process for θ , and with these initial values, we can calculate α ^ = 4 n i = 1 n sinh 2 y i x i θ ^ 2 .
The elements of the observed information matrix, defined as minus the second derivative of the log-likelihood function, are denoted by I α α ,   I α θ ,   I α γ , I α λ   I θ j θ k , I θ j γ , I θ j λ , I γ γ , and I λ λ , and are, respectively, given by
I α α = n α 2 2 γ α 2 i = 1 n 3 ξ i 2 2 + γ ξ i 2 4 ( 1 + γ ξ i 2 2 ) 2 + 3 α 2 i = 1 n ξ i 2 2 + λ α 2 i = 1 n w i ξ i 2 2 + λ ξ i 2 ( λ ξ i 2 + w i ) ,
I α θ j = γ α i = 1 n x i j ξ i 1 ξ i 2 ( 1 + γ ξ i 2 2 ) 2 + 1 α i = 1 n x i j ξ i 1 ξ i 2 + λ 2 α i = 1 n x i j w i ξ i 1 1 + λ ξ i 2 ( λ ξ i 2 + w i ) ,
I α γ = 2 α i = 1 n ξ i 2 2 ( 1 + γ ξ i 2 2 ) 2 , I α λ = 1 α i = 1 n w i ξ i 2 1 λ ξ i 2 ( λ ξ i 2 + w i ) ,
I θ j θ k = 1 4 i = 1 n x i j x i k 2 ξ i 2 2 + 4 α 2 1 + ξ i 2 2 ξ i 2 2 + 4 / α 2 + 2 γ α 2 i = 1 n x i j x i k 1 1 + γ ξ i 2 2 + 2 γ α 2 4 ξ i 2 2 4 α 2 + ξ i 2 2 ( 1 + γ ξ i 2 2 ) 2 + λ 4 i = 1 n x i j x i k w i ξ i 2 + λ ξ i 1 2 ( λ ξ i 2 + w i ) ,
I θ j γ = i = 1 n x i j ξ i 1 ξ i 2 ( 1 + γ ξ i 2 2 ) 2 , I θ j λ = 1 2 i = 1 n x i j w i ξ i 1 1 λ ξ i 2 ( λ ξ i 2 + w i ) ,
I γ λ = 0 , and I λ λ = i = 1 n ξ i 2 2 w i ( λ ξ i 2 + w i ) .
The information matrix ( Ψ ) could be estimated as the expected value of the elements of the observed information matrix, which must be calculated using numerical approximation methods. For λ = 0 , the sub-matrix of vector ( α , θ , γ ) of the information matrix has the following elements:
ψ α α = 2 n α 2 1 + 4 γ 1 + γ 2 n α 2 γ q 2 ( γ ) 1 + γ , ψ α β j = 4 γ a 1 α i = 1 n x i j , ψ α γ = 2 n γ α q 1 ( γ ) ,
ψ β j β l = 1 α 2 ( 1 + γ ) γ + 1 2 α 2 ( 1 + 3 γ ) + 4 γ α 2 1 m ( α ) i = 1 n x i j x i l + γ α 2 ( 1 + γ ) 2 + α 2 1 + 1 γ 4 γ α 2 1 q 1 ( γ ) i = 1 n x i j x i l ,
ψ β j γ = 2 a 1 i = 1 n x i j , ψ γ γ = n ( 1 + γ ) 2 + n 1 + γ q 3 ( γ ) ,
where a 1 = z 1 + 4 α 2 z 2 1 + γ z 2 ϕ ( z ) d z ,   q 1 ( γ ) = 1 π 2 γ 1 / 2 1 e r f 1 2 γ 1 / 2 exp 1 2 γ ,   q 2 ( γ )   = 1 + 2 γ q 1 ,   q 3 ( γ ) = 1 1 γ q 1 ( γ ) and m ( α ) = π α 2 8 1 / 2 1 e r f 2 α 2 1 / 2 exp 2 α 2 , with e r f ( · ) the error function (see Prudnikov et al. [27]).
For γ > 0 , the determinant of the information sub-matrix is not equal to zero; that is, Ψ ( α , θ , γ ) 0 . Thus, it is possible to conclude that the information matrix of the ESHN model is nonsingular. Likewise, for λ = 0 , we find that
ψ θ j λ = 1 α 8 π ( 2 γ ) 1 / 2 E G ( 3 2 , 1 2 γ ) ( 4 γ + α 2 U ) + E G ( 1 2 , 1 2 γ ) ( 4 γ + α 2 U ) i = 1 n x i j ,
ψ λ γ = 0 , and ψ λ λ = n 2 π 1 + 3 γ 1 + γ ,
where E ( · ) denotes the expected value function, and G ( a , b ) is the gamma distribution with parameters a and b, respectively. The rows or columns of the information matrix of the parameters vector ( α , θ , γ , λ ) of the regression model M R E S H N in the case of λ = 0 are linearly independent, Therefore, the information matrix of this model is nonsingular, and its inverse is the variance–covariance matrix, Σ α ^ , θ ^ , γ ^ , λ ^ , of the estimator vector of maximum likelihood of the parameters vector; then, the estimated standard errors of the estimators are the square root of the matrix Σ ^ α ^ , θ ^ , γ ^ , λ ^ .
Then, when n , the approximation N p + 4 α , θ , γ , λ , 1 n Σ α , θ , γ , λ might be used to obtain a confidence interval for the parameter θ r , for r = 1 , 2 , , p , which is given by θ ^ r z 1 ρ / 2 σ ^ ( θ ^ r ) , where σ ^ ( . ) corresponds to the r-th element of the diagonal of the sub-matrix Σ ^ ( θ ^ ) and z 1 δ / 2 is quartile 100 ( δ / 2 ) of the standard normal distribution.

6. Simulation Study

A simulation study is presented to analyze the performance of the maximum likelihood estimation of the parameters of the E G S H N ( α , β 0 , β 1 , γ , λ ) regression model. In general, m = 5000 simulations with n = 30 , 60 , 90 , 120 , and 500 were generated for different scenarios. So, the following model, with β 0 = 2.25 and β 1 = 0.75 , is studied.
y i = β 0 + β 1 x i + ϵ i , i = 1 , 2 , , n .
Note that, for i = 1 , 2 , , n ,   x i takes values of a uniform random variable ( 0 , 1 ) , X U ( 0 , 1 ) , and ϵ E G S H N ( α , 0 , 2 , γ , λ ) . The statistics of the empirical standard deviation (sd), relative bias (RB) and M S E for the EGSHN model are calculated.
The results for each studied scenario are described as follows.
Scenario 1 (Varying α): In this scenario, the used alpha values are 0.75 , 1.75 , and 2.75 , keeping γ = 1.5 and λ = 1 fixed. For each studied case of the E G S H N ( α , 2.25 , 0.75 , 1.5 , 1 ) model (see Figure 3 and Table 1), it can be seen that for the parameters β 0 and β 1 , the relative biases are small, especially for the parameter β 1 . Note that the statistics RB, sd, and M S E decrease when the sample size is increased.
Additionally, it can also be observed that the relative bias of the parameter α is not very important (not very large). Something similar is observed for the parameter λ . On the other hand, the relative bias and the root of the mean square error of the parameter γ are not very small, especially when the sample size is small (30 or 60).
The statistics RB, sd, and M S E for the parameters α , γ , λ decrease when the sample size increases. These results guarantee a lack of bias and the asymptotic consistency of the estimates of the parameters α , β 0 , β 1 , γ , λ .
Scenario 2 (Varying γ): Here, the used gamma values are 1, 2.5, and 4, keeping α = 1.75 and λ = 1 fixed. Similar to scenario 1, we can see that the parameters β 0 and β 1 present small relative biases (see Figure 4 and Table 2). The statistics RB, sd, and M S E of the E G S H N ( 1.75 , 2.25 , 0.75 , γ , 1 ) model decrease when the sample size is increased.
The parameters α and λ show very small relative biases, but the parameter γ presents large relative biases and M S E . This behavior of the gamma parameter is more striking for small sample sizes.
The asymptotic consistency of the estimates of the parameters α , β 0 , β 1 , γ , λ can be guaranteed because the calculated statistics for α , γ , λ decrease when sample size increases.
Scenario 3 (Varying λ): In scenario 3, where the E G S H N ( 1.75 , 2.25 , 0.75 , 2 , λ ) model is considered, the used lambda values are 0.5 , 1.5 , and 3, keeping α = 1.75 and γ = 2 fixed (see Figure 5 and Table 3). As in the previous simulation scenarios, the relative biases for the parameters β 0 and β 1 are small, especially for the parameter β 1 . It is possible to see that the statistics RB, sd, and M S E tend to decrease when the sample size increase.
The RB and M S E of the parameter γ are large for small sample sizes. The parameter α presents small relative biases, but the parameter λ shows a large sd and M S E when λ = 3 . In general, we can see that the asymptotic consistency of the estimates of the parameters α , β 0 , β 1 , γ , λ is satisfied.

7. Numerical Illustrations

7.1. Illustration 1

In this paper, we employed the dataset studied by Hirose (1993) [28] to show the relevance of the EGSHN model. This dataset contains the results of an accelerated life test on polyethylene terephthalate (PET) (used in electrical isolation) in SF6 gas-insulated transformers. Such an accelerated life test was performed at four voltage levels (5, 7, 10, and 15), with 7, 15, 10, and 9 observations per level, respectively. The main purpose of the study was to analyze the resistance times (t) of the insulating films at different voltages (v). Hence, we here consider the following regression model:
y i = log ( t ) = β 0 + β 1 v i + ϵ i ,
where y i follows an E G S H N ( α , β 0 + β 1 v i , σ , γ , λ ) distribution.
In this work, the SHN and ESHN models are fitted (see Ortega et al. [29]). To compare these models, the Akaike (AIC) information criterion (Akaike, [30]) and the corrected Akaike (AICC) criterion (Cavanaugh, [31]) are implemented. These measures are defined as follows:
A I C = 2 ( θ ^ ) + 2 p and A I C C = 2 ( θ ^ ) + 2 n ( p + 1 ) n p 2 ,
where p is the number of parameters and ^ ( · ) is the log-likelihood function evaluated for the MLEs of parameters. The best model is that with the smallest AIC or AICC value. To fit the bivariate model, we used the optim function of the statistical package R Core Team.
The estimated parameters of these models, accompanied by their standard errors in parentheses, are obtained using the maximum likelihood method. Table 4, shows the results. Note that according to the AIC and AICC, the EGSHN and ESHN models present the best fits.
In order to identify atypical observations and/or the misspecifications of models, we analyzed the transformation of the martingale residual, rMTi, proposed by Barros et al. [8]. These residuals are defined by
r M T i = s g n ( r M i ) 2 [ r M i + α i log ( α i r M i ) ] ; for i = 1 , 2 , 3 , , n ,
where r M i = α i + log ( S ( e i , θ ^ ) ) is the martingale residual proposed by Ortega et al. [32], where α i = 0 , 1 indicates whether the i-th observation is censored or not, respectively; s g n ( r M i ) denotes the sign of r M i ; and S ( e i ; θ ^ ) represents the survival function evaluated in e i , where θ ^ are the MLE for θ .
The plots of r M T i with generated confidence envelopes is presented in Figure 6. From this figure, we can see clearly that the EGSHN model fits better to the data than the SHN and ESHN models, since, in these cases, there are no observations that lie outside the envelopes.
In Figure 6b, we observe two points: one in the border and the other outside it. There is also a point far from the set of observations. Since these points could be values influencing the estimates of the parameters, we calculated the generalized Cook’s distance (GCD) and showed the residual components of the deviation plot. Figure 7 illustrates the behavior of these two statistics. As can be seen in this figure, observations 8, 23, and 34 could be possible influential values. Thus, to calculate their impact on the estimates, we computed the estimates by eliminating each of these observations or groups of them.
The relative change (RC), in percentage, of each parameter estimate is used to evaluate the effect of the potentially influential case. The RC is given by R C ( θ ( i ) ) = 100 × | ( θ ^ θ ^ ( i ) ) / θ ^ | , where θ ^ ( i ) denotes the MLE of θ after removing the i-th observation. Table 5 lists the obtained RC values. According to this table, the relative changes of the MLE of parameter λ are excessively pronounced in all the models, mainly for observation 8 and the { 8 , 23 } set. Thus, after deleting observations 8 and 23, the new estimates of the parameter are α ^ = 8.5382 ( 2.7109 ) ,   β ^ 0 = 9.4767 ( 0.1500 ) ,   β ^ 1 = 0.4311 ( 0.0184 ) ,   σ ^ = 0.6937 ( 0.0833 ) ,   γ ^ = 5.8458 ( 3.0420 ) , and λ ^ = 0.6106 ( 0.1838 ) , with A I C = 59.30 and A I C C = 64.91 , as illustrated in Figure 7c.

7.2. Illustration 2

We used the dataset reported by the Center for Applied Statistics of the Institute of Mathematics and Statistics at the University of São Paulo to illustrate the relevance of the EGBSSN model. This dataset contains data on the amount of DNA within the nucleus (ploidy) of mammary cells (250 samples) from women with breast cancer. The ploidy variable exhibits a bimodal asymmetric performance, with D = 0.0399 and p-value = 0.0059 after performing the Hartigan and Hartigan (1985) [33] bimodality test. In addition, according to its descriptive statistics (presented in Table 6), this variable has a considerable positive skewness and high kurtosis. Additionally, Figure 8a shows a bimodal distribution of the data, which is why the EGBSSN model becomes an alternative to fit this kind of data.
In this study, the Birnbaum–Saunders (BS) and Birnbaum–Saunders skew-normal models were fitted. We also considered the following bimodal models: the log-skew-normal model developed by Bolfarine et al. (2011) [16], the extended class (EBN) introduced by Cortez et al. (2018) [17], and the EGBS model.
Table 7 presents the maximum likelihood estimates and AIC and AICC values of five models. As observed in this table, the BS and BSSN models provide a poor fit to the DNA dataset. Conversely, the EGBS model shows the best fit among the fitted models, which is explained by its flexibility to fit asymmetric bimodal data.
Figure 8a,b shows the estimated densities of the models with the best fit (BSSN, BLSN, EBS, and EGBS); the empirical cumulative distribution functions of the BLSN, EBS, and EGBS models; and the parameter estimates. Note that the EGBS model provides the best fit when compared to the BSSN, BLSN, EBN, and BS models.

8. Discussion

Bimodality and skewness are two common features that may be present in data from engineering, geo-spatial, medicine, and other areas. The natural complexity of data from these areas needs to be fitted using models that offer great flexibility and goodness of fit. In some cases, the data only present positive skewness, or in other cases, the distribution is only bimodal, but these two features could be present simultaneously. Thus, in this paper, a distribution capable of fitting bimodal and positively skewed data sets was proposed. In addition, the extended Birnbaum–Saunders distribution was studied as well.
Although there are proposals in the literature such as those cited in the previous sections, which allow the fitting of asymmetric or bimodal data, our model has the characteristic of modeling data that simultaneously have these two characteristics and has great flexibility and goodness of fit for data with these conditions. It is a new, promising, and user-friendly option to consider in statistical analysis.
To conclude, the EGBS model was proposed as a new statistical distribution suitable to fit real data sets with positive skewness and bimodality that presents a great performance compared with models available in the literature and that aim to achieve the same objective.

Author Contributions

The contributions of the authors in this paper are described as follows: Conceptualization, G.M.-F., D.E.-O. and C.B.-C.; methodology, G.M.-F. and D.E.-O.; formal analysis, G.M.-F. and C.B.-C.; investigation, G.M.-F., D.E.-O. and C.B.-C.; resources and data curation, G.M.-F. and C.B.-C.; writing, original draft preparation, C.B.-C.; writing, review and editing, C.B.-C.; visualization, C.B.-C.; supervision, G.M.-F. and C.B.-C.; project administration, G.M.-F. and C.B.-C.; funding acquisition, C.B.-C. and D.E.-O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Instituto Tecnológico Metropolitano (ITM) through the project (P20244) Priorización de zonas para la restauración ecológica y de uso público mediante la armonización de técnicas de mapeo participativo y modelación espacial multicriterio en el municipio de Belmira, Antioquia, Colombia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

C.B.-C. extends their sincere gratitude to the Instituto Tecnológico Metropolitano (ITM) and South Pole for their support. D.E.-O. thanks to the DIUDA REGULAR project No. 22409 of the Universidad de Atacama, Chile, and the project Distribuições de Probabilidade Mutivariadas Assimétricas e Flexíveis, Universidade Federal do Ceará, Fortaleza, Brazil. G.M.-F. thanks Universidad de Córdoba and Universidade Federal do Ceará, Brazil.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Birnbaum, Z.W.; Saunders, S.C. A New Family of Life Distributions. J. Appl. Probab. 1969, 6, 319–327. [Google Scholar] [CrossRef]
  2. Desmond, A. Stochastic models of Failure in random environments. Can. J. Stat. 1985, 13, 171–183. [Google Scholar] [CrossRef]
  3. Rieck, J.R.; Nedelman, J.R. A log-linear model for the Birnbaum-Saunders distribution. Technometrics 1991, 33, 51–60. [Google Scholar]
  4. Leiva, V.; Vilca-Labra, F.; Balakrishnan, N.; Sanhueza, A. A skewed sinh-normal distribution and its properties and application to air pollution. Commun. Stat. Theory Methods 2010, 39, 426–443. [Google Scholar] [CrossRef]
  5. Lemonte, A.J. A log-Birnbaum-Saunders regression model with asymmetric errors. J. Stat. Comput. Simul. 2011, 82, 1775–1787. [Google Scholar] [CrossRef] [Green Version]
  6. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. The Log-Linear Birnbaum-Saunders Power Model. Methodol. Comput. Appl. Probab. 2017, 19, 913–933. [Google Scholar] [CrossRef]
  7. Moreno-Arenas, G.; Martínez-Flórez, G.; Barrera-Causil, C. Proportional Hazard Birnbaum-Saunders Distribution with Application to the Survival Data Analysis. Rev. Colomb. Estad. 2016, 39, 129–147. [Google Scholar] [CrossRef]
  8. Barros, M.; Galea, M.; Gonzalez, M.; Leiva, V. Influence diagnostics in the tobit censored response model. Stat. Methods Appl. 2010, 19, 379–397. [Google Scholar] [CrossRef]
  9. Díaz-García, J.A.; Leiva-Sánchez, V. A new family of life distributions based on the elliptically contoured distributions. J. Stat. Plan. Inference 2005, 128, 445–457. [Google Scholar] [CrossRef]
  10. Vilca-Labra, F.; Leiva-Sanchez, V. A new fatigue life model based on the family of skew-elliptical distributions. Commun. Stat. Theory Methods 2006, 35, 229–244. [Google Scholar] [CrossRef]
  11. Martínez-Flórez, G.; Bolfarine, H.; Gómez, Y.M.; Gómez, H.W. A Unification of Families of Birnbaum.Saunders Distributions with Applications. REVSTAT Stat. J. 2020, 15, 637–660. [Google Scholar]
  12. Castillo, N.; Gomez, H.W.; Bolfarine, H. Epsilon Birnbaum-Saunders distribution family: Properties and inference. Stat. Pap. 2011, 52, 871–883. [Google Scholar] [CrossRef]
  13. Cordeiro, G.M.; Lemonte, A.J. The exponentiated generalized Birnbaum-Saunders distribution. Appl. Math. Comput. 2014, 247, 762–779. [Google Scholar] [CrossRef]
  14. Reyes, J.; Barranco-Chamorro, I.; Gallardo, D.I.; Gómez, H.W. Generalized Modified Slash Birnbaum-Saunders Distribution. Symmetry 2018, 10, 724. [Google Scholar] [CrossRef] [Green Version]
  15. Olmos, N.M.; Martínez-Flórez, G.; Bolfarine, H. Bimodal Birnbaum-Saunders Distribution with Application to Corrosion Data. Commun. Stat. Theory Methods 2017, 46, 6240–6257. [Google Scholar] [CrossRef]
  16. Bolfarine, H.; Gómez, H.W.; Rivas, L. The log-bimodal-skew-normal model. A geochemical application. J. Chemom. 2011, 25, 329–332. [Google Scholar] [CrossRef]
  17. Cortés, M.A.; Elal-Olivero, D.; Olivares-Pacheco, J.F. A New Class of Distributions Generated by the Extended Bimodal-Normal Distribution. J. Probab. Stat. 2018, 2018, 9753439. [Google Scholar] [CrossRef]
  18. Elal-Olivero, D. Alpha-skew-normal distribution. Proyecc. J. Math. 2010, 29, 224–240. [Google Scholar] [CrossRef] [Green Version]
  19. Martínez-Flórez, G.; Barrera-Causil, C.; Marmolejo-Ramos, F. The Exponential-Centred Skew-Normal Distribution. Symmetry 2020, 12, 1140. [Google Scholar] [CrossRef]
  20. Garcia-Papani, F.; Uribe-Opazo, M.A.; Leiva, V.; Aykroyd, R.G. Birnbaum-Saunders spatial modelling and diagnostics applied to agricultural engineering data. Stoch. Environ. Res. Risk Assess. 2017, 31, 105–124. [Google Scholar] [CrossRef] [Green Version]
  21. Sánchez, L.; Leiva, V.; Galea, M.; Saulo, H. Birnbaum-Saunders quantile regression models with application to spatial data. Mathematics 2020, 8, 1000. [Google Scholar] [CrossRef]
  22. Martinez, S.; Giraldo, R.; Leiva, V. Birnbaum-Saunders functional regression models for spatial data. Stoch. Environ. Res. Risk Assess. 2019, 33, 1765–1780. [Google Scholar] [CrossRef]
  23. Azzalini, A.; Capitanio, A. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J. R. Stat. Soc. Ser. B Stat. Methodol. 2003, 65, 367–389. [Google Scholar] [CrossRef]
  24. Azzalini, A. The skew-normal distribution and related multivariate families. Scand. J. Stat. Theory Appl. 2005, 32, 159–200. [Google Scholar] [CrossRef]
  25. Elal-Olivero, D.; Gómez, H.W.; Quintana, F.A. Bayesian modeling using a class of bimodal skew-elliptical distributions. J. Stat. Plan. Inference 2009, 139, 1484–1492. [Google Scholar] [CrossRef]
  26. Azzalini, A. A class of distributions which includes the normal ones. Scand. J. Stat. 1985, 12, 171–178. [Google Scholar]
  27. Prudnikov, A.P.; Brychkov, Y.A.; Marichev, O.I. Integrals and Series; Gordon and Breach Science Publishers: Amsterdam, The Netherlands, 1990; Volumes 1–3. [Google Scholar]
  28. Hirose, H. Estimation of threshold stress in accelerated life-testing. IEEE Trans. Reliab. 1993, 42, 650–657. [Google Scholar] [CrossRef]
  29. Ortega, E.M.; Cordeiro, G.; Lemonte, A. A log-linear regression model for the β-Birnbaum-Saunders distribution with censored data. Comput. Stat. Data Anal. 2012, 56, 698–718. [Google Scholar] [CrossRef]
  30. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  31. Cavanaugh, J.E. Unifying the derivations for the Akaike and corrected Akaike information criteria. Stat. Probab. Lett. 1997, 33, 201–208. [Google Scholar] [CrossRef]
  32. Ortega, E.M.; Bolfarine, H.; Paula, G.A. Influence diagnostics in generalized log-gamma regression models. Comput. Stat. Data Anal. 2003, 42, 165–186. [Google Scholar] [CrossRef]
  33. Hartigan, J.A.; Hartigan, P.M. The dip test of unimodality. Ann. Stat. 1985, 13, 70–84. [Google Scholar] [CrossRef]
Figure 1. Distribution (a) E S H N ( 2.75 , 0 , 1 , γ ) for γ = 3.5 (solid line), γ = 2.5 (dashed line), γ = 1.5 (dotted line) y γ = 0 (dash-dotted line), (b) E S H N ( 1.75 , 0 , 1 , γ ) for γ = 3.5 (solid line), γ = 2.5 (dashed line), γ = 1.5 (dotted line) and γ = 0 (dash-dotted line) and (c) E S H N ( 0.75 , 0 , 1 , γ ) for γ = 3.5 (solid line), γ = 2.5 (dashed line), γ = 1.5 (dotted line) and γ = 0 (dash-dotted line).
Figure 1. Distribution (a) E S H N ( 2.75 , 0 , 1 , γ ) for γ = 3.5 (solid line), γ = 2.5 (dashed line), γ = 1.5 (dotted line) y γ = 0 (dash-dotted line), (b) E S H N ( 1.75 , 0 , 1 , γ ) for γ = 3.5 (solid line), γ = 2.5 (dashed line), γ = 1.5 (dotted line) and γ = 0 (dash-dotted line) and (c) E S H N ( 0.75 , 0 , 1 , γ ) for γ = 3.5 (solid line), γ = 2.5 (dashed line), γ = 1.5 (dotted line) and γ = 0 (dash-dotted line).
Mathematics 09 02793 g001
Figure 2. EGSHN distribution (a) E S H N ( 0.75 , 0 , 1 , 3.5 , 0.75 ) (solid line), E S H N ( 0.75 , 0 , 1 , 2.5 , 0.5 ) (dashed line), E S H N ( 0.75 , 0 , 1 , 0.5 , 1.5 ) (dotted line) and E S H N ( 0.75 , 0 , 1 , 1.5 , 0.75 ) (dash-dotted line), (b) E S H N ( 1.5 , 0 , 1 , 3.5 , 2.5 ) (solid line), E S H N ( 1.5 , 0 , 1 , 2.5 , 1.5 ) (dashed line), E S H N ( 1.5 , 0 , 1 , 1.5 , 2.5 ) (dotted line) and E S H N ( 1.5 , 0 , 1 , 0.5 , 1.5 ) (dash-dotted line) and (c) E S H N ( 2.5 , 0 , 1 , 3.5 , 0.25 ) (solid line), E S H N ( 2.5 , 0 , 1 , 3.5 , 0.75 ) (dashed line), E S H N ( 2.5 , 0 , 1 , 2.5 , 0.75 ) (dotted line) and E S H N ( 2.5 , 0 , 1 , 0.75 , 0.25 ) (dash-dotted line).
Figure 2. EGSHN distribution (a) E S H N ( 0.75 , 0 , 1 , 3.5 , 0.75 ) (solid line), E S H N ( 0.75 , 0 , 1 , 2.5 , 0.5 ) (dashed line), E S H N ( 0.75 , 0 , 1 , 0.5 , 1.5 ) (dotted line) and E S H N ( 0.75 , 0 , 1 , 1.5 , 0.75 ) (dash-dotted line), (b) E S H N ( 1.5 , 0 , 1 , 3.5 , 2.5 ) (solid line), E S H N ( 1.5 , 0 , 1 , 2.5 , 1.5 ) (dashed line), E S H N ( 1.5 , 0 , 1 , 1.5 , 2.5 ) (dotted line) and E S H N ( 1.5 , 0 , 1 , 0.5 , 1.5 ) (dash-dotted line) and (c) E S H N ( 2.5 , 0 , 1 , 3.5 , 0.25 ) (solid line), E S H N ( 2.5 , 0 , 1 , 3.5 , 0.75 ) (dashed line), E S H N ( 2.5 , 0 , 1 , 2.5 , 0.75 ) (dotted line) and E S H N ( 2.5 , 0 , 1 , 0.75 , 0.25 ) (dash-dotted line).
Mathematics 09 02793 g002
Figure 3. Empirical sd, relative bias, and M S E for the estimators of the E G S H N ( α , 2.25 , 0.75 , 1.5 , 1 ) model parameters with sample sizes of 30, 60, 90, 120 and 500.
Figure 3. Empirical sd, relative bias, and M S E for the estimators of the E G S H N ( α , 2.25 , 0.75 , 1.5 , 1 ) model parameters with sample sizes of 30, 60, 90, 120 and 500.
Mathematics 09 02793 g003
Figure 4. Empirical sd, relative bias, and M S E for the estimators of the E G S H N ( 1.75 , 2.25 , 0.75 , γ , 1 ) model parameters with sample sizes of 30, 60, 90, 120 and 500.
Figure 4. Empirical sd, relative bias, and M S E for the estimators of the E G S H N ( 1.75 , 2.25 , 0.75 , γ , 1 ) model parameters with sample sizes of 30, 60, 90, 120 and 500.
Mathematics 09 02793 g004
Figure 5. Empirical sd, relative bias, and M S E for the estimators of the E G S H N   ( 1.75 , 2.25 , 0.75 , 2.0 , λ ) model parameters with sample sizes of 30, 60, 90, 120 and 500.
Figure 5. Empirical sd, relative bias, and M S E for the estimators of the E G S H N   ( 1.75 , 2.25 , 0.75 , 2.0 , λ ) model parameters with sample sizes of 30, 60, 90, 120 and 500.
Mathematics 09 02793 g005
Figure 6. Normal probability plots for r M T i with envelopes of Q-qplots for the scaled residuals, from the fitted models. (a) ESHN and (b) EGSHN.
Figure 6. Normal probability plots for r M T i with envelopes of Q-qplots for the scaled residuals, from the fitted models. (a) ESHN and (b) EGSHN.
Mathematics 09 02793 g006
Figure 7. Influence measures for the EGSHN model (a) Cook’s distance, (b) r M T i , (c) envelope picture of EGSHN model.
Figure 7. Influence measures for the EGSHN model (a) Cook’s distance, (b) r M T i , (c) envelope picture of EGSHN model.
Mathematics 09 02793 g007
Figure 8. (a) Histogram of the variable (amount of DNA in cancer cells) for the EGBS (solid line), EBS (dashed line), BLSN (dotted line), and BSSN (dash-dotted line) adjusted distributions; and (b) empirical cumulative distribution (solid line) and for the EGBS (dashed line), EBS (dotted line), and BLSN (dash-dotted line) models.
Figure 8. (a) Histogram of the variable (amount of DNA in cancer cells) for the EGBS (solid line), EBS (dashed line), BLSN (dotted line), and BSSN (dash-dotted line) adjusted distributions; and (b) empirical cumulative distribution (solid line) and for the EGBS (dashed line), EBS (dotted line), and BLSN (dash-dotted line) models.
Mathematics 09 02793 g008
Table 1. Empirical sd, relative bias, and M S E for the E G S H N ( α , 2.25 , 0.75 , 1.5 , 1 ) model.
Table 1. Empirical sd, relative bias, and M S E for the E G S H N ( α , 2.25 , 0.75 , 1.5 , 1 ) model.
α ^ γ ^ λ ^ β ^ 0 β ^ 1
α nsdRB MSE sdRB MSE sdRB MSE sdRB MSE sdRB MSE
300.16000.09030.17373.45991.15053.86610.69970.06770.70290.33320.03860.34430.41050.01770.4107
600.13390.05020.13902.58120.73542.80670.47560.03800.47700.27370.02260.27840.28180.01450.2819
0.75900.11940.03110.12161.92350.50132.06510.41270.02160.41320.23960.01170.2410.22520.00460.2252
1200.10870.01870.10961.20170.38181.33110.36810.00280.36810.21470.00400.21480.18970.00340.1897
5000.05410.00380.05420.36860.07050.38340.17590.00020.17590.10690.00170.10690.09130.00260.0913
300.48310.06400.49586.44451.13166.66370.59620.02010.59650.55500.05310.56770.61020.01240.6102
600.41620.02630.41871.92790.56862.10790.51130.01980.51160.44040.02110.44290.42550.00930.4255
1.75900.36610.01350.36691.25410.38581.38110.45180.02920.45270.36570.00860.36620.33240.01370.3325
1200.33540.00630.33560.88400.25940.96570.40980.02670.41070.33030.00720.33070.28390.00400.2839
5000.14760.00120.14760.32850.04860.33640.16460.01010.16490.15680.00010.15680.13790.00360.1379
300.83550.05780.85045.07771.05095.31630.60250.00950.60250.60210.05550.61490.67220.01490.6722
600.70300.02250.70562.21560.55902.36870.52140.02800.52210.46120.02570.46480.45910.01020.4591
2.75900.60900.00790.60941.27180.36721.38580.44430.03010.44520.39310.01390.39430.37250.01510.3727
1200.50960.01040.51040.92700.24990.99980.37580.01570.37610.32960.01270.33080.30760.00220.3075
5000.21820.00330.21840.33710.05630.34750.14170.00270.14170.15250.00220.15250.14960.00000.1496
Table 2. Empirical sd, relative bias, and M S E for the E G S H N ( 1.75 , 2.25 , 0.75 , γ , 1 ) model.
Table 2. Empirical sd, relative bias, and M S E for the E G S H N ( 1.75 , 2.25 , 0.75 , γ , 1 ) model.
α ^ γ ^ λ ^ β ^ 0 β ^ 1
γ nsdRB MSE sdRB MSE sdRB MSE sdRB MSE sdRB MSE
300.50600.07840.52422.69971.35523.02050.63510.04730.63680.58450.05530.59750.66300.02040.6631
600.47680.03590.48091.42150.67691.57430.62340.00140.62330.49100.02720.49480.45060.01300.4507
1.0900.43680.01590.43760.79410.43840.90710.55940.02900.56010.43210.01300.43300.35200.00170.3520
1200.39300.01210.39350.59620.29520.66520.51070.02810.51150.39000.01130.39080.30880.01000.3088
5000.19570.00000.19570.23050.07030.24090.24020.01210.24040.19770.00050.19770.14930.00260.1493
300.46120.04460.46776.02220.72796.29060.54850.01850.54870.50920.04170.51780.57560.02720.5759
600.36950.01230.37014.80840.61485.04760.43710.03600.43860.38350.00930.38410.38590.00840.3859
2.5900.30350.00890.30392.76670.39742.93940.34290.02700.34390.31770.00830.31820.31210.00620.3121
1200.26550.00260.26551.96950.26842.08050.30560.02740.30680.27680.00290.27680.26940.00260.2690
5000.11290.00070.11290.56700.05750.58490.12130.00490.12140.12570.00080.12570.12700.00140.1270
300.42460.02450.426713.33640.657013.55210.56130.06930.56550.47860.02690.48240.54040.01790.5406
600.32980.00480.329811.45640.556911.70930.40940.06310.41420.34500.00600.34520.36280.00540.3628
4.0900.26290.00150.26295.93660.48216.24140.29380.03630.29600.27920.00020.27920.29220.01030.2922
1200.22100.00180.22105.61030.37735.80910.25060.02180.25150.23280.00050.23280.24980.00270.2498
5000.09820.00070.09821.06510.07041.10160.10750.00570.10760.11110.00010.11110.12180.00180.1218
Table 3. Empirical sd, relative bias, and M S E for the E G S H N ( 1.75 , 2.25 , 0.75 , 2.0 , λ ) model.
Table 3. Empirical sd, relative bias, and M S E for the E G S H N ( 1.75 , 2.25 , 0.75 , 2.0 , λ ) model.
α ^ γ ^ λ ^ β ^ 0 β ^ 1
λ nsdRB MSE sdRB MSE sdRB MSE sdRB MSE sdRB MSE
300.33540.00830.33575.54090.78085.75630.32440.05410.32550.45080.00480.45090.64930.02150.6494
600.22780.00420.22793.00140.50443.16610.20770.03890.20860.29910.00200.29910.41860.00870.4186
0.50900.17790.00550.17822.29640.32692.38740.16370.01890.16400.23860.00140.23860.33760.00110.3376
1200.15700.00080.15701.29680.20101.35750.14120.02460.14170.20940.00110.20940.28920.00460.2890
5000.07110.00180.07110.44580.04650.45540.06390.00350.06390.09550.00000.09550.13740.00530.1374
300.46510.10930.50294.99200.81755.25250.99680.04280.99880.54700.09490.58720.57200.03770.5727
600.43080.05520.44143.88170.64434.08960.72760.01890.72760.47030.04790.48240.38530.01620.3855
1.5900.40410.03850.40962.77300.45172.91620.68600.01330.68600.42680.03350.43330.31790.00660.3179
1200.37460.02110.37641.55750.31951.68340.64730.00970.64750.38940.01790.39150.27580.01230.2760
5000.23120.00240.23120.51880.07850.54200.37270.00130.37380.23130.00040.23130.13140.00370.1314
300.39300.12710.45153.45100.39153.538417.09670.139017.09770.48720.11680.55350.57890.06310.5808
600.36350.06770.38233.15960.33423.22928.09210.13828.09790.42150.06610.44690.37920.02340.3796
3.0900.34210.04460.35083.04350.24793.08342.49230.13782.51700.40130.04930.41630.31320.00920.3133
1200.32370.02400.32642.05770.21662.10262.07730.11802.11810.37810.03130.38460.25920.00390.2592
5000.25900.01130.25971.18570.11591.2081.66950.10951.71970.29960.00190.29960.12980.00180.1290
Table 4. Estimated parameters, with their standard errors for the SHN, ESHN, and EGSHN models.
Table 4. Estimated parameters, with their standard errors for the SHN, ESHN, and EGSHN models.
EstimatorSHNESHNEGSHN
α ^ 245.979966.99527.9489
(230.85)(48.8845)(2.6274)
β ^ 0 9.27509.34229.3456
(0.1595)(0.1666)(0.1616)
β ^ 1 −0.4217−0.4077−0.4144
(0.0190)(0.0165)(0.0189)
σ ^ 0.35720.43060.7539
(0.0536)(0.0585)(0.0903)
γ ^ 0.30706.1671
(0.1166)(3.3568)
λ ^ −0.6493
(0.1965)
AIC89.5974.4273.1924
AICC93.3078.8978.58
Table 5. Relative change of the estimates of the EGSHN model.
Table 5. Relative change of the estimates of the EGSHN model.
Observation α ^ β ^ 0 β ^ 1 σ ^ γ ^ λ ^
80.07521.07513.62914.41360.24969.5977
230.19420.02000.20911.20900.08772.5542
340.23571.55184.35361.05400.37647.9936
8, 230.08390.85942.88125.70760.323313.8282
8, 340.02530.69242.01393.40980.187310.0372
23, 340.07310.91893.03780.62110.23508.8319
8, 23, 340.03260.63321.70685.29540.243814.2785
Table 6. Descriptive statistics of the ploidy dataset.
Table 6. Descriptive statistics of the ploidy dataset.
y ¯ s y 2 b 1 b 2
3.6361.4320.4520.865
Table 7. Estimated parameters (standard errors) for the fitted models.
Table 7. Estimated parameters (standard errors) for the fitted models.
EstimatorsBSBSSNBLSNEBSEGBS
α ^ (0.3145)0.52541.35640.20330.2136
(0.0140)(0.0263)(0.0169)(0.0066)(0.0082)
β ^ 3.51942.30420.20663.79954.0200
(0.0698)(0.032)(0.0070)(0.0551)(0.0643)
γ ^ 1 3.98454.48995.6760
(1.2161)(1.7541)(2.5460)
λ ^ 7.7814−0.2874 −0.3724
(1.2943)(0.0677) (0.0701)
AIC745.58698.51671.95668.41637.8484
AICC747.68700.68674.20670.58640.09
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Martínez-Flórez, G.; Elal-Olivero, D.; Barrera-Causil, C. Extended Generalized Sinh-Normal Distribution. Mathematics 2021, 9, 2793. https://doi.org/10.3390/math9212793

AMA Style

Martínez-Flórez G, Elal-Olivero D, Barrera-Causil C. Extended Generalized Sinh-Normal Distribution. Mathematics. 2021; 9(21):2793. https://doi.org/10.3390/math9212793

Chicago/Turabian Style

Martínez-Flórez, Guillermo, David Elal-Olivero, and Carlos Barrera-Causil. 2021. "Extended Generalized Sinh-Normal Distribution" Mathematics 9, no. 21: 2793. https://doi.org/10.3390/math9212793

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop