Next Article in Journal
Assessment of a Modified Sandwich Estimator for Generalized Estimating Equations with Application to Opioid Poisoning in MIMIC-IV ICU Patients
Previous Article in Journal
Smoothing in Ordinal Regression: An Application to Sensory Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generalized Cardioid Distributions for Circular Data Analysis

by
Fernanda V. Paula
1,*,†,
Abraão D. C. Nascimento
2,†,
Getúlio J. A. Amaral
2,† and
Gauss M. Cordeiro
2,†
1
Mathematics Degree Course, Federal University of the Tocantins, Araguaína 77824-838, TO, Brazil
2
Department of Statistics, Federal University of Pernambuco, Recife 50740-540, PE, Brazil
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Stats 2021, 4(3), 634-649; https://doi.org/10.3390/stats4030038
Submission received: 29 June 2021 / Revised: 1 August 2021 / Accepted: 7 August 2021 / Published: 11 August 2021
(This article belongs to the Section Applied Stochastic Models)

Abstract

:
The Cardioid (C) distribution is one of the most important models for modeling circular data. Although some of its structural properties have been derived, this distribution is not appropriate for asymmetry and multimodal phenomena in the circle, and then extensions are required. There are various general methods that can be used to produce circular distributions. This paper proposes four extensions of the C distribution based on the beta, Kumaraswamy, gamma, and Marshall–Olkin generators. We obtain a unique linear representation of their densities and some mathematical properties. Inference procedures for the parameters are also investigated. We perform two applications on real data, where the new models are compared to the C distribution and one of its extensions.

1. Introduction

Fitting densities to data has a long history. Statistical distributions are very useful in describing and predicting real world phenomena. Hundreds of extended distributions have been developed by introducing one or more parameters to a baseline distribution over the past decades for modeling data in several disciplines, in particular in reliability engineering [1], survival analysis [2], demography [3], actuarial science [4], etc.
Adding parameters to a well-established distribution is a time honored device for obtaining more flexible new families of distributions. In fact, several classes of distributions have been introduced by adding one or more parameters to generate new distributions in the statistical literature. Recent developments address definitions of new families that extend well-known distributions and, at the same time, provide great flexibility in modeling real data. The well-known generators are the Marshall–Olkin-G [5], beta-G [6], gamma-G [7], Kumaraswamy-G (Kw-G) [8], exponentiated generalized (EG) [9], type I half-logistic-G [10], Burr X-G [11], and exponentiated Weibull-H [12], among others. The applications of these generators have been made in the context of linear data, i.e., on the support of a subset of R .
Several phenomena in practice provide angles (expressed in degrees or radians) as outputs called circular data, such as in the analysis of phase features obtained from radar imagery [13], time series analysis of wind speeds and directions [14], etc. As one of the most used circular distributions, the two-parameter Cardioid (C) law was pioneered by Jeffreys [15] for describing directional spectra of ocean waves. This model has a cumulative distribution function (cdf), G ( x ) = G ( x ; μ , ρ ) , and probability density function (pdf), g ( x ) = g ( x ; μ , ρ ) , given by (for 0 < x 2 π )
G ( x ) = x 2 π + ρ π sin ( x μ ) + sin ( μ )
and
g ( x ) = 1 2 π 1 + 2 ρ cos ( x μ ) ,
respectively, where 0 < μ 2 π is a location parameter, and | ρ | 0.5 represents a concentration index. Some known competing distributions to the C distribution are the wrapped normal, wrapped Cauchy, wrapped Lévy, and Wrapped Lindley. A novel circular distribution introduced by Wang and Shimizu [16] applied the Möbius transformation to the C model. The Papakonstantinou family studied by Abe et al. [17] also extended (1). However, these extensions present hard analytic formulas for their densities. Recently, Paula et al. [18] introduced a simple extended C distribution, called the exponentiated Cardioid (EC), derived from the exponentiated G (exp-G) generator—after adapting the mapping linear to circular—that can describe asymmetric and some bimodal cases beyond those of the C model. The models mentioned and those that will be presented in this work are also classified as trigonometric distributions. In recent years, many trigonometric models have been proposed, such as the transformed Sin-G family [19] and Cos-G Class [20], thus highlighting their importance.
In this work, we derive four extensions of the C model through the adapted β -G, Kw-G, Γ -G, and MO-G generators, which extend the exp-G family. We propose four new circular distributions called the beta Cardioid ( β C), Kumaraswamy Cardioid (KwC), gamma Cardioid ( Γ C), and Marshall–Olkin Cardioid (MOC). Their densities are expressed in a unique linear representation, which is the result of weighting the term 1 + 2 ρ cos ( x μ ) in Equation (2). Circular data phenomena often demand the proposal of tailored clustering structures. Abraham et al. [21] presented a discussion on an unsupervised clustering algorithm in circular data obtained from X-ray beam projectors. Based on mixtures of one-dimensional Langevin distributions, Qiu and Wu [22] derived a new information criterion to cluster circular data. We understand that these works motivate our proposals as the potential inputs for future clustering structures. Furthermore, some mathematical properties of the new models are derived, such as extensions and trigonometric moments [23]. A brief discussion about likelihood-based estimation procedures is provided. Finally, two applications to real data are performed to illustrate the flexibility of our proposals.
The remainder of this paper is organized as follows. New circular distributions are defined in Section 2. Section 3 provides some of their properties, and an estimation procedure is addressed in Section 4. Subsequently, two applications to real data are performed in Section 5, and some conclusions are offered in Section 6.

2. Generalized Cardioid Models

We provide some three- and four-parameter distributions by transforming the C distribution according to four well-known generators.
Let G ( x ) be the cdf of a baseline distribution with p parameters:
(a)
The β -G cdf defined by Eugene et al. [6] is
F β - G ( x ) = I G ( x ) ( θ , ϕ ) = 1 B ( θ , ϕ ) 0 G ( x ) ω θ 1 ( 1 ω ) ϕ 1 d ω ,
where θ , ϕ > 0 are two additional parameters, I G ( x ) ( θ , ϕ ) is the incomplete beta function ratio evaluated at G ( x ) , and B ( θ , ϕ ) = 0 1 ω θ 1 ( 1 ω ) ϕ 1 d ω is the complete beta function;
(b)
The Kw-G cdf pioneered by Cordeiro and Castro [8] is
F Kw - G ( x ) = 1 1 G ( x ) θ ϕ ,
where θ , ϕ > 0 are two additional parameters;
(c)
The Γ -G cdf reported by Zografos and Balakrishnan [7] is
F Γ - G ( x ) = γ θ , log 1 G ( x ) Γ ( θ ) ,
where θ > 0 , Γ ( θ ) = 0 t θ 1 e t d t is the gamma function, and γ ( θ , z ) = 0 z t θ 1 e t d t is the incomplete gamma function;
(d)
The MO-G cdf defined by Marshal and Olkin [5] is
F MO - G ( x ) = 1 θ [ 1 G ( x ) ] 1 ( 1 θ ) [ 1 G ( x ) ] = θ [ 1 G ( x ) ] G ( x ) + θ [ 1 G ( x ) ] ,
where θ > 0 is a shape parameter.
For the first two generators, given a p-parameter baseline cdf as input, one has new ( p + 2 ) -parameter models, whereas for the remaining generators, ( p + 1 ) -parameter distributions are furnished.
Let A ( x ) = ( 2 π ) 1 [ x mod ( x , 2 π ) ] , where mod ( x , y ) is the remainder after x is divided by y. In what follows, we will do an adaptation to the generators (3)–(6) in order to propose generalized Cardioid models with cdf F ( · ) and pdf f ( · ) that satisfy the conditions
  • f ( x + 2 π ) = f ( x ) ;
  • F ( x + 2 π ) F ( x ) = 1 .
The conditions are required for circular data studies (see Mardia and Sutton [24]). The new models present discontinuity in { 2 k π : k Z } . This pattern also holds for other circular models in the literature such as wrapped exponential [25].

2.1. Beta Cardioid

By applying (1) to Equation (3), the cdf of the β C distribution is
F 1 ( x ) = A ( x ) + I mod ( x , 2 π ) 2 π + ρ π sin ( x μ ) + sin ( μ ) ( θ , ϕ ) ,
for x R { 2 π k : k Z } . This case is denoted by X β C ( θ , ϕ , μ , ρ ) . By differentiating the last equation, the  β C pdf, say f 1 ( x ) = f 1 ( x ; θ , ϕ , μ , ρ ) , has the form
f 1 ( x ) = h 1 ( x ) 2 π B ( θ , ϕ ) = h ¯ 1 ( x ) 1 + 2 ρ cos ( x μ ) ,
where h ¯ 1 ( x ) = h 1 ( x ) / [ 2 π B ( θ , ϕ ) ] and
h 1 ( x ) = h 1 ( x ; θ , ϕ , μ , ρ ) = mod ( x , 2 π ) 2 π + ρ π sin ( x μ ) + sin ( μ ) θ 1 1 mod ( x , 2 π ) 2 π ρ π sin ( x μ ) + sin ( μ ) 1 ϕ .
For ϕ = 1 , the  β C model reduces to the EC distribution discussed by Paula et al. (2020).
Figure 1a–d display β C densities for some parametric points.

2.2. Kumaraswamy Cardioid

By inserting (1) in Equation (4), the Kw-C cdf, say F 2 ( x ) = F 2 ( x ; θ , ϕ , μ , ρ ) , can be expressed as
F 2 ( x ) = A ( x ) + 1 1 mod ( x , 2 π ) 2 π + ρ π sin ( x μ ) + sin ( μ ) θ ϕ
for x R { 2 π k : k Z } . This case is denoted by X K w C ( θ , ϕ , μ , ρ ) . The KwC pdf, f 2 ( x ) = f 2 ( x ; θ , ϕ , μ , ρ ) , can be reduced to
f 2 ( x ) = θ ϕ h 2 ( x ) 2 π = h ¯ 2 ( x ) 1 + 2 ρ cos ( x μ ) ,
where h ¯ 2 ( x ) = θ ϕ h 2 ( x ) / ( 2 π ) and
h 2 ( x ) = h 2 ( x ; θ , ϕ , μ , ρ ) = mod ( x , 2 π ) 2 π + ρ π sin ( x μ ) + sin ( μ ) θ 1 1 mod ( x , 2 π ) 2 π + ρ π sin ( x μ ) + sin ( μ ) θ 1 ϕ .
Figure 2a–d display KwC densities for some parametric points.

2.3. Gamma Cardioid

By applying (1) in Equation (5), the  Γ C cdf, F 3 ( x ) = F 3 ( x ; θ , μ , ρ ) , has the form
F 3 ( x ) = A ( x ) + γ θ , log 1 mod ( x , 2 π ) 2 π ρ π sin ( x μ ) + sin ( μ ) Γ ( θ ) ,
for x R { 2 π k : k Z } . This case is denoted by X Γ C ( θ , μ , ρ ) . By differentiating the last equation, the  Γ C pdf, f 3 ( x ) = f 3 ( x ; θ , μ , ρ ) , reduces to   
f 3 ( x ) = h 3 ( x ) 2 π Γ ( θ ) = h ¯ 3 ( x ) 1 + 2 ρ cos ( x μ ) ,
where h ¯ 3 ( x ) = h 3 ( x ) / [ 2 π Γ ( θ ) ] and
h 3 ( x ) = h 3 ( x ; θ , μ , ρ ) = log 1 mod ( x , 2 π ) 2 π ρ π sin ( x μ ) + sin ( μ ) θ 1 .
Figure 3a–d display Γ C densities for some parametric points.

2.4. Marshall–Olkin Cardioid

By inserting (1) in Equation (6), the MOC cdf, F 4 ( x ) = F 4 ( x ; θ , μ , ρ ) , is given by
F 4 ( x ) = A ( x ) + θ 1 mod ( x , 2 π ) 2 π ρ π sin ( x μ ) + sin ( μ ) mod ( x , 2 π ) 2 π + ( 1 θ ) ρ π sin ( x μ ) + sin ( μ ) + θ 1 x 2 π ,
for x R { 2 π k : k Z } . This case is denoted by X M O C ( θ , μ , ρ ) . Thus, the MOC pdf, f 4 ( x ) = f 4 ( x ; θ , μ , ρ ) , becomes
f 4 ( x ) = θ h 4 ( x ) 2 π = h ¯ 4 ( x ) 1 + 2 ρ cos ( x μ ) ,
where h ¯ 4 ( x ) = θ h 4 ( x ) / ( 2 π ) and
h 4 ( x ) = h 4 ( x ; θ , μ , ρ ) = 1 ( 1 θ ) 1 mod ( x , 2 π ) 2 π ρ π sin ( x μ ) + sin ( μ ) 2 .
Figure 4a–d display MOC densities for some parametric points.

2.5. A General Formula

All four extensions have the same support, and their densities can be expressed as
f i ( x ) = h ¯ i ( x ) 1 + 2 ρ cos ( x μ ) , for i = 1 , , 4 ,
where h ¯ i ( x ) is defined in Table 1.
The new densities can be interpreted as weighted multipliers for the baseline pdf kernel 1 + 2 ρ cos ( x μ ) . Thus, the behavior of h ¯ i ( x ) in (11) has an important task for studying the flexibility of the new models. Figure 5 displays the weighted functions h ¯ i ( x ) . For these plots, we set ( μ , ρ ) = ( 2 , 0.2 ) and consider θ = ϕ ( 0 , 100 ) and x ( 0 , 2 π ) . Note that although h ¯ 3 ( x ) and h ¯ 4 ( x ) have the highest values, h ¯ 1 ( x ) and h ¯ 2 ( x ) present larger domain regions, which lead to more flexible scenarios. Thus, we conclude that the β C and KwC can be more flexible among these models.

3. Mathematical Properties

In this section, we obtain the trigonometric moments for the new models. First, we recall some concepts in the area of circular distributions. We follow the notation of Pewsey et al. [26].
Analogously as over the real line, a circular distribution can also be described by its characteristic function (cf). However, as random variables X considered in this paper are periodic, we can write
ϕ X ( t ) = E e i t X = E e i t ( X + 2 π ) = e i t 2 π E e i t X ,
where i = 1 , which implies ϕ X ( t ) = 0 or e i t 2 π = 1 ; i.e., the cf should be defined only at integer values.
The cf evaluated at an integer p is called the pth trigonometric moment of X defined by
τ p , 0 = E e i p X = E [ cos ( p X ) ] α p + i E [ sin ( p X ) ] β p .
The quantity τ p , 0 is the mean resultant vector in the complex plane of length ρ p = | τ p , 0 | = α p 2 + β p 2 [ 0 , 1 ] and direction
μ p = arctan ( β p / α p ) , α p > 0 , arctan ( β p / α p ) + π , β p 0 , α p < 0 , arctan ( β p / α p ) π , β p < 0 , α p < 0 , π / 2 , β p > 0 , α p = 0 , π / 2 , β p < 0 , α p = 0 , undefined , β p = α p = 0 ,
where | · | is the norm of a complex argument. The quantities ρ 1 and μ 1 are fundamental measures of concentration and location, respectively. The polar representation of τ p , 0 is
τ p , 0 = ρ p e i μ p = ρ p cos ( μ p ) α p + i ρ p sin ( μ p ) β p .
Furthermore, the pth central trigonometric moment of a circular distribution is
τ p , μ 1 = E { cos [ p ( X μ 1 ) ] } α ¯ p + i E { sin [ p ( X μ 1 ) ] } β ¯ p ,
where α ¯ p and β ¯ p are its real and imaginary parts. The polar representation of τ p , μ 1 is given by
τ p , μ 1 = τ p , 0 e i p μ 1 = ρ p [ cos ( μ p p μ 1 ) + i sin ( μ p p μ 1 ) ] .
Here, we are interested in finding expressions for τ p , μ 1 .
In what follows, μ refers to the parameter discussed previously in the models, while μ 1 is the mean direction.
Furthermore, we derive expansions for f i ( x ) by means of the following results. First, consider a baseline distribution having cdf G ( x ) and pdf g ( x ) . The exp-G family with power parameter θ > 0 has cdf and pdf given by
Π θ ( x ) = G ( x ) θ and π θ ( x ) = θ g ( x ) G ( x ) θ 1 ,
respectively. Expansions for densities obtained from Equations (3)–(6) have often been given in terms of the last two functions:
  • From Nadarajah et al. [27]:
    f β - G ( x ) = i = 0 ( 1 ) i ( θ + i ) ϕ 1 i [ B ( θ , ϕ ) ] 1 w i ( 1 ) = w i ( 1 ) ( θ , ϕ ) ( θ + i ) g ( x ) Π θ + i 1 ( x ) π θ + i ( x ) .
  • From Cordeiro and de Castro [8]:
    f Kw - G ( x ) = i = 0 ( 1 ) i θ ϕ θ ( i + 1 ) ϕ 1 i w i ( 2 ) = w i ( 2 ) ( θ , ϕ ) θ ( i + 1 ) g ( x ) Π ( i + 1 ) θ 1 ( x ) π ( i + 1 ) θ ( x ) .
  • From Castellares and Lemonte [28]:
    f Γ - G ( x ) = i = 0 φ i ( θ ) ( i + θ ) w i ( 3 ) = w i ( 3 ) ( θ ) π θ + i ( x ) ,
    where
    φ 0 ( θ ) = 1 Γ ( θ ) , φ i ( θ ) = ρ i ( θ ) Γ ( θ ) = ( θ 1 ) Γ ( θ ) ψ i 1 ( i + θ 2 ) , i 1 ,
    and ψ i 1 ( · ) are the Stirling polynomials given in Castellares and Lemonte [28].
  • From Cordeiro et al. [29]:
    f MO - G ( x ) = i = 0 w i ( 4 ) π i + 1 ( x ) ,
    where the coefficients w i ( 4 ) = w i ( 4 ) ( θ ) are given by ( i = 0 , 1 , )
    w i ( 4 ) = ( 1 ) i θ ( i + 1 ) j = i ( j + 1 ) j i θ ¯ j , θ ( 0 , 1 ) , θ 1 ( 1 θ 1 ) i , θ > 1 ,
    and θ ¯ = 1 θ .
  • From Paula et al. [18]:  
    Let X E C E C ( θ , μ , ρ ) . The cdf of X E C is (Paula et al., 2020)
    Π θ ( x ; μ , ρ ) = k = 0 l = 0 k b k , l ( θ , μ , ρ ) x θ k sin l ( x μ ) .
    By simple differentiation, we can write
    π θ ( x ; μ , ρ ) = k = 0 l = 0 k q = 0 1 [ q l + ( 1 q ) ( θ k ) ] b k , l ( θ , μ , ρ ) × m θ k + q 1 , q , l q ( x ; μ ) ,
    where m p , q , r ( x ; μ ) = x p cos q ( x μ ) sin r ( x μ ) ,
    b k , l ( θ , μ , ρ ) = θ k k l ρ π k sin k l ( μ ) ( 2 π ) θ k .
    After some algebraic manipulations, the pth central circular trigonometric moment of X E C , say τ p , μ 1 E C ( θ ) , with mean direction μ 1 , follows as
    τ p , μ 1 E C ( θ ) = e i p μ 1 + p { 0 2 π sin [ p ( x μ 1 ) ] Π θ ( x ; μ , ρ ) d x + i 0 2 π cos [ p ( x μ 1 ) ] Π θ ( x ; μ , ρ ) d x } = e i p μ 1 + p k = 0 l = 0 k b k , l ( θ , μ , ρ ) × [ A 1 ( θ k , l , p ) i A 2 ( θ k , l , p ) ] ,
    where A 1 ( a , b , c ) = 0 2 π x a sin b ( x μ ) sin ( c ( x μ 1 ) ) d x and A 2 ( a , b , c ) = 0 2 π x a sin b ( x μ ) cos ( c ( x μ 1 ) ) d x . The functions A 1 ( · , · , · ) and A 2 ( · , · , · ) are easily handled both numerically and analytically.
    For example, Table 2 displays some special quantities using the symbolic computation software wxmaxima.
By applying (17) to Equations (13)–(16), we obtain linear representations for (11), which hold for the four generalized C distributions.
Theorem 1.
The pdf (11) can be expanded as
f i ( x ) = 1 2 π [ 1 + 2 ρ cos ( x μ ) ] k = 0 l = 0 k t = 0 b k , l , t ( i ) x i n d i ( t ) 1 k sin l ( x μ ) ,
where b k , l , t ( i ) = i n d i ( t ) w t ( i ) ( θ , ϕ ) b k , l ( i n d i ( t ) 1 , μ , ρ ) for i = 1 , , 4 , i n d 1 ( t ) = θ + t , i n d 2 ( t ) = θ ( t + 1 ) , i n d 3 ( t ) = θ + t and i n d 4 ( t ) = t .
Equation (19) can be used to derive some mathematical properties (having intractable analytical forms) of f i ( x ) (for i = 1 , , 4 ). Furthermore, as a consequence, we have expansions for the weights h ¯ i ( x ) (which have complex forms) as linear combinations of x l v sin h ( x μ ) . Proposing criteria for choosing the best f i ( x ) based on these expansions may be a promising research branch. In particular, we obtain expressions for the central trigonometric moments of distributions with pdf (11).
Corollary 1.
Let τ p , μ 1 ( j ) be the pth central trigonometric moment of the model F j . We obtain
τ p , μ 1 ( j ) = e i p μ 1 t = 0 w t ( j ) ( θ , ϕ ) + p k = 0 l = 0 k d k , l ( j ) ,
where
d k , l ( j ) = t = 0 [ A 1 ( i n d j ( t ) k , l , p ) i A 2 ( i n d j ( t ) k , l , p ) ] w t ( j ) ( θ , ϕ ) b k , l ( i n d j ( t ) , μ , ρ ) ,
and i n d j ( t ) is given in Theorem 1.
Proof of Corollary 1.
Let τ p , μ 1 ( j ) be the pth circular trigonometric moment of X ( j ) F j .
Furthermore, let i n d 1 ( t ) = θ + t , i n d 2 ( t ) = θ ( t + 1 ) , i n d 3 ( t ) = θ + t , and i n d 4 ( t ) = t . Thus, assuming α ¯ p E C ( θ ) and β ¯ p E C ( θ ) , similar to real and imaginary parts of τ p , μ 1 E C ( θ ) in (18), it follows from Equations (13)–(16):
τ p , μ 1 ( j ) = E cos p X ( j ) μ 1 + i E sin p X ( j ) μ 1 = t = 0 w t ( j ) ( θ , ϕ ) α ¯ p E C ( i n d j ( t ) ) + i t = 0 w t ( j ) ( θ , ϕ ) β ¯ p E C ( i n d j ( t ) ) = t = 0 w t ( j ) ( θ , ϕ ) τ p , μ 1 E C ( i n d j ( t ) ) = e i p μ 1 t = 0 w t ( j ) ( θ , ϕ ) + p k = 0 l = 0 k d k , l ( j ) ,
where
d k , l ( j ) = t = 0 w t ( j ) ( θ , ϕ ) b k , l ( i n d j ( t ) , μ , ρ ) [ A 1 ( i n d j ( t ) k , l , p ) i A 2 ( i n d j ( t ) k , l , p ) ] .
   □

4. Estimation

This section tackles a brief discussion about maximum likelihood estimation of the parameters of the pdf family (11). Several approaches for estimating the parameters have been proposed in the literature, but the maximum likelihood method is the most commonly employed. The maximum likelihood estimates (MLEs) present desirable properties for constructing confidence intervals for the parameters. They are easily computed by using well-known platforms such as the R (optim function), SAS (PROC NLMIXED), and Ox program (MaxBFGS sub-routine)..
Let x 1 , , x n be an observed sample from a random variable having pdf (11). Thus, the associated log-likelihood function for δ = ( θ , ϕ , μ , ρ ) can be expressed as (for i = 1 , 4 )
i ( δ ) = j = 1 n log f i ( x j ) = j = 1 n { log h ¯ i ( x j ) + log [ 1 + 2 ρ cos ( x j μ ) ] } .
The score vector follows from i ( δ ) as
( U θ , i , U ϕ , i , U μ , i , U ρ , i ) = i ( δ ) θ , i ( δ ) ϕ , i ( δ ) μ , i ( δ ) ρ ,
whose components are
U θ , i = j = 1 n 1 h ¯ i ( x j ) h ¯ i ( x j ) θ , U ϕ , i = j = 1 n 1 h ¯ i ( x j ) h ¯ i ( x j ) ϕ ,
U μ , i = j = 1 n 1 h ¯ i ( x j ) h ¯ i ( x j ) μ + 2 ρ sin ( x j μ ) 1 + 2 ρ cos ( x j μ ) ,
and
U ρ , i = j = 1 n 1 h ¯ i ( x j ) h ¯ i ( x j ) ρ + 2 cos ( x j μ ) 1 + 2 ρ cos ( x j μ ) .
Thus, the MLE of δ is δ ^ = argmax δ Δ { i ( δ ) } , where Δ is the parametric space or, equivalently, the solution of the system of nonlinear equations U θ , i = U ϕ , i = U μ , i = U ρ , i = 0 . The compactness of the parameter space Δ and the continuity of the log-likelihood function on Δ are sufficient for the existence of the MLE.
The partitioned observed information matrix for the model F i ( x ) takes the form (for i = 1 , , 4 )
J i ( δ ) = U θ θ , i U θ ϕ , i U θ μ , i U θ ρ , i U ϕ θ , i U ϕ ϕ , i U ϕ μ , i U ϕ ρ , i U μ θ , i U μ ϕ , i U μ μ , i U μ ρ , i U ρ θ , i U ρ ϕ , i U ρ μ , i U ρ ρ , i = U a b , i = 2 i ( δ ) a b for a , b = θ , ϕ , μ , ρ ,
whose elements are
U τ ν , i = j = 1 n U τ ν , i ( j ) ,
for { τ , ν ( θ , ϕ , μ , ν ) } { ( τ , ν ) : τ = ν = μ , ρ and ( τ , ν ) = ( μ , ρ ) , ( ρ , μ ) } , where
U τ ν , i ( j ) = 1 h ¯ i ( x j ) 2 h ¯ i ( x j ) τ ν 1 h ¯ i 2 ( x j ) h ¯ i ( x j ) τ h ¯ i ( x j ) ν ,
U μ ρ , i = j = 1 n U μ ρ , i ( j ) + 2 sin ( x j μ ) [ 1 + 2 ρ cos ( x j μ ) ] 2 ,
U μ μ , i = j = 1 n U μ μ , i ( j ) + [ 4 ρ 2 2 ρ cos ( x j μ ) 8 ρ 2 cos 2 ( x j μ ) ] [ 1 + 2 ρ cos ( x j μ ) ] 2 ,
and
U ρ ρ , i = j = 1 n U ρ ρ , i ( j ) + 4 ρ sin ( x j μ ) cos ( x j μ ) [ 1 + 2 ρ cos ( x j μ ) ] 2 .
For interval estimation of the parameters in F i ( x ) , we obtain the Fisher information matrix (FIM) K i ( δ ) = E ( J i ( δ ) ) under standard regularity conditions.
For n sufficiently large, n ( δ ^ δ ) D N 4 ( 0 , K ˙ i ( δ ) ) from a result in Casella and Berger [30], where K ˙ i = K i / n is the unit FIM, “ N k ( μ , Σ ) ” denotes the k-dimensional multivariate normal distribution with parameters μ and Σ , and “ D ” means convergence in distribution.
However, the FIM is seldom tractable. As a solution, we can adopt J i instead of K i . This last strategy will be used in the numerical results. In the next section, the last asymptotic result will be used to determine the standard errors associated with MLEs.

5. Applications

In this section, we provide two applications to illustrate the potentiality of the proposed models. The first dataset consists of 21 wind directions obtained by a Milwaukee weather station at 6:00 a.m. on consecutive days (see [31]). The second one corresponds to the directions taken by 76 turtles after treatment addressed by Stephens [32].
The Cartesian histograms of first and second datasets in Figure 6a and Figure 7a indicate positive ( 0.4313 ) and negative ( 0.0816 ) skewness, respectively. Furthermore, these datasets have bimodal shapes.
First, the MLEs and their SEs (given in parentheses) are evaluated and, subsequently, the values of the Kuiper (K), Watson (W), Akaike information criterion (AIC), and Bayesian information criterion (BIC) statistics. The first two adherence measures are used in the context of circular statistics and can be found in Jammalamadaka and Sengupta [23]. All computations were performed using function maxLik of the R statistical software (see [33]).
The results for the first and second datasets are reported in Table 3 and Table 4, respectively. We note that all generalized models fit both datasets better than the Cardioid model according to these statistics. For the first dataset, the EC distribution stands out according to the K, AIC, and BIC measures, while the β C model yields the best fit to the dataset according to the W statistic. The β C model outperforms the other models for the second dataset.
Figure 6 and Figure 7 display plots of the empirical and fitted densities to these data. The plots support the indications from these tables.

6. Conclusions

We propose four new distributions with supports on the circle. These extensions of the Cardioid (C) distribution follow by inserting this distribution in the beta-G, gamma-G, Kumaraswamy-G (Kw-G), and Marshall–Olkin-G generators, considering a specific adaptation. We derive expansions for the densities and trigonometric moments of the new models. We also discuss the maximum likelihood estimation for their parameters. Two applications illustrate the flexibility of the proposed models to fit real data.

Author Contributions

All authors discussed the results and contributed to all sections. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from Federal University of Pernambuco, FACEPE and CNPq.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

[31,32].

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Astfalck, L.C.; Cripps, E.J.; Gosling, J.P.; Hodkiewicz, M.R.; Milne, I.A. Expert elicitation of directional metocean parameters. Ocean Eng. 2018, 161, 268–276. [Google Scholar] [CrossRef]
  2. Broly, P.; Deneubourg, J.-L. Behavioural contagion explains group cohesion in a social crustacean. PLoS Comput. Biol. 2015, 11, e1004290. [Google Scholar] [CrossRef]
  3. García-González, A.; Damon, A.; River, F.B.; Gosling, J.P. Circular distribution of three species of epiphytic orchids in shade coffee plantations, in Soconusco, Chiapas, Mexico. Plant Ecol. Evol. 2016, 149, 189–198. [Google Scholar] [CrossRef]
  4. Gatto, R. Saddlepoint approximations to tail probabilities and quantiles of inhomogeneous discounted compound poisson processes with periodic intensity functions. Methodol. Comput. Appl. Probab. 2012, 14, 1053–1074. [Google Scholar] [CrossRef]
  5. Marshall, A.W.; Olkin, I. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 1977, 84, 641–652. [Google Scholar] [CrossRef]
  6. Eugene, N.; Lee, C.; Famoye, F. Beta-normal distribution and its applications. Commun. Stat. Theory Methods 2002, 31, 497–512. [Google Scholar] [CrossRef]
  7. Zografos, K.; Balakrishnan, N. On families of beta-and generalized gamma-generated distributions and associated inference. Stat. Methodol. 2009, 6, 344–362. [Google Scholar] [CrossRef]
  8. Cordeiro, G.M.; Castro, M. A new family of generalized distributions. J. Stat. Comput. Simul. 2011, 6, 883–898. [Google Scholar] [CrossRef]
  9. Cordeiro, G.M.; Ortega, E.M.; Cunha, D.C. The exponentiated generalized class of distribution. J. Data Sci. 2013, 11, 1–27. [Google Scholar] [CrossRef]
  10. Cordeiro, G.M.; Alizadeh, M.; Marinho, P.R.D. The type I half-logistic family of distributions. J. Stat. Comput. Simul. 2016, 86, 707–728. [Google Scholar] [CrossRef]
  11. Yousof, H.M.; Afify, A.Z.; Hamedani, G.; Aryal, G.R. The Burr X generator of distributions for lifetime data. J. Stat. Theory Appl. 2016, 16, 288–305. [Google Scholar] [CrossRef] [Green Version]
  12. Cordeiro, G.M.; Afify, A.Z.; Yousof, H.M.; Pescim, R.R.; Aryal, G.R. The exponentiated Weibull-H family of distributions: Theory and applications. Mediterr. J. Math. 2017, 14, 155–176. [Google Scholar] [CrossRef]
  13. Lee, J.-S.; Hoppel, K.W.; Mango, S.A.; Miller, A.R. Intensity and phase statistics of multilook polarimetric and interferometric SAR imagery. IEEE Trans. Geosci. Remote Sens. 1994, 32, 1017–1028. [Google Scholar]
  14. Breckling, J. The Analysis of Directional Time Series: Applications to Wind Speed and Direction; Springer: Berlin/Heidelberg, Germany, 1989. [Google Scholar]
  15. Jeffreys, H. Theory of Probability; Oxford University Press: Oxford, UK, 1983. [Google Scholar]
  16. Wang, M.Z.; Shimizu, K. On applying Möbius transformation to Cardioid random variables. Stat. Methodol. 2012, 9, 604–614. [Google Scholar] [CrossRef]
  17. Abe, T.; Pewsey, A.; Shimizu, K. On Papakonstantinou’s extension of the Cardioid distribution. Stat. Probab. Lett. 2009, 79, 2138–2147. [Google Scholar] [CrossRef]
  18. Paula, F.V.; Nascimento, A.D.C.; Amaral, G.J.A. A new extended Cardioid model: An application to wind data. Submitt. J. Math. Imaging Vis. 2020. [Google Scholar]
  19. Jamal, F.; Chesneau, C.; Bouali, D.L.; Ul Hassan, M. Beyond the Sin-G family: The transformed Sin-G family. PLoS ONE 2021, 16, e0250790. [Google Scholar] [CrossRef] [PubMed]
  20. Souza, L.; Júnior, W.R.O.; Brito, C.C.R.; Chesneau, C.; Ferreira, T.A.E.; Soares, L.G.M. General properties for the Cos-G Class of Distributions with Applications. Eurasian Bull. Math. 2019, 2, 63–79. [Google Scholar]
  21. Abraham, C.; Molinari, N.; Servien, R. Unsupervised clustering of multivariate circular data. Stat. Med. 2013, 32, 1376–1382. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Qiu, X.; Wu, S.; Wu, H. A new information criterion based on langevin mixture distribution for clustering circular data with application to time course genomic data. Stat. Sin. 2015, 25, 1459–1476. [Google Scholar] [CrossRef]
  23. Jammalamadaka, S.R.; Sengupta, A. Topics in Circular Statistics; World Scientific Publishing: Singapore, 2001. [Google Scholar]
  24. Mardia, K.V.; Sutton, T.W. On the modes of a mixture of two von Mises distributions. Biometrika 1975, 62, 699–701. [Google Scholar] [CrossRef]
  25. Yilmaz, A.; Biçer, C. A new wrapped exponential distribution. Math. Sci. 2018, 12, 285–293. [Google Scholar] [CrossRef] [Green Version]
  26. Pewsey, A.; Neuhäuser, M.; Ruxton, G. Circular Statistics in R; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
  27. Nadarajah, S.; Cordeiro, G.M.; Ortega, E.M. General results for the beta-modified Weibull distribution. J. Stat. Comput. Simul. 2011, 81, 1211–1232. [Google Scholar] [CrossRef]
  28. Castellares, F.; Lemonte, A.J. A new generalized weibull distribution generated by gamma random variables. J. Egypt. Math. Soc. 2015, 23, 382–390. [Google Scholar] [CrossRef] [Green Version]
  29. Cordeiro, G.M.; Lemonte, A.J.; Ortegam, E.M. The Marshall–Olkin family of distributions: Mathematical properties and new models. J. Stat. Theory Pract. 2014, 8, 343–366. [Google Scholar] [CrossRef]
  30. Casella, G.; Berger, R. Statistical Inference; Thomson Learning: Boston, MA, USA, 2002. [Google Scholar]
  31. Johnson, R.A.; Wehrly, T. Measures and models for angular correlation and angular-linear correlation. J. R. Stat. Soc. (Ser. B) 1977, 39, 222–229. [Google Scholar] [CrossRef] [Green Version]
  32. Stephens, M.A. Techniques for Directional Data; Stanford University: Stanford, CA, USA, 1969. [Google Scholar]
  33. Henningsen, A.; Toomet, O. maxlik: Package for maximum likelihood estimation in R. Comput. Stat. 2011, 26, 443–458. [Google Scholar] [CrossRef]
Figure 1. Cartesian and circular β C densities for some parametric points.
Figure 1. Cartesian and circular β C densities for some parametric points.
Stats 04 00038 g001
Figure 2. Cartesian and circular KwC densities for some parametric points.
Figure 2. Cartesian and circular KwC densities for some parametric points.
Stats 04 00038 g002
Figure 3. Cartesian and circular Γ C densities for some parametric points.
Figure 3. Cartesian and circular Γ C densities for some parametric points.
Stats 04 00038 g003
Figure 4. Cartesian and circular MOC densities for some parametric points.
Figure 4. Cartesian and circular MOC densities for some parametric points.
Stats 04 00038 g004
Figure 5. Weighted curves of f i ( x ) .
Figure 5. Weighted curves of f i ( x ) .
Stats 04 00038 g005
Figure 6. Fitted densities of the C, EC, β C, KwC, Γ C, and MOC models to the first dataset. (a) histogram and (b) rose diagram.
Figure 6. Fitted densities of the C, EC, β C, KwC, Γ C, and MOC models to the first dataset. (a) histogram and (b) rose diagram.
Stats 04 00038 g006
Figure 7. Fitted densities of the C, EC, β C, KwC, Γ C, and MOC models to the second dataset. (a) histogram and (b) rose diagram.
Figure 7. Fitted densities of the C, EC, β C, KwC, Γ C, and MOC models to the second dataset. (a) histogram and (b) rose diagram.
Stats 04 00038 g007
Table 1. The weighted multipliers for the proposed models.
Table 1. The weighted multipliers for the proposed models.
ModelC β C KwC Γ CMOC
Index (i)1234
Expression ( 2 π ) 1 h ¯ 1 ( x ) h ¯ 2 ( x ) h ¯ 3 ( x ) h ¯ 4 ( x )
Table 2. Some expressions for A 1 ( · , · , · ) and A 2 ( · , · , · ) .
Table 2. Some expressions for A 1 ( · , · , · ) and A 2 ( · , · , · ) .
A 2 ( 1 , 1 , 1 ) = 4 π cos μ 1 + μ 8 π 2 sin μ 1 μ 8
A 1 ( 1 , 1 , 1 ) = 4 π sin μ 1 + μ + 8 π 2 cos μ 1 μ 8
A 2 ( 2 , 2 , 2 ) = 96 π 2 3 sin 2 μ 1 + 2 μ 24 π cos 2 μ 1 + 2 μ 256 π 3 cos 2 μ 1 2 μ + 48 384 π 2 sin 2 μ 1 + 192 π cos 2 μ 1 384
+ sin 2 μ 1 + 2 μ 16 sin 2 μ 1 128
A 1 ( 2 , 2 , 2 ) = 24 π sin 2 μ 1 + 2 μ + 96 π 2 3 cos 2 μ 1 + 2 μ + 256 π 3 sin 2 μ 1 2 μ 192 π sin 2 μ 1 + 48 384 π 2 cos 2 μ 1 384
+ cos 2 μ 1 + 2 μ 16 cos 2 μ 1 128
Table 3. MLEs of the parameters for the first dataset, their standard errors (given in parentheses), and the Kuiper, Watson, AIC, and BIC statistics.
Table 3. MLEs of the parameters for the first dataset, their standard errors (given in parentheses), and the Kuiper, Watson, AIC, and BIC statistics.
Model ρ μ θ ϕ KuiperWatsonAICBIC
C 0.2436 4.6708 1.1590 0.0711 76.5763 80.6654
( 0.1463 )( 0.6835 )
EC 0.2164 1.1780 2.8755 0.7367 0.0257 68.6317 74.7653
( 0.1465 ) ( 0.6168 ) ( 0.8929 )
β C 0.2774 0.9093 3.9353 1.4044 0.8060 0.0247 68.8623 77.0404
( 0.0980 )( 0.5030 )( 2.6919 )( 0.6641 )
KwC 0.2767 0.9381 3.6511 1.4144 0.8038 0.0240 68.8919 77.0700
( 0.0989 )( 0.4933 )( 2.1205 )( 1.1213 )
Γ C 0.1364 1.7274 1.8563 0.8289 0.0328 70.0159 76.1495
( 0.1349 )( 0.6820 )( 0.2566 )
MOC 0.2007 2.0632 4.5765 0.8134 0.0327 70.6394 76.7730
( 0.1277 )( 0.7659 )( 1.8103 )
Table 4. MLEs of the parameters for the second dataset, their standard errors (given in parentheses), and the Kuiper, Watson, AIC, and BIC statistics.
Table 4. MLEs of the parameters for the second dataset, their standard errors (given in parentheses), and the Kuiper, Watson, AIC, and BIC statistics.
Model ρ μ θ ϕ KuiperWatsonAICBIC
C 0.3259 1.2022 2.4852 0.4855 254.6554 261.3169
( 0.0553 )( 0.3337 )
EC 0.3067 1.6025 0.7688 2.3610 0.4443 251.9396 261.9318
( 0.1256 ) ( 0.0829 ) ( 0.5656 )
β C 0.3978 0.1441 1.8836 3.1088 1.2683 0.0917 231.1254 244.4483
( 0.0538 )( 0.1892 )( 0.4959 )( 1.0142 )
KwC 0.3985 0.1712 1.6295 3.2205 1.3204 0.1007 231.9943 245.3172
( 0.0543 )( 0.2713 )( 0.3180 )( 1.8769 )
Γ C 0.2829 1.5308 0.7788 2.3005 0.4006 249.8842 259.8764
( 0.0776 )( 0.4744 )( 0.0760 )
MOC 0.2160 1.9026 0.3880 1.9593 0.2614 242.0967 252.0889
( 0.1088 )( 0.7594 )( 1.1475 )
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Paula, F.V.; Nascimento, A.D.C.; Amaral, G.J.A.; Cordeiro, G.M. Generalized Cardioid Distributions for Circular Data Analysis. Stats 2021, 4, 634-649. https://doi.org/10.3390/stats4030038

AMA Style

Paula FV, Nascimento ADC, Amaral GJA, Cordeiro GM. Generalized Cardioid Distributions for Circular Data Analysis. Stats. 2021; 4(3):634-649. https://doi.org/10.3390/stats4030038

Chicago/Turabian Style

Paula, Fernanda V., Abraão D. C. Nascimento, Getúlio J. A. Amaral, and Gauss M. Cordeiro. 2021. "Generalized Cardioid Distributions for Circular Data Analysis" Stats 4, no. 3: 634-649. https://doi.org/10.3390/stats4030038

Article Metrics

Back to TopTop