Finite Mixture Model-Based Analysis of Yarn Quality Parameters

Karakaş, Esra; Koyuncu, Melik; Ükelge, Mülayim Öngün

doi:10.3390/app15126407

Open AccessArticle

Finite Mixture Model-Based Analysis of Yarn Quality Parameters

by

Esra Karakaş

^1,*

,

Melik Koyuncu

²

and

Mülayim Öngün Ükelge

³

¹

Departmant of Business, Faculty of Economic, Administrative and Social Sciences, Adana Alparslan Türkeş Science and Technology University, Adana 01250, Turkey

²

Departmant of Industrial Engineering, Faculty of Engineering, Cukurova University, Adana 01250, Turkey

³

Quality Management Coordination Office, Adana Alparslan Türkeş Science and Technology University, Adana 01250, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(12), 6407; https://doi.org/10.3390/app15126407

Submission received: 24 April 2025 / Revised: 2 June 2025 / Accepted: 4 June 2025 / Published: 6 June 2025

Download

Browse Figures

Versions Notes

Abstract

This study investigates the applicability of finite mixture models (FMMs) for accurately modeling yarn quality parameters in 28/1 Ne ring-spun polyester/viscose yarns, focusing on both yarn imperfections and mechanical properties. The research addresses the need for advanced statistical modeling techniques to better capture the inherent heterogeneity in textile production data. To this end, the Poisson mixture model is employed to represent count-based defects, such as thin places, thick places, and neps, while the gamma mixture model is used to model continuous variables, such as tenacity and elongation. Model parameters are estimated using the expectation–maximization (EM) algorithm, and model selection is guided by the Akaike and Bayesian information criteria (AIC and BIC). The results reveal that thin places are optimally modeled using a two-component Poisson mixture distribution, whereas thick places and neps require three components to reflect their variability. Similarly, a two-component gamma mixture distribution best describes the distributions of tenacity and elongation. These findings highlight the robustness of FMMs in capturing complex distributional patterns in yarn data, demonstrating their potential in enhancing quality assessment and control processes in the textile industry.

Keywords:

finite mixture model; expectation–maximization algorithm; Poisson mixture; gamma mixture; yarn quality

1. Introduction

Effective data mining methods are needed to access valuable information and knowledge hidden within large heterogeneous data groups encountered in various fields, such as business, government, engineering, and health. These heterogeneous populations are typically composed of subpopulations, often referred to as components [1,2]. Finite mixture models (FMMs), which provide a flexible and powerful probabilistic approach, have been used by researchers in many fields to analyze heterogeneous populations due to their well-principled approach to clustering. Known as unsupervised learning models, they can also be used in latent class analysis, image analysis, survival analysis, disease mapping, and meta-analysis. Fields in which mixture models have been successfully applied include, but are not limited to, chemistry [3], engineering [4], criminology [5], psycho-oncology [6], psychiatry [7], education [8], health and medical sciences [9,10,11,12], and acoustic signal processing and sonar data analysis [13].

However, despite their wide range of applications in the aforementioned fields, the use of mixture distributions in the textile industry remains relatively limited. In view of the textile literature, it is observed that the FMM is mostly applied to generate the cotton fiber length probability density function. Notably, an FMM based on two Weibull distributions has been proposed as an effective method for this purpose [14]. This approach was further validated by [15,16], both of which applied a two-component Weibull mixture model to represent fiber length distribution. Additionally, the normal distribution and the power function were introduced as part of the FMM framework to derive the probability density functions for carded and combed cotton fibers, utilizing parameters derived from fiber length measurements [17]. The characterization of cotton fiber length distribution using an FMM is also addressed in [18]. Based on the characteristics of fiber length histograms, a mixture of Weibull distributions was used to generate a parametric continuous function. This model was applied to cotton fiber samples collected from different spinning processes (bale, card mat, carded sliver, combed sliver, and finished sliver) to describe the fiber length distribution. Collectively, these studies highlight the FMM as a reliable and robust method for generating cotton fiber length probability density functions.

Building on these findings, in this study, we investigate the effectiveness of FMMs in modeling the probability distributions of yarn quality parameters, including yarn imperfection and mechanical parameters. Yarn imperfection parameters include the total number of neps and thick and thin places within a specified length of yarn. In yarn manufacturing, “thick and thin places” are irregularities where the yarn’s diameter varies significantly. Thin places indicate reductions in mass, while thick places signify increases in mass [19]. More specifically, thick places are yarn defects identified by a diameter larger than that of adjacent sections, extending approximately 6 mm (1/4 inch) in length, and are primarily due to inadequate fiber drafting during yarn formation. Conversely, thin places are defects marked by segments at least 25% smaller in diameter than the average yarn diameter, usually resulting from inconsistencies in sliver or yarn count [20]. Another imperfection parameter is called neps, and is defined as small fiber clusters that become entangled within the yarn and enlarge its diameter, which can negatively influence the fabric’s visual appeal and tactile quality [21]. The presence of neps adversely affects both the appearance of the yarn and the fabrics produced from it, impacting their smoothness and the evenness of dye application [22]. The reason for the negative impact on the appearance of dyed or printed fabrics is typically the absorption of dye by neps. The low cellulose content of the immature fibers reflects light differently, resulting in white spots or flecks on finished fabrics [23]. Although such white spots are visible when fabrics are dyed with lighter colors, these spots or specks are especially visible in fabrics dyed with darker colors. Once these marks and specks are present in the fabric, they can lead to downgrading or rejection because there is no cost-effective method of concealing them [24]. Because these imperfections can significantly degrade the quality of both the yarn and the resulting fabric, understanding the probability distribution of yarn imperfections is crucial for predicting these properties, ensuring quality control, and optimizing production. This knowledge also contributes to a better understanding of variations in yarn properties and enables the rapid identification of potential issues.

In addition to the imperfection parameters of the yarn, the mechanical properties, including tenacity and elongation, have also been considered in the scope of the study. Tenacity and elongation are two crucial quality parameters that directly impact the overall performance and suitability of textile fibers and yarns for various applications [25,26,27,28,29,30]. Tenacity, also known as tensile strength, refers to the ability of a fiber or yarn to withstand tensile stress before breaking [26,27]. This property is essential in high-speed weaving and other fabric manufacturing processes where the yarn is subjected to high stresses imposed by various machine elements. Yarns with higher tenacity are less prone to breakage during these dynamic processes, leading to improved efficiency and reduced waste [31]. The elongation of a fiber or yarn represents its maximum extension before rupture [29,30]. This property is closely related to the fabric’s deformability and elasticity, which are important for the comfort and fit of the final textile products. Fabrics with higher elongation can better accommodate user movements and stresses, enhancing the overall user experience [32]. In the literature, studies on this subject have generally focused on identifying the factors influencing the yarn tenacity and elongation parameters. Overall, the tenacity and elongation of textile fibers are crucial parameters that determine the suitability and performance of the final textile products. Understanding these factors is essential for optimizing the quality and performance of textile materials.

Finally, this study aims to evaluate the applicability and effectiveness of FMMs in analyzing yarn imperfections—specifically thin places, thick places, and neps—as well as yarn mechanical properties, including tenacity and elongation. By leveraging Poisson and gamma mixture distributions, this work highlights a novel approach for characterizing these quality parameters, offering practical insights for quality control and process optimization in the textile industry.

To provide a clear understanding of the research process and findings, the structure of this paper is outlined as follows: Section 2 provides the methodology employed in this study, including data collection procedures and the theorical background of specific FMMs utilized for analyzing imperfections and yarn mechanical properties. Section 3 presents the results of the analysis, discussing the findings in detail. Finally, Section 4 concludes the study by summarizing the key findings, discussing their implications for the textile industry, and outlining potential avenues for future research.

2. Materials and Methods

2.1. Datasets

The data were sourced from a textile company in Adana, Turkey, Turkey that has preferred to remain anonymous. The company offers an extensive range of fabrics, employing various techniques, such as fiber dyeing, piece dyeing, texturing, and digital printing. In addition, it produces yarns in diverse counts and fiber blends using the ring-spinning method, which is utilized in its fabric production.

Specifically, the data utilized in this study were obtained from a 28/1 Ne ring-spun, combed (RNG COM) polyester/viscose (PE/V) yarn with a composition ratio of 65/35. The dataset consists of 2979 test results from tests conducted between December 2017 and January 2022. All yarns were produced under the standard ring spinning process parameters defined by the manufacturer for 28/1 Ne combed polyester/viscose (65/35) yarns. In the ring spinning stage, the spindle speed was set to 15,000 rpm, the twist level to 730 t/m, the draft ratio to 3.50, and the delivery speed to 20.5 m/min. In the roving stage, the spindle speed was 1000 rpm, the draft ratio was 0.83, and the twist was 31 t/m. These parameters were consistently applied throughout the production process, thereby ensuring the reliability of the dataset used in the analysis.

The company employs the Uster Tester-4 (Uster Technologies AG, Uster, Switzerland) instrument to evaluate yarn imperfections, including thick places, thin places, and neps, at a testing speed of 400 m/min. Sensitivity thresholds are set at −40% for identifying thin places, +50% for thick places, and at 140% and 200% for detecting different levels of neps. Additionally, the company utilizes the Uster Tensorapid 3 (Uster Technologies AG, Uster, Switzerland) device to assess the tensile strength and elasticity of the yarn.

2.2. Theoretical Background of Poisson Mixture Distribution

As a statistical modeling tool, FMMs are widely utilized for analyzing diverse types of random phenomena and can effectively represent complex probability distributions. Mixture models are particularly suitable for data characterized by the following properties: the dataset is derived from a mixture of different populations, and the distribution of each underlying component within the mixture is unknown [18].

Mathematically, FMMs are often expressed as a weighted sum of probability density functions from individual components within the mixture. Assuming that a random variable x is obtained from the mixture distributions, the probability density function f can be defined as expressed as in the following Equation (1).

f (x; θ) = \sum_{k = 1}^{K} w_{k} g_{k} (x, λ_{k})

(1)

This is called a Poisson mixture distribution, where

θ = (w_{1}, λ_{1}, \dots, w_{k}, λ_{k})

. Equation (2) is as follows:

g_{k} (x, λ_{k}) = \frac{λ_{k}^{x} e^{λ_{k}}}{x!} x = 0, 1, 2, \dots a n d k = 1, \dots, K

(2)

where

g_{k} (x, λ_{k})

represents the Poisson density function with parameter

λ_{k}

. Furthermore, the weights

w_{k}

must satisfy the following condition given in Equation (3).

\sum_{k = 1}^{K} w_{k} = 1

(3)

2.2.1. EM Algorithm for Poisson Mixture Distribution

In FMM analysis, four primary methods are commonly employed to determine the optimal number of components, namely the method of moments, the minimum distance approach, maximum likelihood estimation, and Bayesian inference. Among these approaches, maximum likelihood estimation is the most frequently utilized.

Given a set of observations

(x_{1}, \dots, x_{n})

drawn from a mixture density, as defined in Equation (1), the likelihood function and the log-likelihood function for the Poisson mixture distribution can be derived as shown in Equation (4) and Equation (5), respectively. Equation (4) is as follows:

L (X; θ) = \prod_{i = 1}^{n} \sum_{k = 1}^{K} w_{k} g_{k} (x_{i}; λ_{k})

(4)

where

L (X; θ)

represents the likelihood of the observed data, given the parameters

θ

, where

w_{k}

denotes the mixing proportions associated with each component and

g_{k} (x_{i}; λ_{k})

is the Poisson density function for the

x_{i}

th observation with parameter

λ_{k}

. Equation (5) is as follows:

l o g L (X; θ) = l (θ) = \sum_{i = 1}^{n} l o g (\sum_{k = 1}^{K} w_{k} g (x_{i}, λ_{k}))

(5)

where

l (θ)

represents the log-likelihood of the observed data, and the logarithm of the sum of the weighted Poisson densities is taken for each observation. This transformation is beneficial as it converts the product in the likelihood function into a sum, making it easier to handle mathematically, especially when performing maximization procedures to estimate the parameters

θ

.

The likelihood and log-likelihood functions are fundamental components in the statistical analysis of Poisson mixture distributions. They provide the basis for parameter estimation through methods, such as maximum likelihood estimation (MLE).

In the context of MLE for Poisson mixture distributions, the estimate

\hat{θ}

represents a sequence of roots for the log-likelihood equation, which is expressed mathematically as follows:

\frac{\partial l (θ)}{\partial θ} = 0

(6)

Equation (6) indicates that the score function, which is the gradient of the log-likelihood function with respect to the parameters

θ

, must equal zero at the maximum likelihood estimates.

The score functions derived from Equation (6) are as follows:

\frac{\partial l (θ)}{\partial λ_{k}} = \sum_{i = 1}^{n} \frac{w_{k} g^{'} (x_{i}, λ_{k})}{\sum_{k = 1}^{K} w_{k} g_{k} (x_{i}; λ_{k})} k = 1, \dots, K

(7)

where

g^{'} (x_{i}, λ_{k})

represents the derivative of the Poisson density function with respect to the parameter

λ_{k}

. Equation (8) is as follows:

\frac{\partial l (θ)}{\partial w_{k}} = \sum_{i = 1}^{n} \frac{g_{k} (x_{i}; λ_{k}) - g_{k} (x_{i}; λ_{K})}{\sum_{k = 1}^{K} w_{k} g_{k} (x_{i}; λ_{k})}

(8)

Equation (8) reflects the sensitivity of the log-likelihood to changes in the mixing proportions

{(w}_{k})

. It compares the contribution of the k-th component to the overall mixture likelihood against the contributions from the other components.

Since a closed-form solution for the maximum likelihood estimation (MLE) of Poisson mixture distributions is not available, the expectation–maximization (EM) algorithm, originally developed by Dempster et al. [33], is employed to estimate the parameters of the mixture distribution. The EM algorithm is an iterative computational method used to determine the maximum likelihood or maximum a posteriori estimates of parameters in statistical models. It involves defining missing data, representing incomplete information that can be assumed to simplify the model and its likelihood function. The details of the EM algorithm for the Poisson distribution have been demonstrated by numerous researchers.

Let

(x_{1}, \dots, x_{n})

represent an independent and identically distributed (i.i.d.) random sample drawn from the Poisson mixture distribution defined in Equation (1). The likelihood function of

f (x; θ)

is expressed as Equation (4). For computational convenience, it is more practical to utilize the following log-likelihood function (see Equation (9)), which was presented in [34]:

\log L (x; θ) = l (θ) = l o g \prod_{i = 1}^{n} \sum_{k = 1}^{K} w_{k} g_{k} (x_{i}, λ_{k}) = \sum_{i = 1}^{n} l o g \{\sum_{k = 1}^{K} w_{k} g_{k} (x_{i}; λ_{k})\}

(9)

In the context of the Poisson mixture model, the missing data can be defined as follows:

z_{i k} = \{\begin{matrix} 1, i f i t h o b s e r v a t i o n i s f r o m k t h P o i s s o n d i s t i b u t i o n i = 1, \dots, n a n d j = 1, \dots, k \\ 0, o t h e r w i s e \end{matrix} z_{i} = (z_{i 1}, \dots, z_{i k})^{T}

By incorporating the missing data

z_{i k}

, the augmented data can be defined as

y = (y_{1}, \dots, y_{n})

, where

y_{i} = (x_{i}, z_{i}^{T})

. The augmented log-likelihood function is then expressed by Equation (10).

l o g L_{a} (θ, y) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} z_{i k} l o g \{w_{k} g_{k} (x_{i}; λ_{k})\}

(10)

Since the missing data

z_{i k}

are unobserved, a closed-form solution for the MLE of λ does not exist. Instead of directly obtaining the optimal likelihood estimate, the EM algorithm employs two main steps: the Expectation (E) step and the Maximization (M) step. In the E-step, the expectation of the log-likelihood function is calculated with respect to the missing data to provide an estimated mean.

In the E-step of the EM algorithm, it is necessary to compute the expected value of the augmented log-likelihood function, conditioned on the observed data and the current parameter estimates. This expectation is expressed as follows.

In the E-step, it is necessary to compute the following Equation (11).

Q (θ| θ^{(r)}) = E \{l o g L_{a} (θ; y) | x; θ^{(r)}\}

(11)

where

θ^{(r)}

represents the parameter estimates at the r-th iteration. The conditional expectation of the missing data

z_{i k}

is then calculated. Given that

l o g L_{a} (θ, y)

is a linear function of the missing data

z_{i k},

the expected value is then substituted into

l o g L_{a} (θ; y)

. These expected values are given in Equation (12).

p_{i k}^{(r + 1)} = E (z_{i k}| x_{i}, θ^{(r)}) = P (z_{i k} = 1| x_{i}, θ^{(r)}) = \frac{w_{k}^{(r)} g_{k} (x_{i}; λ_{k}^{(r)})}{\sum_{v = 1}^{K} w_{v} g_{v} (x_{i}; λ_{v}^{(r)})}

(12)

where

p_{i k}^{(r + 1)}

represents the posterior probability that the i-th observation belongs to the k-th Poisson component, based on the current parameter estimates

θ^{(r)}

. These probabilities are used in the M-step to update the parameter estimates, ensuring the iterative convergence of the algorithm. Table 1 presents the steps of the EM algorithm applied to the Poisson mixture distribution.

In the E-step of the EM algorithm, the probability density function of the Poisson mixture distribution (as defined in Equation (1)) is substituted into Equation (13) to compute the value of

{\hat{p}}_{i k}

.

The resulting expression is given in Equation (16).

{\hat{p}}_{i k} = \frac{{\hat{w}}_{k} (\frac{e^{{- \hat{λ}}_{k}} {\hat{λ}}_{k}^{x_{i}}}{x_{i}!})}{\sum_{v = 1}^{K} {\hat{w}}_{v} (\frac{e^{{- \hat{λ}}_{v}} {\hat{λ}}_{v}^{x_{i}}}{x_{i}!})}

(16)

where

{\hat{p}}_{i k}

represents the probability that the i-th observation belongs to the k-th Poisson component, given the current parameter estimates.

In the M-step of the EM algorithm, Equations (14) and (15) represent a constrained optimization problem and can be expressed in the form of Equations (17) and (18). This optimization problem can be transformed into Equation (19) using the Lagrange multiplier (

δ

).

{\hat{λ}}_{k}, {\hat{w}}_{k} = \arg \max_{λ_{k}, w_{k}} Q (λ_{1}, \dots, λ_{k}, w_{1}, \dots, w_{K})

(17)

s.t.

\sum_{k = 1}^{K} w_{k} = 1

(18)

L (λ_{1}, \dots, λ_{k}, w_{1}, \dots, w_{k}, δ) = Q (λ_{1}, \dots, λ_{K}, w_{1}, \dots, w_{K}) - δ (\sum_{k = 1}^{K} w_{k})

(19)

By incorporating the likelihood and log-likelihood functions (as defined in Equations (9)–(11)) into the probability function of the Poisson mixture distribution (from Equation (1)), the Lagrangian can be explicitly expressed as shown in Equations (20) and (21).

L (λ_{1}, \dots, λ_{k}, w_{1}, \dots, w_{k}, δ) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} [{\hat{p}}_{i k} l o g w_{k} + {\hat{p}}_{i k} \log g_{k} (x_{i}; λ_{k})] - α (\sum_{k = 1}^{K} w_{k} - 1)

(20)

L (λ_{1}, \dots, λ_{k}, w_{1}, \dots, w_{k}, δ) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} [{\hat{p}}_{i k} l o g w_{k} + {\hat{p}}_{i k} (- λ_{k} + x_{i} \log λ_{k} - \log x_{i}!)] - δ (\sum_{k = 1}^{K} w_{k} - 1)

(21)

When the partial derivative of the Lagrangian function in Equation (21) with respect to

λ_{k}

is taken and set as equal to zero, as shown in Equations (22) and (23), the value of

{\hat{λ}}_{k}

given in Equation (24) is obtained.

\frac{\partial L}{λ_{k}} = \sum_{i = 1}^{n} {\hat{p}}_{i k} \frac{\partial}{λ_{k}} (\log w_{k} + x_{i} \log λ_{k} - \log x_{i}! - λ_{k})

(22)

\frac{\partial L}{λ_{k}} = \sum_{i = 1}^{n} [{\hat{p}}_{i k} (\frac{x_{i}}{λ_{k}} - 1)] = 0

(23)

{\hat{λ}}_{k} = \frac{\sum_{i = 1}^{n} {\hat{p}}_{i k} x_{i}}{\sum_{i = 1}^{n} {\hat{p}}_{i k}}

(24)

By performing the same operations and taking the partial derivatives with respect to

w_{k}

and

α_{k}

, and setting them as equal to zero, an estimate for

w_{k}

can also be obtained.

\frac{\partial L}{{\partial w}_{k}} = \sum_{i = 1}^{n} {\hat{p}}_{i k} \frac{\partial}{w_{k}} [(\log w_{k} + x_{i} \log λ_{k} - \log x_{i}! - λ_{k}) - δ (\sum_{k = 1}^{K} w_{k} - 1)]

(25)

\frac{\partial L}{{\partial w}_{k}} = \sum_{i = 1}^{n} \frac{{\hat{p}}_{i k}}{w_{k}} - δ = 0 that yields to {\hat{w}}_{k} = \frac{1}{δ} \sum_{i = 1}^{n} {\hat{p}}_{i k}

(26)

\frac{\partial L}{\partial δ} = \sum_{k = 1}^{K} w_{k} - 1 = 0

(27)

By substituting Equation (26) into Equation (27), the Lagrange multiplier δ is obtained, as shown in Equation (28).

\sum_{k = 1}^{K} \frac{1}{δ} \sum_{i = 1}^{n} p_{i k} = 1 \Rightarrow δ = \sum_{i = 1}^{n} \sum_{k = 1}^{K} p_{i k}

(28)

Thus, the estimate of

{\hat{w}}_{k}

is obtained as shown in Equation (29).

{\hat{w}}_{k} = \frac{\sum_{i = 1}^{n} p_{i k}}{\sum_{i = 1}^{n} \sum_{v = 1}^{K} p_{i v}}

(29)

2.2.2. The Mean and Variance of the Poisson Mixture Distribution

If the mean of a Poisson mixture distribution, with parameters defined in the preceding section, is represented as

λ_{m i x}

, it can be calculated using the Equation (30).

λ_{m i x} = \sum_{k = 1}^{K} w_{k} λ_{k}

(30)

The variance of any distribution can be computed using Equation (31).

V a r (X) = E [X^{2}] - (E [X])^{2}

(31)

where

E [X^{2}]

represents the expected value of the squares of the random variable, while

E [X]

denotes the expected value itself. For the Poisson mixture distribution, the variance formula in Equation (31) is adapted to its parameters as shown in Equation (32).

V a r (X) = E [X^{2}| w_{k}, λ_{k}] - (E [X| w_{k}, λ_{k}])^{2}

(32)

After algebraic manipulations, the variance of the Poisson distribution, as expressed in Equation (33), is obtained.

σ_{m i x}^{2} = \sum_{k = 1}^{K} w_{k} (σ_{k}^{2} + λ_{k}^{2}) - λ_{m i x}^{2}

(33)

An important property of the Poisson distribution is that its mean equals its variance. Therefore, in the preceding equation,

σ_{k}^{2}

is replaced with

λ_{k}

, leading to the following expression in Equation (34).

σ_{m i x}^{2} = \sum_{k = 1}^{K} w_{k} (λ_{k} + λ_{k}^{2}) - λ_{m i x}^{2}

(34)

Equation (34) presents the final expression needed to compute the variance of the Poisson mixture distribution.

2.3. Theoretical Background of the Gamma Mixture Distribution

The probability density function (PDF) of the gamma mixture distribution is expressed as given in Equation (35).

f (x; Θ) = \sum_{k = 1}^{K} π_{k} g_{k} (x, α_{k}, β_{k})

(35)

where

Θ = \{(θ_{1}, θ_{2}, \dots, θ_{K})\}

represents the parameter space of the mixture distribution, with each component

\{θ_{k} = (π_{k}, α_{k}, β_{k})\}

for each

k = 1, \dots, K

.

α_{k}

denotes the shape parameter of the

k

th component.

β_{k}

denotes the scale parameter of the

k

-th component.

π_{k}

denotes the mixture proportion of the k-th component, satisfying

π_{k} \geq 0

for all

k

and the constraint

\sum_{k = 1}^{K} π_{k} = 1

.

Additionally, in this formulation,

g_{k} (x, α_{k}, β_{k})

characterizes each individual gamma component, weighted by its respective proportion

π_{k}

, as Equation (36).

g (x, α_{k}, β_{k}) = \frac{β_{k}^{α_{k}} x^{α_{k} - 1} e^{- β_{k} x}}{Γ (α_{k})} x \geq 0 a n d k = 1, \dots, K

(36)

where

Γ (α_{k})

is the gamma function and is given in Equation (37).

Γ (α_{k}) = \int_{0}^{\infty} t^{α_{k} - 1} e^{- t} d t k = 1, \dots, K

(37)

2.3.1. EM Algorithm for Gamma Mixture Distribution

For a set of observations assumed to follow a gamma mixture distribution, the likelihood and logarithmic likelihood functions can be expressed using the density function provided in Equation (36). These are written as Equations (38) and (39), respectively.

L (X; θ) = \prod_{i = 1}^{n} \sum_{k = 1}^{K} π_{k} g_{k} (x_{i}; α_{k}, β_{k})

(38)

l o g L (X; θ) = l (θ) = \sum_{i = 1}^{n} l o g (\sum_{k = 1}^{K} π_{k} g_{k} (x_{i}, α_{k}, β_{k}))

(39)

With the maximum likelihood estimation (MLE), the estimate

\hat{θ}

defines a sequence of roots of the loglikelihood equation given in Equation (6). Due to the absence of a closed-form solution for the MLE, the parameters of the gamma mixture distribution are estimated using the EM algorithm. The EM algorithm for gamma distributions has been detailed by various researchers [35,36].

The missing data can be defined as follows:

z_{i k} = \{\begin{array}{l} 1, i f i t h o b s e r v a t i o n i s f r o m k t h G a m m a d i s t i b u t i o n i = 1, \dots, n a n d k = 1, \dots, K \\ 0, o t h e r w i s e \end{array} z_{i} = (z_{i 1}, \dots, z_{i k})^{T}

(40)

Let

y = (y_{1}, \dots, y_{n})

be the augmented data, where

y_{i} = (x_{i}, z_{i}^{T})

. Then, the augumented loglikelihood function can be defined as shown in Equation (41).

l o g L_{a} (θ, y) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} z_{i k} l o g \{π_{k} g_{k} (x_{i}; α_{k}, β_{k})\}

(41)

Since the missing data

z_{i k}

are not observable, the MLE for

θ

does not have a closed-form solution. Rather than directly determining the optimal likelihood estimate, the EM algorithm operates through two core steps, i.e., the expectation (E) step and the maximization (M) step. During the E-step, the expected value of the log-likelihood function is computed concerning the missing data, yielding an estimate of the mean.

In the E-step, as previously demonstrated in the context of the Poisson mixture distribution, it is essential to compute

Q (θ| θ^{(r)})

using Equation (11). Then, the conditional expectation of the missing data

z_{i k}

is calculated. Given that

l o g L_{a} (θ, y)

is a linear function of the missing data

z_{i k}

, the expected value is then substituted into

l o g L_{a} (θ; y)

. These expected values are given in Equation (42).

p_{i k}^{(r + 1)} = E (z_{i k}| x_{i}, θ^{(r)}) = P (z_{i k} = 1| x_{i}, θ^{(r)}) = \frac{π_{k}^{(r)} g_{k} (x_{i}; α_{k}^{(r)}, β_{k}^{(r)})}{\sum_{v = 1}^{K} π_{v} g_{v} (x_{i}; α_{v}^{(r)}, β_{v}^{(r)})}

(42)

where

p_{i k}^{(r + 1)}

represents the posterior probability that the i-th observation belongs to the k-th gamma component, based on the current parameter estimates

θ^{(r)}

. These probabilities are subsequently utilized in the M-step to update the parameter estimates, following a process analogous to that in the Poisson mixture model. Table 2 gives the EM algorithm for the gamma mixture model.

In the E-step of the EM algorithm, the probability density function of the gamma mixture distribution (as defined in Equation (35)) is substituted into Equation (43) to calculate the value of

{\hat{p}}_{i k}

, which represents the probability that the i-th observation belongs to the k-th gamma component, given the current parameter estimates.

In the M-step of the EM algorithm, Equations (44)–(46) represent a constrained optimization problem and can be expressed in the form of Equations (47) and (48).

{\hat{θ}}_{k}, {\hat{π}}_{k} = \arg \max_{θ_{k}, π_{k}} Q (θ_{1}, \dots, θ_{k}, π_{1}, \dots, π_{K})

(47)

s.t.

\sum_{k = 1}^{K} π_{k} = 1

(48)

This optimization problem can be transformed into Equation (49) using the Lagrange multiplier (

δ

).

L (θ_{1}, \dots, θ_{k}, π_{1}, \dots, π_{k}, δ) = Q (θ_{1}, \dots, θ_{K}, π_{1}, \dots, π_{K}) - δ (\sum_{k = 1}^{K} π_{k})

(49)

By incorporating the likelihood and log-likelihood functions (as defined in Equations (38) and (39)) into the probability function of the gamma mixture distribution (from Equation (35)), the Lagrangian can be explicitly expressed as given in Equations (50) and (51).

L (θ_{1}, \dots, θ_{k}, π_{1}, \dots, π_{k}, δ) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} [{\hat{p}}_{i k} l o g π_{k} + {\hat{p}}_{i k} \log g_{k} (x_{i}; α_{k} {, β}_{k})] - δ (\sum_{k = 1}^{K} π_{k} - 1)

(50)

\begin{matrix} L (θ_{1}, \dots, θ_{k}, π_{1}, \dots, & π_{k}, δ) \\ = \sum_{i = 1}^{n} \sum_{k = 1}^{K} [{\hat{p}}_{i k} l o g π_{k} + {\hat{p}}_{i k} ((α_{k} - 1) \log x_{i} - \frac{x_{i}}{β_{k}} - α_{k} \log (β_{k}) - \log (Γ (α_{k})))] \\ - δ (\sum_{k = 1}^{K} π_{k} - 1) \end{matrix}

(51)

When the partial derivative of the Lagrangian function in Equation (51) with respect to

π_{k}

is taken and set equal to zero, the following relationship is derived as Equation (52).

\frac{\partial L}{\partial π_{k}} = \sum_{i = 1}^{n} {\hat{p}}_{i k} - δ π_{k} = 0

(52)

Summing both sides over k yields Equation (53).

δ = - n

(53)

By substituting this result back into the equation and simplifying, we arrive at the final expression for

{\hat{π}}_{k}

, as given in Equation (54).

{\hat{π}}_{k} = \frac{1}{n} \sum_{i = 1}^{n} {\hat{p}}_{i k}

(54)

This expression represents the maximum likelihood estimate for the mixture proportion

{\hat{π}}_{k}

based on the posterior probabilities

{\hat{p}}_{i k}

obtained during the expectation step. Taking the partial derivative of the Lagrangian function with respect to

β_{k}

and setting it equal to zero yields Equation (55).

\frac{\partial L}{β_{k}} = \sum_{i = 1}^{n} {\hat{p}}_{i k} (\frac{x_{i}}{β_{k}^{2}} + \frac{{\hat{p}}_{i k}}{α_{k}}) = 0

(55)

Then, solving for

β_{k}

, the updated parameter estimate is obtained as shown in Equation (56).

β_{k} = \frac{1}{α_{k}} \frac{\sum_{i = 1}^{n} {\hat{p}}_{i k} x_{i}}{\sum_{i = 1}^{n} {\hat{p}}_{i k}}

(56)

Equation (56) provides the maximum likelihood estimate of

β_{k}

, expressed in terms of the posterior probabilities

{\hat{p}}_{i k}

, the observed data

x_{i}

, and the shape parameter

α_{k} .

Taking the partial derivative of the Lagrangian function with respect to

α_{k}

and setting it as equal to zero yields Equation (57).

\frac{\partial L}{{\partial α}_{k}} = \sum_{i = 1}^{n} {\hat{p}}_{i k} \log (x_{i}) - \sum_{i = 1}^{n} {\hat{p}}_{i k} l o g (\frac{\sum_{i = 1}^{n} {\hat{p}}_{i k} x_{i}}{\sum_{i = 1}^{n} {\hat{p}}_{i k}}) + \sum_{i = 1}^{n} {\hat{p}}_{i k} \log (α_{k}) - \sum_{i = 1}^{n} {\hat{p}}_{i k} ψ (α_{k}) = 0

(57)

Since Equation (57) has no closed-form expression, the optimal update scheme is not available for

α_{k}

. To address this limitation, the iterative and gradient-based nature of the EM algorithm is utilized. In each iteration, the parameter updates yield a positive projection onto the gradient of the log-likelihood function with respect to the system parameters. Accordingly, the update of

α_{k}

can be performed in the direction of the gradient using a step size

ε_{k}

(ε_{k} \in (0,1))

. This yields an iterative update scheme for

α_{k}

as defined in Equations (58) and (59), which ensure convergence toward the local maximum of the likelihood function [35,36].

α_{k}^{(r + 1)} = α_{k}^{(r)} + ε_{k} G_{α_{k}} (X, θ^{(r)})

(58)

G_{α_{k}} (X, θ^{(r)}) = \frac{\partial Q (X, θ^{(r)})}{n \partial α_{k}} = \frac{1}{n} \sum_{i = 1}^{n} [\log (x_{i}) + \log (β_{k}^{(r)}) - ψ (α_{k}^{(r)})] p_{i k}^{(r + 1)}

(59)

This equation involves the digamma function

ψ (x)

, which is defined in Equation (60).

ψ (x) = \frac{Γ^{'} (x)}{Γ (x)}

(60)

This function plays a crucial role in the estimation of the shape parameter

α_{k}

of the k-th gamma distribution in the EM algorithm for the gamma mixture model.

Since Equation (60) does not have a closed-form solution, a good approximation can be used for the digamma function given in Equation (61), especially

x \geq 2

.

ψ (x) \approx G (x) = \log (x - \frac{1}{2}) + \frac{1}{24 (x - \frac{1}{2})^{2}}

(61)

Using the approach presented in Equation (61), the digamma function can be expressed as a function of

α_{k}

, as shown in Equation (62).

ψ (α_{k}^{(r)}) \approx \log (α_{k}^{(r)} - \frac{1}{2}) + \frac{1}{24 (α_{k} - \frac{1}{2})^{2}}

(62)

2.3.2. The Mean and Variance of the Gamma Mixture Distribution

In the context of a gamma mixture distribution, where each component follows a

G a m m a (α_{k}, β_{k})

distribution with a mixing proportion

π_{k} (k = 1, \dots, K)

, the mean (

{μ)}_{m g}

and the variance (

σ_{m g}^{2}

) can be represented by Equation (63).

μ_{m g} = \sum_{k = 1}^{K} π_{k} (α_{k} β_{k})

(63)

Utilizing the variance definition in Equation (31), the variance (

σ_{m g}^{2}

) for gamma distributions can be expressed by Equation (64).

σ_{m g}^{2} = \sum_{k = 1}^{K} π_{k} (α_{k}^{2} β_{k} + α_{k}^{2} β_{k}^{2}) - μ_{m g}^{2}

(64)

Alternatively, it can also be presented in another form, as shown in Equation (65).

σ_{m g}^{2} = \sum_{k = 1}^{K} π_{k} (β_{k} + 1) (α_{k}^{2} β_{k}) - {(μ_{m g})}^{2}

(65)

2.4. Model Evaluation

To determine the optimal model, we employ the Akaike information criterion (AIC) and the Bayesian information Criterion (BIC), which are widely used statistical criteria for model selection. The formal definitions for these criteria are given in Equations (66) and (67).

A I C = 2 k - 2 l n (L)

(66)

B I C = l n (n) k - 2 l n (L)

(67)

where

k

denotes the number of parameters,

L

represents the likelihood function, and

n

is the sample size.

Both the AIC and BIC incorporate penalty terms that account for model complexity by considering the number of parameters. However, the BIC applies a larger penalty compared to the AIC, making it more conservative in selecting models with excessive complexity. While the AIC tends to favor models that provide a better fit to the data, it may lead to overfitting, particularly when dealing with large datasets. On the other hand, the BIC mitigates this risk by incorporating a stronger penalty, making it more suitable for cases where a balance between model complexity and goodness-of-fit is essential [13].

In this study, we utilize an iterative approach to evaluate various candidate models with different parameter configurations. The selection process involves comparing AIC and BIC values, where the model yielding the smallest criterion value is deemed the optimal choice for representing the given dataset.

3. Results and Discussion

3.1. Distribution Fitting of Yarn Imperfection Parameters

The results of the chi-square (

χ^{2})

goodness-of-fit test revealed that the observed data for thin places, thick places, and neps in the yarn samples did not conform to standard single distributions, such as binomial, geometric, or Poisson distributions. This finding strongly suggests that these parameters follow a more complex, heterogeneous pattern that can be better described using mixture models.

The histograms of the observed data further supported this conclusion by showing multiple peaks, indicative of an underlying mixture of distributions (Figure 1 and Figure 2).

Poisson mixture distribution was applied to model each dataset, with parameters estimated using the EM algorithm. The results of these estimations are summarized in Table 3. The proportions and parameters of each Poisson mixture distribution were calculated using the Flexmix (version 2.3-19) package in the R (version 4.1.0) programming environment.

These results indicate that a two-component Poisson mixture distribution is sufficient for modeling thin places and neps (200%), while a three-component distribution better fits the data for thick places and neps (140%).

The estimated proportions and distribution parameters provide valuable insights into the occurrence of imperfections in the yarn.

Thin places: The analysis identifies a high-proportion component (0.79) with a mean of 11.64. This indicates that most thin places are relatively minor deviations, with occasional severe irregularities (mean of 35.68).
Thick places: The three components represent varying degrees of thickness irregularities, with the largest proportion (0.72) attributed to minor variations. The presence of more extreme thick places (means of 25.76 and 11.80) highlights potential process inefficiencies, potentially attributable to such factors as inappropriate tension settings or irregular fiber feeding.
Neps: The data indicate that across both the 140% and 200% sensitivity levels for neps, there is a mixture of mild, moderate, and severe instances. The extreme values identified could be indicative of inconsistencies in fiber preparation or during spinning processes. Such irregularities are often linked to fiber contamination or processing errors that can lead to significant adverse effects on the overall fabric quality.

The validity of the Poisson mixture models was evaluated through both visual inspection and statistical testing using the Mann–Whitney U test. The same dataset used for parameter estimation was also used for this evaluation. A close agreement between the observed data and the fitted models was observed, as evidenced by the overlay of histograms of the observed data with the probability density functions derived from the estimated models. These visual comparisons are presented for thin places, thick places, and both neps (140% and 200%) in Figure 3, Figure 4, Figure 5 and Figure 6, respectively.

An even more reliable method for determining whether the observed data and the data generated from the distribution come from the same population is to perform the Mann–Whitney U test.

The null hypothesis (H₀) of the Mann–Whitney U test suggests that the two groups are drawn from the same population. In other words, it proposes that the two independent groups are homogeneous and follow the same distribution. The alternative hypothesis (H₁), which is tested against the null hypothesis, suggests that the distribution of data in the first group differs from that of the second group [37]. Thus, the hypotheses are as follows:

H₀.

There is no significant difference between the distribution of the observed dataset and the Poisson mixture distribution.

H₁.

There is a significant difference between the distribution of the observed dataset and the Poisson mixture distribution.

According to the Mann–Whitney U test results, all examined datasets produced p-values greater than 0.05. Specifically, the p-value was 0.26 for the thin places dataset, 0.49 for the thick places dataset, 0.75 for the neps 200% dataset, and 0.30 for the neps 140% dataset. These findings confirm that the Poisson mixture distributions effectively represent each dataset.

The Poisson mixture parameters provided in Table 3 were applied in conjunction with Equations (30) and (34) to compute the mean and standard deviation for thin places, thick places, and neps. The computed values exhibit strong concordance with the observed data for elongation and tensile strength, as presented in Table 4. Thus, the strong agreement between the observed and modeled mean and standard deviation values further strengthens the reliability of the proposed Poisson mixture models.

After establishing the validity of the mixed distribution using the Mann–Whitney U test, the probability functions were obtained by substituting the Poisson mixture distribution parameters provided in Table 3 into Equation (2).

{f (x)}_{t h i n} = 0.79 \frac{{11.64}^{x} e^{- 11.64}}{x!} + 0.21 \frac{{35.68}^{x} e^{- 35.68}}{x!}

(68)

{f (x)}_{t h i c k} = 0.073 \frac{{25.76}^{x} e^{- 25.76}}{x!} + 0.21 \frac{{11.8}^{x} e^{- 11.8}}{x!} + 0.72 \frac{{4.42}^{x} e^{- 4.42}}{x!}

(69)

{f (x)}_{n e p s 140} = 0.71 \frac{{19.31}^{x} e^{- 19.31}}{x!} + 0.23 \frac{{45.89}^{x} e^{- 45.89}}{x!} + 0.06 \frac{{107.06}^{x} e^{- 107.06}}{x!}

(70)

{f (x)}_{n e p s 200} = 0.15 \frac{{30.80}^{x} e^{- 30.80}}{x!} + 0.85 \frac{{8.25}^{x} e^{- 8.25}}{x!}

(71)

Determining the probability distribution function of thin places (Equation (68)), thick places (Equation (69)), and neps (Equations (70) and (71)) provides significant benefits for optimizing quality control and enhancing production efficiency. The derived probability functions offer predictive insight into the likelihood of yarn imperfections, enabling more proactive and data-driven quality control practices. These models can be directly compared against specification limits defined for each imperfection type to estimate the probability of conformance. This capability can significantly enhance quality control mechanisms in production processes.

To further evaluate the practical applicability of the proposed Poisson mixture models in quality control, conformance probabilities were calculated based on upper specification limits provided by the yarn manufacturer. These limits reflect the quality thresholds expected by the company’s customers for each defect type—namely, thin places (≤35), thick places (≤10), neps at 140% sensitivity (≤127), and neps at 200% sensitivity (≤26). The probability of conformance was computed as the cumulative probability

P (X \leq s p e c .)

under the respective mixture distribution functions (Equations (69)–(72)).

The results reveal that the estimated conformance probabilities are 0.923 for thin places, 0.793 for thick places, 0.998 for neps at 140%, and 0.892 for neps at 200%. These values indicate that, while neps at 140% and thin place imperfections are largely within acceptable bounds, neps at 200% and thick places exhibit relatively lower conformance levels. This insight can help quality engineers prioritize improvement efforts and implement corrective measures more effectively for specific defect categories.

Thus, understanding the probability distribution enables predictive insights, allowing for proactive measures to prevent similar irregularities in future production batches. Such foresight is instrumental in minimizing potential quality issues, thereby fostering higher levels of customer satisfaction. Furthermore, integration of these models into real-time quality monitoring systems allows for instant alerts when defect probabilities exceed tolerable bounds. This capability ensures timely intervention and minimizes the occurrence of defective batches.

3.2. Distribution Fitting of Yarn Mechanical Properties

Following the comprehensive evaluation of yarn imperfection parameters, the subsequent analysis focused on investigating the mechanical properties of the yarn, particularly tensile strength (tenacity) and elongation. This investigation was conducted under the assumption that the underlying data might follow a specific statistical distribution. However, the application of the chi-square goodness-of-fit test indicated that the datasets related to these mechanical properties did not conform to any single distribution, suggesting a more complex underlying structure in the observed data.

Figure 7 and Figure 8 illustrate the histogram of the respective datasets, revealing multiple peaks, which suggest that the data may follow a mixed distribution.

Each dataset was modeled using gamma mixture distributions, with parameters estimated through the EM algorithm. A summary of these estimation results is provided in Table 5. The proportions and parameters of each gamma mixture distribution were determined using the Flexmix package in the R programming environment.

A comparison of the histograms representing the observed data with the probability density functions derived from the fitted gamma mixture models is illustrated in Figure 9 and Figure 10, which clearly demonstrates a strong degree of agreement between the empirical data and the models. Such visual consistency underscores the assertion that the proposed gamma mixture models adequately capture the underlying distribution of the observed data for both tensile strength and elongation.

In addition to visual assessments, the Mann–Whitney U test was employed to statistically analyze the differences between the observed and modeled distributions. The results of this non-parametric test indicated that all examined datasets produced p-values exceeding the conventional threshold of 0.05. Specifically, the p-value for the tensile strength dataset was 0.69, while the elongation dataset produced a p-value of 0.40. These findings imply that there are no significant differences between the distributions of the observed data and those modeled by the gamma mixture framework.

The parameters of the gamma mixture models, detailed in Table 5, were employed alongside Equations (64) and (66) to calculate the mean and standard deviation for both elongation and tensile strength. The computed values from these equations indicate a close alignment with the observed data, which are documented in Table 6. This congruence reinforces the validity of the gamma mixture models in accurately capturing the statistical characteristics of the underlying datasets.

Following the validation of the mixed distribution, the corresponding probability functions for elongation and tensile strength were derived by substituting the gamma mixture distribution parameters from Table 5 into Equation (35). The resulting probability density functions are given in Equations (72) and (73).

{f (x)}_{e l o n g a t i o n} = 0.33 \frac{{0.04}^{320} x^{319} e^{- 0.04 x}}{\int_{0}^{\infty} t^{319} e^{- t} d t} + 0.67 \frac{{0.146}^{110} {x^{109} e}^{- 0.146 x}}{\int_{0}^{\infty} t^{109} e^{- t} d t)}

(72)

{f (x)}_{t s t r e n g t h} = 0.52 \frac{{0.17}^{120} x^{119} e^{- 0.17 x}}{\int_{0}^{\infty} t^{119} e^{- t} d t} + 0.48 \frac{{0.09}^{280} {x^{279} e}^{- 0.09 x}}{\int_{0}^{\infty} t^{279} e^{- t} d t)}

(73)

In spinning mills, the variability in elongation and tensile strength directly affects the performance characteristics of the produced yarns. Establishing the probability density function for these properties is critical for effectively modeling the statistical behavior of elongation and tensile strength, providing insights that can be leveraged for quality control and predictive analysis in yarn production processes.

In summary, employing a suitable FMM for both yarn irregularity and mechanical properties offers several key advantages, as follows:

Prediction and control: The obtained probability density functions allow for the prediction of the likelihood of specific parameter values occurring in future yarn production. This predictive capability is crucial for process control, enabling adjustments to maintain desired quality standards. When integrated into real-time quality monitoring systems, this predictive capability enables timely interventions by triggering alerts when the probability of conformance falls below critical thresholds.
Modeling and simulation: Probability functions can be integrated into more complex models and simulations of textile manufacturing processes. This facilitates virtual experimentation and optimization, reducing reliance on costly and time-consuming physical trials.
Risk assessment: The obtained probability density functions allow for a more precise evaluation of the risk associated with producing fabrics that do not meet predefined performance specifications. By quantifying the probability of conformance, manufacturers can assess the likelihood of deviation from quality standards in advance. This type of risk assessment is particularly crucial in high-performance applications—such as medical textiles, aerospace composites, or protective fabrics—where yarn failure could lead to severe functional or safety consequences.

4. Conclusions

This study demonstrates the effective application of finite mixed models for characterizing key yarn quality parameters in a 28/1 Ne ring-spun polyester/viscose yarn. By employing Poisson and gamma mixture distributions and using advanced statistical techniques, such as the EM algorithm, this research provides a robust framework for modeling the heterogeneity observed in yarn quality parameters. Moreover, the validation of the model using goodness-of-fit tests and the Mann–Whitney U test underscores the reliability of the proposed method.

The findings have significant implications for the textile industry. Understanding the probability distributions of quality parameters enables manufacturers to implement proactive quality control measures, optimize production processes, and predict irregularities in future batches. This approach not only reduces waste and production costs but also improves quality and customer satisfaction. These insights pave the way for future studies to explore mixture models in other aspects of textile production, such as in relation to fiber properties or a wider range of yarn types.

Author Contributions

Conceptualization, M.K. and E.K.; methodology, M.K., E.K., and M.Ö.Ü.; software, M.K., E.K., and M.Ö.Ü.; validation, M.K., E.K., and M.Ö.Ü.; formal analysis, M.K., E.K., and M.Ö.Ü.; investigation, M.K., E.K., and M.Ö.Ü.; writing—original draft preparation M.K., E.K., and M.Ö.Ü.; writing—review and editing, M.K., E.K., and M.Ö.Ü.; visualization, E.K.; supervision, M.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the. corresponding author. The data are not publicly available due to privacy reasons.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FMM	Finite mixture model
EM	Expectation–maximization
AIC	Akaike information criterion
BIC	Bayesian information criterion
MLE	Maximum likelihood estimation

References

Bouguila, N.; Fan, W. Mixture Models and Applications; Springer: Berlin/Heidelberg, Germany, 2020; Volume 530. [Google Scholar]
Ng, S.-K. Mixture Modelling for Medical and Health Sciences; Chapman and Hall/CRC: Boca Raton, FL, USA, 2019. [Google Scholar]
Yu, J.; Qin, S.J. Multiway Gaussian mixture model based multiphase batch process monitoring. Ind. Eng. Chem. Res. 2009, 48, 8585–8594. [Google Scholar] [CrossRef]
Hafemeister, C.; Costa, I.G.; Schönhuth, A.; Schliep, A. Classifying short gene expression time-courses with Bayesian estimation of piecewise constant functions. Bioinformatics 2011, 27, 946–952. [Google Scholar] [CrossRef] [PubMed]
Neema, I.; Böhning, D. Improved methods for surveying and monitoring crimes through likelihood based cluster analysis. Int. J. Criminol. Sociol. Theory 2010, 3, 477–495. [Google Scholar]
Chambers, S.K.; Ng, S.K.; Baade, P.; Aitken, J.F.; Hyde, M.K.; Wittert, G.; Frydenberg, M.; Dunn, J. Trajectories of quality of life, life satisfaction, and psychological adjustment after prostate cancer. Psycho-Oncology 2017, 26, 1576–1585. [Google Scholar] [CrossRef] [PubMed]
Wessman, J.; Paunio, T.; Tuulio-Henriksson, A.; Koivisto, M.; Partonen, T.; Suvisaari, J.; Turunen, J.A.; Wedenoja, J.; Hennah, W.; Pietiläinen, O.P. Mixture model clustering of phenotype features reveals evidence for association of DTNBP1 to a specific subtype of schizophrenia. Biol. Psychiatry 2009, 66, 990–996. [Google Scholar] [CrossRef]
Palardy, G.J.; Vermunt, J.K. Multilevel growth mixture models for classifying groups. J. Educ. Behav. Stat. 2010, 35, 532–565. [Google Scholar] [CrossRef]
ElMoaqet, H.; Kim, J.; Tilbury, D.; Ramachandran, S.K.; Ryalat, M.; Chu, C.-H. Gaussian mixture models for detecting sleep apnea events using single oronasal airflow record. Appl. Sci. 2020, 10, 7889. [Google Scholar] [CrossRef]
Fahey, M.T.; Ferrari, P.; Slimani, N.; Vermunt, J.K.; White, I.R.; Hoffmann, K.; Wirfält, E.; Bamia, C.; Touvier, M.; Linseisen, J. Identifying dietary patterns using a normal mixture model: Application to the EPIC study. J. Epidemiol. Community Health 2012, 66, 89–94. [Google Scholar] [CrossRef]
Ji, Z.; Xia, Y.; Sun, Q.; Chen, Q.; Xia, D.; Feng, D.D. Fuzzy local Gaussian mixture model for brain MR image segmentation. IEEE Trans. Inf. Technol. Biomed. 2012, 16, 339–347. [Google Scholar]
Tentoni, S.; Astolfi, P.; De Pasquale, A.; Zonta, L.A. Birthweight by gestational age in preterm babies according to a Gaussian mixture model. BJOG Int. J. Obstet. Gynaecol. 2004, 111, 31–37. [Google Scholar] [CrossRef]
Sun, T.; Wen, Y.; Zhang, X.; Jia, B.; Zhou, M. Gaussian mixture model for marine reverberations. Appl. Sci. 2023, 13, 12063. [Google Scholar] [CrossRef]
Krifa, M. Fiber length distribution in cotton processing: A finite mixture distribution model. Text. Res. J. 2008, 78, 688–698. [Google Scholar] [CrossRef]
Cui, X.; Rodgers, J.; Cai, Y.; Li, L.; Belmasrour, R.; Pang, S.-S. Obtaining cotton fiber length distributions from the beard test method Part 1-Theoretical distributions related to the beard method. J. Cotton Sci. 2009, 13, 265–273. [Google Scholar]
Belmasrour, R.; Li, L.; Cui, X.; Cai, Y.; Rodgers, J. Obtaining cotton fiber length distributions from the beard test method. Part 1—A new approach through PLS Regression. J. Cotton Sci. 2011, 15, 73–79. [Google Scholar]
Lin, Q.; Xing, M.; Oxenham, W.; Yu, C. Generation of cotton fiber length probability density function with length measures. J. Text. Inst. 2012, 103, 225–230. [Google Scholar] [CrossRef]
Kuang, X.; Hu, Y.; Yang, J.; Yu, C. Application of finite mixture model in cotton fiber length probability distribution. J. Text. Inst. 2015, 106, 146–151. [Google Scholar] [CrossRef]
Repon, M.R.; Al Mamun, R.; Reza, S.; Das, M.K.; Islam, T. Effect of spinning parameters on thick, thin places and neps of rotor spun yarn. J. Text. Sci. Technol. 2016, 2, 47–55. [Google Scholar] [CrossRef]
Mwasıagı, J.; Mırembel, J. Influence of Spinning Parameters on thin and thick Places of rotor spun yarns. Int. J. Comput. Exp. Sci. Eng. 2018, 4, 1–7. [Google Scholar] [CrossRef]
Ozcelik, G.; Kirtay, E.; Derici, I.M. Examination of fiber neps count during yarn manufacturing. In Proceedings of the Beltwide Cotton Conference, New Orleans, LA, USA, 4–7 January 2005; pp. 2229–2235. [Google Scholar]
Frydrych, I.; Matusiak, M. Predicting the Nep Number in Cotton Yarn—Determining the Critical Nep Size. Text. Res. J. 2002, 72, 917–923. [Google Scholar] [CrossRef]
Nath, J.M.; Shukla, S.; Patil, P. Neps in cotton. J. Cotton Res. Dev. 2015, 29, 329–332. [Google Scholar]
Habib, A.; Olgun, Y.; Babaarslan, O. Development of dual-core spun yarn using different filaments as a core and its impact on denim fabric properties. Text. Leather Rev. 2024, 7, 534–549. [Google Scholar] [CrossRef]
Kebabcı, M.; Babaarslan, O.; Hacıoğulları, S.Ö.; Telli, A. The effect of drawing ratio and cross-sectional shapes on the properties of polypropylene CF and BCF yarns. Tekst. Mühendis 2015, 22, 46–53. [Google Scholar] [CrossRef]
Lemmi, T.S.; Barburski, M.; Kabziński, A.; Frukacz, K. Effect of thermal aging on the mechanical properties of high tenacity polyester yarn. Materials 2021, 14, 1666. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Meng, C.; Zhou, J.; Li, Z.; Ding, J.; Liu, F.; Yu, C. Characterization and control of oxidized cellulose in ramie fibers during oxidative degumming. Text. Res. J. 2017, 87, 1828–1840. [Google Scholar] [CrossRef]
Liu, Y.; Thibodeaux, D.; Gamble, G.; Rodgers, J. Preliminary study of relating cotton fiber tenacity and elongation with crystallinity. Text. Res. J. 2014, 84, 1829–1839. [Google Scholar] [CrossRef]
Široká, B.; Manian, A.P.; Noisternig, M.F.; Henniges, U.; Kostic, M.; Potthast, A.; Griesser, U.J.; Bechtold, T. Wash–dry cycle induced changes in low-ordered parts of regenerated cellulosic fibers. J. Appl. Polym. Sci. 2012, 126, E397–E408. [Google Scholar] [CrossRef]
Upreti, M.; Jahan, S. Study the effect of scouring time on Grewia asiatica (Phalsa) fibres properties. J. Appl. Nat. Sci. 2017, 9, 1388. [Google Scholar] [CrossRef]
Maiti, S.; Islam, M.R.; Uddin, M.A.; Afroj, S.; Eichhorn, S.J.; Karim, N. Sustainable fiber-reinforced composites: A review. Adv. Sustain. Syst. 2022, 6, 2200258. [Google Scholar] [CrossRef]
Gorjanc, D.S.; Bizjak, M. The influence of constructional parameters on deformability of elastic cotton fabrics. J. Eng. Fibers Fabr. 2014, 9, 155892501400900106. [Google Scholar] [CrossRef]
Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Methodol. 1977, 39, 1–22. [Google Scholar] [CrossRef]
Ghojogh, B.; Ghojogh, A.; Crowley, M.; Karray, F. Fitting a mixture distribution to data: Tutorial. arXiv 2019, arXiv:1901.06708. [Google Scholar]
Almhana, J.; Liu, Z.; Choulakian, V.; McGorman, R. A recursive algorithm for gamma mixture models. In Proceedings of the 2006 IEEE International Conference on Communications, Istanbul, Turkey, 11–15 June 2006; pp. 197–202. [Google Scholar]
Liu, Z.; Almhana, J.; Choulakian, V.; McGorman, R. Traffic modeling with gamma mixtures and dynamical bandwidth provisioning. In Proceedings of the 4th Annual Communication Networks and Services Research Conference (CNSR’06), Moncton, NB, Canada, 24–25 May 2006; pp. 8–130. [Google Scholar]
Nachar, N. The Mann-Whitney U: A test for assessing whether two independent samples come from the same distribution. Tutor. Quant. Methods Psychol. 2008, 4, 13–20. [Google Scholar] [CrossRef]

Figure 1. Histograms of thin and thick places.

Figure 2. Histograms of neps (140%, 200%).

Figure 3. Thin places: histogram of observed data and fitted Poisson mixture data.

Figure 4. Thick places: histogram of observed data and fitted Poisson mixture data.

Figure 5. Neps (%140): histogram of observed data and fitted Poisson mixture data.

Figure 6. Neps (%200): histogram of observed data and fitted Poisson mixture data.

Figure 7. Histogram of elongation (%).

Figure 8. Histogram of tensile strength (cN/tex).

Figure 9. Tensile strength observed data and fitted gamma mixture data.

Figure 10. Elongation observed data and fitted gamma mixture data.

Table 1. EM algorithm for the Poisson mixture distribution.

i. Start with initial parameter values of ${\hat{λ}}_{k}^{(r)}, {\hat{w}}_{k}^{(r)} (r = 0)$ $k = 1, \dots, K$ and iterate the following steps until convergence.
ii. E-step: Find the classification probability that the observation $x_{i}$ comes from the $k^{t h}$ Poisson distribution based on the following estimate:
$p_{i k}^{(r + 1)} = \frac{w_{k}^{(r)} g_{k} (x_{i}; λ_{k}^{(r)})}{\sum_{v = 1}^{K} w_{v} g_{v} (x_{i}; λ_{v}^{(r)})}$	(13)
where $i = 1, \dots n a n d k = 1, \dots, K$
iii. M-step: Update the Poisson mixture distribution parameters as follows:
$λ_{k}^{(r + 1)} = a r g \max_{λ_{k}} \sum_{i = 1}^{n} p_{i k}^{(r + 1)} \log g_{k} (x_{i}\| λ_{k})$	(14)
$w_{k}^{(r + 1)} = \frac{\sum_{i = 1}^{n} p_{i k}^{(r + 1)}}{n}$	(15)

Table 2. EM algorithm for the gamma mixture distribution.

i. Start with initial parameter values of $(r = 0)$ $k = 1, \dots, K$ and iterate the following steps until convergence:
ii. E-step: Find the classification probability that the observation $x_{i}$ comes from the $k^{t h}$ gamma distribution based on the following estimate:
$p_{i k}^{(r + 1)} = \frac{π_{k}^{(r)} g_{k} (x_{i}; α_{k}^{(r)}, β_{k}^{(r)})}{\sum_{v = 1}^{K} π_{v}^{(r)} g_{v} (x_{i}; α_{v}^{(r)}, β_{v}^{(r)})}$	(43)
where $i = 1, \dots n a n d k = 1, \dots, K$
iii. M-step: Update the gamma mixture distribution parameters as follows:
$β_{k}^{(r + 1)} = \frac{α_{k}^{(r)} \sum_{i = 1}^{n} p_{i k}^{(r)}}{\sum_{i = 1}^{n} x_{i} p_{i k}^{(r)}}$	(44)
$α_{k}^{(r + 1)} = α_{k}^{(r)} + ε_{m} G_{α_{k}} (X, θ^{(r)})$	(45)
$π_{k}^{(r + 1)} = \frac{\sum_{i = 1}^{n} p_{i k}^{(r)}}{n}$	(46)

Table 3. Poisson mixture distribution parameters for each dataset.

Data	k	Proportions			Poisson Mixture Distribution Parameters			Selection Criteria
Data	k	$w_{1}$	$w_{2}$	$w_{3}$	$λ_{1}$	$λ_{2}$	$λ_{3}$	AIC	BIC	LL
Thin 40%	2	0.79	0.21	-	11.64	35.68	-	6934.28	6948.21	−3464.14
Thick 50%	3	0.073	0.21	0.72	25.76	11.80	4.42	4466.874	4490.099	4634.936
Neps 140%	3	0.71	0.23	0.06	19.31	45.89	107.06	7152.014	7175.239	7224.290
Neps 200%	2	0.15	0.85	-	30.80	8.25	-	5508.402	5522.338	5550.933

Table 4. Mean and standard deviation of yarn imperfection parameters: observed data vs. Poisson mixture model data.

Quality Parameters	Observed Data		PMM Data
Quality Parameters	Mean	Standard Dev.	Mean	Standard Dev.
Thin Places (40%)	16.63	14.83	16.68	10.60
Thick Places (50%)	7.52	6.67	7.54	6.51
Neps (140%)	29.70	23.61	30.68	22.92
Neps (200%)	11.55	9.61	11.63	8.74

Table 5. Gamma mixture distribution parameters for each dataset.

Data	k	Proportions		Gamma Mixture Distribution Parameters		Selection Criteria
Data	k	$π_{1}$	$π_{2}$	$(α_{1}, β_{1})$	$(α_{2}, β_{2})$	AIC	BIC	LL
Elongation	2	0.33	0.67	(320, 0.04)	(110, 0.146)	3121.86	3145.06	−1555.93
Tensile Str.	2	0.52	0.48	(120, 0.17)	(280, 0.09)	3825.70	3848.90	−1907.85

Table 6. Mean and standard deviation of elongation and tensile strength: observed data vs. gamma mixture model data.

Quality Parameters	Observed Data		GMM Data
Quality Parameters	Mean	Standard Dev.	Mean	Standard Dev.
Elongation	14.97	2.03	14.98	2.01
Tensile strength	22.89	3.25	22.86	3.35

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Karakaş, E.; Koyuncu, M.; Ükelge, M.Ö. Finite Mixture Model-Based Analysis of Yarn Quality Parameters. Appl. Sci. 2025, 15, 6407. https://doi.org/10.3390/app15126407

AMA Style

Karakaş E, Koyuncu M, Ükelge MÖ. Finite Mixture Model-Based Analysis of Yarn Quality Parameters. Applied Sciences. 2025; 15(12):6407. https://doi.org/10.3390/app15126407

Chicago/Turabian Style

Karakaş, Esra, Melik Koyuncu, and Mülayim Öngün Ükelge. 2025. "Finite Mixture Model-Based Analysis of Yarn Quality Parameters" Applied Sciences 15, no. 12: 6407. https://doi.org/10.3390/app15126407

APA Style

Karakaş, E., Koyuncu, M., & Ükelge, M. Ö. (2025). Finite Mixture Model-Based Analysis of Yarn Quality Parameters. Applied Sciences, 15(12), 6407. https://doi.org/10.3390/app15126407

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Finite Mixture Model-Based Analysis of Yarn Quality Parameters

Abstract

1. Introduction

2. Materials and Methods

2.1. Datasets

2.2. Theoretical Background of Poisson Mixture Distribution

2.2.1. EM Algorithm for Poisson Mixture Distribution

2.2.2. The Mean and Variance of the Poisson Mixture Distribution

2.3. Theoretical Background of the Gamma Mixture Distribution

2.3.1. EM Algorithm for Gamma Mixture Distribution

2.3.2. The Mean and Variance of the Gamma Mixture Distribution

2.4. Model Evaluation

3. Results and Discussion

3.1. Distribution Fitting of Yarn Imperfection Parameters

3.2. Distribution Fitting of Yarn Mechanical Properties

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI