MDPI - Publisher of Open Access Journals

20 pages, 3978 KiB

Open AccessArticle

Cotton-YOLO: A Lightweight Detection Model for Falled Cotton Impurities Based on Yolov8

by Jie Li, Zhoufan Zhong, Youran Han and Xinhou Wang

Symmetry 2025, 17(8), 1185; https://doi.org/10.3390/sym17081185 (registering DOI) - 24 Jul 2025

As an important pillar of the global economic system, the cotton industry faces critical challenges from non-fibrous impurities (e.g., leaves and debris) during processing, which severely degrade product quality, inflate costs, and reduce efficiency. Traditional detection methods suffer from insufficient accuracy and low [...] Read more.

As an important pillar of the global economic system, the cotton industry faces critical challenges from non-fibrous impurities (e.g., leaves and debris) during processing, which severely degrade product quality, inflate costs, and reduce efficiency. Traditional detection methods suffer from insufficient accuracy and low efficiency, failing to meet practical production needs. While deep learning models excel in general object detection, their massive parameter counts render them ill-suited for real-time industrial applications. To address these issues, this study proposes Cotton-YOLO, an optimized yolov8 model. By leveraging principles of symmetry in model design and system setup, the study integrates the CBAM attention module—with its inherent dual-path (channel-spatial) symmetry—to enhance feature capture for tiny impurities and mitigate insufficient focus on key areas. The C2f_DSConv module, exploiting functional equivalence via quantization and shift operations, reduces model complexity by 12% (to 2.71 million parameters) without sacrificing accuracy. Considering angle and shape variations in complex scenarios, the loss function is upgraded to Wise-IoU for more accurate boundary box regression. Experimental results show that Cotton-YOLO achieves 86.5% precision, 80.7% recall, 89.6% mAP50, 50.1% mAP50–95, and 50.51 fps detection speed, representing a 3.5% speed increase over the original yolov8. This work demonstrates the effective application of symmetry concepts (in algorithmic structure and performance balance) to create a model that balances lightweight design and high efficiency, providing a practical solution for industrial impurity detection and key technical support for automated cotton sorting systems. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

17 pages, 343 KiB

Open AccessArticle

On the Conflation of Poisson and Logarithmic Distributions with Applications

by Abdulhamid A. Alzaid, Anfal A. Alqefari and Najla Qarmalah

Axioms 2025, 14(7), 518; https://doi.org/10.3390/axioms14070518 - 6 Jul 2025

Viewed by 203

Abstract

It is frequent for real-life count data to show inflation in lower values; however, most of the well-known count distributions cannot capture such a feature. The present paper introduces a new distribution for modeling inflated count data in small values based on a [...] Read more.

It is frequent for real-life count data to show inflation in lower values; however, most of the well-known count distributions cannot capture such a feature. The present paper introduces a new distribution for modeling inflated count data in small values based on a conflation of distributions approach. The new distribution inherits some properties from Poisson distribution (PD) and logarithmic distribution (LD), making it a powerful modeling tool. It can serve as an alternative to PD, LD, and zero-truncated distributions. The new distribution is worth considering theoretically, as it belongs to the weighted PD family. With zero as a support point, two additional models are suggested for the new distribution. These modifications yield distributions that demonstrate overdispersion models comparable to the negative binomial distribution (NBD) while retaining essential PD properties, making them suitable for accurately representing count data with frequent events of low frequency and high variance. Furthermore, we discuss the superior performance of three new distributions in modeling real count data compared to traditional count distributions such as PD and NBD, as well as other discrete distributions. This paper examines the key statistical properties of the proposed distributions. A comparison of the novel and other distributions in the literature is shown employing real-life data from some domains. All of the computations shown in this study are generated using the R programming language. Full article

(This article belongs to the Special Issue Advances in the Theory and Applications of Statistical Distributions)

► Show Figures

Figure 1

24 pages, 347 KiB

Open AccessArticle

Estimating the Ratio of Means in a Zero-Inflated Poisson Mixture Model

by Michael Pearce and Michael D. Perlman

Stats 2025, 8(3), 55; https://doi.org/10.3390/stats8030055 - 5 Jul 2025

Viewed by 133

Abstract

The problem of estimating the ratio of the means of a two-component Poisson mixture model is considered, when each component is subject to zero-inflation, i.e., excess zero counts. The resulting zero-inflated Poisson mixture (ZIPM) model can be viewed as a three-component Poisson mixture [...] Read more.

The problem of estimating the ratio of the means of a two-component Poisson mixture model is considered, when each component is subject to zero-inflation, i.e., excess zero counts. The resulting zero-inflated Poisson mixture (ZIPM) model can be viewed as a three-component Poisson mixture model with one degenerate component. The EM algorithm is applied to obtain frequentist estimators and their standard errors, the latter determined via an explicit expression for the observed information matrix. As an intermediate step, we derive an explicit expression for standard errors in the two-component Poisson mixture model (without zero-inflation), a new result. The ZIPM model is applied to simulated data and real ecological count data of frigatebirds on the Coral Sea Islands off the coast of Northeast Australia. Full article

► Show Figures

Figure 1

14 pages, 281 KiB

Open AccessArticle

Leading Logarithm Quantum Gravity

by S. P. Miao, N. C. Tsamis and R. P. Woodard

Universe 2025, 11(7), 223; https://doi.org/10.3390/universe11070223 - 4 Jul 2025

Viewed by 169

Abstract

The continual production of long wavelength gravitons during primordial inflation endows graviton loop corrections with secular growth factors. During a prolonged period of inflation, these factors eventually overwhelm the small loop-counting parameter of

G H^{2}

, causing perturbation theory to break down. [...] Read more.

The continual production of long wavelength gravitons during primordial inflation endows graviton loop corrections with secular growth factors. During a prolonged period of inflation, these factors eventually overwhelm the small loop-counting parameter of

G H^{2}

, causing perturbation theory to break down. A technique was recently developed for summing the leading secular effects at each order in non-linear sigma models, which possess the same kind of derivative interactions as gravity. This technique combines a variant of Starobinsky’s stochastic formalism with a variant of the renormalization group. Generalizing the technique to quantum gravity is a two-step process, the first of which is the determination of the gauge fixing condition that will allow this summation to be realized; this is the subject of this paper. Moreover, we briefly discuss the second step, which shall obtain the Langevin equation, in which secular changes in gravitational phenomena are driven by stochastic fluctuations of the graviton field. Full article

(This article belongs to the Special Issue Loop Quantum Gravity and Non-Perturbative Approaches to Quantum Cosmology, Second Edition)

20 pages, 3461 KiB

Open AccessArticle

Effect of Elevated Temperature on Physical Activity and Falls in Low-Income Older Adults Using Zero-Inflated Poisson and Graphical Models

by Tho Nguyen, Dahee Kim, Yingru Li, Christopher T. Emrich, Jennifer Crook, Ladda Thiamwong and Rui Xie

Information 2025, 16(6), 442; https://doi.org/10.3390/info16060442 - 26 May 2025

Viewed by 353

Abstract

High ambient temperature poses a significant public health challenge, particularly for low-income older adults (LOAs) with preexisting health and social issues and disproportionate living conditions, placing them at a vulnerable condition of heat-related illnesses and associated public health risks. This study aims to [...] Read more.

High ambient temperature poses a significant public health challenge, particularly for low-income older adults (LOAs) with preexisting health and social issues and disproportionate living conditions, placing them at a vulnerable condition of heat-related illnesses and associated public health risks. This study aims to utilize advanced statistical regression and machine learning methods to analyze complex relationships between elevated temperature, physical activity (PA), sociodemographic factors and fall incidents among LOAs. We collected data from a cohort of 304 LOAs aged 60 and above, living in free-living conditions in low-income communities in Central Florida, USA. Zero-inflated Poisson regression was employed to examine the linear relationships, which reflect the zero-abundant nature of fall incidents. Then, an advanced machine learning approach—the mixed undirected graphical model (MUGM)—was employed to further explore the intricate, nonlinear relationships among daily PA, daily temperature, and fall incidents. The findings suggest that more moderate-to-vigorous PA is significantly associated with fewer fall incidents (RR = 0.90, 95% CI: (0.816, 0.993),

p = 0.037

), after adjusting for other variables. In contrast, elevated temperature is strongly linked to a greater risk of falls (RR = 1.733, 95% CI: (1.581, 1.901), p < 0.0001), potentially reflecting seasonal influences. Although higher temperature increases fall events, this effect is mitigated among LOAs with increased sedentary behavior (

p < 0.0001

). Additionally, findings from the MUGM reinforce the intricate nature of falls. Fall counts were highly correlated with race and positively associated with temperature, highlighting the importance of tailoring fall prevention strategies to account for seasonal variations and health disparities, and promoting PA. Full article

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence with Applications)

► Show Figures

Figure 1

27 pages, 463 KiB

Open AccessFeature PaperArticle

An Optional Semimartingales Approach to Risk Theory

by Mahdieh Aminian Shahrokhabadi, Alexander Melnikov and Andrey Pak

Risks 2025, 13(4), 61; https://doi.org/10.3390/risks13040061 - 21 Mar 2025

Viewed by 671

Abstract

This paper aims to develop optional semimartingale methods in risk theory to allow for a larger class of risk models. Optional semimartingales are left-continuous with right-limit stochastic processes defined on a probability space where the usual conditions—completeness and right-continuity of the filtration—are not [...] Read more.

This paper aims to develop optional semimartingale methods in risk theory to allow for a larger class of risk models. Optional semimartingales are left-continuous with right-limit stochastic processes defined on a probability space where the usual conditions—completeness and right-continuity of the filtration—are not assumed. Three risk models are formulated, accounting for inflation, interest rates, and claim occurrences. The first model extends the martingale approach to calculate ruin probabilities, the second employs the Gerber–Shiu function to evaluate the expected discounted penalty from financial oscillations or jumps, and the third introduces a Gaussian risk model using counting processes to capture premium and claim cash flow jumps in insurance companies. Full article

(This article belongs to the Special Issue Advancements in Actuarial Mathematics and Insurance Risk Management)

17 pages, 904 KiB

Open AccessArticle

Knowledge About HPV and the HPV Vaccine: Observational Study on a Convenience Sample of Adolescents from Select Schools in Three Regions in Italy

by Laura Brunelli, Francesca Valent, Manola Comar, Barbara Suligoi, Maria Cristina Salfa, Daniele Gianfrilli, Franz Sesti, Giuseppina Capra, Alessandra Casuccio, Erik De Luca, Emily Bertola, Silvia Gazzetta, Lorenza Driul, Andrea Isidori, Patrizia Ferro, Nicolò Piazza, Palmira Immordino, Teresa Fasciana and Vincenzo Restivo

Vaccines 2025, 13(3), 227; https://doi.org/10.3390/vaccines13030227 - 24 Feb 2025

Cited by 1 | Viewed by 1974

Abstract

Background/Objectives: HPV is the most common sexually transmitted infectious agent worldwide and adolescents are at high risk of contracting HPV. The aim of our study was to find out how much adolescents know about the virus and its effects, and to obtain information [...] Read more.

Background/Objectives: HPV is the most common sexually transmitted infectious agent worldwide and adolescents are at high risk of contracting HPV. The aim of our study was to find out how much adolescents know about the virus and its effects, and to obtain information on attitudes and behaviors regarding HPV vaccination to close these gaps. Methods: As part of the ESPRIT project, 598 lower secondary (11–14 years) and upper secondary (14–19 years) school students from three Italian regions were surveyed between December 2023 and March 2024 using a seven-question online questionnaire on awareness, knowledge, and attitudes about HPV and the HPV vaccine. Count and zero-inflation models were used to determine correlations between sexes, urban/suburban, province of residence, and school type with knowledge. Results: Lower secondary students believed that HPV causes HIV/AIDS (8.9%) or hepatitis C (3.0%) and rarely mentioned anal (21%) and oral sex (9.6%) as ways of transmission. Among upper secondary students, misconceptions were similar, with worrying rates of students stating that HPV only causes cancer in females (18%) or males (2.4%), and low rates of identifying transmission risk through anal (41%) and oral (34%) sex and genital contact (38%). The HPV vaccination rate was quite low (47% in lower secondary students, 61% in upper secondary students). In the regressions, sex, urban/suburban area, and province were the variables associated with higher levels of knowledge for lower secondary students; for upper secondary students, level of knowledge was associated with sex, urban/suburban area, school type, and province of residence. Conclusions: Awareness and knowledge of HPV and the HPV vaccine are low among Italian students in this study and reported vaccination coverage is below the national target. Coordinated efforts at the national level are needed to address this public health issue. Full article

(This article belongs to the Special Issue HPV Vaccination Coverage: Problems and Challenges)

► Show Figures

Figure 1

16 pages, 624 KiB

Open AccessArticle

Factors Influencing Frequency of Depressive Experiences Among Married Working Women in South Korea

by Se Hui Jeong, Chan Mi Kang and Kyung Im Kang

Healthcare 2025, 13(5), 453; https://doi.org/10.3390/healthcare13050453 - 20 Feb 2025

Viewed by 1013

Abstract

Background/Objectives: This study aimed to identify the factors influencing and predicting the frequency of depressive experiences among married working women in South Korea in the post-COVID-19 period (2022–2023). It examines how alterations in circumstances and the complex difficulties encountered by this demographic [...] Read more.

Background/Objectives: This study aimed to identify the factors influencing and predicting the frequency of depressive experiences among married working women in South Korea in the post-COVID-19 period (2022–2023). It examines how alterations in circumstances and the complex difficulties encountered by this demographic group may have shaped their depressive experiences. Through a comparative analysis of the group reporting depressive experiences and the group reporting no depressive experiences, the study delineates the factors influencing depressive experiences within the former group and the predictive factors within the latter group. The findings offer a comprehensive understanding of the factors that may contribute to mental health outcomes within this population. Methods: This study utilized data from the ninth wave (2022–2023) of the Korean Longitudinal Survey of Women and Families, conducted by the Korean Women’s Development Institute. The study included a total of 1735 participants. A zero-inflated negative binomial regression model was applied to analyze the frequency of depressive experiences and the influencing and predictive factors. Results: Among the participants, 38.9% reported no depressive experiences. The count model analysis revealed that subjective health status, physical activity, thoughts about husband, family decision-making, and work–family balance were significant factors associated with the frequency of depressive experiences. In the logistic model, key predictors for those without depression included the spouse’s education, physical activity, satisfaction with the spouse’s housework, and happiness with marital life. Conclusions: These findings provide essential empirical evidence for the development of targeted policies and interventions aimed at mitigating and preventing depression problem among married working women. Full article

(This article belongs to the Section Women's Health Care)

► Show Figures

Figure 1

30 pages, 10797 KiB

Open AccessArticle

Bayesian Inference for Zero-Modified Power Series Regression Models

by Katiane S. Conceição, Marinho G. Andrade, Victor Hugo Lachos and Nalini Ravishanker

Mathematics 2025, 13(1), 60; https://doi.org/10.3390/math13010060 - 27 Dec 2024

Cited by 1 | Viewed by 968

Abstract

Count data often exhibit discrepancies in the frequencies of zeros, which commonly occur across various application domains. These data may include excess zeros (zero inflation) or, less frequently, a scarcity of zeros (zero deflation). In regression models, both situations can arise at different [...] Read more.

Count data often exhibit discrepancies in the frequencies of zeros, which commonly occur across various application domains. These data may include excess zeros (zero inflation) or, less frequently, a scarcity of zeros (zero deflation). In regression models, both situations can arise at different levels of covariates. The zero-modified power series regression model provides an effective framework for modeling such count data, as it does not require prior knowledge of the type of zero modification, whether zero inflation or zero deflation, and can accommodate overdispersion, equidispersion, or underdispersion present in the data. This paper proposes a Bayesian estimation procedure based on the stochastic gradient Hamiltonian Monte Carlo algorithm, effectively addressing many challenges associated with estimating the model parameters. Additionally, we introduce a measure of Bayesian efficiency to evaluate the impact of prior information on parameter estimation. The practical utility of the proposed method is demonstrated through both simulated and real data across different types of zero modification. Full article

(This article belongs to the Section D1: Probability and Statistics)

► Show Figures

Figure 1

24 pages, 3621 KiB

Open AccessArticle

Improving Forest Above-Ground Biomass Estimation Accuracy Using Multi-Source Remote Sensing and Optimized Least Absolute Shrinkage and Selection Operator Variable Selection Method

by Er Wang, Tianbao Huang, Zhi Liu, Lei Bao, Binbing Guo, Zhibo Yu, Zihang Feng, Hongbin Luo and Guanglong Ou

Remote Sens. 2024, 16(23), 4497; https://doi.org/10.3390/rs16234497 - 30 Nov 2024

Cited by 7 | Viewed by 3155

Abstract

Estimation of forest above-ground biomass (AGB) using multi-source remote sensing data is an important method to improve the accuracy of the estimate. However, selecting remote sensing factors that can effectively improve the accuracy of forest AGB estimation from a large amount of data [...] Read more.

Estimation of forest above-ground biomass (AGB) using multi-source remote sensing data is an important method to improve the accuracy of the estimate. However, selecting remote sensing factors that can effectively improve the accuracy of forest AGB estimation from a large amount of data is a challenge when the sample size is small. In this regard, the Least Absolute Shrinkage and Selection Operator (Lasso) has advantages for extensive redundant variables but still has some drawbacks. To address this, the study introduces two Least Absolute Shrinkage and Selection Operator Lasso-based variable selection methods: Least Absolute Shrinkage and Selection Operator Genetic Algorithm (Lasso-GA) and Variance Inflation Factor Least Absolute Shrinkage and Selection Operator (VIF-Lasso). Sentinel 2, Sentinel 1, Landsat 8 OLI, ALOS-2 PALSAR-2, Light Detection and Ranging, and Digital Elevation Model (DEM) data were used in this study. In order to explore the variable selection capabilities of Lasso-GA and VIF-Lasso for remote sensing estimation of forest AGB. It compares Lasso-GA and VIF-Lasso with Boruta, Random Forest Importance Selection, Pearson Correlation, and Lasso for selecting remote sensing factors. Additionally, it employs eight machine learning models—Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Bayesian Regression Neural Network (BRNN), Elastic Net (EN), K-Nearest Neighbors (KNN), Extremely Randomized Trees (ETR), and Stochastic Gradient Boosting (SGBoost)—to estimate forest AGB in Wuyi Village, Zhenyuan County. The results showed that the optimized Lasso variable selection could improve the accuracy of forest biomass estimation. The VIF-Lasso method results in a BRNN model with an R² of 0.75 and an RMSE of 16.48 Mg/ha. The Lasso-GA method results in an ETR model with an R² of 0.73 and an RMSE of 16.70 Mg/ha. Compared to the optimal SGBoost model with the Lasso variable selection method (R² of 0.69, RMSE of 18.63 Mg/ha), the VIF-Lasso method improves R² by 0.06 and reduces RMSE by 2.15 Mg/ha, while the Lasso-GA method improves R² by 0.04 and reduces RMSE by 1.93 Mg/ha. From another perspective, they also demonstrated that the RX sample count and sensitivity provided by LiDAR, as well as the Horizontal Transmit, Vertical Receive provided by Microwave Radar, along with the feature variables (Mean, Contrast, and Correlation) calculated from the Green, Red, and NIR bands of optical remote sensing in 7 × 7 and 5 × 5 windows, play an important role in forest AGB estimation. Therefore, the optimized Lasso variable selection method shows strong potential for forest AGB estimation using multi-source remote sensing data. Full article

► Show Figures

Figure 1

30 pages, 3766 KiB

Open AccessArticle

An Interpretable Machine Learning-Based Hurdle Model for Zero-Inflated Road Crash Frequency Data Analysis: Real-World Assessment and Validation

by Moataz Bellah Ben Khedher and Dukgeun Yun

Appl. Sci. 2024, 14(23), 10790; https://doi.org/10.3390/app142310790 - 21 Nov 2024

Cited by 3 | Viewed by 2376

Abstract

Road traffic crashes pose significant economic and public health burdens, necessitating an in-depth understanding of crash causation and its links to underlying factors. This study introduces a machine learning-based hurdle model framework tailored for analyzing zero-inflated crash frequency data, addressing the limitations of [...] Read more.

Road traffic crashes pose significant economic and public health burdens, necessitating an in-depth understanding of crash causation and its links to underlying factors. This study introduces a machine learning-based hurdle model framework tailored for analyzing zero-inflated crash frequency data, addressing the limitations of traditional statistical models like the Poisson and negative binomial models, which struggle with zero-inflation and overdispersion. The research employs a two-stage modeling process using CatBoost. The first stage uses binary classification to identify road segments with potential crash occurrences, applying a customized loss function to tackle data imbalance. The second stage predicts crash frequency, also utilizing a customized loss function for count data. SHapley Additive exPlanations (SHAP) analysis interprets the model outcomes, providing insights into factors affecting crash likelihood and frequency. This study validates the model’s performance with real-world crash data from 2011 to 2015 in South Korea, demonstrating superior accuracy in both the classification and regression stages compared to other machine learning algorithms and traditional models. These findings have significant implications for traffic safety research and policymaking, offering stakeholders a more accurate and interpretable tool for crash data analysis to develop targeted safety interventions. Full article

(This article belongs to the Section Transportation and Future Mobility)

► Show Figures

Figure 1

30 pages, 3813 KiB

Open AccessFeature PaperArticle

Matrix Factorization and Prediction for High-Dimensional Co-Occurrence Count Data via Shared Parameter Alternating Zero Inflated Gamma Model

by Taejoon Kim and Haiyan Wang

Mathematics 2024, 12(21), 3365; https://doi.org/10.3390/math12213365 - 27 Oct 2024

Cited by 1 | Viewed by 1967

Abstract

High-dimensional sparse matrix data frequently arise in various applications. A notable example is the weighted word–word co-occurrence count data, which summarizes the weighted frequency of word pairs appearing within the same context window. This type of data typically contains highly skewed non-negative values [...] Read more.

High-dimensional sparse matrix data frequently arise in various applications. A notable example is the weighted word–word co-occurrence count data, which summarizes the weighted frequency of word pairs appearing within the same context window. This type of data typically contains highly skewed non-negative values with an abundance of zeros. Another example is the co-occurrence of item–item or user–item pairs in e-commerce, which also generates high-dimensional data. The objective is to utilize these data to predict the relevance between items or users. In this paper, we assume that items or users can be represented by unknown dense vectors. The model treats the co-occurrence counts as arising from zero-inflated Gamma random variables and employs cosine similarity between the unknown vectors to summarize item–item relevance. The unknown values are estimated using the shared parameter alternating zero-inflated Gamma regression models (SA-ZIG). Both canonical link and log link models are considered. Two parameter updating schemes are proposed, along with an algorithm to estimate the unknown parameters. Convergence analysis is presented analytically. Numerical studies demonstrate that the SA-ZIG using Fisher scoring without learning rate adjustment may fail to find the maximum likelihood estimate. However, the SA-ZIG with learning rate adjustment performs satisfactorily in our simulation studies. Full article

(This article belongs to the Special Issue Statistics for High-Dimensional Data)

► Show Figures

Figure 1

27 pages, 436 KiB

Open AccessArticle

On the Conflation of Negative Binomial and Logarithmic Distributions

by Anfal A. Alqefari, Abdulhamid A. Alzaid and Najla Qarmalah

Axioms 2024, 13(10), 707; https://doi.org/10.3390/axioms13100707 - 13 Oct 2024

Cited by 1 | Viewed by 1075

Abstract

In recent decades, the study of discrete distributions has received increasing attention in the field of statistics, mainly because discrete distributions can model a wide range of count data. One common distribution used for modeling count data, for instance, is the negative binomial [...] Read more.

In recent decades, the study of discrete distributions has received increasing attention in the field of statistics, mainly because discrete distributions can model a wide range of count data. One common distribution used for modeling count data, for instance, is the negative binomial distribution (NBD), which performs well with over-dispersed data. In this paper, a new count distribution is introduced, called the conflation of negative binomial and logarithmic distributions, which is formed by conflating the negative binomial and logarithmic distributions, resulting in a distribution that possesses some of the properties of negative binomial and logarithmic distributions. The distribution has two parameters and is verified by a positive integer. Two modifications are proposed to the distribution, which includes zero as a support point. The new distribution is valuable from a theoretical perspective since it is a member of the weighted negative binomial distribution family. In addition, the distribution differs from the NBD in the sense that the probability of lower counts is inflated. This study discusses the characteristics of the proposed distribution and its modified versions, such as moments, probability generating functions, likelihood stochastic ordering, log-concavity, and unimodality properties. Real-world data are used to evaluate the performance of the proposed models against other models. All computations shown in this paper were produced using the R programming language. Full article

(This article belongs to the Special Issue Probability, Statistics and Estimations, 2nd Edition)

► Show Figures

Figure 1

24 pages, 3848 KiB

Open AccessArticle

Analysis of Effects on Scientific Impact Indicators Based on Coevolution of Coauthorship and Citation Networks

by Haobai Xue

Information 2024, 15(10), 597; https://doi.org/10.3390/info15100597 - 30 Sep 2024

Cited by 1 | Viewed by 1110

Abstract

This study investigates the coevolution of coauthorship and citation networks and their influence on scientific metrics such as the h-index and journal impact factors. Using a preferential attachment mechanism, we developed a model that integrated these networks and validated it with data [...] Read more.

This study investigates the coevolution of coauthorship and citation networks and their influence on scientific metrics such as the h-index and journal impact factors. Using a preferential attachment mechanism, we developed a model that integrated these networks and validated it with data from the American Physical Society (APS). While the correlations between reference counts, paper lifetime, and team sizes with scientific impact metrics are well-known, our findings demonstrate how these relationships vary depending on specific model parameters. For instance, increasing reference counts or reducing paper lifetime significantly boosts both journal impact factors and h-indexes, while expanding team sizes without adding new authors can artificially inflate h-indexes. These results highlight potential vulnerabilities in commonly used metrics and emphasize the value of modeling and simulation for improving bibliometric evaluations. Full article

(This article belongs to the Special Issue Advances in Data and Network Sciences Applied to Computational Social Science)

► Show Figures

Figure 1

16 pages, 860 KiB

Open AccessArticle

Robust Negative Binomial Regression via the Kibria–Lukman Strategy: Methodology and Application

by Adewale F. Lukman, Olayan Albalawi, Mohammad Arashi, Jeza Allohibi, Abdulmajeed Atiah Alharbi and Rasha A. Farghali

Mathematics 2024, 12(18), 2929; https://doi.org/10.3390/math12182929 - 20 Sep 2024

Cited by 3 | Viewed by 1843

Abstract

Count regression models, particularly negative binomial regression (NBR), are widely used in various fields, including biometrics, ecology, and insurance. Over-dispersion is likely when dealing with count data, and NBR has gained attention as an effective tool to address this challenge. However, multicollinearity among [...] Read more.

Count regression models, particularly negative binomial regression (NBR), are widely used in various fields, including biometrics, ecology, and insurance. Over-dispersion is likely when dealing with count data, and NBR has gained attention as an effective tool to address this challenge. However, multicollinearity among covariates and the presence of outliers can lead to inflated confidence intervals and inaccurate predictions in the model. This study proposes a comprehensive approach integrating robust and regularization techniques to handle the simultaneous impact of multicollinearity and outliers in the negative binomial regression model (NBRM). We investigate the estimators’ performance through extensive simulation studies and provide analytical comparisons. The simulation results and the theoretical comparisons demonstrate the superiority of the proposed robust hybrid KL estimator (M-NBKLE) with predictive accuracy and stability when multicollinearity and outliers exist. We illustrate the application of our methodology by analyzing a forestry dataset. Our findings complement and reinforce the simulation and theoretical results. Full article

(This article belongs to the Special Issue Application of Regression Models, Analysis and Bayesian Statistics)

► Show Figures

Figure 1

Search Results (72)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (72)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI