High-Dimensional Data Analysis and Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "D1: Probability and Statistics".

Deadline for manuscript submissions: 30 September 2025 | Viewed by 1792

Special Issue Editors


E-Mail Website
Guest Editor
School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 639798, Singapore
Interests: high-dimensional data analysis; subgroup detection; biostatistics; infectious disease modelling; health economic and policy modelling
Department of Biostatistics, City University of Hong Kong, Hong Kong, China
Interests: survival analysis; statistical machine learning; network model; model selection; empirical likelihood; statistical genetics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We are excited to announce a Special Issue dedicated to "High-Dimensional Data Analysis and Applications", inviting researchers and practitioners to contribute their innovative work to this rapidly evolving field. This Special Issue seeks to highlight cutting-edge methodologies and applications addressing the challenges and opportunities presented by high-dimensional data. We encourage submissions that explore novel statistical techniques, advanced machine learning approaches, and real-world applications across diverse domains. By sharing your research, you will contribute to advancing knowledge and fostering discussions on state-of-the-art solutions in high-dimensional data analysis. Join us in shaping the future of this dynamic field and submit your work for consideration in this impactful Special Issue.

Dr. Mu Yue
Dr. Jinfeng Xu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • high-dimensional data analysis
  • dimensionality reduction
  • statistical learning
  • machine learning
  • predictive modeling
  • big data applications

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

25 pages, 866 KiB  
Article
Hybrid Deep Neural Network with Domain Knowledge for Text Sentiment Analysis
by Jawad Khan, Niaz Ahmad, Youngmoon Lee, Shah Khalid and Dildar Hussain
Mathematics 2025, 13(9), 1456; https://doi.org/10.3390/math13091456 - 29 Apr 2025
Viewed by 324
Abstract
Sentiment analysis (SA) analyzes online data to uncover insights for better decision-making. Conventional text SA techniques are effective and easy to understand but encounter difficulties when handling sparse data. Deep Neural Networks (DNNs) excel in handling data sparsity but face challenges with high-dimensional, [...] Read more.
Sentiment analysis (SA) analyzes online data to uncover insights for better decision-making. Conventional text SA techniques are effective and easy to understand but encounter difficulties when handling sparse data. Deep Neural Networks (DNNs) excel in handling data sparsity but face challenges with high-dimensional, noisy data. Incorporating rich domain semantic and sentiment knowledge is crucial for advancing sentiment analysis. To address these challenges, we propose an innovative hybrid sentiment analysis approach that combines established DNN models like RoBERTA and BiGRU with an attention mechanism, alongside traditional feature engineering and dimensionality reduction through PCA. This leverages the strengths of both techniques: DNNs handle complex semantics and dynamic features, while conventional methods shine in interpretability and efficient sentiment extraction. This complementary combination fosters a robust and accurate sentiment analysis model. Our model is evaluated on four widely used real-world benchmark text sentiment analysis datasets: MR, CR, IMDB, and SemEval 2013. The proposed hybrid model achieved impressive results on these datasets. These findings highlight the effectiveness of this approach for text sentiment analysis tasks, demonstrating its ability to improve sentiment analysis performance compared to previously proposed methods. Full article
(This article belongs to the Special Issue High-Dimensional Data Analysis and Applications)
Show Figures

Figure 1

16 pages, 440 KiB  
Article
Sparse Boosting for Additive Spatial Autoregressive Model with High Dimensionality
by Mu Yue and Jingxin Xi
Mathematics 2025, 13(5), 757; https://doi.org/10.3390/math13050757 - 25 Feb 2025
Viewed by 401
Abstract
Variable selection methods have been a focus in the context of econometrics and statistics literature. In this paper, we consider additive spatial autoregressive model with high-dimensional covariates. Instead of adopting the traditional regularization approaches, we offer a novel multi-step sparse boosting algorithm to [...] Read more.
Variable selection methods have been a focus in the context of econometrics and statistics literature. In this paper, we consider additive spatial autoregressive model with high-dimensional covariates. Instead of adopting the traditional regularization approaches, we offer a novel multi-step sparse boosting algorithm to conduct model-based prediction and variable selection. One main advantage of this new method is that we do not need to perform the time-consuming selection of tuning parameters. Extensive numerical examples illustrate the advantage of the proposed methodology. An application of Boston housing price data is further provided to demonstrate the proposed methodology. Full article
(This article belongs to the Special Issue High-Dimensional Data Analysis and Applications)
Show Figures

Figure 1

40 pages, 5018 KiB  
Article
Global Dense Vector Representations for Words or Items Using Shared Parameter Alternating Tweedie Model
by Taejoon Kim and Haiyan Wang
Mathematics 2025, 13(4), 612; https://doi.org/10.3390/math13040612 - 13 Feb 2025
Viewed by 529
Abstract
In this article, we present a model for analyzing the co-occurrence count data derived from practical fields such as user–item or item–item data from online shopping platforms and co-occurring word–word pairs in sequences of texts. Such data contain important information for developing recommender [...] Read more.
In this article, we present a model for analyzing the co-occurrence count data derived from practical fields such as user–item or item–item data from online shopping platforms and co-occurring word–word pairs in sequences of texts. Such data contain important information for developing recommender systems or studying the relevance of items or words from non-numerical sources. Different from traditional regression models, there are no observations for covariates. Additionally, the co-occurrence matrix is typically of such high dimension that it does not fit into a computer’s memory for modeling. We extract numerical data by defining windows of co-occurrence using weighted counts on the continuous scale. Positive probability mass is allowed for zero observations. We present the Shared Parameter Alternating Tweedie (SA-Tweedie) model and an algorithm to estimate the parameters. We introduce a learning rate adjustment used along with the Fisher scoring method in the inner loop to help the algorithm stay on track with optimizing direction. Gradient descent with the Adam update was also considered as an alternative method for the estimation. Simulation studies showed that our algorithm with Fisher scoring and learning rate adjustment outperforms the other two methods. We applied SA-Tweedie to English-language Wikipedia dump data to obtain dense vector representations for WordPiece tokens. The vector representation embeddings were then used in an application of the Named Entity Recognition (NER) task. The SA-Tweedie embeddings significantly outperform GloVe, random, and BERT embeddings in the NER task. A notable strength of the SA-Tweedie embedding is that the number of parameters and training cost for SA-Tweedie are only a tiny fraction of those for BERT. Full article
(This article belongs to the Special Issue High-Dimensional Data Analysis and Applications)
Show Figures

Figure 1

Back to TopTop