Sustainable Big Data Analytics and Machine Learning Technologies

A special issue of Big Data and Cognitive Computing (ISSN 2504-2289).

Deadline for manuscript submissions: closed (15 March 2024) | Viewed by 31194

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science and Information Engineering, National Taipei University of Technology, Taipei 10608, Taiwan
Interests: big data analytics; deep learning; social web mining; artificial intelligence; natural language processing

Special Issue Information

Dear Colleagues,

With the advances in big data analytics and machine learning technologies, people’s daily lives have been improved in many different ways. For example, tremendous improvement in image processing and language understanding technologies boost many applications in medical image diagnosis, face recognition, voice recognition, question answering, and machine reading comprehension. These have been possible largely due to the development of deep learning algorithms. However, deep learning algorithms rely on powerful machines and systems with GPUs to accomplish the complex and long training process. On the one hand, these solutions are limited by the computational power on single systems, which could not be scaled up indefinitely. Thus, big data analytics solutions utilize distributed frameworks to scale out in terms of data parallelism or task parallelism. On the other hand, the global environment has undergone an extremely rapid development that makes it difficult to maintain or recover to its original status. The impact of technology on environmental changes could lead to significant damages that also jeopardize human lives and global ecology. Many efforts have begun to address the sustainability issues by containing the environmental changes and slowing down deterioration—for example, addressing climate change, water resources, air quality, to name a few. This Special Issue focuses on ideas such as big data analytics for sustainability [1], federated learning [2], and distributed deep learning [3]. We aim to seek potential solutions and empirical studies that investigate sustainable technologies that are also energy efficient and resource efficient.

This issue includes, but is not limited to, the following topics:

  • Performance of machine learning systems;
  • Efficiency of deep learning algorithms;
  • Resource allocation for improving sustainability in data mining;
  • Effects of federated machine learning on sustainability;
  • Energy efficiency of distributed deep learning systems;
  • Sustainable big data analytics;
  • Sustainable framework for large-scale data collection, processing, and analytics;
  • Social media mining for sustainability;
  • Social media monitoring for sustainability;
  • Fake news detection for sustainability;
  • Application of data science for sustainability in economy;
  • The impact of big data analytics on environmental sustainability.

References:

[1] Zhihan Lv, Rahat Iqbal, Victor Chang, Big data analytics for sustainability, Future Generation Computer Systems, Volume 86, 2018, Pages 1238-1241.

[2] Jakub Konečný, H. Brendan McMahan, Daniel Ramage, “Federated Optimization: Distributed Optimization Beyond the Datacenter,” NIPS Optimization for Machine Learning Workshop (2015).

[3] Matthias Langer, Zhen He, Wenny Rahayu, Yanbo Xue, “Distributed Training of Deep Learning Models: A Taxonomic Perspective,” IEEE Transactions on Parallel and Distributed Systems, 2020, Volume: 31, Issue: 12, Pages: 2802-2818.

Dr. Jenq-Haur Wang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • big data analytics
  • data mining
  • federated machine learning
  • deep learning
  • artificial intelligence
  • distributed computing
  • sustainable technology
  • energy efficiency

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

16 pages, 2272 KiB  
Article
Augmenting Multimodal Content Representation with Transformers for Misinformation Detection
by Jenq-Haur Wang, Mehdi Norouzi and Shu Ming Tsai
Big Data Cogn. Comput. 2024, 8(10), 134; https://doi.org/10.3390/bdcc8100134 - 11 Oct 2024
Viewed by 891
Abstract
Information sharing on social media has become a common practice for people around the world. Since it is difficult to check user-generated content on social media, huge amounts of rumors and misinformation are being spread with authentic information. On the one hand, most [...] Read more.
Information sharing on social media has become a common practice for people around the world. Since it is difficult to check user-generated content on social media, huge amounts of rumors and misinformation are being spread with authentic information. On the one hand, most of the social platforms identify rumors through manual fact-checking, which is very inefficient. On the other hand, with an emerging form of misinformation that contains inconsistent image–text pairs, it would be beneficial if we could compare the meaning of multimodal content within the same post for detecting image–text inconsistency. In this paper, we propose a novel approach to misinformation detection by multimodal feature fusion with transformers and credibility assessment with self-attention-based Bi-RNN networks. Firstly, captions are derived from images using an image captioning module to obtain their semantic descriptions. These are compared with surrounding text by fine-tuning transformers for consistency check in semantics. Then, to further aggregate sentiment features into text representation, we fine-tune a separate transformer for text sentiment classification, where the output is concatenated to augment text embeddings. Finally, Multi-Cell Bi-GRUs with self-attention are used to train the credibility assessment model for misinformation detection. From the experimental results on tweets, the best performance with an accuracy of 0.904 and an F1-score of 0.921 can be obtained when applying feature fusion of augmented embeddings with sentiment classification results. This shows the potential of the innovative way of applying transformers in our proposed approach to misinformation detection. Further investigation is needed to validate the performance on various types of multimodal discrepancies. Full article
(This article belongs to the Special Issue Sustainable Big Data Analytics and Machine Learning Technologies)
Show Figures

Figure 1

21 pages, 1646 KiB  
Article
Improving Clothing Product Quality and Reducing Waste Based on Consumer Review Using RoBERTa and BERTopic Language Model
by Andry Alamsyah and Nadhif Ditertian Girawan
Big Data Cogn. Comput. 2023, 7(4), 168; https://doi.org/10.3390/bdcc7040168 - 25 Oct 2023
Cited by 6 | Viewed by 3545
Abstract
The disposability of clothing has emerged as a critical concern, precipitating waste accumulation due to product quality degradation. Such consequences exert significant pressure on resources and challenge sustainability efforts. In response, this research focuses on empowering clothing companies to elevate product excellence by [...] Read more.
The disposability of clothing has emerged as a critical concern, precipitating waste accumulation due to product quality degradation. Such consequences exert significant pressure on resources and challenge sustainability efforts. In response, this research focuses on empowering clothing companies to elevate product excellence by harnessing consumer feedback. Beyond insights, this research extends to sustainability by providing suggestions on refining product quality by improving material handling, gradually mitigating waste production, and cultivating longevity, therefore decreasing discarded clothes. Managing a vast influx of diverse reviews necessitates sophisticated natural language processing (NLP) techniques. Our study introduces a Robustly optimized BERT Pretraining Approach (RoBERTa) model calibrated for multilabel classification and BERTopic for topic modeling. The model adeptly distills vital themes from consumer reviews, exhibiting astounding accuracy in projecting concerns across various dimensions of clothing quality. NLP’s potential lies in endowing companies with insights into consumer review, augmented by the BERTopic to facilitate immersive exploration of harvested review topics. This research presents a thorough case for integrating machine learning to foster sustainability and waste reduction. The contribution of this research is notable for its integration of RoBERTa and BERTopic in multilabel classification tasks and topic modeling in the fashion industry. The results indicate that the RoBERTa model exhibits remarkable performance, as demonstrated by its macro-averaged F1 score of 0.87 and micro-averaged F1 score of 0.87. Likewise, BERTopic achieves a coherence score of 0.67, meaning the model can form an insightful topic. Full article
(This article belongs to the Special Issue Sustainable Big Data Analytics and Machine Learning Technologies)
Show Figures

Figure 1

18 pages, 1526 KiB  
Article
Big Data Analytics with the Multivariate Adaptive Regression Splines to Analyze Key Factors Influencing Accident Severity in Industrial Zones of Thailand: A Study on Truck and Non-Truck Collisions
by Manlika Seefong, Panuwat Wisutwattanasak, Chamroeun Se, Kestsirin Theerathitichaipa, Sajjakaj Jomnonkwao, Thanapong Champahom, Vatanavongs Ratanavaraha and Rattanaporn Kasemsri
Big Data Cogn. Comput. 2023, 7(3), 156; https://doi.org/10.3390/bdcc7030156 - 21 Sep 2023
Cited by 1 | Viewed by 2392
Abstract
Machine learning currently holds a vital position in predicting collision severity. Identifying factors associated with heightened risks of injury and fatalities aids in enhancing road safety measures and management. Presently, Thailand faces considerable challenges with respect to road traffic accidents. These challenges are [...] Read more.
Machine learning currently holds a vital position in predicting collision severity. Identifying factors associated with heightened risks of injury and fatalities aids in enhancing road safety measures and management. Presently, Thailand faces considerable challenges with respect to road traffic accidents. These challenges are particularly acute in industrial zones, where they contribute to a rise in injuries and fatalities. The mixture of heavy traffic, comprising both trucks and non-trucks, significantly amplifies the risk of accidents. This situation, hence, generates profound concerns for road safety in Thailand. Consequently, discerning the factors that influence the severity of injuries and fatalities becomes pivotal for formulating effective road safety policies and measures. This study is specifically aimed at predicting the factors contributing to the severity of accidents involving truck and non-truck collisions in industrial zones. It considers a variety of aspects, including roadway characteristics, underlying assumptions of cause, crash characteristics, and weather conditions. Due to the fact that accident data is big data with specific characteristics and complexity, with the employment of machine learning in tandem with the Multi-variate Adaptive Regression Splines technique, we can make precise predictions to identify the factors influencing the severity of collision outcomes. The analysis demonstrates that various factors augment the severity of accidents involving trucks. These include darting in front of a vehicle, head-on collisions, and pedestrian collisions. Conversely, for non-truck related collisions, the significant factors that heighten severity are tailgating, running signs/signals, angle collisions, head-on collisions, overtaking collisions, pedestrian collisions, obstruction collisions, and collisions during overcast conditions. These findings illuminate the significant factors influencing the severity of accidents involving trucks and non-trucks. Such insights provide invaluable information for developing targeted road safety measures and policies, thereby contributing to the mitigation of injuries and fatalities. Full article
(This article belongs to the Special Issue Sustainable Big Data Analytics and Machine Learning Technologies)
Show Figures

Figure 1

Other

Jump to: Research

23 pages, 2848 KiB  
Systematic Review
Physics-Informed Neural Network (PINN) Evolution and Beyond: A Systematic Literature Review and Bibliometric Analysis
by Zaharaddeen Karami Lawal, Hayati Yassin, Daphne Teck Ching Lai and Azam Che Idris
Big Data Cogn. Comput. 2022, 6(4), 140; https://doi.org/10.3390/bdcc6040140 - 21 Nov 2022
Cited by 43 | Viewed by 22957
Abstract
This research aims to study and assess state-of-the-art physics-informed neural networks (PINNs) from different researchers’ perspectives. The PRISMA framework was used for a systematic literature review, and 120 research articles from the computational sciences and engineering domain were specifically classified through a well-defined [...] Read more.
This research aims to study and assess state-of-the-art physics-informed neural networks (PINNs) from different researchers’ perspectives. The PRISMA framework was used for a systematic literature review, and 120 research articles from the computational sciences and engineering domain were specifically classified through a well-defined keyword search in Scopus and Web of Science databases. Through bibliometric analyses, we have identified journal sources with the most publications, authors with high citations, and countries with many publications on PINNs. Some newly improved techniques developed to enhance PINN performance and reduce high training costs and slowness, among other limitations, have been highlighted. Different approaches have been introduced to overcome the limitations of PINNs. In this review, we categorized the newly proposed PINN methods into Extended PINNs, Hybrid PINNs, and Minimized Loss techniques. Various potential future research directions are outlined based on the limitations of the proposed solutions. Full article
(This article belongs to the Special Issue Sustainable Big Data Analytics and Machine Learning Technologies)
Show Figures

Figure 1

Back to TopTop