MDPI - Publisher of Open Access Journals

27 pages, 3698 KiB

Open AccessReview

A Historical Survey of Advances in Transformer Architectures

by Ali Reza Sajun, Imran Zualkernan and Donthi Sankalpa

Appl. Sci. 2024, 14(10), 4316; https://doi.org/10.3390/app14104316 - 20 May 2024

Cited by 9 | Viewed by 10875

In recent times, transformer-based deep learning models have risen in prominence in the field of machine learning for a variety of tasks such as computer vision and text generation. Given this increased interest, a historical outlook at the development and rapid progression of transformer-based models becomes imperative in order to gain an understanding of the rise of this key architecture. This paper presents a survey of key works related to the early development and implementation of transformer models in various domains such as generative deep learning and as backbones of large language models. Previous works are classified based on their historical approaches, followed by key works in the domain of text-based applications, image-based applications, and miscellaneous applications. A quantitative and qualitative analysis of the various approaches is presented. Additionally, recent directions of transformer-related research such as those in the biomedical and timeseries domains are discussed. Finally, future research opportunities, especially regarding the multi-modality and optimization of the transformer training process, are identified. Full article

(This article belongs to the Special Issue Advances in Neural Networks and Deep Learning)

► Show Figures

Figure 1

26 pages, 4306 KiB

Open AccessArticle

Investigating the Performance of FixMatch for COVID-19 Detection in Chest X-rays

by Ali Reza Sajun, Imran Zualkernan and Donthi Sankalpa

Appl. Sci. 2022, 12(9), 4694; https://doi.org/10.3390/app12094694 - 6 May 2022

Cited by 8 | Viewed by 3086

Abstract

The advent of the COVID-19 pandemic has resulted in medical resources being stretched to their limits. Chest X-rays are one method of diagnosing COVID-19; they are used due to their high efficacy. However, detecting COVID-19 manually by using these images is time-consuming and expensive. While neural networks can be trained to detect COVID-19, doing so requires large amounts of labeled data, which are expensive to collect and code. One approach is to use semi-supervised neural networks to detect COVID-19 based on a very small number of labeled images. This paper explores how well such an approach could work. The FixMatch algorithm, which is a state-of-the-art semi-supervised classification algorithm, was trained on chest X-rays to detect COVID-19, Viral Pneumonia, Bacterial Pneumonia and Lung Opacity. The model was trained with decreasing levels of labeled data and compared with the best supervised CNN models, using transfer learning. FixMatch was able to achieve a COVID F1-score of 0.94 with only 80 labeled samples per class and an overall macro-average F1-score of 0.68 with only 20 labeled samples per class. Furthermore, an exploratory analysis was conducted to determine the performance of FixMatch to detect COVID-19 when trained with imbalanced data. The results show a predictable drop in performance as compared to training with uniform data; however, a statistical analysis suggests that FixMatch may be somewhat robust to data imbalance, as in many cases, and the same types of mistakes are made when the amount of labeled data is decreased. Full article

(This article belongs to the Topic Artificial Intelligence in Healthcare)

► Show Figures

Figure 1

21 pages, 2631 KiB

Open AccessReview

Survey on Implementations of Generative Adversarial Networks for Semi-Supervised Learning

by Ali Reza Sajun and Imran Zualkernan

Appl. Sci. 2022, 12(3), 1718; https://doi.org/10.3390/app12031718 - 7 Feb 2022

Cited by 28 | Viewed by 5275

Abstract

Given recent advances in deep learning, semi-supervised techniques have seen a rise in interest. Generative adversarial networks (GANs) represent one recent approach to semi-supervised learning (SSL). This paper presents a survey method using GANs for SSL. Previous work in applying GANs to SSL are classified into pseudo-labeling/classification, encoder-based, TripleGAN-based, two GAN, manifold regularization, and stacked discriminator approaches. A quantitative and qualitative analysis of the various approaches is presented. The R3-CGAN architecture is identified as the GAN architecture with state-of-the-art results. Given the recent success of non-GAN-based approaches for SSL, future research opportunities involving the adaptation of elements of SSL into GAN-based implementations are also identified. Full article

(This article belongs to the Special Issue Generative Models in Artificial Intelligence and Their Applications)

► Show Figures

Figure 1

24 pages, 37226 KiB

Open AccessArticle

An IoT System Using Deep Learning to Classify Camera Trap Images on the Edge

by Imran Zualkernan, Salam Dhou, Jacky Judas, Ali Reza Sajun, Brylle Ryan Gomez and Lana Alhaj Hussain

Computers 2022, 11(1), 13; https://doi.org/10.3390/computers11010013 - 13 Jan 2022

Cited by 45 | Viewed by 13374

Abstract

Camera traps deployed in remote locations provide an effective method for ecologists to monitor and study wildlife in a non-invasive way. However, current camera traps suffer from two problems. First, the images are manually classified and counted, which is expensive. Second, due to manual coding, the results are often stale by the time they get to the ecologists. Using the Internet of Things (IoT) combined with deep learning represents a good solution for both these problems, as the images can be classified automatically, and the results immediately made available to ecologists. This paper proposes an IoT architecture that uses deep learning on edge devices to convey animal classification results to a mobile app using the LoRaWAN low-power, wide-area network. The primary goal of the proposed approach is to reduce the cost of the wildlife monitoring process for ecologists, and to provide real-time animal sightings data from the camera traps in the field. Camera trap image data consisting of 66,400 images were used to train the InceptionV3, MobileNetV2, ResNet18, EfficientNetB1, DenseNet121, and Xception neural network models. While performance of the trained models was statistically different (Kruskal–Wallis: Accuracy H(5) = 22.34, p < 0.05; F1-score H(5) = 13.82, p = 0.0168), there was only a 3% difference in the F1-score between the worst (MobileNet V2) and the best model (Xception). Moreover, the models made similar errors (Adjusted Rand Index (ARI) > 0.88 and Adjusted Mutual Information (AMU) > 0.82). Subsequently, the best model, Xception (Accuracy = 96.1%; F1-score = 0.87; F1-Score = 0.97 with oversampling), was optimized and deployed on the Raspberry Pi, Google Coral, and Nvidia Jetson edge devices using both TenorFlow Lite and TensorRT frameworks. Optimizing the models to run on edge devices reduced the average macro F1-Score to 0.7, and adversely affected the minority classes, reducing their F1-score to as low as 0.18. Upon stress testing, by processing 1000 images consecutively, Jetson Nano, running a TensorRT model, outperformed others with a latency of 0.276 s/image (s.d. = 0.002) while consuming an average current of 1665.21 mA. Raspberry Pi consumed the least average current (838.99 mA) with a ten times worse latency of 2.83 s/image (s.d. = 0.036). Nano was the only reasonable option as an edge device because it could capture most animals whose maximum speeds were below 80 km/h, including goats, lions, ostriches, etc. While the proposed architecture is viable, unbalanced data remain a challenge and the results can potentially be improved by using object detection to reduce imbalances and by exploring semi-supervised learning. Full article

(This article belongs to the Special Issue Survey in Deep Learning for IoT Applications)

► Show Figures

Figure 1

Search Results (4)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (4)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI