MDPI - Publisher of Open Access Journals

12 pages, 340 KiB

Open AccessArticle

Quantitative Study of Swin Transformer and Loss Function Combinations for Face Anti-Spoofing

by Liang Yu Gong and Xue Jun Li

Electronics 2025, 14(3), 448; https://doi.org/10.3390/electronics14030448 - 23 Jan 2025

Cited by 1 | Viewed by 1316

Face anti-spoofing (FAS) has always been a hidden danger in network security, especially with the widespread application of facial recognition systems. However, some current FAS methods are not effective at detecting different forgery types and are prone to overfitting, which means they cannot effectively process unseen spoof types. Different loss functions significantly impact the classification effect based on the same feature extraction without considering the quality of the feature extraction. Therefore, it is necessary to find a loss function or a combination of different loss functions for spoofing detection tasks. This paper mainly aims to compare the effects of different loss functions or loss function combinations. We selected the Swin Transformer as the backbone of our training model to extract facial features to ensure the accuracy of the ablation experiment. For the application of loss functions, we adopted four classical loss functions: cross-entropy loss (CE loss), semi-hard triplet loss, L1 loss and focal loss. Finally, this work proposed combinations of Swin Transformers and different loss functions (pairs) to test through in-dataset experiments with some common FAS datasets (CelebA-Spoofing, CASIA-MFSD, Replay attack and OULU-NPU). We conclude that using a single loss function cannot produce the best results for the FAS task, and the best accuracy is obtained when applying triplet loss, cross-entropy loss and Smooth L1 loss as a loss combination. Full article

(This article belongs to the Special Issue AI Synergy: Vision, Language, and Modality)

► Show Figures

Figure 1

15 pages, 458 KiB

Open AccessArticle

Facial Anti-Spoofing Using “Clue Maps”

by Liang Yu Gong, Xue Jun Li and Peter Han Joo Chong

Sensors 2024, 24(23), 7635; https://doi.org/10.3390/s24237635 - 29 Nov 2024

Viewed by 1029

Abstract

Spoofing attacks (or Presentation Attacks) are easily accessible to facial recognition systems, making the online financial system vulnerable. Thus, it is urgent to develop an anti-spoofing solution with superior generalization ability due to the high demand for spoofing attack detection. Although multi-modality methods such as combining depth images with RGB images and feature fusion methods could currently perform well with certain datasets, the cost of obtaining the depth information and physiological signals, especially that of the biological signal is relatively high. This paper proposes a representation learning method of an Auto-Encoder structure based on Swin Transformer and ResNet, then applies cross-entropy loss, semi-hard triplet loss, and Smooth L1 pixel-wise loss to supervise the model training. The architecture contains three parts, namely an Encoder, a Decoder, and an auxiliary classifier. The Encoder part could effectively extract the features with patches’ correlations and the Decoder aims to generate universal “Clue Maps” for further contrastive learning. Finally, the auxiliary classifier is adopted to assist the model in making the decision, which regards this result as one preliminary result. In addition, extensive experiments evaluated Attack Presentation Classification Error Rate (APCER), Bonafide Presentation Classification Error Rate (BPCER) and Average Classification Error Rate (ACER) performances on the popular spoofing databases (CelebA, OULU, and CASIA-MFSD) to compare with several existing anti-spoofing models, and our approach could outperform existing models which reach 1.2% and 1.6% ACER on intra-dataset experiment. In addition, the inter-dataset on CASIA-MFSD (training set) and Replay-attack (Testing set) reaches a new state-of-the-art performance with 23.8% Half Total Error Rate (HTER). Full article

(This article belongs to the Special Issue Deep Learning for Perception and Recognition: Method and Applications)

► Show Figures

Figure 1

14 pages, 2747 KiB

Open AccessArticle

Pine Wilt Disease Segmentation with Deep Metric Learning Species Classification for Early-Stage Disease and Potential False Positive Identification

by Nikhil Thapa, Ridip Khanal, Bhuwan Bhattarai and Joonwhoan Lee

Electronics 2024, 13(10), 1951; https://doi.org/10.3390/electronics13101951 - 16 May 2024

Cited by 5 | Viewed by 1698

Abstract

Pine Wilt Disease poses a significant global threat to forests, necessitating swift detection methods. Conventional approaches are resource-intensive but utilizing deep learning on ortho-mapped images obtained from Unmanned Aerial Vehicles offers cost-effective and scalable solutions. This study presents a novel method for Pine Wilt Disease detection and classification using YOLOv8 for segmenting diseased areas, followed by cropping the diseased regions from the original image and applying Deep Metric Learning for classification. We trained a ResNet50 model using semi-hard triplet loss to obtain embeddings, and subsequently trained a Random Forest classifier tasked with identifying tree species and distinguishing false positives. Segmentation was favored over object detection due to its ability to provide pixel-level information, enabling the flexible extension of subsequent bounding boxes. Deep Metric Learning-based classification after segmentation was chosen for its effectiveness in handling visually similar images. The results indicate a mean Intersection over Union of 83.12% for segmentation, with classification accuracies of 98.7% and 90.7% on the validation and test sets, respectively. Full article

(This article belongs to the Special Issue Revolutionizing Medical Image Analysis with Deep Learning)

► Show Figures

Figure 1

20 pages, 2529 KiB

Open AccessArticle

NFT Image Plagiarism Check Using EfficientNet-Based Deep Neural Network with Triplet Semi-Hard Loss

by Aji Teguh Prihatno, Naufal Suryanto, Sangbong Oh, Thi-Thu-Huong Le and Howon Kim

Appl. Sci. 2023, 13(5), 3072; https://doi.org/10.3390/app13053072 - 27 Feb 2023

Cited by 8 | Viewed by 5255

Abstract

Blockchain technology is used to support digital assets such as cryptocurrencies and tokens. Commonly, smart contracts are used to generate tokens on top of the blockchain network. There are two fundamental types of tokens: fungible and non-fungible (NFTs). This paper focuses on NFTs and offers a technique to spot plagiarism in NFT images. NFTs are information that is appended to files to produce distinctive signatures. It can be found in image files, real artifacts, literature published online, and various other digital media. Plagiarism and fraudulent NFT images are becoming a big concern for artists and customers. This paper proposes an efficient deep learning-based approach for NFT image plagiarism detection using the EfficientNet-B0 architecture and the Triplet Semi-Hard Loss function. We trained our model using a dataset of NFT images and evaluated its performance using several metrics, including loss and accuracy. The results showed that the EfficientNet-B0-based deep neural network with triplet semi-hard loss outperformed other models such as Resnet50, DenseNet, and MobileNetV2 in detecting plagiarized NFTs. The experimental results demonstrate sufficient to be implemented in various NFT marketplaces. Full article

(This article belongs to the Special Issue Recent Advances in Cybersecurity and Computer Networks)

► Show Figures

Figure 1

17 pages, 388 KiB

Open AccessArticle

Deep Metric Learning Using Negative Sampling Probability Annealing

by Gábor Kertész

Sensors 2022, 22(19), 7579; https://doi.org/10.3390/s22197579 - 6 Oct 2022

Cited by 2 | Viewed by 2210

Abstract

Multiple studies have concluded that the selection of input samples is key for deep metric learning. For triplet networks, the selection of the anchor, positive, and negative pairs is referred to as triplet mining. The selection of the negatives is considered the be the most complicated task, due to a large number of possibilities. The goal is to select a negative that results in a positive triplet loss; however, there are multiple approaches for this—semi-hard negative mining or hardest mining are well-known in addition to random selection. Since its introduction, semi-hard mining was proven to outperform other negative mining techniques; however, in recent years, the selection of the so-called hardest negative has shown promising results in different experiments. This paper introduces a novel negative sampling solution based on dynamic policy switching, referred to as negative sampling probability annealing, which aims to exploit the positives of all approaches. Results are validated on an experimental synthetic dataset using cluster-analysis methods; finally, the discriminative abilities of trained models are measured on real-life data. Full article

(This article belongs to the Special Issue Image Processing and Analysis for Object Detection)

► Show Figures

Figure 1

16 pages, 5582 KiB

Open AccessArticle

Comparing Class-Aware and Pairwise Loss Functions for Deep Metric Learning in Wildlife Re-Identification

by Nkosikhona Dlamini and Terence L. van Zyl

Sensors 2021, 21(18), 6109; https://doi.org/10.3390/s21186109 - 12 Sep 2021

Cited by 4 | Viewed by 3570

Abstract

Similarity learning using deep convolutional neural networks has been applied extensively in solving computer vision problems. This attraction is supported by its success in one-shot and zero-shot classification applications. The advances in similarity learning are essential for smaller datasets or datasets in which few class labels exist per class such as wildlife re-identification. Improving the performance of similarity learning models comes with developing new sampling techniques and designing loss functions better suited to training similarity in neural networks. However, the impact of these advances is tested on larger datasets, with limited attention given to smaller imbalanced datasets such as those found in unique wildlife re-identification. To this end, we test the advances in loss functions for similarity learning on several animal re-identification tasks. We add two new public datasets, Nyala and Lions, to the challenge of animal re-identification. Our results are state of the art on all public datasets tested except Pandas. The achieved Top-1 Recall is

94.8

% on the Zebra dataset,

72.3

% on the Nyala dataset,

79.7

% on the Chimps dataset and, on the Tiger dataset, it is

88.9

%. For the Lion dataset, we set a new benchmark at

94.8

%. We find that the best performing loss function across all datasets is generally the triplet loss; however, there is only a marginal improvement compared to the performance achieved by Proxy-NCA models. We demonstrate that no single neural network architecture combined with a loss function is best suited for all datasets, although VGG-11 may be the most robust first choice. Our results highlight the need for broader experimentation and exploration of loss functions and neural network architecture for the more challenging task, over classical benchmarks, of wildlife re-identification. Full article

(This article belongs to the Special Issue Sensors and Artificial Intelligence for Wildlife Conservation)

► Show Figures

Figure 1

15 pages, 1507 KiB

Open AccessArticle

Enhancing Multi-tissue and Multi-scale Cell Nuclei Segmentation with Deep Metric Learning

by Tomas Iesmantas, Agne Paulauskaite-Taraseviciene and Kristina Sutiene

Appl. Sci. 2020, 10(2), 615; https://doi.org/10.3390/app10020615 - 15 Jan 2020

Cited by 15 | Viewed by 3454

Abstract

(1) Background: The segmentation of cell nuclei is an essential task in a wide range of biomedical studies and clinical practices. The full automation of this process remains a challenge due to intra- and internuclear variations across a wide range of tissue morphologies, differences in staining protocols and imaging procedures. (2) Methods: A deep learning model with metric embeddings such as contrastive loss and triplet loss with semi-hard negative mining is proposed in order to accurately segment cell nuclei in a diverse set of microscopy images. The effectiveness of the proposed model was tested on a large-scale multi-tissue collection of microscopy image sets. (3) Results: The use of deep metric learning increased the overall segmentation prediction by 3.12% in the average value of Dice similarity coefficients as compared to no metric learning. In particular, the largest gain was observed for segmenting cell nuclei in H&E -stained images when deep learning network and triplet loss with semi-hard negative mining were considered for the task. (4) Conclusion: We conclude that deep metric learning gives an additional boost to the overall learning process and consequently improves the segmentation performance. Notably, the improvement ranges approximately between 0.13% and 22.31% for different types of images in the terms of Dice coefficients when compared to no metric deep learning. Full article

(This article belongs to the Special Issue Image Processing Techniques for Biomedical Applications)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI