Deep Morphological Anomaly Detection Based on Angular Margin Loss

: Deep anomaly detection aims to identify “abnormal” data by utilizing a deep neural network trained on a normal training dataset. In general, industrial visual anomaly detection systems distinguish between normal and “abnormal” data through small morphological differences such as cracks and stains. Nevertheless, most existing algorithms emphasize capturing the semantic features of normal data rather than the morphological features. Therefore, they yield poor performance on real-world visual inspection, although they show their superiority in simulations with representative image classiﬁcation datasets. To address this limitation, we propose a novel deep anomaly detection algorithm based on the salient morphological features of normal data. The main idea behind the proposed algorithm is to train a multiclass model to classify hundreds of morphological transformation cases applied to all the given data. To this end, the proposed algorithm utilizes a self-supervised learning strategy, making unsupervised learning straightforward. Additionally, to enhance the performance of the proposed algorithm, we replaced the cross-entropy-based loss function with the angular margin loss function. It is experimentally demonstrated that the proposed algorithm outperforms several recent anomaly detection methodologies in various datasets.


Introduction
In data analysis, anomaly detection refers to the identification of outliers in a data distribution [1]. Several visual anomaly detection algorithms based on deep neural networks (DNNs) have been proposed, including variational autoencoders (VAEs), convolutional neural networks (CNNs), and generative adversarial networks (GANs). However, DNNs can merely access the "normal" class instances. Therefore, most studies focused on representing or extracting salient features from normal instances by utilizing various methodologies, such as low-dimensional embedding, data reconstruction, and self-supervised learning. Deep anomaly detection (DAD) methodologies primarily involve the extraction of the semantically salient features of "normal" images using DNNs. Hence, most studies reported excellent results on representative image classification datasets composed of semantically distinguishable classes, e.g., CIFAR-10 [2], Fashion-MNIST [3], and cats-anddogs dataset [4]. Figure 1a shows several images that are semantically different from each other. Generally, the semantic difference in the image domain leads to large morphological differences, such as outline and texture. Therefore, if the criterion between "normal" and "abnormal" is defined by the semantic differences, DAD tries to extract semantically important features in the training procedure. However, in common real-world anomaly detection problems, the discriminant criterion between "abnormal" and "normal" images is defined by the small morphological differences such as cracks, stains, and noise, which cannot be described semantically. Figure 1b shows two morphologically different images. In general, the criteria to distinguish between "abnormal" and "normal" classes are based on small spatial differences in images. Therefore, previous DAD algorithms developed to capture semantic features are not suitable for morphological anomaly detection. To address this problem, DAD models that emphasize morphological features from "normal" images are required.
(a) (b) Figure 1. Visual description of semantic and morphological differences in images: (a) Semantic difference. Both images are sampled from the cats-and-dogs dataset [4]. The difference between "cat" and "dog" classes is called the semantic difference. Generally, the semantic difference concerns both the semantic and morphological differences in the spatial domain of the image. To understand this difference, DNN must learn salient semantic features, such as the orientation of an object and the relation between dominant parts of a target object; (b) morphological difference. Both images are sampled from the representative industrial visual anomaly detection dataset MVTec [5]. The difference between "good grid" and "broken grid" classes is called the morphological difference. The morphological difference usually does not involve the semantic difference. Therefore, DNN, learning these semantic features, often cannot understand morphological differences between these two images.
Self-supervised learning is one of the best learning mechanisms for guiding the DAD model to understanding morphological features of "normal" images. As a subset of unsupervised learning, self-supervised learning has been proposed to learn image features without using any human-annotated labels [6]. In particular, there is a proxy objective function that enables DNN to achieve the goal of the target application. To rephrase, with a properly designed self-supervised loss function, DNN can learn the feature that we are interested in, e.g., the morphological features of an image. Several methods for self-supervised learning-based DAD have been proposed [7,8].
Existing self-supervised learning-based DAD models train DNN to recognize the geometric transformation applied to an image received as the input, e.g., 2D rotation and geometric translation. Previous studies that demonstrate this straightforward task provide a powerful supervisory signal for semantic feature learning. Therefore, these previous semantic DAD models cannot maintain their robust performance in visual morphological anomaly detection problems. More specifically, for instance, to successfully predict the 2D rotation of an image, the DAD model must learn to (1) localize salient objects in an image and (2) recognize their orientation and object type. Subsequently, it must relate the object orientation with the dominant orientation that each object tends to depict within the available "normal" images. However, the DAD model, focusing on semantic features of "normal" images, is not suitable for the morphological anomaly detection problem, as depicted in Figure 1b. This is because, in most visual morphological anomaly detection problems, not only "normal" image does not include a salient object, but the discriminate criteria between "abnormal" and "normal" images are defined by the small differences in the spatial domain of an image. To address this limitation, in this study, we propose a DAD algorithm, which trains DNN to recognize morphological transformations applied to the instance that it receives as input, including dilation, erosion, and morphological gradient. In addition, we propose a novel objective function called the kernel size prediction loss, which leads the proposed DAD model to recognize the window size of the morphological transformation filter via the transformed image only. To define this loss as a classification loss, we define several window sizes, which facilitates the proposed DAD to learn various morphological features from "normal" images.
Although the proposed DAD model learns morphological features of a "normal" image via self-supervised learning, several challenges remain in the training procedures, including the enhancement in the discriminative performance and the stabilization of the training process. Unlike semantic feature representation-based DAD models, the proposed algorithm must capture versatile and subtle morphological features on "normal" instances to quantify the abnormality of unobserved input instances. To address these limitations, the proposed DAD model adopts an angular margin loss (AML) to augment the softmax loss, which is widely used in previous self-supervised learning-based DAD models [7][8][9]. The softmax loss is suitable to optimize the inter-class difference but unable to reduce the intra-class variation. To enhance the discriminative power of softmax loss, several AMLs have been developed to minimize the intra-class variation. These AMLs force the classification boundary closer to the weight vector of each class and improve the softmax loss by combining various types of margins. Because the proposed DAD is based on classification tasks, AMLs can easily be combined without any additional process. Therefore, the proposed DAD has enhanced discriminative performance in morphological feature representation learning owing to AMLs enforce intra-class compactness and interclass discrepancy simultaneously.
In essence, the main contributions of this study are as follows: • A novel deep morphological anomaly detection model based on straightforward morphological transformations and AML is developed. The proposed algorithm can learn the morphological features of "normal" images intensively.

•
Because the proposed algorithm is based on self-supervised learning, it represents and extracts salient features with supervised learning, which often guarantees easier convergence and lower computational cost compared to unsupervised learning-based DAD models.

•
To combine self-supervised learning and morphological transformations, we propose a novel objective function called the kernel size prediction loss, enabling the DAD model to recognize the morphological filter size of morphologically transformed inputs.
The remainder of the paper is organized as follows. In Section 2, we briefly introduce several DAD models. In Section 3, we describe AMLs. In Section 4, we detail the proposed algorithm through theoretical analysis. In Section 5, we report and discuss the experimental results. Finally, in Section 6, we summarize the study. Notably, this study is an extension of our previous study [9] by combining self-supervised learning and AMLs.

Related Works
This section provides an outline of the popular reconstruction-based DAD and selfsupervised learning-based DAD for visual anomaly detection.

Reconstruction-Based DAD
Reconstruction-based methods project a "normal" sample into a lower-dimensional latent space and then reconstruct it to approximate the original input. It is generally assumed that a high reconstruction error can distinguish a "normal" instance from an "abnormal" instance. Schlegl et al. argued that the discriminator of GANs, which is pretrained on a "normal" sample, projects an "abnormal" sample far from the feature of a "normal" instance [11]. Zenati et al. tried to increase the efficiency of the test process by training the encoder and decoder simultaneously using a Bi-GAN structure [12]. Akcay et al. attempted to capture the distribution of "normal" samples by an additional encoder to the existing autoencoder structure and compared the features of the reconstructed image using an autoencoder (AE) [13]. Sabokrou et al. attempted to reconstruct a more realistic image through adversarial learning with discriminators [14]. Akcay et al. exploited adversarial learning through an encoder-decoder network architecture with a skip connection that helps capture the detail of images [15]. Gong et al. proposed the memory-guided AE, which stores the characteristics of "normal" instances to limit the powerful generalization capabilities of CNNs [16]. Park et al. introduced loss functions that, unlike memoryguided AE, guarantee intra-class compactness and inter-class separateness of "normal" instance patterns based on a 2D convolutional AE to increase the efficiency of the memory module [17]. Perera et al. proposed a one-class GAN structure that includes two hostile discriminators and a classifier to ensure that the feature vector of the "normal" instance has a uniform distribution, thereby obtaining a high reconstruction error for "abnormal" instances [18]. Hong et al. proposed a model with high reconstruction error for "abnormal" data that does not involve complicated processes, such as adversarial learning; instead, the model uses a dispersion loss function that spreads feature vectors in a limited area [19].

Self-Supervised Learning-Based DAD
Self-supervised learning-based methods predict the transformation applied to an image or restore a damaged image, leading to the learning of the semantic features of "normal" instances. Gidaris et al. argued that the semantic characteristics of "normal" instances could be learned to recognize arbitrary geometric transformations applied to the input image without accessing the original image [8]. Golan et al. trained a model to discriminate dozens of geometric transformations, including horizontal flipping, translations, and rotations, applied to "normal" instances to learn a meaningful representation of "normal" instances [7]. Kim et al. proposed a self-supervised learning method based on morphological transformations, including dilation, erosion, and morphological gradients, to detect irregularities in data in the same semantic category [9]. Other studies have applied self-supervised learning-based methods using restoration. These methods transform the input image and restore it to learn the critical semantic features that enable the discrimination of "normal" and "abnormal" instances. Sabokrou et al. proposed adversarial visual irregularity detection using two models [20]. The first model is an AE model that considers irregular objects in the input image as noise and denoises pixel-level irregularities based on the dominant textures of the image. The other model detects irregularities in the image using patch units. These models were trained using adversarial learning. Zarvrtanik divided the original image into small patches, and randomly deleted and restored them, allowing the network to learn the semantic features of "normal" instances [21]. Fei et al. proposed an attribute restoration network, which erases some attributes from the "normal" instances and forces the network to restore the erased attributes [22].

Angular Margin Loss
In this section, we provide a simple description of the angular loss and the general definition of AML. The proposed algorithm enhances the anomaly detection performance by defining the objective function in the AML format.

Softmax Loss
The most popularly used classification objective function is the softmax loss L so f tmax : where x i ∈ R d is the embedded feature of the ith instance in the dataset, which belongs to the c i th class. W j ∈ R d is the jth column vector of weight W ∈ R d×n , and N and n are the batch size and number of classes, respectively. Despite its popularity, the softmax loss does not explicitly optimize the feature embedding to enforce higher similarity for intra-class and diversity for inter-class samples.

Angular Loss
Because W T x is equal to W x cos θ, the aforementioned softmax objective function can be reformulated as follows: where . is the l 2 norm operation and θ j is the angle between W j and x i . To transform the softmax loss L so f tmax in Equation (2) to angular loss, following [23][24][25], we normalize W j = 1 by l 2 normalization. Then, following [24][25][26][27], we set the feature x i by l 2 normalization and rescale it to r. This process makes the classifier depend only on the angle between the embedded feature x i and weight W j . Therefore, the features are distributed on a hypersphere with a radius of r, and the softmax angular loss L angular can be expressed as follows [28]:

Angular Margin Loss
To enhance its discriminative power, the softmax angular loss can be transformed into an AML L margin as follows: where m 1 , m 2 , and m 3 are the margins of a multiplicative AML (MAML) [23,29], additive AML (AAML) [28], and additive cosine margin loss (ACML) [24,27], respectively. In numerical analysis, these three AMLs aim to enforce the intra-class compactness and inter-class diversity by penalizing the target logit. However, in the geometric analysis, the proposed AAML has a constant linear angular margin throughout the interval. In contrast, MAML and ACML only have nonlinear angular margins [28]. In essence, the proposed DAD model can enhance its detection performance by utilizing AML, a robust classifier based on a straightforward modification of the softmax loss function.

Proposed Method
In this section, we describe the proposed deep morphological anomaly detection algorithm, which effectively represents the dominant morphological features of "normal" instances via self-supervised learning and recognizes "abnormal" samples through an enhanced classifier using AML.

Morphological Image Processing
In digital image processing, a mathematical morphology transformation is a mechanism for extracting image components useful for representing and describing the shape of the regions, such as boundaries, skeletons, and convex hulls [30]. The proposed deep anomaly detection learns the morphological features by three representative morphological transformations: erosion, dilation, and morphological gradient.

Erosion and Dilation
The erosion at any location (x, y) of image A by a kernel b is the minimum value of A in the region covered by b when the central point of b is at (x, y). If b is an S × T kernel, obtaining the erosion at a pixel requires obtaining the smallest of the ST values of A included in an S × T region determined by the kernel when its origin was at that point. Mathematically, the erosion is defined as follows: where [A b] denotes the erosion of A with filter b, and A(x, y) denotes the (x, y) pixel in image A, and (s, t) is the (s, t) pixel in filter b. Notably, the origin points in A and b are defined as the top-left corner pixel and central pixel, respectively. Because the erosion calculates the minimum pixel value of A in every neighborhood of (x, y) coincident with b, it is expected that the size of bright features will be reduced, and the size of dark features will be increased. The third column in Figures 2 and 3 show the eroded images of the "normal" and "abnormal" samples, respectively, in the "tile" class of MVTec. From these figures, it can be seen that the erosion process enlarges the dark features of an image. Additionally, it was found that the shape of the extracted features from the morphological transformation depends on the form of the kernel. If a vertical-shaped filter is used, the erosion causes the dark region to enlarge vertically.
In contrast, the dilation at any location (x, y) of image A by a kernel b is the maximum value of A in the region covered by b when the origin of b is at (x, y). The dilation transformations can be defined as follows: where [A ⊕ b] denotes the dilation of A with filter b. In contrast to the erosion process, dilation increases the size of bright features and decreases the size of the dark features. The second column in Figures 2 and 3 show the dilated figures of "normal" instances and the anomalies in the MVTec "tile" class, respectively. It is evident from these images that dilation has the opposite effect of erosion.

Morphological Gradient
To obtain the morphological gradient of an image, dilation and erosion can be used in combination with image subtraction. This operation can be expressed as follows: where denotes morphological gradient operation. Because dilation increases regions in an image, whereas erosion decreases them, the difference between them highlights the boundaries between areas. The emphasis of edges and suppression of homogeneous regions in an image is referred to as the "derivative-like" (gradient) effect. The third column in Figures 2 and 3 shows the morphological gradient images of the chosen "normal" and "abnormal" images, respectively. From these images, it is evident that this morphological transformation emphasizes the outline of the cracked area. In addition, the outcome of the morphological gradient depends on the shape of the filter.

Deep Morphological Anomaly Detection via Angular Margin Loss
The proposed algorithm was developed to detect "abnormal" instances based on morphological criteria in the anomaly detection process. To achieve this goal, the model adopts self-supervised learning, which easily makes the DAD model learn the features of interest, and the CNN architecture, which exhibits high performance in various computer vision tasks. In self-supervised learning, because the learned features are chosen, we propose a novel objective function that enforces the DAD model to represent the dominant morphological features of "normal" instances.
The problem of anomaly detection in images can be defined as follows: where A denotes an input image, f DAD denotes an anomaly detection function that returns 1 if the input instance A belongs to "normal" class, g(A) is a scoring function that quantifies the normality of the input, and λ is the threshold parameter that controls the recall and precision of the DAD function f DAD . Similar to the previous algorithm [8,9], the proposed algorithm aims to estimate the scoring function g in the aforementioned equation.

Self-Supervised Learning Using Morphological Transformations
The proposed algorithm estimates the optimal scoring function using self-supervised learning. Therefore, we constructed a self-labeled dataset of images from an initial "nor- ] is unknown to F. The proposed DAD model F consists of a single feature extractor and four classifiers, similar to a hard-parameter-sharing-based multi-task model. We denote the classifiers in F as F M , F W , F H , and F R ; they are designed to predict the class of morphological transformation M i , the class of kernel width W j , the class of kernel height H o , and the class of geometric rotation R p , respectively. All objective functions of classifiers are defined following AML fashion to enhance the discriminative power. Therefore, the proposed objective function is defined as follows: where θ M = {θ M 1 , . . . , θ M |M| }, θ W = {θ W 1 , . . . , θ W |W | }, θ H = {θ H 1 , . . . , θ H |H| }, and θ R = {θ R 1 , . . . , θ R |R| } are the sets of angles between the embedded features of F and given classes in M, W, H, and R, respectively. AMLs for M, W, H, and R are defined according to the AML definition in Section 3. We report these four objective functions in Appendix A. Through the objective function, the proposed DAD model learns dominant morphological features of "normal" images. In addition, because the proposed loss function is based on self-supervised learning and AML, the DAD model can easily train "normal" instances in an improved discriminative manner.
At the inference time, given an unseen image A, we decide whether it belongs to the "normal" class by first applying each transformation on it and then applying the classifier on each of the |M||W ||H||R| transformed images. Each such application results in an AML response vector of size |M||W ||H||R|. The final normality score is defined using the combined log-likelihood of these vectors under an estimated distribution of "normal" AML output vectors.
Subsequently, we define the normality score function g(A)-fix a set of morphological transformations T = {T 0 , . . . , T |M||W ||H||R|−1 } and define F(T i (A)), which is the vector of AML response of the proposed model F applied on ith transformed image T i (A). To construct the normality score, we define which is the combined log-likelihood of a transformed image conditioned on each of the applied transformations in T. By following [7], we approximate each conditional is the Dirichlet parameter, A ∼ P real (A), i ∼ Uni(0, |M||W ||H||R| − 1), and P real (A) is the real data probability distribution of "normal" images. The primary reason for the choice of the Dirichlet distribution is that it is a common choice for distribution approximation when the F(T i (A)) reside in the unit |M||W ||H||R| − 1 simplex. This is because, in the proposed algorithm, we modified the softmax function by applying margin penalties (scalars) on the between angle θ and its cosine value, and the response vectors of AML and the softmax function do not differ. Therefore, the score function g(A) critically captures normality such that for two images A and A , g(A) > g(A ) tend to imply that A is "more normal" than A .

Experimental Results and Discussion
In this section, deep anomaly detection experiments were performed to verify the performance of the proposed algorithms on several datasets, including MVTec [5], MNIST [10], and Fashion-MNIST [3]. Following [7,9], we learn the representations by transformation prediction from scratch with ResNet-34 [31]. We set the classes in R by following the geometric rotation classes in [7][8][9]. Additionally, we follow [28] to set rescale parameter r, multiplicative angular margin m 1 , additive angular margin m 2 , and additive cosine margin m 3 to 64, 0.9, 0.4, and 0.15, respectively. All experimental results are reported as the area under the receiver operating characteristic (AUROC), which is a useful performance metric to measure the quality of the trade-off of g(A) in (8). Because MNIST and Fashion-MNIST are not designed for anomaly detection, we trained the model for only one class as a "normal" class, and the performance was evaluated for the entire test dataset-classes other than the trained class were assumed to be "abnormal." The proposed algorithm was actualized using PyTorch in a GPU implementation. We performed experiments with an RTX 2080Ti 11GB graphical processing unit and an Intel i7 processor. To validate the experimental results using statistical methods, we conducted all experiments five times and then calculated the average and variance.

Experimental Results on the MVTec Dataset
MVTec contains 10 object and 5 texture categories for anomaly detection, with 3629 training instances and 1725 test samples [5]. The dataset is composed of "normal" images for training and both "normal" and "abnormal" images with various industrial defects for testing. In Table 1, we present the anomaly detection performance on the MVTec dataset. In comparison with the semantic-feature-based DAD model [7], the proposed algorithm yields a higher detection performance. This result demonstrates that extracting salient morphological features on "normal" images significantly increases the detection performance on industrial images. Compared to the results of our previous study [9], the results of this study show that self-supervised learning-based anomaly detection can be improved by merely applying AMLs. In addition, the proposed algorithm outperforms (02.6 AUROC) the existing DAD algorithm [32], which leverages pretrained networks (87.9 AUROC). The results on this dataset confirm that the combination of morphological transformations and AML-based self-supervised learning provides satisfactory performance in industrial anomaly detection problems. To illustrate the difference between the proposed algorithm and semantic featurebased DAD [7], a visual saliency map (visual representations) using Grad-CAM++ [34] is presented in Figure 4. Grad-CAM++ is a representative interpretable machine learning technique. This technology enables us to visually identify the parts of the input image that most critically influence the CNN's output. The MVTec dataset is divided into two overarching classes: "texture" and "object." Figure 4 shows visualizations utilizing images of the "carpet" class within the texture class and the "cable" class in the object class. The first row of Figure 4 represents visualizations for the carpet class. For images corresponding to the texture class, the DAD model must detect all regions of the input image because they have morphological characteristics relevant to the entire image. Therefore, visualizations of the results by the proposed method (shown in the second and fifth columns in Figure 4) confirm that the saliency map appears in the input image when considered as a whole. In contrast, semantic feature-based DAD shows an unequal visual expansion with a high saliency score only for the edge region of the image. This is the primary reason that the proposed method has higher AUROC performance than existing methods in the texture class of the MVTec dataset. Unlike the texture class, subclasses belonging to the object class contain crucial object information in the center of the image. The second row in Figure 4 shows visual representations of the cable class. The proposed method, similar to the output in the carpet class, has a saliency map for the entire area of the image, but its saliency map is more intense in the vicinity of the object. Conversely, semantic featurebased DADs have high saliency map concentrations on wires and on the covering parts of cables. Most notably, based on the experiments using the cable class, in the "abnormal" image, the proposed method is quite accurate-it has a high saliency score in the area expressed as defective. This implies that the proposed method has sufficiently learned various morphological features of the image and proves that it is suitable for industrial anomaly detection problems.  [34]: (first row) the "carpet" class; (second row) the "cable" class; (first column) "normal" images; (second and fifth columns) visualizations of results using proposed algorithm; (third and sixth columns) visualizations of results using proposed algorithm [7]; (fourth column) "abnormal" images.

Experimental Results on MNIST and Fashion-MNIST Datasets
The MNIST dataset contains 10 categories labeled 0 to 9, and the Fashion-MNIST contains 10 categories of clothing. As aforementioned, in these experiments, we followed a one-class classification protocol [19]. In addition, we compared the performance of the proposed method with reconstruction-based DAD models, such as AE-, VAE-, and GAN-based algorithms. Tables 2 and 3 present the results on MNIST and Fashion-MNIST, respectively. In the MNIST experiment, the proposed algorithm achieves slightly better performance than the other algorithms because the classification criteria between different digits are based on several morphological characteristics. In the Fashion-MNIST experiments, the proposed algorithm exhibited lower performance for some classes. In particular, because this dataset contains various styles of "coat," a high semantic feature understanding of the DAD model is required to achieve better performance. Conversely, the proposed algorithm is designed to focus on salient morphological features of "normal" images to detect small morphological differences between "normal" and "abnormal" instances. Therefore, the proposed algorithm is not appropriate for some classes in Fashion-MNIST. Notably, this phenomenon is not a limitation of the proposed model but the necessity for a proper DAD model design by considering the target.

Statistical Analysis of the Results Using the Wilcoxon Signed-Rank Test
To analyze the experimental results by a statistical method, we performed the Wilcoxon signed-rank test and reported it in Table 4. The Wilcoxon signed-rank test is a popular nonparametric statistical test for matched or paired data. This statistical test is based on both difference scores and the magnitude of the observed differences. In this paper, we define the difference score as follows: are AUROC values of the compared algorithm and the proposed algorithm on ith class in the certain dataset. In Wilcoxon signed-rank test, the hypotheses are concerned by the population median of the difference scores. In this paper, we consider a one-sided test which is as follows: Hypothesis 1 (H1). The median difference is zero.

Hypothesis 2 (H2). The median difference is negative.
The test static for the Wilcoxon signed-rank test W is defined as the smaller value among W + (the sum of the positive ranks) and W − (the sum of the negative ranks). To determine whether the observed test static W supports H 1 or H 2 , we have to choose the W critical . Therefore, if the W is less than W critical , we reject H 1 , in favor of H 2 . In contrast, if the W exceeds the W critical , we do not rejct H 1 . In this paper, to set the appropriate W critical , we choose the level of significance α to 0.05, which is the most common value for the Wilcoxon signed-rank test.
For the MVTec dataset, the proposed algorithm yields statistically highest performance than the other algorithms. Moreover, in MNIST and Fashion-MNIST, the proposed algorithm has better performance than several algorithms. Because MNIST and Fashion-MNIST are relatively easier datasets than the MVTec dataset, it exhibits a similar performance than [33]. However, notably, the proposed algorithm is developed to address the morphological anomaly detection problem. Therefore, in the Wilcoxon signed-rank test results of the MVTec dataset, it is clearly demonstrated that the proposed algorithm achieved significantly higher performance than [33] in the morphological anomaly detection task.

Conclusions
We proposed a novel DAD model to extract the dominant morphological features of "normal" images. The proposed algorithm is based on a combination of morphological transformations and a self-supervised learning algorithm. In addition, to improve the discriminative power of the proposed DAD model, we adopted an angular margin on the classification loss function. The experiments confirmed that the proposed algorithm achieves higher performance in the industrial anomaly detection dataset than the previous algorithms. In addition, the proposed algorithm yields slightly better performance than the previous reconstruction-based DAD models. The results validate the satisfactory performance of the proposed algorithm in real-world industrial anomaly inspection applications. In the future, we plan to combine the proposed algorithm with the semantic feature-based algorithm.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A
The proposed objective function is defined as follows: . (A5)