You are currently viewing a new version of our website. To view the old version click .
Applied Sciences
  • Article
  • Open Access

8 February 2021

Wheel Hub Defects Image Recognition Base on Zero-Shot Learning

,
,
,
and
1
School of Mechanical Engineering, Jiangsu University, Zhenjiang 212000, China
2
School of Mechanical Engineering, Anyang Institute of Technology, Anyang 455000, China
*
Authors to whom correspondence should be addressed.
This article belongs to the Special Issue Artificial Intelligence for Computer Vision

Abstract

In the wheel hub industry, the quality control of the product surface determines the subsequent processing, which can be realized through the hub defect image recognition based on deep learning. Although the existing methods based on deep learning have reached the level of human beings, they rely on large-scale training sets, however, these models are completely unable to cope with the situation without samples. Therefore, in this paper, a generalized zero-shot learning framework for hub defect image recognition was built. First, a reverse mapping strategy was adopted to reduce the hubness problem, then a domain adaptation measure was employed to alleviate the projection domain shift problem, and finally, a scaling calibration strategy was used to avoid the recognition preference of seen defects. The proposed model was validated using two data sets, VOC2007 and the self-built hub defect data set, and the results showed that the method performed better than the current popular methods.

1. Instruction

The automobile is an indispensable means of transportation in people’s daily life, it is also an important part of the national economy, and the wheel hub is one of the key parts of the automobile. With the development of China’s automobile industry, the technical level of wheel hub manufacturing enterprises is constantly improving, and product export volume is large and continues to grow. However, due to the rapid growth of production and the imperfection of processing technology, more than 40 defects have been found in wheel hub products. These defects not only affect the good appearance and brand image of products but also lead to serious traffic accidents. Therefore, researchers have done some work, such as Li [1], in order to improve the accuracy of automobile wheel hub defect image detection and recognition, an improved peak algorithm—the trend peak algorithm—was proposed to extract the wheel hub defect area and combined with the BP neural network to recognize the wheel hub defects. Zhang [2] aimed at the internal defects such as air hole and shrinkage cavity in the process of low-pressure casting of the wheel hub, a method of defect extraction based on mathematical morphology was employed, all of which belongs to the traditional recognition method’s artificial design feature, so in the face of complex samples, the robustness is poor. While deep learning technology has advantages: automatic feature extraction, weight sharing, and needless image preprocessing. For example, Han [3] used Faster R-CNN with ResNet-101 as the target detection algorithm to detect scratches and points on the wheel hub. Sun [4] put forward an improved Faster R-CNN recognition model for multiple types of wheel hub defects, that is, by improving the shared network ZFNet to two revised branches (RPN and Fast R-CNN), and then four typical wheel hub defects are identified. Although the methods based on deep learning have made some progress, we find that existing supervised learning strategies rely on a large amount of labeled data. However, some defect samples are scarce or none at all, so they cannot meet the requirements of the deep network training, which leads to the performance breakdown. At present, there are some different solutions, such as transfer learning [5], self-taught learning [6], and few-shot learning [7]. However, these methods are unable to cope with the condition of zero samples. When the test samples are never seen during training, which is called zero-shot learning (ZSL) [8], as shown in Figure 1.
Figure 1. Zero-shot learning process. In training, there are seen classes with a large number of labeled images, but unseen classes with no labeled images, however, semantic descriptors for both seen and unseen classes are available, through which unseen classes are identified.
As we know, humans have the ability to retain knowledge from past learning tasks and use that knowledge to quickly integrate and solve new recognition tasks. Specifically, humans can easily identify these rare categories by using semantic description and their relationship to the seen classes. For example, a person could identify a new species “penguin” by the semantic description, “a flightless bird living in the Antarctic”, or “kangaroo” by the portrayal, “an Australian animal with a pouch on its body”, inspired by which we try to empower machines with the same intelligence, in the process of achieving this goal, zero-shot learning is essential to the realization of machine intelligence.
ZSL has broad prospects in the fields of autonomous driving [9], medical imaging analysis [10], intelligent manufacturing [11], robotics [12], etc. In these fields, although it is difficult to obtain new labeled images, advanced semantic descriptions of categories can be easily obtained.
In order to identify unseen classes, usually, a large scale of labeled samples (seen classes) is needed to train the deep model, and then the model is adapted to an unseen one. For ZSL, the seen and unseen classes are associated through a high-dimensional vector space of semantic descriptors. Each class corresponds to a unique semantic descriptor. Semantic descriptors can take the form of manually defined attributes [13] or automatically extracted word vectors [14]. Figure 1 shows the training and testing process of ZSL.

3. Wheel Hub Defect Dataset

3.1. Image Data

According to task requirements, we cooperated with local well-known hub manufacturers, collected hub defect image data on the production line, and completed defect type labeling according to the guidance of enterprise engineers and the defect standards of enterprises.
In order to eliminate the influence of the background and better extract the features of hub defect images, we automatically cut the target area, and then manually checked the segmented defect images to optimize the database. In this paper, nine kinds of hub defects (oil pollution, grinning, scratch, block, sagging, indentation, orange peel, deformation, and dust) were used to construct WHD-9, as shown in Figure 3.
Figure 3. Images of wheel hub defects.
From Figure 4, we find that the distribution of classes is extremely uneven. The largest category is oil pollution defects, and the least number is orange peels. In this case, the machine learning algorithm finds it difficult to correctly identify the orange peel type. It is worth mentioning that some completely unknown defect types may appear in the future, which increases the difficulty of recognition, therefore, the study of this topic is very necessary.
Figure 4. Sample distribution in WHD-9. Note: D-1 = block, D-2 = grinning, D-3 = oil pollution, D-4 = scratch, D-5 = orange peel, D-6 = sagging, D-7 = indentation, D-8 = dust, D-9 = deformation.

3.2. Domain Knowledge Base

The semantic attributes of each hub defect are different according to the product standards of the industry. As can be seen from Table 1, block is a lump on the surface of the paint film. Dust is white specks of soot that fall from the oven onto the surface of the wheel hub. Oil pollution is caused by mineral oil or grease attached to the metal surface, which is mostly round and dark in color with obvious protrusions, then 16 attributes were selected to describe the hub defect image in the semantic attribute space. Each defect type consists of a 16-dimension (A1, ..., A16) vector, and these vectors are encoded in one-hot [41] mode. These 16 dimension semantic vectors will be used to build a domain knowledge base for ZSL.
Table 1. Semantic attributes space for wheel hub defects.

4. Method

4.1. Problem Description

The given training data set Dtr is composed of Ntr hub defect samples, so that D t r = { ( x i ,         a i         y i ) ,         i = 1 , 2 , 3 , N t r } . Here, x i R m × n × c is the image sample (m × n is the image size, c is the number of channels), a i R s is the semantic descriptor of the defect category. Each semantic descriptor ai is associated with a unique defect label y i Y t r . The goal of ZSR is to predict the category label y j Y t e for the j t h test sample defect x j . In a traditional ZSL, Y t r Y t e = φ , means that there is no overlap between seen and unseen defects. However, in the GZSL, test sets include not only unseen classes but also seen classes, that is Y t r Y t e . During the training phase, semantic descriptors for both seen and unseen classes can be used. Since the probability of defects in seen classes is much higher than that of unseen classes in specific identification tasks, in this paper, a generalized ZSR framework was come up with to realize hub defect identification, as shown in Figure 5.
Figure 5. Generalized ZSR framework for wheel hub defect images.
As can be seen from Figure 5, a multilayer perceptron was adopted for the mapping of semantic descriptors to image feature space. Then the semantics were embedded into the corresponding features by means of one-to-one pairing. The semantics of the unseen class were then embedded to accommodate the test data of the unseen class. To avoid the framework’s preference for seen class recognition, scaling calibration was performed during the test.

4.2. Structure Matching Strategy

About visual-semantic mapping, both forward mapping and common mapping methods use the embedding function of samples and semantic descriptors, and embedding processing is learned by minimizing the similarity function between the embedded sample and the corresponding embedded semantic descriptor. It is only different in the selection of embedding methods and similarity functions. These methods usually use nearest neighbor search for classification after embedding, however, in a high-dimensional space, nearest neighbor search always suffers from the “hubness” phenomenon, because a certain number of data points will become the nearest neighbors or centers of almost all test points, leading to classification errors. However, if the reverse mapping strategy is adopted, that is, the mapping from the semantic descriptor space to the visual feature space will effectively avoid the “hubness” problem, therefore, reverse mapping strategy was adopted in this paper.
We need to learn a mapping function f ( ) the semantic descriptor a i to its corresponding image feature φ ( x i ) . Where x i is the image, and φ ( ) refers to the CNN architecture for extracting a high-dimensional feature map. The mapping function f ( ) is a fully connected neural network. In order to make the descriptors and image features close to each other, the least squares loss function was employed to minimize the difference. The initial objective function L 1 is shown in Equation (1):
L 1 = 1 N t r i = 1 N t r f ( a i ) φ ( x i ) 2 2 + λ γ g ( f )
where g ( ) stands for normalized loss for f ( ) . The loss function L 1 was adopted to minimize the point-to-point difference between semantic descriptors and image features.
In order to illustrate the structural matching between the semantic space and the feature space, we tried to minimize the pairing relationships between classes in these two spaces. Therefore, we constructed relational matrices D a for semantic descriptors and image features, and each of these elements was derived from [ D a ] u v = f ( a u ) f ( a v ) 2 2 , where a u and a v represent semantic descriptors of seen class defects u and v respectively. The image feature relationship matrix D φ was built, and each element was calculated by the formula [ D φ ] u v = φ u φ v 2 2 . Where, φ - u and φ v represent the mean values of class u and v respectively, which was calculated from Equation (2)
φ u = 1 | y t r u | y i y t r u φ ( x i )
where the ( ) is based on the defect type u , and | y t r u | is the cardinality of the training set of defect type u , and the same is true for defect v .
To achieve structure alignment, the structure alignment loss function L 2 needs to be minimized.
L 2 = D a D φ F 2
F 2 Where represents the Frobenius norm, and combined with the loss functions L 1 and L 2 , we get the total loss L t o t a l , as Equation (4) shown.
L t o t a l = L 1 + ρ L 2
where ρ 0 , is used to measure the loss weight of L 2 . L t o t a l is to optimize the parameters of f ( ) .

4.3. Domain Adaption Strategy

After training, projection domain shift may occur between the mapped semantic descriptors and the image features of the unseen classes. This is due to data from unseen classes not being used in the training phase, so regularized models have poor generalization ability to unseen classes. Therefore, we have to use test data from the unseen defect to adjust the mapping semantic descriptor to fit the unseen defects.
Given the mapping descriptors of unseen classes stack vertically in the form of a matrix A R n u × d , where n u stands for the number of unseen classes, and d means the dimension of semantic descriptor space.
Suppose U R n u × d is the test data set for the unseen class, and O u represents the number of test samples from the unseen class. To accommodate the mapped descriptors, a point-to-point correspondence between the descriptors and test data is used, which is represented as a matrix C R n u × o u . The rows of U need to be rearranged so that each row of the revised matrix corresponds to the row in a A , which is achieved by minimizing the loss function Equation (5).
L 3 = C U A F 2
L 3 can force C U to generate adaptive semantic descriptors. However, there is still the problem that a sample may correspond to more than one descriptor in A , which means that it will actually result in corresponding to more than one category of test samples. To avoid this problem, by using group-Lasso [42], an additional group-based regularization function L 4 was conducted.
L 4 = j c [ C ] I c j 2
where I c means the index corresponds to those rows in the unseen class. So [ C ] I c j stands for a vector consisting of an index of rows I c and columns j . Since C is a correspondence matrix, certain constraints can be applied to solve the deviation of sample number between semantic space and feature space for unseen classes. Therefore, the domain adaptive optimization problem can be expressed as follows Equation (7):
min c { L 3 + λ g L 4 }           s . t .             C 0 ,           C 1 O u = 1 n u ,         C T 1 n u = n u o u 1 o u
where λ g balances the weight of loss function L 4 . The above optimization problems are all convex functions, which can be effectively settled by the conditional gradient method [43], and the method requires solving the linear function on the constraint C D = { C : C 0 ,           C 1 o u = 1 n u ,       C T 1 n u = n u o u   1 o u } as an intermediate step, as shown in Algorithm 1. using the Simplex Formulation [44] in EMD [45], the variable C d in Algorithm 1 could be easily obtained [46].
Algorithm 1: Conditional Gradient Method
  • Initialize: C 0 = 1 n u o u 1 n u × o u ,       t = 1
  • Repeat
  • C d = arg min T r ( C = C 0 ( L 3 + λ g L 4 ) T C ) ,   s .   t .   C D
  •    C 1 = C 0 + α ( C d C 0 ) ,   f o r   α = 2 t + 2
  •    C 0 = C 1   a n d   t = t + 1
  • Until Convergence
  • Output: C 0 = arg min C { L 3 + λ g L 4 }   s .   t .   C D
Algorithm 1 was used to obtain the final solution of the corresponding matrix C 0 and to check it. Given a test sample, we assigned the class correspondence to the maximum value of the corresponding variable and did the same for all test samples. The semantic descriptors of unseen classes were acquired by averaging features of the related classes, and then adaptive semantic descriptors were stacked vertically in the matrix A .

4.4. Recognition Anti-Bias Mechanism for GZSL

In the GZSL [47], obviously, classification performance tends to favor seen defects. To eliminate this phenomenon, we recommend multiplication calibration for classification scores. In this paper, 1-nearest neighbor (1-NN) and Euclidean distance measure (EDM) were used as classifiers. For the test sample defect x, we adjusted the classification score of seen class as shown in Equation (8).
y = arg min x f ( a c ) 2 I [ c φ ]
where, if c φ or c U , and φ U = T , then I [ ] = γ . Here, φ ,           U ,           T represent the seen defects, the unseen defects, and the collection of all defects, respectively. The scaling measure is to modify the effective variance of the seen defects. When the Euclidean distance metric is used for nearest neighbor classification, it assumes that the variance of all classes is equal, however unseen classes are not applied to learn the embedding space, changes in the characteristics of unseen classes are not considered, that is why the EDM is adjusted for seen classes only. For γ > 1 , if we achieved a balance between the seen and unseen defects, which indicates that the variance of the seen classes was overestimated. Conversely, for γ < 1 , a balance was found between seen and unseen classes, which means the variance of seen defects was underestimated. Algorithm 2 shows process of the proposed model from training to testing.
Algorithm 2: Three-step Zero-shot Learning Algorithm
  • Input: Training Dataset { x i ,   a i ,   y i } i = 1 N t r
  • Parameters: λ r ,   ρ ,   λ g ,   γ
  • Repeat (Training)
  •   Sample Minibatch of { ( x i , a i ) } pairs
  •   Gradient descent L 1 + ρ L 2 w.r.t. parameters of f ( )
  • Until Convergence (Step 1)
  • Input: Test Dataset { ( x i ) } i = 1 N t e
  •   Apply Algorithm 1 to obtain adapted descriptors of unseen classes A' (Step 2)
  • Repeat for each test point x (Testing)
  •    y = arg min c T X f ( a c ) 2 I [ c T ] (Step 3)
  • Until all test points covered

5. Experiment and Analysis

5.1. Experiment Preparation

Based on the experimental setup, we evaluated using two data sets: aPY [48] including 20 seen classes and 12 unseen classes, and with an associated 64-dimensional semantic descriptor. WHD-9 is a self-built data set about wheel hub defects, cover 6 seen classes and 3 unseen classes, each associated with a 16-dimensional semantic descriptor. The details of both data sets are shown in Table 2.
Table 2. Dataset information.
With respect to the evaluation criteria, we used class-wise accuracy because it can avoid class dominance during intensive sampling. Therefore, the average precision of the class was calculated as follows Equation (9):
a c c = 1 | y | y = 1 | y | t h e         n u m b e r         o f         c o r r e c t         p r e d i c t i o n s       f o r         c l a s s       y         t h e         t o t a l         n u m b e r       o f       s a m p l e s       f o r       c l a s s       y
where | y | stands for the number of test defects. In the proposed model, the accuracy of seen and unseen classes is acquired respectively, and the harmonic mean H is used for processing [49] as shown in Equation (10), which aims to ensure that the performance of the seen defect does not lead overall accuracy.
H = 2 × a c c s × a c c u a c c s + a c c u
where a c c s and a c c u are the classification accuracy of seen classes and unseen classes respectively. In order to make a fair comparison, we experimented and recorded the results for the training and test data sets on the common data sets and self-built data set.

5.2. Experiment Results

For the experiment, a two-layer feed-forward neural network for semantic embedding was employed. For the aPY and WHD-9 data sets, the dimensions of the hidden layer were selected as 1600 and 1200 respectively, and the activation function was ReLU. Image features were acquired by ResNet-101.
The proposed method was compared with previous ones. The first is to complete the baseline model (DEM) [35]. Then the DEM+R model only includes structural matching components using loss functions L2 in the training phase. DEM+RA model includes structural loss components and domain adaptation components using loss functions L 3 and L 4 . The DEM+ARC model includes all strategies: structural matching, domain adaptation, and anti-bias calibration. The parameters of the aPY, WHD-9 dataset ( λ r ,   ρ ,   λ g ,   γ ) were set to (10−4, 10−1, 10−1, 1.1), and (10−3, 10−1, 10−1, 1.1), respectively. In Table 3, class-wise accuracy results of the traditional unseen classes (TU), the generalized unseen classes (GU), the generalized seen classes (GS), and the generalized harmonic mean (H) are recorded.
Table 3. Experiment results based on aPY and WHD-9.
As shown in Table 3, the proposed method is more effective than previous popular methods, and compared with the baseline model, the harmonic mean of the proposed method was significantly increased. The performance improvement can be owed to the three-step strategy. For both data sets, only using structural matching (DEM+R) yielded better performance than the baseline model, with 22.7% (aPY) and 8.7% (WHD-9) improvements. Additional use of domain adaptation (DEM+RA) showed much better results than DEM+R, increased by 62.3% (aPY), 51.2% (WHD-9), but DEM+RAC with calibration components produced only marginal improvements, 3.5% (aPY), 1.2% (WHD-9). This is because the relational matrix-based component produces class-specific adaptation (DEM+RA) to the semantic embedding of unseen classes, whereas the calibration component is not category-specific, just distinguishing between seen classes and unseen classes, so it is understandable that the effect of improving performance is not obvious.

5.3. Experiment Analysis

(1)
Analysis of structure matching components
The effect of structure matching was analyzed by changing ρ { 10 - 3 ,   10 - 2 ,   10 - 1 ,   10 0 ,   10 1 ,   10 2 } , and the change in accuracy was recorded. We conducted experiments using the WHD-9, the results of which are recorded in Figure 6.
Figure 6. The accuracy change with ρ on the WHD-9, baseline model (DEM) is represented in red, TU classes precision (a), GS classes precision (b), GU classes precision (c), and H precision (d) shows in blue.
It can be seen from Figure 6 that based on the WHD-9, the conventional unseen class accuracy (Figure 6a) and the generalized seen class accuracy (Figure 6b) are better than or similar to the baseline model DEM, while, the accuracy of the generalized unseen class (Figure 6c) is higher than that of DEM, and the harmonic mean (Figure 6d) is significantly better than that of DEM. It can be seen from Figure 6d that when ρ =   10 1 , the effect of the harmonic mean is the best.
Compared with baseline DEM, we verified that the DEM+R strategy contributes to hubness reduction. Hubness is measured by a bias of 1 nearest neighbor histogram (N1) [50]. A smaller skewness of the N1 histogram means less hubness prediction, and we used test samples of unseen classes in the generalized setting, as shown in Table 4.
Table 4. Hubness reduction using structure matching strategy.
In Table 4, WHD-9 and aPY datasets are used for experiments of DEM and DEM+R. Let ρ =   0.1 , the average value of multiple experiments is recorded in the Table, as can be seen from Table 4, the hubness of the N1 histogram generated by the DEM+R method on both data sets is smaller. This means that the use of an additional structural matching strategy will reduce hubness, therefore it can alleviate the trouble of dimension disaster.
(2)
Domain adaptive component analysis
As can be seen from Table 3, compared with DEM+R, the unseen class accuracy of DEM+RA increased by 88% (aPY), 101% (WHD-9), harmonic mean increased by 62.3% (aPY), 51.2% (WHD-9). Figure 7 shows the effect of domain adaptation by using t-SNE [51] on WHD-9. As shown in Figure 7a, the unseen class semantic embedding (purple) is very near to the seen class feature (blue). However, through the domain adaptation step, as shown in Figure 6b, the unseen class semantics are transformed, which is obviously close to the center of the feature family (red) of the unseen class.
Figure 7. t-SNE diagram of embedded samples. (a) Domain-free adaptation for WHD-9; (b) Domain adaptive operation is adopted. The image features of seen and unseen classes are represented in blue and red respectively. Embedded semantic descriptors for seen and unseen classes are represented in yellow and purple respectively.
(3)
Analysis of Calibration Mechanism
In Table 3, compared with the DEM+RA model, the “GU” performance of the DEM+RAC model was improved by 11.4%(aPY), 2.9% (WHD-9), and “H” performance increased by 3.6% (aPY), 1.2% (WHD-9), the reason may be that domain adaptive steps have already transferred the semantic embedding of unseen classes and accordingly shrunk the bias on seen classes and made further calibration not obvious.

6. Conclusions

In this paper, firstly a wheel defect data set (WHD-9) was built (image collection and domain knowledge expression). Secondly, a generalized zero-shot recognition framework for wheel hub defect image was proposed. This three-step recognition method is as follows: Step 1: Structural matching strategy, Step 2: Domain adaptation, Step 3: Calibration of classification scores. The model was validated by using the public data set aPY and the self-built WHD-9. The experiment result shows that the proposed three-step strategy is better than the previous method to a large extent, in that, Step 1 makes the hubness problem significantly reduced; Step 2 makes the projection domain shift well eliminated; Step 3 makes the bias problem of the seen class in the generalization recognition slightly decreased. Among them, the improvement of Step 2 is the most obvious.

Author Contributions

Conceptualization, X.S.; Data curation, X.S., Y.M.; Formal analysis, X.S., H.S.; Funding acquisition, J.G., M.W.; Investigation, X.S., Y.M.; Methodology, X.S.; Resources, X.S., H.S.; Software, X.S., H.S.; Supervision, J.G., M.W.; Writing—original draft, X.S.; Writing—review & editing, X.S., J.G., M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 51875266), Key projects of Science and Technology of Henan Province (No. 202102110114).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the company where the defect image came from hopes me keep a secret.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, W.; Li, K.; Huang, Y.; Deng, X. A New Trend Peak Algorithm with X-ray Image for Wheel Hubs Detection and Recognition. In ISICA 2015: Computational Intelligence and Intelligent Systems; Springer: Singapore, 2015; pp. 23–31. [Google Scholar]
  2. Zhang, J.; Hao, L.; Jiao, T.; Que, L.; Wang, M. Mathematical morphology approach to internal defect analysis of a356 aluminum alloy wheel hubs. AIMS Math. 2020, 5, 3256–3273. [Google Scholar] [CrossRef]
  3. Han, K.; Sun, M.; Zhou, X.; Zhang, G.; Dang, H.; Liu, Z. A new method in wheel hub surface defect detection: Object detection algorithm based on deep learning. In Proceedings of the International Conference on Advanced Mechatronic Systems, Xiamen, China, 6–9 December 2017; pp. 335–338. [Google Scholar]
  4. Sun, X.; Gu, J.; Huang, R.; Zou, R.; Palomares, B. Surface defects recognition of wheel hub based on improved Faster R-CNN. Electronics 2019, 8, 481. [Google Scholar] [CrossRef]
  5. Han, D.; Liu, Q.; Fan, W. A new image classification method using cnn transfer learning and web data augmentation. Expert Syst. Appl. 2018, 95, 43–56. [Google Scholar] [CrossRef]
  6. Jie, Z.; Wei, Y.; Jin, X.; Feng, J.; Liu, W. Deep self-taught learning for weakly supervised object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4294–4302. [Google Scholar]
  7. Ren, Z.; Zhu, Y.; Yan, K.; Chen, K.; Kang, W.; Yue, Y. A novel model with the ability of few-shot learning and quick updating for intelligent fault diagnosis. Mech. Syst. Signal Proc. 2020, 138, 106608.1–106608.21. [Google Scholar] [CrossRef]
  8. Luo, C.; Li, Z.; Huang, K.; Feng, J.; Wang, M. Zero-shot learning via attribute regression and class prototype rectification. IEEE Trans. Image Process. 2018, 27, 637–648. [Google Scholar] [CrossRef] [PubMed]
  9. Yu, L.; Feng, L.; Qian, Y.; Liu, W.; Hauptmann, A. Zero-VIRUS *: Zero-shot Vehicle Route Understanding System for Intelligent Transportation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 2534–2543. [Google Scholar]
  10. Wen, G.; Ma, J.; Hu, Y.; Li, H.; Jiang, L. Grouping attributes zero-shot learning for tongue constitution recognition. Artif. Int. Med. 2020, 109, 101951. [Google Scholar] [CrossRef]
  11. Gao, Y.; Gao, L.; Li, X.; Zheng, Y. A zero-shot learning method for fault diagnosis under unknown working loads. J. Int. Manuf. 2020, 31, 899–909. [Google Scholar] [CrossRef]
  12. Miguel, L.; Lin, D.; Guntupalli, J.; George, D. Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs. Sci. Robot. 2019, 4, 1–16. [Google Scholar] [CrossRef]
  13. Marieke, M.; Mirjam, M.; Jerzy, B.; Rainer, G.; Bandettini, P.; Nikolaus, K. Human object-similarity judgments reflect and transcend the primate-it object representation. Front. Psychol. 2013, 4, 128. [Google Scholar]
  14. Lee, J.; Marzelli, M.; Jolesz, F.; Yoo, S. Automated classification of fmri data employing trial-based imagery tasks. Med Image Anal. 2009, 13, 392–404. [Google Scholar] [CrossRef]
  15. Wang, G.; Forsyth, D. Joint learning of visual attributes, object classes and visual saliency. In Proceedings of the IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; pp. 537–544. [Google Scholar]
  16. Coates, A.; Ng, A. Selecting receptive fields in deep networks. In Proceedings of the 24th International Conference on Neural Information Processing Systems, Granada, Spain, 12–14 December 2011; pp. 2528–2536. [Google Scholar]
  17. Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Ans Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef]
  18. Zou, X.; Zhou, L.; Li, K.; Ouyang, A.; Chen, C. Multi-task cascade deep convolutional neural networks for large-scale commodity recognition. Neural Comp. Appl. 2020, 32, 5633–5647. [Google Scholar] [CrossRef]
  19. Zhao, T.; Chen, Q.; Kuang, Z.; Yu, J.; Zhang, W.; He, M. Deep mixture of diverse experts for large-scale visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1072–1087. [Google Scholar] [CrossRef] [PubMed]
  20. Leong, M.; Prasad, D.; Lee, Y.; Lin, F. Semi-CNN architecture for effective spatio-temporal learning in action recognition. Appl. Sci. 2020, 10, 557. [Google Scholar] [CrossRef]
  21. Zhang, J.; Chen, Y.; Zhai, Y. Zero-shot classification based on word vector enhancement and distance metric learning. IEEE Access 2020, 8, 102292–102302. [Google Scholar] [CrossRef]
  22. Wang, F.; Liu, H.; Sun, F. Fabric recognition using zero-shot learning. Tsinghua Sci. Technol. 2019, 24, 3–11. [Google Scholar] [CrossRef]
  23. Jangra, M.; Dhull, S.; Singh, K. Ecg arrhythmia classification using modified visual geometry group network (mvggnet). J. Intell. Fuzzy Syst. 2020, 38, 1–15. [Google Scholar] [CrossRef]
  24. Lu, Z.; Jiang, X.; Kot, C. Deep coupled resnet for low-resolution face recognition. IEEE Signal. Proc. Lett. 2018, 526–530. [Google Scholar] [CrossRef]
  25. Tang, J.; Zha, Z.J.; Tao, D.; Chua, T. Semantic-gap-oriented active learning for multilabel image annotation. IEEE Trans. Image Process. 2012, 21, 2354–2360. [Google Scholar] [CrossRef]
  26. Amstel, V.; Brand, V.; Protic, Z.; Verhoeff, T. Transforming process algebra models into uml state machines: Bridging a semantic gap? Lect. Notes Comput. Sci. 2008, 5063, 61–75. [Google Scholar]
  27. Saju, A.; Bella Mary, I.T.; Vasuki, A.; Lakshmi, P. Reduction of semantic gap using relevance feedback technique in image retrieval system. Appl. Digital Inf. Web Technol. 2014, 148–153. [Google Scholar] [CrossRef]
  28. Chang, D.; Cho, G.; Choi, Y. Zero-shot recognition enhancement by distance-weighted contextual inference. Appl. Sci. 2020, 10, 7234. [Google Scholar] [CrossRef]
  29. Weston, J.; Hamel, P. Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval. J. New Music Res. 2011, 40, 337–348. [Google Scholar] [CrossRef]
  30. Liu, G.Z. Semantic vector space model: Implementation and evaluation. J. Assoc. Inf. Sci. Technol. 2010, 48, 395–417. [Google Scholar] [CrossRef]
  31. Dill, S.; Eiron, N.; Gibson, D.; Gruhl, D.; Zien, J.Y. A case for automated large-scale semantic annotation. J. Web Semant. 2003, 1, 115–132. [Google Scholar] [CrossRef]
  32. Snel. Digital shoreline analysis system (DASA) version 4.3 transects with end-point rate calculations for sheltered shorelines between the colville river delta and point barrow for the time period 1947 to 2005. Nephrol. Dial. Transplant. 2012, 27, 3664. [Google Scholar]
  33. Krishnan, N.; Cook, D.J.; Wemlinger, Z. Learning a taxonomy of predefined and discovered activity patterns. J. Ambient Intell. Smart Environ. 2013, 5, 621–637. [Google Scholar] [CrossRef] [PubMed]
  34. Allen, D. Automatic one-hot re-encoding for fpls. In Proceedings of the second International Workshop on Field Programmable Logic. & Applications, Vienna, Austria, 31 August–2 September 1992; pp. 71–77. [Google Scholar]
  35. Liu, Z.H.; Cui, L.; Zhang, T.; Liu, Q.; Chen, E.Y.; Yang, K.M. Three dimensional mapping of ground deformation disasters caused by underground coal mining. Adv. Mater. Res. 2012, 524, 503–507. [Google Scholar] [CrossRef]
  36. Radovanovic, M.; Nanopoulos, A.; Ivanovic, M. Hubs in space: Popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 2010, 11, 2487–2531. [Google Scholar]
  37. Shigeto, Y.; Suzuki, I.; Hara, K.; Shimbo, M.; Matsumoto, Y. Ridge regression, hubness, and zero-shot learning. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Porto, Portugal, 7–11 September 2015; pp. 135–151. [Google Scholar] [CrossRef]
  38. Fu, Y.; Hospedales, T.; Xiang, T.; Gong, S. Transductive multi-view zero-shot learning. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 2332–2345. [Google Scholar] [CrossRef] [PubMed]
  39. Das, D.; George, C. Sample-to-sample correspondence for unsupervised domain adaptation. Eng. Appl. Artif. Intell. 2018, 73, 80–91. [Google Scholar] [CrossRef]
  40. Chao, W.; Changpinyo, S.; Gong, B.; Sha, F. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 52–68. [Google Scholar] [CrossRef]
  41. Chren, W.A. One-hot residue coding for low delay-power product cmos design. IEEE Trans. Circuits Syst. II Analog Digit. Signal. Process. 1998, 45, 303–313. [Google Scholar] [CrossRef]
  42. Juan, A.; Mateos, G.; Giannakis, G. Group-lasso on splines for spectrum cartography. IEEE Trans. Signal. Process. 2011, 59, 4648–4663. [Google Scholar]
  43. Bredies, K.; Lorenz, D.; Maass, P. A generalized conditional gradient method and its connection to an iterative shrinkage method. Comput. Optim. Appl. 2009, 42, 173–193. [Google Scholar] [CrossRef]
  44. Shek, E.; Ghani, M.; Jones, R. Simplex search in optimization of capsule formulation. J. Pharm. Ences 2015, 69, 1135–1142. [Google Scholar] [CrossRef]
  45. Fu, A.; Wenyin, L.; Deng, X. Detecting phishing web pages with visual similarity assessment based on earth mover’s distance (emd). IEEE Trans. Dependable Secur. Comput. 2006, 3, 301–311. [Google Scholar] [CrossRef]
  46. Bonneel, N.; Panne, M.; Paris, S.; Heidrich, W. Displacement interpolation using lagrangian mass transport. ACM Trans. Graph. 2011, 30, 158. [Google Scholar] [CrossRef]
  47. Sun, L.; Song, J.; Wang, Y.; Li, B. Cooperative coupled generative networks for generalized zero-shot learning. IEEE Access 2020, 8, 119287–119299. [Google Scholar]
  48. Farhadi, A.; Endres, I.; Hoiem, D.; Forsyth, D. Describing objects by their attributes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 1778–1785. [Google Scholar]
  49. Caldwell, A.; Eller, P.; Hafych, V.; Schick, R.; Szalay, M. Integration with an adaptive harmonic mean algorithm. Int. J. Mod. Phys. A 2020, 35, 1950142. [Google Scholar] [CrossRef]
  50. Schnitzer, D.; Flexer, A.; Schedl, M.; Widmer, G. Local and global scaling reduce hubs in space. J. Mach. Learn. Res. 2012, 13, 2871–2902. [Google Scholar]
  51. Laurens, V.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.