# Image-Label Recovery on Fashion Data Using Image Similarity from Triple Siamese Network

^{*}

## Abstract

**:**

## 1. Introduction

- Designing a novel DML model to learn good similarity measures between every two images in an image dataset and create a graph with their distances.
- Recovering labels for all images from the graph using a label-propagation algorithm.

## 2. Related Work

## 3. Preliminary Work

## 4. Proposed Methodology

#### 4.1. Distance Metric Learning

#### 4.1.1. Triple Siamese Network

#### 4.1.2. Graph Representation

#### 4.2. Label Propagation

#### 4.2.1. Spectral Graph Theory

#### 4.2.2. Graph Harmonic Analysis

#### 4.2.3. Problem Setup

#### 4.2.4. Recover Algorithm

## 5. Implementation

#### 5.1. Learning

#### 5.2. Validation

#### 5.3. Twin Siamese vs. Triple Siamese Network

## 6. Experiments and Results

#### Results

## 7. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Conflicts of Interest

## Abbreviations

AI | Artificial intelligence |

ML | Machine learning |

DML | Distance metric learning |

CNN | Convolutional neural network |

MSE | Mean square error |

SSL | Semisupervised learning |

ReLU | Rectified linear unit |

GNN | Graph neural network |

k-NN | k-nearest neighbors |

MAE | Mean absolute error |

RA | Recover algorithm |

## References

- Niculescu-Mizil, A.; Caruana, R. Predicting good probabilities with supervised learning. In Proceedings of the ICML ’05, Bonn, Germany, 7–11 August 2005. [Google Scholar]
- Kotsiantis, S.B. Supervised Machine Learning: A Review of Classification Techniques. In Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: RealWord AI Systems with Applications in EHealth, HCI, Information Retrieval and Pervasive Technologies, Amsterdam, The Netherlands, 14–16 June 2007; pp. 3–24. [Google Scholar]
- Loog, M. Supervised Classification: Quite a Brief Overview. arXiv
**2017**, arXiv:cs.LG/1710.09230. [Google Scholar] - Stephen, P.; Jaganathan, S. Linear regression for pattern recognition. In Proceedings of the 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), Coimbatore, India, 6–8 March 2014; pp. 1–6. [Google Scholar] [CrossRef]
- Param, A. Fashion Product Images (Small). 2019. Available online: https://www.kaggle.com/paramaggarwal/fashion-product-images-small (accessed on 20 January 2021).
- Kim, W.H.; Jalal, M.; Hwang, S.; Johnson, S.C.; Singh, V. Online Graph Completion: Multivariate Signal Recovery in Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Van Engelen, J.E.; Hoos, H. A survey on semi-supervised learning. Mach. Learn.
**2019**, 109, 373–440. [Google Scholar] [CrossRef][Green Version] - Iscen, A.; Tolias, G.; Avrithis, Y.; Chum, O. Label Propagation for Deep Semi-supervised Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 5070–5079. [Google Scholar]
- Puy, G.; Tremblay, N.; Gribonval, R.; Vandergheynst, P. Random sampling of bandlimited signals on graphs. In Proceedings of the NIPS2015 Workshop on Multiresolution Methods for Large Scale Learning, Montréal, QC, Canada, 12 December 2015. [Google Scholar]
- Kim, W.H.; Hwang, S.J.; Adluru, N.; Johnson, S.C.; Singh, V. Adaptive Signal Recovery on Graphs via Harmonic Analysis for Experimental Design in Neuroimaging. In Proceedings of the Computer Vision—ECCV 2016—14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Part VI; Lecture Notes in Computer Science. Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2016; Volume 9910, pp. 188–205. [Google Scholar] [CrossRef]
- Bronstein, M.M.; Bruna, J.; LeCun, Y.; Szlam, A.; Vandergheynst, P. Geometric Deep Learning: Going beyond Euclidean data. IEEE Signal Process. Mag.
**2017**, 34, 18–42. [Google Scholar] [CrossRef][Green Version] - Malkov, Y.A.; Yashunin, D.A. Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs. IEEE Trans. Pattern Anal. Mach. Intell.
**2020**, 42, 824–836. [Google Scholar] [CrossRef] [PubMed][Green Version] - Saito, K.; Kim, D.; Sclaroff, S.; Darrell, T.; Saenko, K. Semi-supervised Domain Adaptation via Minimax Entropy. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 8050–8058. [Google Scholar]
- Zhai, X.; Oliver, A.; Kolesnikov, A.; Beyer, L. S4L: Self-Supervised Semi-Supervised Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 1476–1485. [Google Scholar]
- Bromley, J.; Guyon, I.; LeCun, Y.; Säckinger, E.; Shah, R. Signature Verification using a “Siamese” Time Delay Neural Network. In Advances in Neural Information Processing Systems; Cowan, J., Tesauro, G., Alspector, J., Eds.; Morgan-Kaufmann: Burlington, MA, USA, 1994; Volume 6, pp. 737–744. [Google Scholar]
- Fei-Fei, L.; Fergus, R.; Perona, P. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell.
**2006**, 28, 594–611. [Google Scholar] [CrossRef][Green Version] - Lake, B.M.; Salakhutdinov, R.; Tenenbaum, J.B. Human-level concept learning through probabilistic program induction. Science
**2015**, 350, 1332–1338. [Google Scholar] [CrossRef][Green Version] - Koch, G.; Zemel, R.; Salakhutdinov, R. Siamese Neural Networks for One-shot Image Recognition. In Proceedings of the ICML Deep Learning Workshop, Lille Grand Palais, France, 6–11 July 2015. [Google Scholar]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; Kavukcuoglu, K.; Wierstra, D. Matching Networks for One Shot Learning. Adv. Neural Inf. Process. Syst.
**2016**, 29, 3630–3638. [Google Scholar] - Schroff, F.; Kalenichenko, D.; Philbin, J. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 8–10 June 2015. [Google Scholar] [CrossRef][Green Version]
- Kertész, G. Metric Embedding Learning on Multi-Directional Projections. Algorithms
**2020**, 13, 133. [Google Scholar] [CrossRef] - Gori, M.; Monfardini, G.; Scarselli, F. A new model for learning in graph domains. In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada, 31 July–4 August 2005; Volume 2, pp. 729–734. [Google Scholar]
- Doersch, C. Tutorial on Variational Autoencoders. arXiv
**2016**, arXiv:stat.ML/1606.05908. [Google Scholar] - Kipf, T.N.; Welling, M. Variational Graph Auto-Encoders. arXiv
**2016**, arXiv:stat.ML/1611.07308. [Google Scholar] - Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv
**2014**, arXiv:stat.ML/1406.2661. [Google Scholar] [CrossRef] - Odena, A. Semi-Supervised Learning with Generative Adversarial Networks. arXiv
**2016**, arXiv:stat.ML/1606.01583. [Google Scholar] - Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved Techniques for Training GANs. arXiv
**2016**, arXiv:cs.LG/1606.03498. [Google Scholar] - Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv
**2016**, arXiv:1609.02907. [Google Scholar] - Chang, M.B.; Ullman, T.; Torralba, A.; Tenenbaum, J.B. A Compositional Object-Based Approach to Learning Physical Dynamics. arXiv
**2016**, arXiv:1612.00341. [Google Scholar] - Duvenaud, D.; Maclaurin, D.; Aguilera-Iparraguirre, J.; Gómez-Bombarelli, R.; Hirzel, T.; Aspuru-Guzik, A.; Adams, R.P. Convolutional Networks on Graphs for Learning Molecular Fingerprints. Adv. Neural Inf. Process. Syst.
**2015**, 28, 2224–2232. [Google Scholar] - Kearnes, S.M.; McCloskey, K.; Berndl, M.; Pande, V.S.; Riley, P. Molecular graph convolutions: Moving beyond fingerprints. J. Comput. Aided Mol. Des.
**2016**, 30, 595–608. [Google Scholar] [CrossRef][Green Version] - Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 1263–1272. [Google Scholar]
- Appalaraju, S.; Chaoji, V. Image similarity using Deep CNN and Curriculum Learning. arXiv
**2017**, arXiv:1709.08761. [Google Scholar] - Hammond, D.K.; Vandergheynst, P.; Gribonval, R. Wavelets on Graphs via Spectral Graph Theory. App. Comput. Harmonic Anal.
**2011**, 30, 129–150. [Google Scholar] [CrossRef][Green Version] - Turk, G.; Levoy, M. Zippered Polygon Meshes from Range Images. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’94), Orlando, FL, USA, 24–29 July 1994; pp. 311–318. [Google Scholar] [CrossRef]
- Frary, R.B.; Cross, L.H.; Lowry, S.R. Random guessing, correction for guessing, and reliability of multiple-choice test scores. J. Exp. Educ.
**1977**, 46, 11–15. [Google Scholar] [CrossRef] - Hui, G.G.; Wang, H.; Bell, D.; Bi, Y.; Greer, K. KNN Model-Based Approach in Classification. In Proceedings of the OTM Confederated International Conferences On the Move to Meaningful Internet Systems, Catania, Italy, 3–7 November 2003. [Google Scholar]
- Tan, C.; Sun, F.; Kong, T.; Zhang, W.; Yang, C.; Liu, C. A Survey on Deep Transfer Learning. In Proceedings of the International Conference on Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018. [Google Scholar]
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv
**2017**, arXiv:1708.07747. [Google Scholar] - Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR09), Miami, FL, USA, 20–25 June 2009. [Google Scholar]
- Vielzeuf, V.; Lechervy, A.; Pateux, S.; Jurie, F. CentralNet: A Multilayer Approach for Multimodal Fusion. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Settles, B. Active Learning Literature Survey (Computer Sciences Technical Report 1648); University of Wisconsin-Madison: Madison, WI, USA, 2009. [Google Scholar]

**Figure 4.**Toy example of our framework on GSP bunny (N = 2503). (

**a**) Band-limited random signal in [0, 1] with noise, (

**b**) sampled signal at m = 600 locations out of 2503, (

**c**) recovered signal using our method with k = 500.

**Table 1.**Accuracy comparison of our model with other models for different partial-observation (PO) sizes. Note: RA, recover algorithm; k-NN, k-nearest neighbors.

Models | P.O. 400 | P.O. 750 | P.O. 1000 | P.O. 1500 | ||||
---|---|---|---|---|---|---|---|---|

Acc | Var | Acc | Var | Acc | Var | Acc | Var | |

Triple Siamese net and RA (our method) | 60.91% | 29.37 | 81.1% | 22.45 | 83.7% | 31.83 | 86.4% | 27.32 |

Twin Siamese net and RA [18] | 55.23 | 21.45 | 62.8% | 12.87 | 70.2% | 22.75 | 79.3% | 17.20 |

Triple Siamese net and k-NN [37] | 20.1 | 13.8 | 41.6% | 26.92 | 50% | 23.66 | 45.5% | 9.25 |

Graph convolutional network [24] | 12.81 | 15.2 | 17% | 10.01 | 21.8% | 17.21 | 23.7% | 22.90 |

**Table 2.**Precision Comparison of our model with other models for different partial-observation sizes.

Models | P.O. 400 | P.O. 750 | P.O. 1000 | P.O. 1500 | ||||
---|---|---|---|---|---|---|---|---|

Acc | Var | Acc | Var | Acc | Var | Acc | Var | |

Triple Siamese net and RA(Our method) | 57.32% | 15.4 | 84.6% | 18.67 | 85.5% | 10.49 | 87.6% | 9.41 |

Twin Siamese net and RA [18] | 52% | 17.77 | 71.4% | 16.33 | 82.7% | 17.71 | 82.8% | 12.58 |

Triple Siamese net and k-NN [37] | 39.71% | 17.32 | 60.4% | 32.79 | 65.8% | 23.36 | 67.3% | 18.63 |

Graph convolutional network [24] | 13.32% | 16.51 | 13.3% | 15.62 | 17.4% | 17.49 | 18% | 17.22 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Banerjee, D.; Kyrarini, M.; Kim, W.H. Image-Label Recovery on Fashion Data Using Image Similarity from Triple Siamese Network. *Technologies* **2021**, *9*, 10.
https://doi.org/10.3390/technologies9010010

**AMA Style**

Banerjee D, Kyrarini M, Kim WH. Image-Label Recovery on Fashion Data Using Image Similarity from Triple Siamese Network. *Technologies*. 2021; 9(1):10.
https://doi.org/10.3390/technologies9010010

**Chicago/Turabian Style**

Banerjee, Debapriya, Maria Kyrarini, and Won Hwa Kim. 2021. "Image-Label Recovery on Fashion Data Using Image Similarity from Triple Siamese Network" *Technologies* 9, no. 1: 10.
https://doi.org/10.3390/technologies9010010