# Garment Categorization Using Data Mining Techniques

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Research Background

## 3. Machine Learning Algorithms for Garment Classification

#### 3.1. Naïve Bayes (NB) Classification

^{th}feature. Based on the features, the object can be classified into a class ${c}_{i}$ in $C=\left({c}_{1},{c}_{2},\dots ,{c}_{w}\right)$. Therefore, according to Bayes theorem [51],

#### 3.2. Decision Trees (DT)

#### 3.3. Random Forest (RF)

#### 3.4. Bayesian Forest (BF)

## 4. Research Methodology

#### 4.1. Tools and Dataset

- List of garment sub-categories tagged in the images along with the corresponding garment categories.
- List of 289,222 image names with the corresponding garment category (upper, lower, whole).
- List of garment attributes containing the attribute name (A-line, long-sleeve, zipper, etc.) and the corresponding attribute type.
- List of 289,222 image names with 1000 columns for each garment attributes providing the presence or absence of the attribute in that image by (−1, 0, 1).

#### 4.2. Data Preprocessing

#### 4.2.1. Data Extraction, Cleaning, and Integration

#### 4.2.2. Feature Selection and Data Reduction

- $p\left(t\right)$ is the proportion $\raisebox{1ex}{${N}_{t}$}\!\left/ \!\raisebox{-1ex}{$N$}\right.$ of samples reaching $t$.
- $\u25b3g$ is the impurity function, i.e., Gini importance or mean decrease Gini.
- $v\left({s}_{t}\right)$ is the feature used in the split ${s}_{t}$.

#### 4.3. Model Building

#### 4.3.1. Development of Subsystems

#### Model Development

- The dataset was randomly split into k (=10) equal size partitions.
- From the k partitions, one was reserved as the test dataset for the final evaluation of the model, while the other k-1 partitions were used to model training.
- The process was repeated for each model and machine learning technique k times with each of the k-partitions used exactly once as the test data.
- The k results acquired from each of the test partitions were combined by averaging them, to produce a single estimation.

#### Evaluation

#### 4.3.2. Integration of Subsystems

#### Model Development

#### Evaluation

## 5. Experimentation and Results

#### 5.1. Analysis of Subsystems

#### 5.2. Analysis of the Integrated System

## 6. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Guleria, P.; Sood, M. Big data analytics: Predicting academic course preference using hadoop inspired mapreduce. In Proceedings of the 2017 4th International Conference on Image Information Processing, ICIIP 2017, Shimla, India, 21–23 December 2017; pp. 328–331. [Google Scholar]
- Hsu, C.-H. Data mining to improve industrial standards and enhance production and marketing: An empirical study in apparel industry. Expert Syst. Appl.
**2009**, 36, 4185–4191. [Google Scholar] [CrossRef] - Lu, Q.; Lyu, Z.-J.; Xiang, Q.; Zhou, Y.; Bao, J. Research on data mining service and its application case in complex industrial process. In Proceedings of the IEEE International Conference on Automation Science and Engineering, Xi’an, China, 20–23 August 2017; pp. 1124–1129. [Google Scholar]
- Buluswar, M.; Campisi, V.; Gupta, A.; Karu, Z.; Nilson, V.; Sigala, R. How Companies are Using Big Data and Analytics; McKinsey & Company: San Francisco, CA, USA, 2016. [Google Scholar]
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv
**2017**, arXiv:1708.07747. [Google Scholar] - Adhikari, S.S.; Singh, S.; Rajagopal, A.; Rajan, A. Progressive Fashion Attribute Extraction. arXiv
**2019**, arXiv:1907.00157. [Google Scholar] - Zielnicki, K. Simulacra and Selection: Clothing Set Recommendation at Stitch Fix. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 18 July 2019; pp. 1379–1380. [Google Scholar]
- Griva, A.; Bardaki, C.; Pramatari, K.; Papakiriakopoulos, D. Retail business analytics: Customer visit segmentation using market basket data. Expert Syst. Appl.
**2018**, 100, 1–16. [Google Scholar] [CrossRef] - Juaneda-Ayensa, E.; Mosquera, A.; Murillo, Y.S. Omnichannel Customer Behavior: Key Drivers of Technology Acceptance and Use and Their Effects on Purchase Intention. Front. Psychol.
**2016**, 7, 1117. [Google Scholar] [CrossRef] [Green Version] - Manfredi, M.; Grana, C.; Calderara, S.; Cucchiara, R. A complete system for garment segmentation and color classification. Mach. Vis. Appl.
**2014**, 25, 955–969. [Google Scholar] [CrossRef] [Green Version] - Ghani, R.; Probst, K.; Liu, Y.; Krema, M.; Fano, A. Text mining for product attribute extraction. ACM SIGKDD Explor. Newsl.
**2006**, 8, 41–48. [Google Scholar] [CrossRef] - Gill, S. A review of research and innovation in garment sizing, prototyping and fitting A review of research and innovation in garment sizing, prototyping and fitting. Text. Prog.
**2015**, 47, 1–85. [Google Scholar] [CrossRef] - Kausher, H.; Srivastava, S. Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining. Int. J. Mater. Text. Eng.
**2019**, 13, 571–575. [Google Scholar] - Zakaria, N.; Ruznan, W.S. Developing apparel sizing system using anthropometric data: Body size and shape analysis, key dimensions, and data segmentation. In Anthropometry, Apparel Sizing and Design; Elsevier: Amsterdam, The Netherlands, 2020; pp. 91–121. [Google Scholar]
- Pei, J.; Park, H.; Ashdown, S.P. Female breast shape categorization based on analysis of CAESAR 3D body scan data. Text. Res. J.
**2019**, 89, 590–611. [Google Scholar] [CrossRef] - Liu, K.; Zeng, X.; Bruniaux, P.; Wang, J.; Kamalha, E.; Tao, X. Fit evaluation of virtual garment try-on by learning from digital pressure data. Knowl.-Based Syst.
**2017**, 133, 174–182. [Google Scholar] [CrossRef] - Lagė, A.; Ancutienė, K. Virtual try-on technologies in the clothing industry: Basic block pattern modification. Int. J. Cloth. Sci. Technol.
**2019**, 31, 729–740. [Google Scholar] [CrossRef] - Zakaria, N.; Taib, J.S.M.N.; Tan, Y.Y.; Wah, Y.B. Using data mining technique to explore anthropometric data towards the development of sizing system. In Proceedings of the Proceedings—International Symposium on Information Technology, Kuala Lumpur, Malaysia, 26–28 August 2008; Volume 2. [Google Scholar]
- Hsu, C.H.; Wang, M.J.J. Using decision tree-based data mining to establish a sizing system for the manufacture of garments. Int. J. Adv. Manuf. Technol.
**2005**, 26, 669–674. [Google Scholar] [CrossRef] - Thomassey, S. Sales forecasts in clothing industry: The key success factor of the supply chain management. Int. J. Prod. Econ.
**2010**, 128, 470–483. [Google Scholar] [CrossRef] - Zhang, Y.; Zhang, C.; Liu, Y. An AHP-Based Scheme for Sales Forecasting in the Fashion Industry. In Analytical Modeling Research in Fashion Business; Springer: Singapore, 2016; pp. 251–267. [Google Scholar]
- Craparotta, G.; Thomassey, S. A Siamese Neural Network Application for Sales Forecasting of New Fashion Products Using Heterogeneous Data. Int. J. Comput. Intell. Syst.
**2019**, 12, 1537–1546. [Google Scholar] [CrossRef] [Green Version] - Beheshti-Kashi, S.; Thoben, K.-D. The Usage of Social Media Text Data for the Demand Forecasting in the Fashion Industry. In Dynamics in Logistics; Springer: Cham, Switzerland, 2016; pp. 723–727. [Google Scholar]
- Yang, S.J.; Jang, S. Fashion trend forecasting using ARIMA and RNN: Application of tensorflow to retailers’ websites. Asia Life Sci.
**2019**, 18, 407–418. [Google Scholar] - Kumar, S.V.; Poonkuzhali, S. Improvising the Sales of Garments by Forecasting Market Trends using Data Mining Techniques. Int. J. Pure Appl. Math.
**2018**, 119, 797–805. [Google Scholar] - Al-halah, Z.; Stiefelhagen, R.; Grauman, K. Fashion Forward: Forecasting Visual Style in Fashion Supplementary Material. In Proceeding of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 388–397. [Google Scholar]
- Stan, I.; Mocanu, C. An Intelligent Personalized Fashion Recommendation System. In Proceedings of the 2019 22nd International Conference on Control Systems and Computer Science (CSCS), Bucharest, Romania, 28–30 May 2019; pp. 210–215. [Google Scholar]
- Sugumaran, P.; Sukumaran, V. Recommendations to improve dead stock management in garment industry using data analytics. Math. Biosci. Eng.
**2019**, 16, 8121–8133. [Google Scholar] [CrossRef] - Guan, C.; Qin, S.; Ling, W.; Ding, G. Apparel Recommendation System Evolution: An empirical review. Int. J. Cloth. Sci. Technol.
**2016**, 28, 854–879. [Google Scholar] [CrossRef] [Green Version] - Sun, W.; Lin, H.; Li, C.; Wang, T.; Zhou, K. Research and Application of Clothing Recommendation System Combining Explicit Data and Implicit Data. In Proceedings of the International Conference on Artificial Intelligence and Computing Science (ICAICS 2019), Wuhan, China, 24–25 March 2019. [Google Scholar]
- Skiada, M.; Lekakos, G.; Gkika, S.; Bardaki, C. Garment Recommendations for Online and Offline Consumers. In Proceedings of the MCIS 2016 Proceedings, Paphos, Cyprus, 2016. [Google Scholar]
- Chen, Y.; Qin, M.; Qi, Y.; Sun, L. Improving Fashion Landmark Detection by Dual Attention Feature Enhancement. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Karessli, N.; Guigourès, R.; Shirvany, R. SizeNet: Weakly Supervised Learning of Visual Size and Fit in Fashion Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Surakarin, W.; Chongstitvatana, P. Classification of clothing with weighted SURF and local binary patterns. In Proceedings of the ICSEC 2015—19th International Computer Science and Engineering Conference: Hybrid Cloud Computing: A New Approach for Big Data Era, Chiang Mai, Thailand, 23–26 November 2015. [Google Scholar]
- Bhimani, H.; Kaimaparambil, K.; Papan, V.; Chaurasia, H.; Kukreja, A. Web-Based Model for Apparel Classification. 2nd International Conference on Advances in Science & Technology (ICAST) 2019 on 8th, 9th April 2019 by K J Somaiya Institute of Engineering & Information Technology, Mumbai, India. 2019. Available online: https://ssrn.com/abstract=3367732 (accessed on 2 June 2020).
- Cheng, C.-I.; Liu, D.S.-M. An intelligent clothes search system based on fashion styles. In Proceedings of the International Conference on Machine Learning and Cybernetics, Kunming, China, 12–15 July 2008; Volume 3, pp. 1592–1597. [Google Scholar]
- Ak, K.E.; Lim, J.H.; Tham, J.Y.; Kassim, A.A. Attribute Manipulation Generative Adversarial Networks for Fashion Images. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 10541–10550. [Google Scholar]
- Yildirim, P.; Birant, D.; Alpyildiz, T. Data mining and machine learning in textile industry. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
**2018**, 8, e1228. [Google Scholar] [CrossRef] [Green Version] - Tong, L.; Wong, W.K.; Kwong, C.K. Fabric Defect Detection for Apparel Industry: A Nonlocal Sparse Representation Approach. IEEE Access
**2017**, 5, 5947–5964. [Google Scholar] [CrossRef] - Wei, B.; Hao, K.; Tang, X.S.; Ren, L. Fabric defect detection based on faster RCNN. In Advances in Intelligent Systems and Computing; Springer: Cham, Switzerland, 2019; Volume 849, pp. 45–51. [Google Scholar]
- Gries, T.; Lutz, V.; Niebel, V.; Saggiomo, M.; Simonis, K. Automation in quality monitoring of fabrics and garment seams. In Automation in Garment Manufacturing; Elsevier Inc.: Amsterdam, The Netherlands, 2017; pp. 353–376. [Google Scholar]
- Vuruskan, A.; Ince, T.; Bulgun, E.; Guzelis, C. Intelligent fashion styling using genetic search and neural classification. Int. J. Cloth. Sci. Technol.
**2015**, 27, 283–301. [Google Scholar] [CrossRef] - Tuinhof, H.; Pirker, C.; Haltmeier, M. Image-based fashion product recommendation with deep learning. In International Conference on Machine Learning, Optimization, and Data Science; Springer: Cham, 2019; pp. 472–481. [Google Scholar]
- Donati, L.; Iotti, E.; Mordonini, G.; Prati, A. Fashion product classification through deep learning and computer vision. Appl. Sci.
**2019**, 9, 1385. [Google Scholar] [CrossRef] [Green Version] - Bossard, L.; Dantone, M.; Leistner, C.; Wengert, C.; Quack, T.; van Gool, L. Apparel classification with style. In Asian conference on computer vision; Springer: Berlin/Heidelberg, Germany, 2012; pp. 321–335. [Google Scholar]
- Laenen, K.; Zoghbi, S.; Moens, M.-F. Cross-modal Search for Fashion Attributes. In Proceedings of the KDD 2017 Workshop on Machine Learning Meets Fashion, Halifax, NS, Canada, 14 August 2017. [Google Scholar]
- Hammar, K.; Jaradat, S.; Dokoohaki, N.; Matskin, M. Deep Text Mining of Instagram Data Without Strong Supervision 4 th Mihhail Matskin. In Proceedings of the 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI), Santiago, Chile, 3–6 December 2018; pp. 158–165. [Google Scholar]
- Kreyenhagen, C.D.; Aleshin, T.I.; Bouchard, J.E.; Wise, A.M.I.; Zalegowski, R.K. Using supervised learning to classify clothing brand styles. In Proceedings of the 2014 IEEE Systems and Information Engineering Design Symposium (SIEDS), Charlottesville, VA, USA, 25 April 2014; pp. 239–243. [Google Scholar]
- Rutkowski, L.; Jaworski, M.; Pietruczuk, L.; Duda, P. The CART decision tree for mining data streams. Inf. Sci.
**2014**, 266, 1–15. [Google Scholar] [CrossRef] - Fuster-Parra, P.; García-Mas, A.; Cantallops, J.; Ponseti, F.J.; Luo, Y. Ranking Features on Psychological Dynamics of Cooperative Team Work through Bayesian Networks. Symmetry
**2016**, 8, 34. [Google Scholar] [CrossRef] [Green Version] - Mccallum, A.; Nigam, K. A Comparison of Event Models for Naive Bayes Text Classification. In AAAI-98 Workshop on Learning for Text Categorization; Madison, Wisconsin, 1998; pp. 41–48. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.65.9324&rep=rep1&type=pdf (accessed on 2 June 2020).
- Cheng, J.; Greiner, R. Comparing Bayesian network classifiers. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, 30–31 July 1999; pp. 101–108. [Google Scholar]
- Liu, L.; Su, J.; Zhao, B.; Wang, Q.; Chen, J.; Luo, Y. Towards an Efficient Privacy-Preserving Decision Tree Evaluation Service in the Internet of Things. Symmetry
**2020**, 12, 103. [Google Scholar] [CrossRef] [Green Version] - Song, Y.Y.; Lu, Y. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiatry
**2015**, 27, 130–135. [Google Scholar] - Farid, D.; Zhang, L.; Rahman, C.; Hossain, M.A.; Strachan, R. systems with, and U. 2014, “Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Syst. Appl.
**2014**, 41, 1937–1946. [Google Scholar] [CrossRef] - Shalev-Shwartz, S.; Ben-David, S. Understanding Machine Learning: From Theory to Algorithms; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
- Siddiqui, M.F.; Mujtaba, G.; Reza, A.W.; Shuib, L. Multi-Class Disease Classification in Brain MRIs Using a Computer-Aided Diagnostic System. Symmetry
**2017**, 9, 37. [Google Scholar] [CrossRef] [Green Version] - Aggarwal, C. Data Classification: Algorithms and Applications; CRC Press: Boca Raton, FL, USA, 2014. [Google Scholar]
- Mao, W.; Wang, F.-Y. Cultural Modeling for Behavior Analysis and Prediction. In Advances in Intelligence and Security Informatics; Elsevier: Amsterdam, The Netherlands, 2012; pp. 91–102. [Google Scholar]
- Taddy, M.; Chen, C.-S.; Yu, J. Bayesian and Empirical Bayesian Forests. arXiv
**2015**, arXiv:1502.02312. [Google Scholar] - Liu, Z.; Luo, P.; Qiu, S.; Wang, X.; Tang, X. DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1096–1104. [Google Scholar]
- Louppe, G.; Wehenkel, L.; Sutera, A.; Geurts, P. Understanding variable importances in forests of randomized trees. In Advances in Neural Iinformation Processing Systems; Neural Information Processing Systems Foundation, Inc.: La Jolla, CA, USA, 2013; pp. 431–439. [Google Scholar]
- Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI
**1995**, 14, 1137–1145. [Google Scholar] - Sechidis, K.; Tsoumakas, G.; Vlahavas, I. On the stratification of multi-label data. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases; Springer: Berlin/Heidelberg, Germany, 2011; pp. 145–158. [Google Scholar]
- Hossin, M.; Sulaiman, M. A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manag. Process
**2015**, 5, 1. [Google Scholar] - Gupta, D.L.; Malviya, A.K.; Singh, S. Performance analysis of classification tree learning algorithms. Int. J. Comput. Appl.
**2012**, 55. [Google Scholar] - Liu, Y.; Zhang, H.H.; Wu, Y. Hard or soft classification? large-margin unified machines. J. Am. Stat. Assoc.
**2011**, 106, 166–177. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Wahba, G. Soft and hard classification by reproducing kernel Hilbert space methods. Proc. Natl. Acad. Sci. USA
**2002**, 99, 16524–16530. [Google Scholar] [CrossRef] [Green Version] - Tu, D.Q.; Kayes, A.S.M.; Rahayu, W.; Nguyen, K. ISDI: A New Window-Based Framework for Integrating IoT Streaming Data from Multiple Sources. In Advanced Information Networking and Applications; AINA 2019. Advances in Intelligent Systems and Computing; Barolli, L., Takizawa, M., Xhafa, F., Enokido, T., Eds.; Springer: Cham, Switzerland, 2020; Volume 926, pp. 498–511. [Google Scholar]

**Figure 4.**Ten-fold cross validation results for (

**a**) Dataset A, (

**b**) Dataset U, (

**c**) Dataset L, and (

**d**) Dataset W.

**Figure 6.**Accuracies at different thresholds for (

**a**) Naïve Bayes, (

**b**) Decision Trees, (

**c**) Bayesian Forest, and (

**d**) Random Forest.

Dataset | Initial | Final | ||
---|---|---|---|---|

No. of Data Points | No. of Attributes | No. of Data Points | No. of Attributes | |

A | 289222 | 1000 | 276253 | 1000 |

U | 137770 | 1000 | 131620 | 430 |

L | 56037 | 1000 | 55915 | 467 |

W | 82446 | 1000 | 82202 | 453 |

S. No. | Data Mining Algorithm | Model Parameters |
---|---|---|

1 | Naïve Bayes | The prior probability distribution is represented by Bernoulli’s Naïve Bayes. |

2 | Decision Trees | Minimum number of samples required to be at a leaf node = 3, Seed value = 1000. |

3 | Random Forest | Minimum number of samples required to be at a leaf node = 3, Number of trees in the forest = 200, Seed value = 1000. |

4 | Bayesian Forest | Minimum number of samples required to be at a leaf node = 3, Number of trees in the forest = 200, Bootstrap = True, Seed value = 1000. |

Accuracy | Precision | Recall | F-Score | |
---|---|---|---|---|

Naïve Bayes | ||||

Dataset A | 0.7513 | 0.7530 | 0.7513 | 0.7502 |

Dataset U | 0.5539 | 0.5444 | 0.5539 | 0.5417 |

Dataset L | 0.6734 | 0.6684 | 0.6734 | 0.6682 |

Dataset W | 0.8242 | 0.7888 | 0.8242 | 0.7975 |

Decision Trees | ||||

Dataset A | 0.7957 | 0.7947 | 0.7957 | 0.7940 |

Dataset U | 0.6130 | 0.6085 | 0.6130 | 0.6064 |

Dataset L | 0.7429 | 0.7384 | 0.7429 | 0.7341 |

Dataset W | 0.8577 | 0.8389 | 0.8577 | 0.8388 |

Random Forest | ||||

Dataset A | 0.8658 | 0.8656 | 0.8658 | 0.8652 |

Dataset U | 0.7331 | 0.7323 | 0.7331 | 0.7305 |

Dataset L | 0.8232 | 0.8223 | 0.8232 | 0.8206 |

Dataset W | 0.9024 | 0.8966 | 0.9024 | 0.8975 |

Bayesian Forest | ||||

Dataset A | 0.7946 | 0.7947 | 0.7946 | 0.7920 |

Dataset U | 0.6113 | 0.6090 | 0.6113 | 0.5963 |

Dataset L | 0.7386 | 0.7396 | 0.7386 | 0.7173 |

Dataset W | 0.8488 | 0.8395 | 0.8488 | 0.8089 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Jain, S.; Kumar, V.
Garment Categorization Using Data Mining Techniques. *Symmetry* **2020**, *12*, 984.
https://doi.org/10.3390/sym12060984

**AMA Style**

Jain S, Kumar V.
Garment Categorization Using Data Mining Techniques. *Symmetry*. 2020; 12(6):984.
https://doi.org/10.3390/sym12060984

**Chicago/Turabian Style**

Jain, Sheenam, and Vijay Kumar.
2020. "Garment Categorization Using Data Mining Techniques" *Symmetry* 12, no. 6: 984.
https://doi.org/10.3390/sym12060984