# Towards Image Classification with Machine Learning Methodologies for Smartphones

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Works

## 3. Methodologies

#### 3.1. Traditional Machine Learning

#### 3.1.1. Support Vector Machine (SVM)

**Linear:**$k({x}_{i},{x}_{j})={x}_{i}\xb7{x}_{j}$**Polynomial:**$k({x}_{i},{x}_{j})={({x}_{i}\xb7{x}_{j}+1)}^{d}$**Radial Basis Function:**$k({x}_{i},{x}_{j})=exp(-||{x}_{i}-{x}_{j}{\left|\right|}^{2}/2{\sigma}^{2})$**Sigmoid:**$k({x}_{i},{x}_{j})=tanh(k({x}_{i}\xb7{x}_{j})-c)$

- Supposing there are m n-dimensional samples and arraying the original data by columns into a matrix X with n rows and m columns;
- Zero-averaging each row of X (representing an attribute field), i.e., subtracting the mean of the row;
- Calculating the covariance matrix $C=\frac{1}{m}X{X}^{T}$;
- Determining the eigenvalues of the covariance matrix and the corresponding eigenvectors;
- Arranging the feature vectors decreasingly according to the corresponding feature value and selecting the first ${n}^{{}^{\prime}}$ rows to form a matrix P;
- $Y=PX$ is the new data after reducing to ${n}^{{}^{\prime}}$ dimension.

#### 3.1.2. Decision Tree and Random Forests

- Take a random sample of size N with replacement from the data;
- Take a random sample without replacement of the predictors;
- Construct the first Classification And Regression Tree (CART) for partition of the data;
- Repeat Step 2 for each subsequent split until the tree is as large as desired and do not prune;
- Repeat Step 1 to Step 4 a large number of times (e.g., 500).

#### 3.2. Deep Learning

- The first layer in this structure is a Conv2D layer for the convolution operation which can extract features from the input data by sliding the filter over the input to generate a feature map. In this case, the size of the filter is $3\times 3$.
- The second layer is a MaxPooling2D layer for the max-pooling operation which can reduce the dimensionality of each feature. In this case, the size of the pooling window is $2\times 2$.
- The third layer is a Dropout layer for reducing overfitting. In this case, the dropout function will randomly abandon 20% of the outputs.

#### 3.3. Transfer Learning

## 4. Dataset and Application

#### 4.1. Dataset

#### 4.2. Android Application

#### 4.2.1. Comparison of Models Saved in Server and in Application

#### 4.2.2. TensorFlow Mobile and TensorFlow Lite

#### 4.2.3. Flowchart

## 5. Experimental Results and Discussions

#### 5.1. Optimization

#### 5.1.1. Data Augmentation

#### SVM

#### RF

#### 4-Conv CNN and VGG19

#### 5.1.2. Batch Size

#### 5.1.3. Optimizer

#### 5.1.4. Batch Normalization (BN)

#### 5.2. Comparison between Final Results of Different Methodologies

#### 5.3. Android Application

#### 5.3.1. Structure

#### 5.3.2. Layout and Output

#### 5.3.3. Performance

#### 5.4. Summary

## 6. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Mayo, M.; Watson, A.T. Automatic species identification of live moths. Knowl.-Based Syst.
**2007**, 20, 195–202. [Google Scholar] [CrossRef][Green Version] - Taxonomic Keys. Available online: https://collectionseducation.org/identify-specimen/taxonomic-keys/ (accessed on 24 August 2019).
- Walter, D.E.; Winterton, S. Keys and the Crisis in Taxonomy: Extinction or Reinvention? Annu. Rev. Entomol.
**2007**, 52, 193–208. [Google Scholar] [CrossRef] [PubMed] - Gaston, K.J.; May, R.M. Taxonomy of taxonomists. Nature
**1992**, 356, 281–282. [Google Scholar] [CrossRef] - Hopkins, G.W.; Freckleton, R.P. Declines in the numbers of amateur and professional taxonomists: implications for conservation. Anim. Conserv.
**2002**, 5, 245–249. [Google Scholar] [CrossRef] - Weeks, P.J.D.; O’Neill, M.A.; Gaston, K.J.; Gauld, I.D. Automating insect identification: exploring the limitations of a prototype system. J. Appl. Entomol.
**1999**, 123, 1–8. [Google Scholar] [CrossRef] - Weeks, P.J.D.; O’Neill, M.A.; Gaston, K.J.; Gauld, I.D. Species–identification of wasps using principal component associative memories. Image Vision Comput.
**1999**, 17, 861–866. [Google Scholar] [CrossRef] - Weeks, P.J.D.; Gaston, K.J. Image analysis, neural networks, and the taxonomic impediment to biodiversity studies. Biodivers. Conserv.
**1997**, 6, 263–274. [Google Scholar] [CrossRef] - Dayrat, B. Towards integrative taxonomy. Biol. J. Linn. Soc.
**2005**, 85, 407–417. [Google Scholar] [CrossRef] - Cortes, C.; Vapnik, V. Support-vector networksv. Mach. Learn.
**1995**, 20, 273–297. [Google Scholar] [CrossRef] - Wold, S.; Esbensen, K.; Geladi, P. Principal component analysis. Chemom. Intell. Lab. Syst.
**1987**, 2, 37–52. [Google Scholar] [CrossRef] - Bouzalmat, A.; Kharroubi, J.; Zarghili, A. Comparative Study of PCA, ICA, LDA using SVM Classifier. J. Emerg. Techol. Web Intell.
**2014**, 6. [Google Scholar] [CrossRef] - Hai, N.T.; An, T.H. PCA-SVM Algorithm for Classification of Skeletal Data-Based Eigen Postures. Am. J. Biomed. Eng.
**2016**, 6, 47–158. [Google Scholar] [CrossRef] - Alam, S.; Kang, M.; Pyun, J.-Y.; Kwon, G. Performance of classification based on PCA, linear SVM, and Multi-kernel SVM. In Proceedings of the Eighth International Conference on Ubiquitous and Future Networks (ICUFN), Vienna, Austria, 5–8 July 2016; pp. 987–989. [Google Scholar] [CrossRef]
- Hernández-Serna, A.; Jimenez, L. Automatic identification of species with neural networks. PeerJ
**2014**, 11, e563. [Google Scholar] [CrossRef] [PubMed] - Wang, J.; Ji, L.; Liang, A.; Yuan, D. The identification of butterfly families using content-based image retrieval. Biosyst. Eng.
**2012**, 111, 24–32. [Google Scholar] [CrossRef] - Iamsa-at, S.; Horata, P.; Sunat, K.; Thipayang, N. Improving Butterfly Family Classification Using Past Separating Features Extraction in Extreme Learning Machine. In Proceedings of the 2nd International Conference on Intelligent Systems and Image Processing 2014 (ICISIP2014), Kitakyushu, Japan, 26–29 September 2014. [Google Scholar]
- Kang, S.-H.; Cho, J.-H.; Lee, S.-H. Identification of butterfly based on their shapes when viewed from different angles using an artificial neural network. J Asia-PAC Entomol.
**2014**, 17, 143–149. [Google Scholar] [CrossRef] - Xie, J.; Hou, Q.; Shi, Y.; Peng, L.; Jing, L.; Zhuang, F.; Zhang, J.; Tang, X.; Xu, S. The Automatic Identification of Butterfly Species. arXiv
**2018**, arXiv:1803.06626. [Google Scholar] - Lane, N.; Bhattacharya, S.; Mathur, A.; Forlivesi, C.; Kawsar, F. DXTK: Enabling Resource-efficient Deep Learning on Mobile and Embedded Devices with the DeepX Toolkit. In Proceedings of the 8th EAI International Conference on Mobile Computing, Applications and Services, Cambridge, UK, 30 Novembe–1 December 2016; ICST: Brussels, Belgium, 2016; pp. 98–107. [Google Scholar][Green Version]
- Samangouei, P.; Chellappa, R. Convolutional neural networks for attribute-based active authentication on mobile devices. In Proceedings of the IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), Niagara Falls, NY, USA, 6–9 September 2016; pp. 1–8. [Google Scholar] [CrossRef]
- Alsing, O. Mobile Object Detection using TensorFlow Lite and Transfer Learning (Dissertation). 2018. Available online: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233775 (accessed on 29 April 2019).
- Zararsiz, G.; Elmali, F.; Ozturk, A. Bagging Support Vector Machines for Leukemia Classification. IJCSI
**2012**, 9, 355–358. [Google Scholar] - Schachtner, R.; Lutter, D.; Stadlthanner, K.; Lang, E.W.; Schmitz, G.; Tomé, A.M.; Vilda, P.G. Routes to identify marker genes for microarray classification. In Proceedings of the 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Lyon, France, 22–26 August 2007. [Google Scholar] [CrossRef]
- Deepa, V.B.; Thangaraj, P. Classification of EEG data using FHT and SVM based on Bayesian Network. IJCSI
**2011**, 8, 239–243. [Google Scholar] - Zhang, J.; Bo, L.L.; Xu, J.W.; Park, S.H. Why Can SVM Be Performed in PCA Transformed Space for Classification? Adv. Mater. Res.
**2011**, 181–182, 1031–1037. [Google Scholar] [CrossRef] - Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees; Chapman and Hall/CRC: London, UK, 1984; ISBN 9780412048418. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef][Green Version] - Quinlan, J.R. Induction of decision trees. Mach. Learn.
**1986**, 1, 81–106. [Google Scholar] [CrossRef][Green Version] - Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1993; ISBN 1558602402. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv
**2014**, arXiv:1409.1556. [Google Scholar] - Wang, J.; Markert, K.; Everingham, M. Learning Models for Object Recognition from Natural Language Descriptions. In Proceedings of the 20th British Machine Vision Conference (BMVC2009), London, UK, 7–10 September 2009. [Google Scholar] [CrossRef]
- eNature: FieldGuides. Available online: http://www.biologydir.com/enature-fieldguides-info-33617.html (accessed on 5 December 2018).
- TensorFlow Lite GPU delegate. Available online: https://www.tensorflow.org/lite/performance/gpu (accessed on 30 June 2019).
- Bottou, L. Large-Scale Machine Learning with Stochastic Gradient Descent. In Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT’2010), Paris, France, 22–27 August 2010; pp. 177–186. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Batch normalization in Neural Networks. Available online: https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c (accessed on 21 November 2018).
- Tornay, S.C. Ockham: Studies and Selections; Open Court: La Salle, IL, USA, 1938. [Google Scholar]

**Figure 7.**An example of image and description in one category [32].

**Figure 13.**Training 4-Conv CNN with different batch sizes. (

**a**) batch size is 16, (

**b**) batch size is 256.

Number of Components | Acc | Precision | Recall | F1-score |
---|---|---|---|---|

10 | 46.0% | 50% | 46% | 46% |

15 | 48.4% | 52% | 48% | 49% |

20 | 52.6% | 56% | 52% | 53% |

25 | 44.4% | 49% | 44% | 45% |

30 | 44.6% | 47% | 44% | 45% |

35 | 44.1% | 49% | 44% | 45% |

40 | 44.0% | 49% | 44% | 45% |

Number of Trees | No PCA | 10 | 20 | 30 | 40 | 50 |
---|---|---|---|---|---|---|

100 | 62.7% | 58.4% | 60.0% | 57.6% | 57.2% | 60.8% |

150 | 65.1% | 59.2% | 58.8% | 60.4% | 60.0% | 56.0% |

200 | 67.5% | 62.4% | 61.6% | 57.2% | 60.4% | 62.4% |

250 | 67.5% | 60% | 60.0% | 61.2% | 57.6% | 60.8% |

300 | 66.3% | 63.6% | 62.8% | 60.4% | 62.4% | 59.6% |

350 | 72.3% | 59.6% | 61.2% | 64.0% | 60.0% | 61.6% |

400 | 68.7% | 62% | 60.8% | 64.0% | 61.6% | 60.0% |

450 | 63.9% | 61.2% | 60.8% | 62.0% | 62.0% | 60.8% |

500 | 66.3% | 62.8% | 60.4% | 64.4% | 61.6% | 61.2% |

Model | Augmentation | Epochs | Acc | Precision | Recall | F1-score |
---|---|---|---|---|---|---|

4-Conv CNN | Before | 36 | 84.4% | 84% | 84% | 84% |

After | 100 | 95.2% | 95% | 95% | 95% | |

VGG19 | Before | 41 | 94.1% | 94% | 94% | 94% |

After | 66 | 96.0% | 96% | 96% | 96% |

Model | Batch size | Epochs | Acc | Precision | Recall | F1-score |
---|---|---|---|---|---|---|

4-Conv CNN | 128 | 100 | 95.2% | 95% | 95% | 95% |

64 | 79 | 96.3% | 97% | 96% | 96% | |

VGG19 | 128 | 66 | 96.0% | 96% | 96% | 96% |

64 | 57 | 97.1% | 97% | 97% | 97% |

4-Conv CNN | ||||||
---|---|---|---|---|---|---|

Batch size=64 | Acc | Precision | Recall | F1-score | Epochs | Time(minutes) |

Adam | 96.3% | 97% | 96% | 96% | 79 | 7 |

SGD | 90.7% | 91% | 91% | 91% | 1278 | 125 |

Adam+SGD | 97.1% | 97% | 97% | 97% | 79+39 | 7+4 |

Pre-trained VGG19 | ||||||
---|---|---|---|---|---|---|

Batch size=64 | Acc | Precision | Recall | F1-score | Epochs | Time(minutes) |

Adam | 97.1% | 97% | 97% | 97% | 57 | 5 |

SGD | 96.2% | 96% | 96% | 96% | 324 | 32 |

Adam+SGD | 98.4% | 98% | 98% | 98% | 57+42 | 5+4 |

4-Conv CNN | Epochs | Acc | Precision | Recall | 1-score |
---|---|---|---|---|---|

Before BN | 118 | 97.1% | 97% | 97% | 97% |

After BN | 78 | 98.3% | 98% | 98% | 98% |

Model | Acc | Precision | Recall | F1-score | |
---|---|---|---|---|---|

Traditional machine learning | SVM | 52.6% | 56% | 52% | 53% |

Ramdom Forest | 72.3% | 79% | 73% | 73% | |

Deep learning | 4-Conv CNN | 98.3% | 98% | 98% | 98% |

Transfer learning | VGG19 | 98.4% | 98% | 98% | 98% |

From | Predicted Class | From | Predicted Class | ||||
---|---|---|---|---|---|---|---|

Camera | No | Yes | Gallery | No | Yes | ||

Input Class | No | TN = 4 | FP = 6 | Input Class | No | TN = 2 | FP = 8 |

Yes | FN = 2 | TP = 8 | Yes | FN = 1 | TP = 9 |

Acc | Precision | Recall | F1-score | |
---|---|---|---|---|

Camera | 60.0% | 57.1%% | 80.0% | 66.7% |

Gallery | 55.0% | 52.9% | 90.0% | 66.7% |

Overall | 57.5% | 54.8% | 85% | 66.7% |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhu, L.; Spachos, P. Towards Image Classification with Machine Learning Methodologies for Smartphones. *Mach. Learn. Knowl. Extr.* **2019**, *1*, 1039-1057.
https://doi.org/10.3390/make1040059

**AMA Style**

Zhu L, Spachos P. Towards Image Classification with Machine Learning Methodologies for Smartphones. *Machine Learning and Knowledge Extraction*. 2019; 1(4):1039-1057.
https://doi.org/10.3390/make1040059

**Chicago/Turabian Style**

Zhu, Lili, and Petros Spachos. 2019. "Towards Image Classification with Machine Learning Methodologies for Smartphones" *Machine Learning and Knowledge Extraction* 1, no. 4: 1039-1057.
https://doi.org/10.3390/make1040059