# A Dimensionality Reduction Algorithm for Unstructured Campus Big Data Fusion

^{*}

## Abstract

**:**

## 1. Introduction

- We construct the fusion model of unstructured campus data. The representation model for specific types of data has been very mature, but there is no method to integrate the video, audio, image and so on of unstructured campus data into one model. This paper proposes a fusion model of heterogeneous campus data. The model transforms a variety of heterogeneous campus student data into a corresponding vector form, and establishes corresponding sub tensor models according to students’ class video, class image, answer audio, evaluation text, etc. Then, the semi tensor product method is used to fuse tensors of different orders to realize the fusion of individual sub tensor models of students and abstract the labeled student model.
- Extraction of core tensors. After the fusion of a sub-tensor model, heterogeneous data can be utilized by various algorithms. Due to the large amount of data, this can cause huge time consumption for subsequent analysis. This paper proposes a core tensor extraction method. The original tensor is decomposed using singular value decomposition, and a smaller core tensor is extracted from the original tensor, which can reduce the data storage capacity and the computation time.

## 2. Related Background Knowledge

- ${T}_{m}$ is mode-m unfolded matrix;
- $\parallel T\parallel $ is the frobenius norm of tensor $\mathrm{T}$;
- ${\times}_{n}$ is n-mode product of a tensor;
- $\u2a02$ is Kronecker product;
- $\propto $ is semi-tensor product.

**Theorem**

**1.**

**Corollary**

**1.**

**Theorem**

**2.**

## 3. icHOSVD Algorithm for Unstructured Campus Big Data Fusion

#### 3.1. Framework of the icHOSVD Algorithm

#### 3.2. Fusion Model of Unstructured Campus Data

#### 3.2.1. Subtensor Model of Heterogeneous Data from Multiple Sources

- (1)
- The sub-tensor representation method of video data.

- (2)
- The sub-tensor representation method of audio data.

- (3)
- The sub-tensor representation method of image data.

- (4)
- The sub-tensor representation method of text data.

#### 3.2.2. A Tensor Space Fusion Method Based on Semi-Tensor Product

#### 3.3. An icHOSVD Algorithm Based on Tensor

#### 3.3.1. Tensor Segmentation

#### 3.3.2. icHOSVD Algorithm

Algorithm 1. The recursive HOSVD algorithm. |

Input: matrix ${M}_{i}$, matrix ${C}_{i}$Output: new left unitary matrix $U$, positive semi-definite diagonal matrix $\sum $, right unitary matrix $V$ |

1. if $\mathrm{i}>1$then |

2. $({U}_{j},{\sum}_{j},{C}_{j})\leftarrow \mathrm{HOSVD}({M}_{i},{C}_{i})$; |

3. $blend({M}_{i-1},{C}_{i-1},{U}_{i-1},{\sum}_{j-1},{C}_{j-1})$; |

4. $\mathrm{i}\leftarrow \mathrm{i}-1$; |

5. else if$\text{}\mathrm{i}=1$ |

6. $HOSVD({M}_{i})$ |

7. end |

8. end |

9. return$U,\sum ,V$; |

## 4. Experiment Analysis

- (1)
- Time complexity.

_{1}and C

_{2}are constants. To begin by adding columns to raw matrix, the time complexity of one unfolded matrix decomposed by a singular value decomposition is $O\left({k}^{2}n\right)$, whereas $k$ is the number of truncated left singular vectors. After unfolding, a p-order tensor has p-mode unfolding matrixes. The matrix unfolding time is $O\left(p{k}^{2}n\right)$. The semi-product time of a tensor by a truncated base is $O\left({k}^{2}n\right)$. The total semi-product time is $O\left(p{k}^{2}n\right)$. The total time of icHOSVD algorithm is $O\left(1\right)+O\left(p{k}^{2}n\right)+O\left(p{k}^{2}n\right)$, which is $O\left(p{k}^{2}n\right)$.

- (2)
- Computation accuracy.

**Reconstruction error rate:**the formula of reconstruction error rate is shown in Formula (18).

**Dimensionality Reduction Ratio:**the formula of dimensionality reduction ratio is shown in Formula (19).

- (3)
- Comparison with other methods.

## 5. Summary and Outlook

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Abdel-Basset, M.; Mohamed, M.; Smarandache, F.; Chang, V. Neutrosophic Association Rule Mining Algorithm for Big Data Analysis. Symmetry
**2018**, 10, 106. [Google Scholar] [CrossRef] [Green Version] - Liu, K.; Ni, Y.; Li, Z.; Duan, B. Data Mining and Feature Analysis of College Students’ Campus Network Behavior. In Proceedings of the 2020 5th IEEE International Conference on Big Data Analytics (ICBDA), Xiamen, China, 6–9 March 2020; pp. 231–237. [Google Scholar] [CrossRef]
- Liu, W. Campus Management Strategy Research under the Environment of Big Data. In Proceedings of the 2016 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), Changsha, China, 17–18 December 2016; pp. 195–199. [Google Scholar] [CrossRef]
- Ranjan, R.; Wang, L.; Zomaya, A.Y.; Tao, J.; Jayaraman, P.P.; Georgakopoulos, D. Advances in Methods and Techniques for Processing Streaming Big Data in Datacentre Clouds. IEEE Trans. Emerg. Top. Comput.
**2016**, 4, 262–265. [Google Scholar] [CrossRef] [Green Version] - Zhao, L.; Chen, L.; Ranjan, R.; Choo, K.-K.R.; He, J. Geographical information system parallelization for spatial big data processing: A review. Clust. Comput.
**2015**, 19, 139–152. [Google Scholar] [CrossRef] - Luo, E.; Hu, Z.; Lin, H. Big data era development model research of huge amounts of data extraction. Appl. Res. Comput.
**2013**, 30, 3269–3275. [Google Scholar] - Li, W.; Lang, B. A tetrahedron data model of unstructured database. SSI
**2010**, 40, 1039–1053. [Google Scholar] - Lang, B.; Zhang, B. Key Techniques for Building Big-Data-Oriented Unstructured Data Management Platform. Inf. Technol. Stand.
**2013**, 10, 53–56. [Google Scholar] - Han, J.; E, H.-H.; Song, M.N.; Song, J.D. Model for unstructured data based on subject behavior. Comput. Eng. Des.
**2013**, 34, 904–908. [Google Scholar] - Kuang, L.; Hao, F.; Yang, L.T.; Lin, M.; Luo, C.; Min, G. A Tensor-Based Approach for Big Data Representation and Dimensionality Reduction. IEEE Trans. Emerg. Top. Comput.
**2014**, 2, 280–291. [Google Scholar] [CrossRef] - Kuang, L.; Yang, L.T.; Liao, Y. An Integration Framework on Cloud for Cyber-Physical-Social Systems Big Data. IEEE Trans. Cloud Comput.
**2015**, 8, 363–374. [Google Scholar] [CrossRef] - Sharma, N.; Saroha, K. Study of dimension reduction methodologies in data mining. In Proceedings of the International Conference on Computing, Communication & Automation, New Delhi, India, 15–16 May 2015; pp. 133–137. [Google Scholar]
- Li, Y.; Chai, Y.; Zhou, H.; Yin, H. A novel dimension reduction and dictionary learning framework for high-dimensional data classification. Pattern Recognit.
**2021**, 112, 107793. [Google Scholar] [CrossRef] - He, J.; Ding, L.; Li, Z.; Hu, Q. Margin Discriminant Projection for Dimensionality Reduction. J. Softw.
**2014**, 25, 826–838. [Google Scholar] [CrossRef] - Xiao, J.; Gao, W.; Peng, H.; Tang, L.; Yi, B. Detail Enhancement for Image Super-Resolution Algorithm Based on SVD and Local Self-Similarity. Chin. J. Comput.
**2016**, 39, 1393–1406. [Google Scholar] - Zhan, C.; Wang, D.; Shen, C.; Cheng, H.; Chen, L.; Wei, S. Separable Compressive Image Method Based on Singular Value Decomposition. J. Comput. Res. Dev.
**2016**, 53, 2816–2823. [Google Scholar] - Cuomo, S.; Galletti, A.; Marcellino, L.; Navarra, G.; Toraldo, G. On GPU–CUDA as preprocessing of fuzzy-rough data reduction by means of singular value decomposition. Soft Comput.
**2018**, 22, 1525–1532. [Google Scholar] [CrossRef] - Pan, Y.; Hamdi, M. Computation of singular value decomposition on arrays with pipelined optical buses. J. Netw. Comput. Appl.
**1996**, 19, 235–248. [Google Scholar] [CrossRef] [Green Version] - García-Magariño, A.; Sor, S.; Velazquez, A. Data reduction method for droplet deformation experiments based on High Order Singular Value Decomposition. Exp. Therm. Fluid Sci.
**2016**, 79, 13–24. [Google Scholar] [CrossRef] - Naskovska, K.; Haardt, M.; Tichavsky, P.; Chabriel, G.; Barreré, J. Extension of the semi-algebraic framework for approximate CP decompositions via non-symmetric simultaneous matrix diagonalization. In Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 20–25 March 2016; pp. 2971–2975. [Google Scholar]
- Ding, H.; Chen, K.; Yuan, Y.; Cai, M.; Sun, L.; Liang, S.; Huo, Q. A Compact CNN-DBLSTM Based Character Model for Offline Handwriting Recognition with Tucker Decomposition. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; IEEE: Piscataway, NJ, USA, 2017; Volume 1, pp. 507–512. [Google Scholar]
- Wang, D.; Wang, H.; Zou, X. Identifying key nodes in multilayer networks based on tensor decomposition. Chaos
**2017**, 27, 063108. [Google Scholar] [CrossRef] [Green Version] - Mohanmad, J.; Mauro Dalla, M.; Pierre, C. Hyperspectral Image Classification Using Tensor CP Decomposition. In Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 5328–5331. [Google Scholar]
- Liu, J.; Xu, H.; Tang, H.; Jia, Y.; Cheng, X. Model and Construction Method on Dynamic Knowledge Network in Big Data. J. Comput. Res. Dev.
**2014**, 51 (Suppl. 2), 86–93. [Google Scholar] - Mao, G.; Hu, D.; Xie, S. Models and Algorithms for Classfying Bid Data Based on Distributed Data Streams. Chin. J. Comput.
**2017**, 40, 161–175. [Google Scholar] - Sarasquete, N.C. A common data representation model for customer behavior tracking. Icono
**2017**, 15, 55–91. [Google Scholar] - Chen, X.; Huang, L.; Tao, G. Big data representation method of power system based on random matrix theory. Hongshui River
**2017**, 36, 35–38. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, Z.; Wang, Y.; Zhang, L.; Zhang, C.; Zhang, X.
A Dimensionality Reduction Algorithm for Unstructured Campus Big Data Fusion. *Symmetry* **2021**, *13*, 345.
https://doi.org/10.3390/sym13020345

**AMA Style**

Wang Z, Wang Y, Zhang L, Zhang C, Zhang X.
A Dimensionality Reduction Algorithm for Unstructured Campus Big Data Fusion. *Symmetry*. 2021; 13(2):345.
https://doi.org/10.3390/sym13020345

**Chicago/Turabian Style**

Wang, Zhenfei, Yan Wang, Liying Zhang, Chuchu Zhang, and Xingjin Zhang.
2021. "A Dimensionality Reduction Algorithm for Unstructured Campus Big Data Fusion" *Symmetry* 13, no. 2: 345.
https://doi.org/10.3390/sym13020345