Recently, outlier detection has widespread applications in different areas. The task is to identify outliers in the dataset and extract potential information. The existing outlier detection algorithms mainly do not solve the problems of parameter selection and high computational cost, which leaves enough room for further improvements. To solve the above problems, our paper proposes a parameter-free outlier detection algorithm based on dataset optimization method. Firstly, we propose a dataset optimization method (DOM), which initializes the original dataset in which density is greater than a specific threshold. In this method, we propose the concepts of partition function (P) and threshold function (T). Secondly, we establish a parameter-free outlier detection method. Similarly, we propose the concept of the number of residual neighbors, as the number of residual neighbors and the size of data clusters are used as the basis of outlier detection to obtain a more accurate outlier set. Finally, extensive experiments are carried out on a variety of datasets and experimental results show that our method performs well in terms of the efficiency of outlier detection and time complexity.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited