Multisensor Feature Selection for Maritime Target Estimation
Abstract
:1. Introduction
- A novel method is presented for target estimation from maritime multisensor data.
- Comparative analysis is performed using various preprocessing techniques and regression methods to find suitable processing steps for the time series maritime data.
- Novel denoising and hierarchical feature selection techniques that consider the characteristics of the data are proposed.
- The effectiveness of the proposed method is demonstrated by applying it to data collected in environments similar to real marine environments.
2. Related Work
2.1. Underwater IUU Detection
2.2. Multisensor Feature Selection
3. Method
3.1. Data Preprocessing
3.1.1. Synchronizing Data
- Time Alignment—Sensors may collect data at different rates, requiring resampling or interpolation techniques to align them to a common timeline. For example, data from faster sensors can be downsampled, or data from slower sensors can be interpolated to match the higher frequency.
- Identifying Synchronous Events—In certain cases, detecting events or triggers common to all sensors is helpful. Here, an “event” refers to a specific moment when all sensors simultaneously react during data collection. In this study, data from all sensors were aligned based on the closest target point to ensure synchronized timing across sensors. Events can therefore serve as synchronization points, with data aligned based on when each sensor detected the event.
3.1.2. Handling Missing Values
- Listwise deletion—Removes all observations with missing values. While simple, this method can lead to significant data loss.
- Mean substitution—Replaces the missing values with the mean of variables. Although this method is also easy to use, this can reduce data variability, potentially distorting relationships between sensors, especially if the missing data are not random.
- Regression substitution—Predicts missing values by using other variables in a regression model, assuming a linear relationship. However, if this assumption is incorrect, it would lead to bias.
- Multiple imputation—Creates multiple datasets with different imputed values and combines the results, preserving variability and reducing bias.
3.1.3. Handling Outliers
- Z-score method—Identifies outliers by comparing data based on mean and standard deviation. It is most effective for data that follow a normal distribution.
- Interquartile range (IQR) method—Uses the first and third quartiles to detect outliers. It is robust and can handle skewed distributions, making it suitable for various data types.
- Density-based spatial clustering of applications with noise (DBSCAN)—As a clustering algorithm that identifies outliers based on data point density, it works well for complex distributions, but the effectiveness depends on careful parameter tuning.
- Removing Outliers—In cases where outliers are clearly due to errors or noise, they can simply be removed from the dataset. However, this can result in data loss, requiring caution, especially if the outliers represent important events.
- Imputation with Mean/Median—Outliers can be replaced with the mean or median of the data, especially when they are few in number. Median imputation is particularly useful for skewed data, as it is less sensitive to extreme values than the mean.
- Winsorizing—Involves capping the extreme values to a specific percentile range (e.g., the 5th and 95th percentiles) to reduce the influence of outliers without completely removing them from the dataset.
- K-nearest neighbors (KNN) regression—After detecting outliers (e.g., using the Z-score method), KNN regression can be applied to correct the outlier values, whereby outliers are replaced with predicted values based on the K-nearest data points, ensuring that the imputed values are consistent with the surrounding data.
- Transformation—Applying mathematical transformations (e.g., log or square root) can reduce the effect of outliers. This method is effective when outliers are due to skewed distributions.
3.1.4. Scaling
- Standard scaling, also known as Z-score normalization, transforms the data to have a mean of 0 and a standard deviation of 1.
- MinMax scaling, in contrast to standard scaling, rescales the data to a fixed range, typically between 0 and 1.
- Robust scaling is designed to handle data with significant outliers by using the median and the interquartile range (IQR) instead of the mean and standard deviation. IQR is defined as the difference between the 25th and 75th percentiles.
- MaxAbs scaling scales each feature by dividing it by the maximum absolute value of that feature, ensuring that the range of the scaled data lies between −1 and 1.
- Normalization [24] refers to adjusting the magnitude of each data point such that the data are represented as a vector of unit size. A common normalization method is L2 normalization, which scales each data point by the L2 norm (Euclidean norm) of the vector.
3.2. Feature Selection
3.2.1. Denoising
3.2.2. Sensor Stability
3.2.3. Target Relevance
4. Experiments and Results
4.1. Metrics
4.2. Regression Models
4.3. Results
5. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Correction Statement
References
- Welch, H.; Clavelle, T.; White, T.D.; Cimino, M.A.; Van Osdel, J.; Hochberg, T.; Kroodsma, D.; Hazen, E.L. Hot spots of unseen fishing vessels. Sci. Adv. 2022, 8, eabq2109. [Google Scholar] [CrossRef] [PubMed]
- Paolo, F.S.; Kroodsma, D.; Raynor, J.; Hochberg, T.; Davis, P.; Cleary, J.; Marsaglia, L.; Orofino, S.; Thomas, C.; Halpin, P. Satellite mapping reveals extensive industrial activity at sea. Nature 2024, 625, 85–91. [Google Scholar] [CrossRef] [PubMed]
- Orofino, S.; McDonald, G.; Mayorga, J.; Costello, C.; Bradley, D. Opportunities and challenges for improving fisheries management through greater transparency in vessel tracking. ICES J. Mar. Sci. 2023, 80, 675–689. [Google Scholar] [CrossRef]
- Watson, J.T.; Ames, R.; Holycross, B.; Suter, J.; Somers, K.; Kohler, C.; Corrigan, B. Fishery catch records support machine learning-based prediction of illegal fishing off US West Coast. PeerJ 2023, 11, e16215. [Google Scholar] [CrossRef]
- Wu, P.; Zhang, H.; Shi, Y.; Lu, J.; Li, S.; Huang, W.; Tang, N.; Wang, S. Real-time estimation of underwater sound speed profiles with a data fusion convolutional neural network model. Appl. Ocean Res. 2024, 150, 104088. [Google Scholar] [CrossRef]
- Yang, M.; Sha, Z.; Zhang, F. A Multi-Modal Approach Based on Large Vision Model for Close-Range Underwater Target Localization. arXiv 2024, arXiv:2401.04595. [Google Scholar]
- Ge, X.; Zhou, H.; Zhao, J.; Li, X.; Liu, X.; Li, J.; Luo, C. Robust Positioning Estimation for Underwater Acoustics Targets with Use of Multi-Particle Swarm Optimization. J. Mar. Sci. Eng. 2024, 12, 185. [Google Scholar] [CrossRef]
- Chandrashekar, G.; Sahin, F. A survey on feature selection methods. Comput. Electr. Eng. 2014, 40, 16–28. [Google Scholar] [CrossRef]
- Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature selection: A data perspective. ACM Comput. Surv. (CSUR) 2017, 50, 1–45. [Google Scholar] [CrossRef]
- Bolón-Canedo, V.; Sánchez-Maroño, N.; Alonso-Betanzos, A. A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 2013, 34, 483–519. [Google Scholar] [CrossRef]
- Hall, M.A. Correlation-Based Feature Selection for Machine Learning. Ph.D Thesis, The University of Waikato, Hamilton, New Zealand, 1999. [Google Scholar]
- Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed]
- Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene selection for cancer classification using support vector machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar] [CrossRef]
- Bouaguel, W. A new approach for wrapper feature selection using genetic algorithm for big data. In Intelligent and Evolutionary Systems: Proceedings of the 19th Asia Pacific Symposium, IES 2015, Bangkok, Thailand, 22–25 November 2015; Proceedings; Springer International Publishing: Cham, Switzerland, 2016; pp. 161–171. [Google Scholar]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Khaleghi, B.; Khamis, A.; Karray, F.O.; Razavi, S.N. Multisensor data fusion: A review of the state-of-the-art. Inf. Fusion 2013, 14, 28–44. [Google Scholar] [CrossRef]
- Fernández-Delgado, M.; Sirsat, M.S.; Cernadas, E.; Alawadi, S.; Barro, S.; Febrero-Bande, M. An extensive experimental survey of regression methods. Neural Netw. 2019, 111, 11–34. [Google Scholar] [CrossRef]
- Bilal, M.; Ali, G.; Iqbal, M.W.; Anwar, M.; Malik, M.S.A.; Kadir, R.A. Auto-prep: Efficient and automated data preprocessing pipeline. IEEE Access 2022, 10, 107764–107784. [Google Scholar] [CrossRef]
- Tharwat, A.; Schenck, W. Active Learning for Handling Missing Data. IEEE Trans. Neural Netw. Learn. Syst. 2024. [Google Scholar] [CrossRef]
- Rousseeuw, P.J.; Hubert, M. Robust statistics for outlier detection. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2011, 1, 73–79. [Google Scholar] [CrossRef]
- Ahsan, M.M.; Mahmud, M.P.; Saha, P.K.; Gupta, K.D.; Siddique, Z. Effect of data scaling methods on machine learning algorithms and model performance. Technologies 2021, 9, 52. [Google Scholar] [CrossRef]
- Torgerson, W.S. Theory and Methods of Scaling; Wiley: Hoboken, NJ, USA, 1958. [Google Scholar]
- Patro, S.G.; Sahu, K.K. Normalization: A preprocessing stage. arXiv 2015, arXiv:1503.06462. [Google Scholar] [CrossRef]
- Cleveland, R.B.; Cleveland, W.S.; McRae, J.E.; Terpenning, I. STL: A seasonal-trend decomposition. J. Off. Stat. 1990, 6, 3–73. [Google Scholar]
- Venkatesh, B.; Anuradha, J. A review of feature selection and its methods. Cybern. Inf. Technol. 2019, 19, 3–26. [Google Scholar] [CrossRef]
- Khalid, S.; Khalil, T.; Nasreen, S. A survey of feature selection and feature extraction techniques in machine learning. In Proceedings of the 2014 Science and Information Conference, London, UK, 27–29 August 2014; pp. 372–378. [Google Scholar]
- Gerretzen, J.; Szymanska, E.; Jansen, J.J.; Bart, J.; van Manen, H.J.; van den Heuvel, E.R.; Buydens, L.M. Simple and effective way for data preprocessing selection based on design of experiments. Anal. Chem. 2015, 87, 12096–12103. [Google Scholar] [CrossRef] [PubMed]
- Karagiannopoulos, M.; Anyfantis, D.; Kotsiantis, S.B.; Pintelas, P.E. Feature selection for regression problems. In Proceedings of the 8th Hellenic European Research on Computer Mathematics & Its Applications, Athens, Greece, 20–22 September 2007. [Google Scholar]
- Wu, X.; Kumar, V.; Ross Quinlan, J.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.J.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst. 2008, 14, 1–37. [Google Scholar] [CrossRef]
- Guo, Y.; Wang, W.; Wang, X. A robust linear regression feature selection method for data sets with unknown noise. IEEE Trans. Knowl. Data Eng. 2023, 35, 31–44. [Google Scholar] [CrossRef]
- Hassani, H.; Mahmoudvand, R.; Yarmohammadi, M. Filtering and denoising in linear regression analysis. Fluct. Noise Lett. 2010, 9, 367–383. [Google Scholar] [CrossRef]
- Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Cover, T.M.; Hart, P.E. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
- Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. In Proceedings of the 9th International Conference on Neural Information Processing Systems (NIPS), Denver, CO, USA, 2–5 December 1996; MIT Press: Cambridge, MA, USA, 1996; pp. 155–161. [Google Scholar]
- Hochreiter, S. Long Short-term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Threshold | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|
outlier percentage (%) | 26.296 | 4.020 | 0.427 | 0.050 | 0.012 | 0.006 | 0.002 | 0.001 | 0.001 |
Scaling | Denoising | Feature Selection | ||
---|---|---|---|---|
None | Baseline (RFE) | Hierarchical (Proposed) | ||
Standard Scaling | 2.64 | 1.76 | 6.95 | |
2.98 | 1.99 | 8.66 | ||
1.88 | 1.25 | 1.89 | ||
Min-Max Scaling | 2.10 | 1.12 | 6.36 | |
2.29 | 1.22 | 7.81 | ||
1.42 | 7.61 | 1.77 | ||
Robust Scaling | 9.43 | 6.64 | 4.20 | |
1.07 | 7.38 | 4.85 | ||
6.77 | 4.72 | 1.06 | ||
MaxAbs Scaling | 1.47 | 4.75 | 3.77 | |
1.55 | 4.81 | 4.31 | ||
1.04 | 4.99 | 1.46 | ||
Normalizer | 5.85 | 2.40 | 1.65 | |
5.79 | 1.62 | 2.10 | ||
3.68 | 3.04 | 3.61 |
Feature Selection | Test # | Average R2 Score |
---|---|---|
No Feature Selection (0.5114) | Test 1 | 0.4839 |
Test 2 | 0.2083 | |
Test 3 | 0.6506 | |
Test 4 | 0.6697 | |
Test 5 | 0.5447 | |
Existing Feature Selection (0.5710) | Test 1 | 0.3698 |
Test 2 | 0.5426 | |
Test 3 | 0.7238 | |
Test 4 | 0.6336 | |
Test 5 | 0.5852 | |
2nd Feature Selection (0.6244) | Test 1 | 0.6601 |
Test 2 | 0.6567 | |
Test 3 | 0.6832 | |
Test 4 | 0.5563 | |
Test 5 | 0.5659 |
Method | MSE | Compared to Baseline (%) |
---|---|---|
Baseline (RFE) | 6.00 | - |
Scaled | 1.21 | +1916.67% |
Denoised | 5.02 | +736.67% |
Hierarchical * | 6.52 | −98.91% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Choi, S.; An, J. Multisensor Feature Selection for Maritime Target Estimation. Electronics 2024, 13, 4497. https://doi.org/10.3390/electronics13224497
Choi S, An J. Multisensor Feature Selection for Maritime Target Estimation. Electronics. 2024; 13(22):4497. https://doi.org/10.3390/electronics13224497
Chicago/Turabian StyleChoi, Sun, and Jhonghyun An. 2024. "Multisensor Feature Selection for Maritime Target Estimation" Electronics 13, no. 22: 4497. https://doi.org/10.3390/electronics13224497
APA StyleChoi, S., & An, J. (2024). Multisensor Feature Selection for Maritime Target Estimation. Electronics, 13(22), 4497. https://doi.org/10.3390/electronics13224497