Automatic Filtering of Sugarcane Yield Data
Abstract
:1. Introduction
2. Materials and Methods
2.1. Obtaining Sugarcane Harvest Data and the Online Sliding Window Algorithm
2.2. The Filtering-Out Method and the Development of the Sliding Window Algorithm
2.3. Dataset and Case Studies
2.4. Validation
3. Results and Discussion
3.1. Sliding Window Filtering Algorithm Performance
3.2. Performance of Artificial Outliers’ Detection
3.3. Methods’ Performance Comparison
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Fulton, J.; Hawkins, E.; Taylor, R.; Franzen, A. Yield Monitoring and Mapping. Precision Agriculture Basics. In ASA, CSSA, and SSSA Books; American Society of Agronomy: Madison, WI, USA, 2018; pp. 63–77. [Google Scholar] [CrossRef]
- Maldaner, L.F.; Molin, J.P.; Spekken, M. Methodology to filter out outliers in high spatial density data to improve maps reliability. Sci. Agric. 2022, 79, e20200178. [Google Scholar] [CrossRef]
- Mutran, V.M.; Ribeiro, C.O.; Nascimento, C.O.A.; Chachuat, B. Risk-conscious approach to optimizing bioenergy investments in the Brazilian sugarcane industry. In Computer Aided Chemical Engineering; Kiss, A.A., Zondervan, E., Lakerveld, R., Özkan, L., Eds.; Elsevier: Amsterdam, The Netherland, 2019; Volume 46, pp. 361–366. [Google Scholar] [CrossRef]
- United Nations (UN). Transforming our World: The 2030 Agenda for Sustainable Development. 2015. Available online: https://sustainabledevelopment.un.org/content/documents/21252030%20Agenda%20for%20Sustainable%20Development%20web.pdf (accessed on 11 February 2024).
- Braunbeck, O.A.; Oliveira, J.T.A. Colheita de cana-de-açúcar com auxílio mecânico. Eng. Agrícola 2006, 26, 300–308, (In Portuguese with English Abstract). [Google Scholar] [CrossRef]
- Spekken, M.; Molin, J.P.; Romanelli, T.L. Cost of boundary manoeuvres in sugarcane production. Biosyst. Eng. 2015, 129, 112–126. [Google Scholar] [CrossRef]
- Zhao, L.; Zhang, J.; Jiao, S.; Zheng, T.; Li, J.; Zhao, T. Ground surface detection method using ground penetrating radar signal for sugarcane harvester base-cutter control. Biosyst. Eng. 2022, 219, 103–123. [Google Scholar] [CrossRef]
- Mokoena, T.; Celik, T.; Marivate, V. Why is this an anomaly? Explaining anomalies using sequential explanations. Pattern Recognit. 2022, 121, 108227. [Google Scholar] [CrossRef]
- Blackmore, S.; Moore, M. Remedial Correction of Yield Map Data. Precis. Agric. 1999, 1, 53–66. [Google Scholar] [CrossRef]
- Gimenez, L.M.; Molin, J.P. Algoritmo para redução de erros em mapas de produtividade para Agricultura de Precisão. Rev. Bras. Agrocomputação 2004, 2, 5–10, (In Portuguese with English abstract). [Google Scholar]
- Menegatti, L.A.A.; Molin, J.P. Remoção de erros em mapas de produtividade via filtragem de dados brutos. Rev. Bras. Eng. Agrícola E Ambient. 2004, 8, 126–134, (In Portuguese with English abstract). [Google Scholar] [CrossRef]
- Simbahan, G.C.; Dobermann, A.; Ping, J.L. Screening Yield Monitor Data Improves Grain Yield Maps. Agron. J. 2004, 96, 1091–1102. [Google Scholar] [CrossRef]
- Ping, J.L.; Dobermann, A. Processing of Yield Map Data. Precis. Agric. 2005, 6, 193–212. [Google Scholar] [CrossRef]
- Sudduth, K.A.; Drummond, S.T. Yield Editor: Software for Removing Errors from Crop Yield Maps. Agron. J. 2007, 99, 1471–1482. [Google Scholar] [CrossRef]
- Gozdowski, D.; Samborski, S.; Dobers, E.S. Evaluation of methods for the detection of spatial outliers in the yield data of winter wheat. Colloq. Biom. 2010, 40, 41–51. [Google Scholar]
- Sun, W.; Whelan, B.; McBratney, A.B.; Minasny, B. An integrated framework for software to provide yield data cleaning and estimation of an opportunity index for site-specific crop management. Precis. Agric. 2013, 14, 376–391. [Google Scholar] [CrossRef]
- Leroux, C.; Jones, H.; Clenet, A.; Dreux, B.; Becu, M.; Tisseyre, B. A general method to filter out defective spatial observations from yield mapping datasets. Precis. Agric. 2018, 19, 789–808. [Google Scholar] [CrossRef]
- Vega, A.; Córdoba, M.; Castro-Franco, M.; Balzarini, M. Protocol for automating error removal from yield maps. Precis. Agric. 2019, 20, 1030–1044. [Google Scholar] [CrossRef]
- Maldaner, L.F.; Molin, J.P. Data processing within rows for sugarcane yield mapping. Sci. Agric. 2020, 77, e20180391. [Google Scholar]
- Schwertman, N.C.; Owens, M.A.; Adnan, R. A simple more general boxplot method for identifying outliers. Comput. Stat. Data Anal. 2004, 47, 165–174. [Google Scholar] [CrossRef]
- Carter, N.J.; Schwertman, N.C.; Kiser, T.L. A comparison of two boxplot methods for detecting univariate outliers which adjust for sample size and asymmetry. Stat. Methodol. 2009, 6, 604–621. [Google Scholar] [CrossRef]
- Han, J.; Pei, J.; Tong, H. (Eds.) Outlier Detection. In Data Mining; Morgan Kaufmann: Burlington, MA, USA, 2023; pp. 557–604. [Google Scholar] [CrossRef]
- Jung, J.M.; Kim, D.H.; Cho, H.; Lee, M.; Jeong, J.; Lee, D.H.; Seo, S.; Lee, W.H. Multi-algorithmic approach for detecting outliers in cattle intake data. J. Agric. Food Res. 2024, 15, 101021. [Google Scholar] [CrossRef]
- Zhang, H.; Liu, J.; Zhao, C. Distance Based Method for Outlier Detection of Body Sensor Networks. EAI Endorsed Trans. Wirel. Spectr. 2016, 16, e4. [Google Scholar] [CrossRef]
- Muhr, D.; Affenzeller, M. Little data is often enough for distance-based outlier detection. Procedia Comput. Sci. 2022, 200, 984–992. [Google Scholar] [CrossRef]
- Puchhammer, P.; Kalubowila, C.; Braus, L.; Pospiech, S.; Sarala, P.; Filzmoser, P. A performance study of local outlier detection methods for mineral exploration with geochemical compositional data. J. Geochem. Explor. 2024, 258, 107392. [Google Scholar] [CrossRef]
- Tang, B.; He, H. A local density-based approach for outlier detection. Neurocomputing 2017, 241, 171–180. [Google Scholar] [CrossRef]
- Liu, F.; Yu, Y.; Song, P.; Fan, Y.; Tong, X. Scalable KDE-based top-n local outlier detection over large-scale data streams. Knowl.-Based Syst. 2020, 204, 106186. [Google Scholar] [CrossRef]
- Aydın, F. Boundary-aware local Density-based outlier detection. Inf. Sci. 2023, 647, 119520. [Google Scholar] [CrossRef]
- Zhou, Y.; Xia, H.; Yu, D.; Cheng, J.; Li, J. Outlier detection method based on high-density iteration. Inf. Sci. 2024, 662, 120286. [Google Scholar] [CrossRef]
- Huang, J.; Zhu, Q.; Yang, L.; Cheng, D.; Wu, Q. A novel outlier cluster detection algorithm without top-n parameter. Knowl.-Based Syst. 2017, 121, 32–40. [Google Scholar] [CrossRef]
- Nowak-Brzezińska, A.; Horyń, C. Outliers in rules—The comparision of LOF, COF and KMEANS algorithms. Procedia Comput. Sci. 2020, 176, 1420–1429. [Google Scholar] [CrossRef]
- Kiersztyn, A.; Pylak, D.; Horodelski, M.; Kiersztyn, K.; Urbanovich, P. Random clustering-based outlier detector. Inf. Sci. 2024, 667, 120498. [Google Scholar] [CrossRef]
- Qu, B.; Wang, Z.; Shen, B.; Dong, H. Decentralized dynamic state estimation for multi-machine power systems with non-Gaussian noises: Outlier detection and localization. Automatica 2023, 153, 111010. [Google Scholar] [CrossRef]
- Tobler, W. A computer movie simulating urban growth in the Detroit region. Econ. Geogr. 1970, 46, 234–240. [Google Scholar] [CrossRef]
- Smiti, A. A critical overview of outlier detection methods. Comput. Sci. Rev. 2020, 38, 100306. [Google Scholar] [CrossRef]
- Liu, J.; Liu, L.; Du, J.; Sang, J. TLE outlier detection based on expectation maximization algorithm. Adv. Space Res. 2021, 68, 2695–2712. [Google Scholar] [CrossRef]
- Souiden, I.; Omri, M.N.; Brahmi, Z. A survey of outlier detection in high dimensional data streams. Comput. Sci. Rev. 2022, 44, 100463. [Google Scholar] [CrossRef]
- Mieno, T.; Watanabe, K.; Nakashima, Y.; Inenaga, S.; Bannai, H.; Takeda, M. Palindromic trees for a sliding window and its applications. Inf. Process. Lett. 2022, 173, 106174. [Google Scholar] [CrossRef]
- Zeng, Z.; Cui, L.; Qian, M.; Zhang, Z.; Wei, K. A survey on sliding window sketch for network measurement. Comput. Netw. 2023, 226, 109696. [Google Scholar] [CrossRef]
- Datar, M.; Motwani, R. The Sliding-Window Computation Model and Results. In Data Streams: Advances in Database Systems; Aggarwal, C.C., Ed.; Springer: Boston, MA, USA, 2007; pp. 149–167. [Google Scholar] [CrossRef]
- Maldaner, L.F.; Canata, T.F.; Molin, J.P. An Approach to Sugarcane Yield Estimation Using Sensors in the Harvester and ZigBee Technology. Sugar. Tech. 2022, 24, 813–821. [Google Scholar] [CrossRef]
- Nori, F.; Deypir, M.; Sadreddini, M.H. A sliding window based algorithm for frequent closed itemset mining over data streams. J. Syst. Softw. 2013, 86, 615–623. [Google Scholar] [CrossRef]
- Souza, T.; Aquino, A.L.L.; Gomes, D.G. An Online Method to Detect Urban Computing Outliers via Higher-Order Singular Value Decomposition. Sensors 2019, 19, 4464. [Google Scholar] [CrossRef]
- Lidtke, A.A.; Gondelach, D.J.; Armellin, R. Optimising filtering of two-line element sets to increase re-entry prediction accuracy for GTO objects. Adv. Space Res. 2019, 63, 1289–1317. [Google Scholar] [CrossRef]
- QGIS.org. QGIS Geographic Information System. QGIS Association. 2024. Available online: http://www.qgis.org (accessed on 11 February 2022).
- Kluyver, T.; Ragan-Kelley, B.; Pérez, F.; Granger, B.; Bussonnier, M.; Frederic, J.; Kelley, K.; Hamrick, J.; Grout, J.; Corlay, S.; et al. Jupyter Notebooks—A publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas; Loizides, F., Schmidt, B., Eds.; IOS Press: Amsterdam, The Netherland, 2016; pp. 87–90. [Google Scholar] [CrossRef]
- Python. The Python Standard Library. 2024. Available online: https://docs.python.org/3/library/index.html (accessed on 15 March 2024).
- Santoro, E.; Soler, E.M.; Cherri, A.C. Route optimization in mechanized sugarcane harvesting. Comput. Electron. Agric. 2017, 141, 140–146. [Google Scholar] [CrossRef]
- Empresa Brasileira de Pesquisa Agropecuária (Embrapa). Automation and Precision Agriculture. 2024. Available online: https://www.embrapa.br/en/tema-automacao-e-agricultura-de-precisao/sobre-o-tema (accessed on 10 February 2024).
- Wu, J.; He, J.; Christakos, G. (Eds.) Classical geostatistics. In Quantitative Analysis and Modeling of Earth and Environmental Data; Elsevier: Amsterdam, The Netherland, 2022; pp. 149–211. [Google Scholar] [CrossRef]
- Smitha, P.S.; Narasimhan, B.; Sudheer, K.P.; Annamalai, H. An improved bias correction method of daily rainfall data using a sliding window technique for climate change impact assessment. J. Hydrol. 2018, 556, 100–118. [Google Scholar] [CrossRef]
- Xing, Q.; Yu, H.; Wang, H.; Yu, H. A sliding-window-threshold algorithm for identifying global mesoscale ocean fronts from satellite observations. Prog. Oceanogr. 2023, 2016, 103072. [Google Scholar] [CrossRef]
- Danay, L.; Ramon-Gonen, R.; Gorodetski, M.; Schwartz, D.G. Evaluating the effectiveness of a sliding window technique in machine learning models for mortality prediction in ICU cardiac arrest patients. Int. J. Med. Inform. 2024, 191, 105565. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Qian, Y.; Wang, B.; Zhang, Y. Improved sliding window decoding algorithm based on information reserved for spatially coupled LDPC codes. Phys. Commun. 2024, 64, 102359. [Google Scholar] [CrossRef]
Dataset | Number of Fields | Area (ha) | nº Points | Density (Points ha−1) | Mean Yield (Mg ha−1) |
---|---|---|---|---|---|
1 | 6 | 48.99 | 70,120 | 1431.05 | 108.80 |
2 | 11 | 73.43 | 70,446 | 959.31 | 51.08 |
Dataset | SW | N | Mean | Min | Max | SD | CV (%) | Asy | Kurt |
---|---|---|---|---|---|---|---|---|---|
Mg ha−1 | |||||||||
1 | 10 | 41,795 | 129.89 | 38.00 | 259.47 | 22.74 | 17.51 | 0.07 | 0.02 |
20 | 40,907 | 128.68 | 39.14 | 223.77 | 22.43 | 17.43 | 0.03 | −0.10 | |
30 | 41,250 | 128.07 | 38.76 | 195.01 | 22.14 | 17.29 | 0.02 | −0.09 | |
50 | 41,771 | 127.40 | 23.92 | 195.01 | 21.72 | 17.05 | −0.02 | −0.08 | |
100 | 42,032 | 127.11 | 21.41 | 805.60 | 21.28 | 16.74 | 0.78 | 24.73 | |
200 | 42,072 | 126.79 | 21.41 | 805.60 | 20.87 | 16.46 | 0.77 | 26.75 | |
2 | 10 | 40,820 | 52.25 | 27.15 | 83.43 | 6.63 | 12.69 | 0.21 | 0.06 |
20 | 41,886 | 51.81 | 21.15 | 83.43 | 6.61 | 12.76 | 0.20 | 0.06 | |
30 | 43,676 | 51.73 | 21.15 | 83.43 | 6.55 | 12.67 | 0.18 | 0.02 | |
50 | 44,113 | 51.63 | 16.91 | 83.43 | 6.50 | 12.59 | 0.15 | 0.03 | |
100 | 44,353 | 51.60 | 14.38 | 80.06 | 6.46 | 12.53 | 0.11 | −0.05 | |
200 | 44,322 | 51.50 | 14.38 | 78.25 | 6.42 | 12.46 | 0.06 | −0.06 |
Dataset | Method | Outlier Magnitude | |||||
---|---|---|---|---|---|---|---|
±0.01 | ±0.05 | ±0.10 | ±0.50 | ±0.80 | ±1.00 | ||
1 | Sliding Window Algorithm | 26.35 | 26.35 | 27.36 | 93.24 | 97.30 | 97.30 |
MapFilter 2.0 | 100.00 | 100.00 | 98.65 | 98.31 | 100.00 | 98.99 | |
2 | Sliding Window Algorithm | 26.30 | 24.44 | 21.11 | 98.15 | 99.26 | 99.26 |
MapFilter 2.0 | 100.00 | 100.00 | 99.26 | 99.26 | 100.00 | 99.63 |
Dataset | Method | n | Mean | Min | Max | SD | CV (%) |
---|---|---|---|---|---|---|---|
Mg ha−1 | |||||||
1 | Raw Data | 70,120 | 108.80 | 0.00 | 860.16 | 58.02 | 53.33 |
Sliding Window Algorithm | 41,771 | 127.40 | 23.92 | 195.01 | 21.72 | 17.05 | |
MapFilter 2.0 | 39,645 | 127.28 | 85.54 | 158.82 | 16.32 | 12.82 | |
2 | Raw Data | 70,446 | 51.08 | 0.00 | 233.99 | 22.24 | 55.44 |
Sliding Window Algorithm | 44,113 | 51.63 | 16.91 | 83.43 | 6.50 | 12.59 | |
MapFilter 2.0 | 48,640 | 51.28 | 34.43 | 63.44 | 5.75 | 11.21 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
da Silva, E.R.O.; Molin, J.P.; Wei, M.C.F.; Canal Filho, R. Automatic Filtering of Sugarcane Yield Data. AgriEngineering 2024, 6, 4812-4830. https://doi.org/10.3390/agriengineering6040275
da Silva ERO, Molin JP, Wei MCF, Canal Filho R. Automatic Filtering of Sugarcane Yield Data. AgriEngineering. 2024; 6(4):4812-4830. https://doi.org/10.3390/agriengineering6040275
Chicago/Turabian Styleda Silva, Eudocio Rafael Otavio, José Paulo Molin, Marcelo Chan Fu Wei, and Ricardo Canal Filho. 2024. "Automatic Filtering of Sugarcane Yield Data" AgriEngineering 6, no. 4: 4812-4830. https://doi.org/10.3390/agriengineering6040275
APA Styleda Silva, E. R. O., Molin, J. P., Wei, M. C. F., & Canal Filho, R. (2024). Automatic Filtering of Sugarcane Yield Data. AgriEngineering, 6(4), 4812-4830. https://doi.org/10.3390/agriengineering6040275