Recent advances in single-molecule science have revealed an astonishing number of details on the microscopic states of molecules, which in turn defined the need for simple, automated processing of numerous time-series data. In particular, large datasets of time series of single protein molecules have been obtained using laser optical tweezers. In this system, each molecular state has a separate time series with a relatively uneven composition from the point of view-point of local descriptive statistics. In the past, uncertain data quality and heterogeneity of molecular states were biased to the human experience. Because the data processing information is not directly transferable to the black-box-framework for an efficient classification, a rapid evaluation of a large number of time series samples simultaneously measured may constitute a serious obstacle. To solve this particular problem, we have implemented a supervised learning method that combines local entropic models with the global Lehmer average. We find that the methodological combination is suitable to perform a fast and simple categorization, which enables rapid pre-processing of the data with minimal optimization and user interventions.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited