Transductive and Transfer Learning

Lavine, Barry K.; Booksh, Karl S.; Neal, Sharon L.

doi:10.3390/jeta2020005

Open AccessEditorial

Transductive and Transfer Learning

by

Barry K. Lavine

^1,*

,

Karl S. Booksh

²

and

Sharon L. Neal

²

¹

Department of Chemistry, Oklahoma State University, Stillwater, OK 74078, USA

²

Department of Chemistry and Biochemistry, University of Delaware, Newark, DE 19716, USA

^*

Author to whom correspondence should be addressed.

J. Exp. Theor. Anal. 2024, 2(2), 56-57; https://doi.org/10.3390/jeta2020005

Submission received: 13 June 2024 / Accepted: 13 June 2024 / Published: 14 June 2024

Download Versions Notes

For most of the twentieth century, chemistry has been a data-poor discipline relying on well-thought-out hypotheses and carefully planned experiments to develop solutions to real-world problems. With the advent of computerized multichannel instrumentation, chemistry in the twenty-first century is evolving into a data-rich field. According to Lavine and Workman [1], this has led to a new approach to chemical problem solving: (1) measure a phenomenon or process using instrumentation to generate multivariate data inexpensively, (2) create, test, and validate models that describe the data, (3) iterate steps 1 and 2 if necessary, and (4) interpret the results to develop a fundamental understanding of and insights into complex multivariate phenomena or processes. Framing chemical problem-solving using this paradigm capitalizes on the synergy between instruments and advanced algorithms for model development to capture the world from a multivariate perspective.

Chemists have relied on inductive learning [2] for model formulation. Inductive learning develops a universal model for all samples using a training set. Each individual prediction sample is treated as an isolated random observation that either resides inside the training set or is excluded from consideration. Using inductive learning, prediction samples do not influence model generation. However, the reliance on inductive learning strategies is futile for analyzing samples that lie outside of the training set boundary. This problem has been stated in the context of ‘un-calibrated interferents’ or sample matrices and environmental effects that distort the instrumental signatures of observed samples. Multivariate methods, such as partial least squares regression or discriminant analysis, can indicate when a sample lies outside of the training set but cannot make reliable predictions regarding these samples. Strategies to find local windows in the data that are free of these interferences or transformation of the data to isolate the signal of interest have been investigated, but only show modest improvements in prediction.

Consequently, chemists and other physical scientists turned to transductive learning [3] to build models around both the training and prediction sets to assure the model addresses the challenges posed by both data sets. Transduction defines the learning task as predicting the correct labels on specified unlabeled test data, not all possible future data. The model is built to be valid for the specific data it is tasked to predict. This simpler task can result in theoretically better bounds on the prediction error. The concept underlying transduction offers a promising framework to address the problem of sample prediction outside of the training set. The key hypothesis in transductive learning is that making the specific prediction data available, though unlabeled, at the time of training to influence model optimization improves model performance by reoptimizing, i.e., updating, it for every future prediction sample. When the training data consist of relatively few labeled data points in a high-dimensional space, using the information in the unlabeled data prevents the classification or regression model from overfitting the training data.

Recently, transfer learning [4], a particular application of transductive learning, has been investigated to construct better machine learning models using less training data. Specifically, transfer learning aims to improve the predictions of a model on a primary task by leveraging data from one or more related auxiliary tasks. The only requirement is that the primary task and auxiliary tasks be drawn from related domains. For example, prediction of enantiomeric excess for a cross-coupling reaction on a new substrate could be a representative primary task. By using the correlations learned from a previously measured set of substrates (i.e., the auxiliary tasks), the machine learning model may perform better with limited data on the new substrate (i.e., the primary task). Transfer learning has shown great promise in a variety of process research and development tasks. In addition, transfer learning in the form of model updating represents a potential solution to the computationally expensive problem of retraining deep learning models. The upcoming publication entitled “The Future of Molecular-Scale Measurements Enabled by Chemical Data Science” includes a report describing the 2022 NSF Workshop “Envisioning Data Driven Advances in Measurement and Instrumentation for Chemical Discovery”, which provides a comprehensive view of transfer learning in the context of several outstanding research questions in chemical data science.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lavine, B.K.; Workman, J., Jr. Fundamental Review of Chemometrics. Anal. Chem. 2002, 74, 2763–2770. [Google Scholar] [CrossRef] [PubMed]
Spiegelhalter, D. The Art of Statistics—How to Learn from Data; Basic Books: New York, NY, USA, 2019. [Google Scholar]
Alakent, B. Soft-Sensor Design via Task Transferred Just-In-Time-Learning Coupled Transductive Moving Window Learner. J. Process Control 2021, 101, 52–67. [Google Scholar] [CrossRef]
Cai, C.; Wang, S.; Xu, Y.; Zhang, W.; Tang, K.; Ouyang, Q.; Lai, L. Transfer Learning for Drug Discovery. J. Med. Chem. 2020, 63, 8683–8694. [Google Scholar] [CrossRef] [PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lavine, B.K.; Booksh, K.S.; Neal, S.L. Transductive and Transfer Learning. J. Exp. Theor. Anal. 2024, 2, 56-57. https://doi.org/10.3390/jeta2020005

AMA Style

Lavine BK, Booksh KS, Neal SL. Transductive and Transfer Learning. Journal of Experimental and Theoretical Analyses. 2024; 2(2):56-57. https://doi.org/10.3390/jeta2020005

Chicago/Turabian Style

Lavine, Barry K., Karl S. Booksh, and Sharon L. Neal. 2024. "Transductive and Transfer Learning" Journal of Experimental and Theoretical Analyses 2, no. 2: 56-57. https://doi.org/10.3390/jeta2020005

APA Style

Lavine, B. K., Booksh, K. S., & Neal, S. L. (2024). Transductive and Transfer Learning. Journal of Experimental and Theoretical Analyses, 2(2), 56-57. https://doi.org/10.3390/jeta2020005

Article Menu

Transductive and Transfer Learning

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI