A Framework for the Optimization of Complex CyberPhysical Systems via Directed Acyclic Graph
Abstract
:1. Introduction
2. Data Leakage in ML Experiments
3. Library Design
3.1. Project Management
3.2. Implementation Details
3.3. Example
 scaler: A scikitlearn MinMaxScaler data preprocessor in charge of scaling the dataset.
 classifier: A scikitlearn GaussianMixture classifier in charge of performing the clustering of the dataset and the classification of any new sample.
 demux: A custom Demultiplexer class in charge of splitting the input arrays accordingly to the selection input vector. This block is provided by PipeGraph.
 lm_0, lm_1, lm_2: A set of scikitlearn LinearRegression objects
 mux: A custom Multiplexer class in charge of combining different input arrays into a single one accordingly to the selection input vector. This block is provided by PipeGraph.
3.4. Implemented Methods
 inject(sink, sink_var, source, source_var) Defines a connection between two nodes of the graph declaring which variable (source_var) from the origin node (source) is passed to the destination node (sink) with new variable name sink_name).
 decision_function(X) Applies PipeGraphClasifier’s predict method and returns the decision_function output of the final estimator.
 fit(X, y=None, $**$fit_params) Fits the PipeGraph steps one after the other and following the topological order of the graph defined by the connections attribute.
 fit_predict(X, y=None, $**$fit_params) Applies predict of a PipeGraph to the data following the topological order of the graph, followed by the fit_predict method of the final step in the PipeGraph. Valid only if the final step implements fit_predict.
 get_params(deep=True) Gets parameters for an estimator.
 predict(X) Predicts the PipeGraph steps one after the other and following the topological order defined by the alternative_connections attribute, in case it is not None, or the connections attribute otherwise.
 predict_log_proba(X) Applies PipeGraphRegressor’s predict method and returns the predict_log_proba output of the final estimator.
 predict_proba(X) Applies PipeGraphClassifier’s predict method and returns the predict_proba output of the final estimator.
 score(X, y=None, sample_weight=None) Applies PipeGraphRegressor’s predict method and returns the score output of the final estimator.
 set_params($**$kwargs) Sets the parameters of this estimator. Valid parameter keys can be listed with get_params().
4. Case Studies
4.1. Anomaly Detection in Manufacturing Processes
4.2. Heat Exchanger Modeling
5. Conclusions
Appendix A
Listing A1. Example code for the PipeGraph shown in Figure 2. 

