Computational Package for Copolymerization Reactivity Ratio Estimation : Improved Access to the Error-in-Variables-Model

The error-in-variables-model (EVM) is the most statistically correct non-linear parameter estimation technique for reactivity ratio estimation. However, many polymer researchers are unaware of the advantages of EVM and therefore still choose to use rather erroneous or approximate methods. The procedure is straightforward but it is often avoided because it is seen as mathematically and computationally intensive. Therefore, the goal of this work is to make EVM more accessible to all researchers through a series of focused case studies. All analyses employ a MATLAB-based computational package for copolymerization reactivity ratio estimation. The basis of the package is previous work in our group over many years. This version is an improvement, as it ensures wider compatibility and enhanced flexibility with respect to copolymerization parameter estimation scenarios that can be considered.


Introduction
In copolymerization kinetics, reactivity ratios are important parameters.Not only do reactivity ratio estimates specify the degree of incorporation of each comonomer into the copolymer (i.e., average copolymer composition) but they also provide information about other copolymer microstructural indicators (namely azeotropic point, sequence length distribution, triad fractions and so on).This knowledge of kinetics and microstructure can be useful in synthesizing copolymers with specific desirable properties for specific applications.Thus, polymer chemists and polymer reaction engineers require reliable reactivity ratio estimates.
Over the years, many different (and incorrect) methods have been implemented for reactivity ratio estimation.Linear parameter estimation techniques (such as the Mayo-Lewis method (method of intersections), the Fineman-Ross method and the Kelen-Tüdös method) were used previously due to lack of computational power.However, these techniques should not be used, as linear estimation techniques applied to non-linear models result in faulty parameter estimates and a distorted error structure [1][2][3].Other common sources of error in parameter estimation include poorly designed experiments-too few (usually unreplicated) data points, chosen at random-inherent experimental difficulties (especially at low conversion levels) and inappropriate kinetic models.Ultimately, this has created a wide variety of reactivity ratios in the literature, even for similar copolymer systems (see, for example, reactivity ratios associated with the copolymer of 2-acrylamido-2-methylpropane sulfonic acid and acrylamide, as summarized by Scott et al. [4]).
The most statistically correct technique for reactivity ratio estimation is the error-in-variables-model (EVM).EVM is a non-linear parameter estimation technique that considers the error present in all variables.The procedure is fairly straightforward but somewhat computationally intensive.As a result, researchers often revert back to "historical" (and incorrect) linear parameter estimation techniques.It is speculated that researchers choose not to use EVM for two main reasons: (1) They are unaware of EVM and its advantages, and/or (2) They are intimidated by the complexity of the background mathematics required to use EVM.Therefore, the goal of the current work (based on past work as described in references [1][2][3]) is to make EVM more accessible to all researchers by analyzing a variety of copolymerization case studies using a ready-to-use computational package.
A series of five case studies presented herein revisits copolymerization data from the literature; each analysis has a specific goal in mind.Initially, we will look at current "best practices" and their shortcomings (Section 4.1) by exploring linear parameter estimation techniques (Exhibit A) and the limitations associated with low conversion data sets (Exhibit B).Next, we will demonstrate how to maximize and exploit information content from experimental data (Section 4.2).More specifically, case studies will exhibit the benefits of using cumulative copolymerization data (Exhibit C), the need for replicated experiments (Exhibit D) and the advantages of sequential design of experiments (Exhibit E).

Copolymerization Models
Monomer reactivity ratios (r 1 and r 2 ) are parameters used to describe the potential for homopropagation relative to cross-propagation.Reactivity ratios can be estimated using experimental data and a copolymerization model, if the unreacted monomer composition in the polymerizing mixture and the cumulative copolymer composition are known [1][2][3][4][5].
The Mayo-Lewis equation (see Equation ( 1)), also called the instantaneous copolymer composition (ICC) equation, is the most widely used copolymerization model.Equation (1) can be used to determine the instantaneous mole fraction of monomer 1 incorporated into the copolymer (F 1 ) given the comonomer composition in the polymerizing mixture (as mole fractions of unbound monomer, f i ).It is important to note that the Mayo-Lewis equation provides the instantaneous copolymer composition, which means that the model is only applicable for low conversion data (typically <10%, where composition drift is minimal).
where r 1 = k 11 (k ij is the rate constant for each of the four possible propagation reactions, with active center i adding monomer j).
In order to analyze copolymerization data for medium or high conversion levels, the cumulative form of the copolymer composition model becomes necessary.Direct numerical integration (DNI) requires combining and solving (simultaneously) an instantaneous mole balance and a cumulative mole balance (after reaching a certain molar conversion level, X n ).The instantaneous mole balance (Equation ( 2)) is an ordinary differential equation, from which f i can be found at any conversion level (initial conditions f 1 = f 1,0 at X n = 0).The cumulative mole fraction corresponding to X n is given by the well-known Skeist equation (Equation (3)).DNI is a direct numerical approach and does not rely on model transformations or other potentially restrictive assumptions.This is a significant advantage over other estimation approaches with copolymerization models [3,6].

Error-in-Variables-Model (EVM)
A full statistical explanation of the error-in-variables-model (and enumeration of its benefits) has been presented previously; the interested reader should refer to Reilly and Patino-Leal [7] or Kazemi et al. [3,6,8].Only the basics are presented herein, for the reader to have a brief overview before we tackle the case studies.
As mentioned previously, EVM forces the researcher to consider all sources of error, including the error associated with independent variables (such as feed composition) and (measured) cumulative copolymer composition.To obtain estimates of the "true" values of both the independent variables and the parameters, the EVM program uses a nested-iterative loop (this is represented schematically in Figure 1, with variables defined in the discussion below).The inner loop searches for "true" values of the independent variables, since there is inevitably some error associated with the measured values.Mathematically, we can relate the vector of measurements (x i ) to the vector of their unknown "true" values (ξ i ) and an error term (kε i ), according to Equation (4).In the error term, k is a constant that represents the magnitude of the error and ε (error) is a random variable that is typically uniformly distributed on the interval [−1, 1] (an additional explanation is included in Appendix A, for the interested reader).At the same time, the outer loop uses a copolymerization model (such as the ICC model, Equation (1)) to relate the "true" variables and the parameter (reactivity ratio) estimates, as shown in Equation ( 5).

Error-in-Variables-Model (EVM)
A full statistical explanation of the error-in-variables-model (and enumeration of its benefits) has been presented previously; the interested reader should refer to Reilly and Patino-Leal [7] or Kazemi et al. [3,6,8].Only the basics are presented herein, for the reader to have a brief overview before we tackle the case studies.
As mentioned previously, EVM forces the researcher to consider all sources of error, including the error associated with independent variables (such as feed composition) and (measured) cumulative copolymer composition.To obtain estimates of the "true" values of both the independent variables and the parameters, the EVM program uses a nested-iterative loop (this is represented schematically in Figure 1, with variables defined in the discussion below).The inner loop searches for "true" values of the independent variables, since there is inevitably some error associated with the measured values.Mathematically, we can relate the vector of measurements (xi) to the vector of their unknown "true" values (ξi) and an error term (kεi), according to Equation (4).In the error term, k is a constant that represents the magnitude of the error and ε (error) is a random variable that is typically uniformly distributed on the interval [−1, 1] (an additional explanation is included in Appendix A, for the interested reader).At the same time, the outer loop uses a copolymerization model (such as the ICC model, Equation ( 1)) to relate the "true" variables and the parameter (reactivity ratio) estimates, as shown in Equation ( 5).From a statistical perspective, the program uses this nested-iterative approach to minimize the sum of squares between the observed and predicted values, both in terms of the error in the independent variables and in terms of the parameter estimates.When the objective function (Equation ( 6)) is minimized, the program has found the best estimates for both the independent variables and the parameters (reactivity ratios).
where n is the number of experimental trials (runs), ri is the number of replicates for the ith trial, is the average of the ri measurements ( ), is an estimate of the true values of the variables ( ) and From a statistical perspective, the program uses this nested-iterative approach to minimize the sum of squares between the observed and predicted values, both in terms of the error in the independent variables and in terms of the parameter estimates.When the objective function (Equation ( 6)) is minimized, the program has found the best estimates for both the independent variables and the parameters (reactivity ratios).
where n is the number of experimental trials (runs), r i is the number of replicates for the ith trial, x i is the average of the r i measurements (x i ), ξi is an estimate of the true values of the variables (ξ i ) and V is the variance-covariance matrix of the variables (which provides information about measurement error of the variables involved).Alternatively, minimizing the objective function can be considered graphically, as in Figure 2. Given a model and some measured (independent) data, the inner loop minimizes the horizontal distances between the data points and the model (curve).At the same time, the outer loop minimizes the vertical distances between the data points and the model (that is, the outer loop attempts to reconcile model predictions and measurements).
Processes 2018, 6, 8 4 of 35 V is the variance-covariance matrix of the variables (which provides information about measurement error of the variables involved).Alternatively, minimizing the objective function can be considered graphically, as in Figure 2. Given a model and some measured (independent) data, the inner loop minimizes the horizontal distances between the data points and the model (curve).At the same time, the outer loop minimizes the vertical distances between the data points and the model (that is, the outer loop attempts to reconcile model predictions and measurements).[2]).
The computational package described herein that employs EVM for reactivity ratio estimation is based on the RREVM program created by Dubé et al. [1] (in Fortran 77), which was later updated by Polic et al. [2].The program version was further updated and converted to MATLAB by Kazemi et al. [3,6,8].The new and improved software version in MATLAB ensures wider compatibility and allows for possible extensions to other multi-component systems [9].Also, using MATLAB as the program platform allows for open-source programming, which gives researchers the option of modifying and tailoring the program as needed.

Overview
Although the technical aspects of EVM were kept to a minimum in Section 2 (since more details can be found in the references), several comments are now in order about the modifications to the program, in order to make it more user-friendly.The program has been equipped with a graphical user interface (GUI), so that very little knowledge of MATLAB (or programming, in general) is required to invoke the EVM algorithm.Once users open the QuickStart file and execute the program,  [2]).
The computational package described herein that employs EVM for reactivity ratio estimation is based on the RREVM program created by Dubé et al. [1] (in Fortran 77), which was later updated by Polic et al. [2].The program version was further updated and converted to MATLAB by Kazemi et al. [3,6,8].The new and improved software version in MATLAB ensures wider compatibility and allows for possible extensions to other multi-component systems [9].Also, using MATLAB as the program platform allows for open-source programming, which gives researchers the option of modifying and tailoring the program as needed.

Overview
Although the technical aspects of EVM were kept to a minimum in Section 2 (since more details can be found in the references), several comments are now in order about the modifications to the program, in order to make it more user-friendly.The program has been equipped with a graphical user interface (GUI), so that very little knowledge of MATLAB (or programming, in general) is required to invoke the EVM algorithm.Once users open the QuickStart file and execute the program, they are presented with a series of instructions and user prompts.Details and program screenshots are presented in Appendix B for the interested reader, whereas a brief overview is presented in this section.
First, the user must choose their preferred method of data entry (Figure A1).Data input may be manual (with step by step prompts) or may employ a user-prepared data file.Next, the user must indicate whether the copolymerization information (data) for analysis is instantaneous (below 10% conversion) or cumulative (medium-high conversion).
Once these preliminary decisions have been made, the user is prompted to provide the copolymerization data required for analysis (Section 3.2).Finally, the program evaluates the data and presents the results (Section 3.3).The program typically converges within seconds for the instantaneous analysis and in under one minute for the cumulative analysis (usually within less than 10 to 20 iterations in both cases).The time (or number of iterations) required for convergence is a consequence of the location of the initial estimates and the precision required in the estimates; better initial estimates will result in faster program convergence.The EVM program is also equipped with a "time-out" option, which occurs if no solution has been found after a pre-determined number of iterations.

Instantaneous Model
Preliminary estimates of r 1 and r 2 act as "starting points" for EVM (Figure A2).Depending on how much is known about the copolymer system, this information may be acquired from either the literature or preliminary experiments.Literature values can be good starting guesses for reactivity ratios, either from a prior study on the same copolymer system (or even from existing values from a similar system) or from simple trending analysis based on preliminary/screening experiments over a limited range of conditions.Prior knowledge can provide valuable information for both the design and estimation steps.If these preliminary estimates are far from the real ones, convergence may simply take slightly longer than typical orders of magnitude given at the end of Section 3.1.
For the instantaneous case, the copolymerization data required are the feed composition (f 1,0 ) and cumulative copolymer composition (F 1 ), both in terms of monomer 1 (see sample program prompt in Figure A3).(F 1 is approximated by F 1 for low conversion experiments treated by the instantaneous case).As mentioned previously, the ICC model assumes that composition drift does not occur.Therefore, it is recommended that only low conversion data (below 10% conversion) be included in this analysis.
The final prompt prior to parameter estimation is a review of the default settings (Figure A4).This window gives users the opportunity to check settings such as the error type (additive vs. multiplicative error), the error tolerance level and the variance-covariance matrix for the copolymerization system.Details regarding these input values are presented in the Default Settings section (Section 3.2.3).
If the user prefers to use a pre-made data file for program input (see again Figure A1), the same information is required: preliminary reactivity ratio estimates, experimental data and program settings.However, all of the data input is presented in a single '.txt' file, which can be saved, modified (as necessary) and re-analyzed.This is particularly advantageous if a data set is being altered slightly between analyses, as in some of the case studies presented in Section 4. For the interested reader, a sample data file is presented in Appendix B (Figure A5).

Cumulative Model
The analysis with the cumulative model uses many of the same inputs as the instantaneous analysis but the direct numerical integration (DNI) requires additional information.After the user provides preliminary reactivity ratio estimates, one is prompted to input the molecular weights (MW i ) of both comonomers.This information is required to relate weight conversion data (X w , which can be experimentally determined using gravimetry) to molar conversion data (X n , which is used in the DNI as per Equations ( 2) and ( 3)).The program converts X w to X n according to Equation (7).
The next program requirement is the input of the copolymerization data.In this case, since medium or even high conversion data may be analyzed, the conversion values must be included for each point.Therefore, the user enters three arrays of data: X w (measured mass conversion), f 1,0 (known initial feed composition) and F 1 (measured cumulative copolymer composition).Again, initial reactivity ratio estimates, monomer molecular weights and experimental data may be provided through this series of prompts or in a single data file.

Default Settings
A number of default settings are included in the program to ensure ease of implementation (see again Figure A4).However, settings such as the number of parameters, equations and/or variables involved should not be modified.Changing these settings (and modifying the associated source code) allows for expansion of the program to other applications, such as reactivity ratio estimation for multi-component polymerizations (a program for ternary reactivity ratio estimation has already been developed and applied to experimental terpolymerization data) [9].
The only change that might be made to the default settings is to the variance-covariance matrix, V.The matrix dimension must not be modified but individual entries may be changed to incorporate prior knowledge.For example, since the instantaneous model uses two variables (f 1,0 and F 1 ), the variance-covariance matrix is a 2 × 2 matrix.Additional information about default entries in the V matrix can be found in Appendix A.

Results & Diagnostics
Once the program has all necessary (input) data, EVM acquires the best possible estimates of the reactivity ratios (θ) and the independent variables (ξ i ).
Additional outputs are the objective function value (Φ, minimized as per Equation ( 6)) and G, which is the expected value of the second derivative of Φ with respect to the parameters.This is expressed mathematically in Equation (8).
For the interested reader, more information about Equation (8) (and relevant variables) is given in Appendix A. However, since the program calculates the G matrix "behind the scenes", the average user should focus on the fact that the G matrix gives valuable information about the parameters (θ, specifically r 1 and r 2 ).In fact, the inverse of the G matrix provides an approximation of the variance-covariance matrix for the parameters.With this information, the MATLAB program can plot joint confidence regions (JCRs), which are discussed in what follows.

Joint Confidence Regions
JCRs are typically elliptical contours that quantify the level of uncertainty in the parameter estimates; smaller JCRs indicate higher precision and therefore more confidence in the estimation results.
In this program, the joint confidence region for parameter estimates can be visualized using an "error ellipse" (Equation ( 9)).This assumes that the error be normally distributed and that the variance be known.
where χ 2 p,α represents the chi-squared distribution for p parameters and a confidence level of (1 − α).The program uses χ 2 2,0.05 = 5.991 to plot JCRs at the 95% confidence level.Perhaps the most useful information from the calculation of JCRs is the degree of precision (that is, the size and shape of the error ellipse).Ideally, the JCR will be small and round (as in Figure 3, ellipse "A").A small JCR confirms that the parameter estimates are close to the "true" values and a round JCR indicates that the two parameter estimates have approximately the same amount of associated uncertainty.If, on the other hand, the JCR is long and narrow, it suggests that one parameter may be well-defined, whereas the other parameter may have a significant amount of associated uncertainty (see, for example, Figure 3, ellipse "B").
Processes 2018, 6, 8 7 of 35 associated uncertainty.If, on the other hand, the JCR is long and narrow, it suggests that one parameter may be well-defined, whereas the other parameter may have a significant amount of associated uncertainty (see, for example, Figure 3, ellipse "B").
Another important piece of information is the degree of parameter correlation, which can be evaluated according to the slope of the JCR.Parameter correlation is something that should be avoided as much as possible and can be minimized by using designed experiments.If there is a high degree of parameter correlation, the elliptical JCR will be at an angle, as in Figure 3, ellipse "C".Wellbehaved copolymerization systems should have reactivity ratios with similar degrees of uncertainty and minimal correlation.

Exhibit A: Why Do Researchers Continue to Use Linear Parameter Estimation Techniques?
As mentioned in the introduction, linear parameter estimation techniques were originally used for reactivity ratio estimation (RRE) due to lack of computational power.However, since the required technology is now readily available, linear parameter estimation techniques should no longer be used for RRE.Linearizing or transforming the model distorts the error structure and may result in faulty parameter estimates.The statement may seem obvious but it is still worth emphasizing: linear parameter estimation techniques should not be used for the estimation of parameters in non-linear models!
Although most polymer researchers know that linear techniques are inaccurate, they are still taught in both introductory and graduate level polymer chemistry/science courses.Additionally, in perhaps the most commonly perused non-technical "reference," Wikipedia, only the outdated linear techniques are mentioned.It is no wonder, then, that researchers continue to use incorrect parameter estimation techniques.
Exhibit A presents an overview of recent literature [10][11][12][13] regarding the copolymerization of 2methylene-1,3-dioxepane (MDO; monomer 1) and vinyl acetate (VAc; monomer 2) (see also Table 1; RR stands for reactivity ratio).This copolymer has gained considerable attention in the past decade, largely due to its degradable properties.Researchers are especially interested in the reactivity ratios for the system, as reactivity ratios provide information about the copolymer microstructure.However, the reactivity ratio estimation (RRE) techniques used in this field are often incorrect.This Another important piece of information is the degree of parameter correlation, which can be evaluated according to the slope of the JCR.Parameter correlation is something that should be avoided as much as possible and can be minimized by using designed experiments.If there is a high degree of parameter correlation, the elliptical JCR will be at an angle, as in Figure 3, ellipse "C".Well-behaved copolymerization systems should have reactivity ratios with similar degrees of uncertainty and minimal correlation.As mentioned in the introduction, linear parameter estimation techniques were originally used for reactivity ratio estimation (RRE) due to lack of computational power.However, since the required technology is now readily available, linear parameter estimation techniques should no longer be used for RRE.Linearizing or transforming the model distorts the error structure and may result in faulty parameter estimates.The statement may seem obvious but it is still worth emphasizing: linear parameter estimation techniques should not be used for the estimation of parameters in non-linear models!Although most polymer researchers know that linear techniques are inaccurate, they are still taught in both introductory and graduate level polymer chemistry/science courses.Additionally, in perhaps the most commonly perused non-technical "reference," Wikipedia, only the outdated linear techniques are mentioned.It is no wonder, then, that researchers continue to use incorrect parameter estimation techniques.

Case Studies
Exhibit A presents an overview of recent literature [10][11][12][13] regarding the copolymerization of 2-methylene-1,3-dioxepane (MDO; monomer 1) and vinyl acetate (VAc; monomer 2) (see also Table 1; RR stands for reactivity ratio).This copolymer has gained considerable attention in the past decade, largely due to its degradable properties.Researchers are especially interested in the reactivity ratios for the system, as reactivity ratios provide information about the copolymer microstructure.However, the reactivity ratio estimation (RRE) techniques used in this field are often incorrect.This case study will focus primarily on the issue of linear parameter estimation techniques, but invalid low conversion assumptions (that is, inappropriate use of the instantaneous copolymerization model) and error-prone data cannot be overlooked.Therefore, to demonstrate the advantages of EVM, select data from the literature will be re-evaluated (properly) and comparisons will be conducted.In a recent study by Undin et al. [11], experimental data from six distinct feed compositions were used to estimate reactivity ratios for the MDO/VAc copolymerization.These six (batch) runs were allowed to continue until conversion did not change and the final conversion and composition measurements were reported.Finally, the reactivity ratios for the system were calculated using the Fineman-Ross (F-R) method.However, as mentioned briefly in Table 1, the data on the x and y axes were unintentionally flipped in the analysis; thus, the reactivity ratio estimates originally reported are not representative of the experimental data collected.
Besides this unintended error, there are several other problems with the analysis, including (1) the use of undesigned data (that is, no design of experiments used for the selection of feed compositions); ( 2 technique.For the purposes of this discussion, we will focus on the use of the F-R method for RRE but the other important points should also be noted and kept in mind. As discussed by Hagiopol [15], the F-R method is often justified by its simplicity.However, it has many shortcomings, including unequal weighting of experimental data and symmetry issues (i.e., calculation results depend on which monomer is selected as M 1 ).The data set presented in [11] is especially vulnerable to these shortcomings, largely due to the undesigned initial feed compositions (collection and use of undesigned data for parameter estimation also induce considerable correlation between the parameters, which is highly undesirable).As shown in Table 2, some of the data are obtained under fairly low M 1 comonomer feed fraction; these conditions tend to have the greatest influence on the slope of a line, which ultimately affects reactivity ratio estimates obtained using the F-R method [15].The more pressing concern with the F-R method (also described by Hagiopol [15]) is the lack of symmetry.Thus, values of r 1 and r 2 depend on which monomer is selected as M 1 .To demonstrate this point, the data collected by Undin et al. [11] are evaluated with M 1 = MDO (which was performed incorrectly in the original work; see Figure 4a) and with M 1 = VAc (performed herein for the demonstration; see Figure 4b).
It is clear from Figure 4 that the reactivity ratio estimates depend on which comonomer is selected as M 1 ; the fact that two reactivity ratio pairs can be obtained from a single estimation technique is problematic.It is also interesting to note that both analyses give r 1 > 1 and r 2 > 1.While this is physically impossible, it is a side-effect of experimental (and estimation) error.In reality, these results suggest that both reactivity ratios should be close to unity (which agrees with the findings of Undin et al. [11] and Hedir et al. [12]) but that at least one reactivity ratio is <1.The more pressing concern with the F-R method (also described by Hagiopol [15]) is the lack of symmetry.Thus, values of r1 and r2 depend on which monomer is selected as M1.To demonstrate this point, the data collected by Undin et al. [11] are evaluated with M1 = MDO (which was performed incorrectly in the original work; see Figure 4a) and with M1 = VAc (performed herein for the demonstration; see Figure 4b).It is clear from Figure 4 that the reactivity ratio estimates depend on which comonomer is selected as M1; the fact that two reactivity ratio pairs can be obtained from a single estimation technique is problematic.It is also interesting to note that both analyses give r1 > 1 and r2 > 1.While this is physically impossible, it is a side-effect of experimental (and estimation) error.In reality, these results suggest that both reactivity ratios should be close to unity (which agrees with the findings of Undin et al. [11] and Hedir et al. [12]) but that at least one reactivity ratio is <1.The issue of symmetry (combined with the statistical inaccuracy of using linear parameter estimation to evaluate non-linear models) highlights the need for a non-linear parameter estimation technique like EVM.When using EVM for reactivity ratio estimation, the influence of which comonomer is defined as M 1 has no impact on the parameter estimates.(If RR estimates are slightly different based on the choice of M 1 , this is due to experimental error in the data).As shown in Figure 5, reactivity ratio estimates are within the JCR, regardless of which monomer is identified as M 1 .That is, slight discrepancies between EVM-obtained reactivity ratio estimates are well within the expected error (1% error in f i,0 and 10% error in F i ; more on typical error levels in Appendix A).As expected, using measured/reported values as program inputs (in this case, f MDO,0 and F MDO ) provides us with a greater degree of confidence in our results; note that the JCR in Figure 5a is smaller than that in Figure 5b.There is also significantly more parameter correlation visible in Figure 5b, as evidenced by the diagonal nature of the (more elongated) JCR.This, again, is as expected; the VAc data set was calculated from the measured MDO composition data, so correlation is inevitable here.For the interested reader, data files used for this analysis are provided in Appendix C (Section C.1).
In using EVM to re-analyze the data, r 1 > 1 and r 2 > 1 is still observed (see again Figure 5).This outcome is likely a result of using cumulative composition data in an instantaneous model, since composition drift was not taken into account for this data set and conversion levels up to 80% are reported.Even the most statistically correct technique cannot reconcile cumulative experimental data with an instantaneous model (and, in this case, appropriate conversion data are unavailable for reanalysis with the cumulative EVM program).
Processes 2018, 6, 8 10 of 35 calculated from the measured MDO composition data, so correlation is inevitable here.For the interested reader, data files used for this analysis are provided in Appendix C (Section C.1).
In using EVM to re-analyze the data, r1 > 1 and r2 > 1 is still observed (see again Figure 5).This outcome is likely a result of using cumulative composition data in an instantaneous model, since composition drift was not taken into account for this data set and conversion levels up to 80% are reported.Even the most statistically correct technique cannot reconcile cumulative experimental data with an instantaneous model (and, in this case, appropriate conversion data are unavailable for reanalysis with the cumulative EVM program).Finally, we can visually evaluate the prediction performance of the reactivity ratio estimates, which involves comparing the experimental values to those predicted by the ICC equation (Equation (1)).As shown in Figure 6, the symmetry issues associated with the Fineman-Ross (F-R) technique have a significant impact on the prediction performance (red curves, Figure 6a).In contrast, both of the predictions using EVM-obtained RR estimates (blue curves, Figure 6b) are in agreement with each other and with the experimental data.This is compelling evidence to choose non-linear parameter estimation techniques like EVM over the statistically incorrect linear parameter estimation techniques.Finally, we can visually evaluate the prediction performance of the reactivity ratio estimates, which involves comparing the experimental values to those predicted by the ICC equation (Equation ( 1)).As shown in Figure 6, the symmetry issues associated with the Fineman-Ross (F-R) technique have a significant impact on the prediction performance (red curves, Figure 6a).In contrast, both of the predictions using EVM-obtained RR estimates (blue curves, Figure 6b) are in agreement with each other and with the experimental data.This is compelling evidence to choose non-linear parameter estimation techniques like EVM over the statistically incorrect linear parameter estimation techniques.
(1)).As shown in Figure 6, the symmetry issues associated with the Fineman-Ross (F-R) technique have a significant impact on the prediction performance (red curves, Figure 6a).In contrast, both of the predictions using EVM-obtained RR estimates (blue curves, Figure 6b) are in agreement with each other and with the experimental data.This is compelling evidence to choose non-linear parameter estimation techniques like EVM over the statistically incorrect linear parameter estimation techniques.As described in Section 2.1, the instantaneous copolymer composition equation (Equation ( 1)) is only valid at low conversion levels (<10%).This limitation, though sometimes difficult to achieve experimentally, allows researchers to assume that composition drift does not occur in the samples being analyzed.Hence, the cumulative and instantaneous mole fractions in the copolymer are about the same.This experimental fact (and subsequent analysis) is considered "best practice," and is used throughout the reactivity ratio estimation literature (see, for example, [10,[12][13][14]16,17]).
However, limiting kinetic investigations to low conversion levels presents some fundamental challenges.In spite of our best efforts to validate the "lack of composition drift" assumption, there is almost inevitably some change in feed composition with increasing conversion.From a more practical perspective, collecting low conversion data presents experimental challenges and the collected data are extremely prone to error.
Researchers should be aware of these limitations and should act accordingly.One might choose to include conversion data in the analysis (using a cumulative model and DNI, as per Section 2.1) to account for composition drift.Alternatively (rather, in addition), researchers might use design of experiments and experimental replication to address the inevitable error associated with the data collected.If nothing else, parameter estimation using EVM considers the error present in all variables, which can account for some of the experimental error.Ultimately, though, even the most statistically correct technique cannot compensate for bad data collection!Evaluation of HEA/DCP Copolymerization Data Recent work by Suresh et al. [16] describes the synthesis and reactivity ratio estimation of photosensitive copolymers based on 4-(3-(2,4-dichorophenyl)-3-oxoprop-1-enyl) phenylacrylate (DCP; monomer 2).In the study, DCP was copolymerized with hydroxyethyl acrylate (HEA; monomer 1) and with styrene and reactivity ratios were determined to better understand copolymerization behavior.However, as established in Exhibit A (Section 4.1.1),researchers often revert back to linear parameter estimation techniques and the authors (incorrectly) used the Fineman-Ross (F-R) and Kelen-Tüdös (K-T) methods for parameter estimation (see Table 3).
Since the virtues of EVM over linear parameter estimation techniques have already been established, the goal of the current case study is to emphasize the limitations of low conversion data and demonstrate how they can be addressed using EVM.All experimental data that Suresh et al. [16] used for reactivity ratio estimation were kept below 15% conversion, so that the instantaneous copolymerization equation could be used for parameter estimation.But, is "below 15% conversion" enough?As mentioned previously, this is largely considered "best practice," but does not account for composition drift even at low conversions nor for experimental error.As shown in Figure 7, only five (seemingly unreplicated) data points were collected for reactivity ratio estimation, with obvious discrepancies between the experimental data and the model predictions.

Ref. RRE Technique RRE Results
r 1 r 2 [16] Fineman-Ross (F-R) 1.53 ± 0.10 0.76 ± 0.16 [16] Kelen-Tüdös (K-T) 1.67 ± 0.13 0.58 ± 0.05 [16] Extended K-T  It is likely that these discrepancies are due to experimental error; this type of behavior is observed very often, especially for low conversion data.In order to address this, researchers should review potential sources of error; they are likely (1) unidentified composition drift and/or (2) experimental difficulties.This case study will demonstrate both of these sources of error (and how to handle them).This is another very important, yet implicit, contribution of EVM.EVM, if nothing else, forces one to think about the possible sources of variation (and quantify them).Relevant data, program screenshots and results are available in Appendix C, Section C.2.
To account for composition drift (even at low conversions), the cumulative copolymerization model (Equations ( 2) and ( 3)) should be used.Using direct numerical integration to solve this system of equations ensures that the feed composition (f1) is considered as a function of conversion, thus taking any composition drift into account mathematically.To establish whether unidentified composition drift is the culprit in the current experimental data set, we can evaluate the data using both the instantaneous and cumulative models and compare the results (Figure 8, discussed below).In reality, using the cumulative model would also increase the amount of data available for analysis, as the copolymerization would be allowed to go to higher conversion levels (and data would continue to be collected); in addition, the experimental information is enhanced, anyway, since both conversion and copolymer composition data are included; see also Section 4.2.1.Ultimately, this would increase the degree of confidence in the reactivity ratio estimates and decrease the size of the JCR.It is likely that these discrepancies are due to experimental error; this type of behavior is observed very often, especially for low conversion data.In order to address this, researchers should review potential sources of error; they are likely (1) unidentified composition drift and/or (2) experimental difficulties.This case study will demonstrate both of these sources of error (and how to handle them).This is another very important, yet implicit, contribution of EVM.EVM, if nothing else, forces one to think about the possible sources of variation (and quantify them).Relevant data, program screenshots and results are available in Appendix C, Section C.2.
To account for composition drift (even at low conversions), the cumulative copolymerization model (Equations ( 2) and ( 3)) should be used.Using direct numerical integration to solve this system of equations ensures that the feed composition (f 1 ) is considered as a function of conversion, thus taking any composition drift into account mathematically.To establish whether unidentified composition drift is the culprit in the current experimental data set, we can evaluate the data using both the instantaneous and cumulative models and compare the results (Figure 8, discussed below).In reality, using the cumulative model would also increase the amount of data available for analysis, as the copolymerization would be allowed to go to higher conversion levels (and data would continue to be collected); in addition, the experimental information is enhanced, anyway, since both conversion and copolymer composition data are included; see also Section 4.2.1.Ultimately, this would increase the degree of confidence in the reactivity ratio estimates and decrease the size of the JCR. Figure 8 indicates that the two EVM-obtained reactivity ratio estimates are in good agreement and the JCR sizes and orientations are similar.Thus, in this case, the effect of composition drift is likely minimal.Therefore, we will continue our troubleshooting by investigating the second source of error: experimental difficulties.Since no replicate data are available, we cannot calculate the error associated with the composition measurements shown in Figure 7.However, as discussed in Section 2.2, EVM considers the error present in all variables throughout the parameter estimation process.The program default values of 1% error (associated with f1,0) and 5% error (associated with ) were therefore used in the analysis (see Appendix A for additional information).
The reactivity ratios calculated using the EVM program are as described in Table 3 and Figure 8.The converged program also provides the best possible estimates of "true" values of the variables, a feature which was described in Section 3.3 (see also Appendix C, Section C.2). Thus, in Figure 9, we can compare the experimental (measured) values to the "true" experimental values and the EVM model predictions (both instantaneous and cumulative).Figure 8 indicates that the two EVM-obtained reactivity ratio estimates are in good agreement and the JCR sizes and orientations are similar.Thus, in this case, the effect of composition drift is likely minimal.Therefore, we will continue our troubleshooting by investigating the second source of error: experimental difficulties.Since no replicate data are available, we cannot calculate the error associated with the composition measurements shown in Figure 7.However, as discussed in Section 2.2, EVM considers the error present in all variables throughout the parameter estimation process.The program default values of 1% error (associated with f 1,0 ) and 5% error (associated with F 1 ) were therefore used in the analysis (see Appendix A for additional information).
The reactivity ratios calculated using the EVM program are as described in Table 3 and Figure 8.The converged program also provides the best possible estimates of "true" values of the variables, a feature which was described in Section 3.3 (see also Appendix C, Section C.2.Thus, in Figure 9, we can compare the experimental (measured) values to the "true" experimental values and the EVM model predictions (both instantaneous and cumulative).Figure 9 indicates that the experimental data points were subject to some degree of error, especially for f1,0 = 0.3 and f1,0 = 0.7.Thus, experimental difficulties are likely the culprit here.The best way to mitigate this type of problem is to design experiments, replicate copolymerization runs and use EVM for parameter estimation.Additionally, using full conversion data with the cumulative model gives a more complete picture of copolymerization kinetics; researchers do not have to struggle with the experimental challenges of collecting low conversion data.

Exhibit C: What Happens when We Take Advantage of ALL Copolymerization Data?
Collecting low conversion data presents some challenges (as described in Exhibit B) but the instantaneous model is statistically valid if managed properly.That is, good control over conversion levels (typically < 10%), statistical design of experiments, experimental replication, and/or a nonlinear parameter estimation technique should be used.

Evaluation of BMA/BA Copolymerization Data: Low Conversion Analysis
Ren et al. [18] recently investigated the copolymerization kinetics of n-butyl methacrylate (BMA; monomer 1) and n-butyl acrylate (BA; monomer 2).Originally, the group collected copolymerization data at low conversion levels (<10%) so that the instantaneous model (Equation ( 1)) could be used for analysis.After estimating preliminary reactivity ratios using the RREVM program [1], additional (replicated) designed experiments were completed to improve the quality of the data.This low conversion data set has been re-analyzed using the new MATLAB-based EVM program.As with the original investigation, analysis was first performed with the preliminary data only (9 equidistant feed compositions).Then, the analysis was repeated with preliminary data supplemented by the designed replicates (feed compositions selected using the Tidwell-Mortimer criterion [4,19]).A comparison of reactivity ratio estimates is shown in Table 4 and the original data and program output are provided in Appendix C, Section C.3. Figure 9 indicates that the experimental data points were subject to some degree of error, especially for f 1,0 = 0.3 and f 1,0 = 0.7.Thus, experimental difficulties are likely the culprit here.The best way to mitigate this type of problem is to design experiments, replicate copolymerization runs and use EVM for parameter estimation.Additionally, using full conversion data with the cumulative model gives a more complete picture of copolymerization kinetics; researchers do not have to struggle with the experimental challenges of collecting low conversion data.

Exhibit C: What Happens when We Take Advantage of ALL Copolymerization Data?
Collecting low conversion data presents some challenges (as described in Exhibit B) but the instantaneous model is statistically valid if managed properly.That is, good control over conversion levels (typically < 10%), statistical design of experiments, experimental replication, and/or a non-linear parameter estimation technique should be used.

Evaluation of BMA/BA Copolymerization Data: Low Conversion Analysis
Ren et al. [18] recently investigated the copolymerization kinetics of n-butyl methacrylate (BMA; monomer 1) and n-butyl acrylate (BA; monomer 2).Originally, the group collected copolymerization data at low conversion levels (<10%) so that the instantaneous model (Equation ( 1)) could be used for analysis.After estimating preliminary reactivity ratios using the RREVM program [1], additional (replicated) designed experiments were completed to improve the quality of the data.This low conversion data set has been re-analyzed using the new MATLAB-based EVM program.As with the original investigation, analysis was first performed with the preliminary data only (9 equidistant feed compositions).Then, the analysis was repeated with preliminary data supplemented by the designed replicates (feed compositions selected using the Tidwell-Mortimer criterion [4,19]).A comparison of reactivity ratio estimates is shown in Table 4 and the original data and program output are provided in Appendix C, Section C.3.Good agreement is observed between reactivity ratio estimates, no matter what the amount (or type) of data used.This indicates well-behaved data; the low conversion analysis was done in a methodical and statistically correct manner.

Evaluation of BMA/BA Copolymerization Data: Medium-High Conversion Analysis
This type of low conversion analysis (as performed by Ren et al. [18]) would be sufficient for reactivity ratio estimation, especially since design of experiments was included in the investigation.However, in this case, what if one had also included additional (medium-high conversion) experimental data?Ren et al. [18] chose to run three feed compositions up to high conversion values; the full conversion experimental data was used to evaluate the prediction performance of the reactivity ratio estimates.The analysis showed good agreement between model predictions and experimental data, thus confirming the reactivity ratio estimates.
Let's now take this a step further for illustration purposes.We know that using the cumulative model and direct numerical integration provides us with the opportunity to "repurpose" this cumulative (medium-high conversion) data for improved reactivity ratio estimation.With a cumulative model, there is potential to obtain significantly more information (that is, more data points) from each experiment.Since researchers are not limited to low conversion, less experimental tedium is required to obtain the same degree of accuracy, as long as the experiments are well-designed.
A direct comparison of the preliminary analysis (9 feed compositions) and the cumulative analysis (3 feed compositions) results is provided in Figure 10.The same initial estimates were used in both cases to ensure that both of the RRE techniques had the same starting point.It is interesting to note that both the reactivity ratio estimates and the JCR areas are almost identical for the two data sets.This provides us with two main conclusions: that the parameter estimation results using the instantaneous and cumulative models are in agreement and that the degree of confidence in our results is approximately the same (regardless of which data set and/or model is being used).Therefore, in this case, 3 full conversion runs have approximately the same information content as 9 runs that are limited to low conversion levels.
This result should motivate researchers to think carefully about their preliminary experimental work.By strategically selecting feed compositions (using design of experiments techniques like Tidwell-Mortimer [4,19] or EVM [8,9]) and collecting copolymerization data up to medium or high conversion levels, it is possible to obtain sufficient information about a new system.The results shown herein suggest that preliminary experimental work can almost be reduced to 1/3 of the original load, without any loss of information content.Therefore, researchers should be encouraged to make use of all copolymerization data by employing the cumulative copolymerization model.

Exhibit D: How Many Replicates Do We Really Need?
As demonstrated in Exhibit C, experimental replication is an important aspect of reactivity ratio estimation.Collection of data for reactivity ratio estimation, especially at low conversion, is subject to experimental error; replicating experiments helps researchers account for experimental error and potential lurking variables.In deciding on the number of required replicates, a number of aspects must be considered [20].The goal of the current case study is not to do a statistical evaluation of the number of replicates required but rather to look at pre-existing data and demonstrate once more the importance of experimental replicates.
Many reactivity ratio estimation studies (including [18,[21][22][23]) have used the Tidwell-Mortimer (T-M) criterion for design of experiments to select the feed compositions at which to run copolymerizations for reactivity ratio estimation.In most cases, two optimal feed compositions (according to T-M) are used and replicated four times each.To establish the importance of replication and symmetry, this case study will analyze subsets of an original data set from the copolymerization of styrene (Sty; monomer 1) and ethyl acrylate (EA; monomer 2) [21].
When experiments are statistically designed, it is possible to obtain accurate reactivity ratio estimates from reduced experimental effort.Replication ensures that experimental results are repeatable, gives us an estimate of the experimental error and increases the degree of confidence in the resulting reactivity ratio estimates.The data reported by McManus and Penlidis [21] included an initial run and 3 replicates at two feed compositions (8 data points total); the data set is available in Appendix C, Section C.4.

Exhibit D: How Many Replicates Do We Really Need?
As demonstrated in Exhibit C, experimental replication is an important aspect of reactivity ratio estimation.Collection of data for reactivity ratio estimation, especially at low conversion, is subject to experimental error; replicating experiments helps researchers account for experimental error and potential lurking variables.In deciding on the number of required replicates, a number of aspects must be considered [20].The goal of the current case study is not to do a statistical evaluation of the number of replicates required but rather to look at pre-existing data and demonstrate once more the importance of experimental replicates.
Many reactivity ratio estimation studies (including [18,[21][22][23]) have used the Tidwell-Mortimer (T-M) criterion for design of experiments to select the feed compositions at which to run copolymerizations for reactivity ratio estimation.In most cases, two optimal feed compositions (according to T-M) are used and replicated four times each.To establish the importance of replication and symmetry, this case study will analyze subsets of an original data set from the copolymerization of styrene (Sty; monomer 1) and ethyl acrylate (EA; monomer 2) [21].
When experiments are statistically designed, it is possible to obtain accurate reactivity ratio estimates from reduced experimental effort.Replication ensures that experimental results are repeatable, gives us an estimate of the experimental error and increases the degree of confidence in the resulting reactivity ratio estimates.The data reported by McManus and Penlidis [21] included an initial run and 3 replicates at two feed compositions (8 data points total); the data set is available in Appendix C, Section C.4.
To create this collection of reactivity ratio estimates and JCRs (Figure 11), the original data set [21] was revisited with the MATLAB-based RREVM program (blue JCR and circle point estimate in Figure 11; 3 replicates).Then, one randomly selected replicate at each feed composition was removed and the data was re-evaluated (red JCR and square point estimate; 2 replicates).This process was repeated, thus leaving half of the original data set (green JCR and triangle point estimate; 1 replicate).Finally, only one data point at each feed composition was used in the analysis (black JCR and diamond point estimate; no replicates).To create this collection of reactivity ratio estimates and JCRs (Figure 11), the original data set [21] was revisited with the MATLAB-based RREVM program (blue JCR and circle point estimate in Figure 11; 3 replicates).Then, one randomly selected replicate at each feed composition was removed and the data was re-evaluated (red JCR and square point estimate; 2 replicates).This process was repeated, thus leaving half of the original data set (green JCR and triangle point estimate; 1 replicate).Finally, only one data point at each feed composition was used in the analysis (black JCR and diamond point estimate; no replicates).As shown in Figure 11, the reactivity ratio estimates are all similar for the copolymerization of styrene and ethyl acrylate, regardless of how many replicates are used (in this specific case, for this specific data set).The JCRs form a series of concentric ellipses and the slight skew (off-centeredness) depends on which of the replicates are randomly removed.Clearly, using the full data set (3 replicates) gives a much smaller JCR (and a much higher degree of confidence) compared to the other estimates.In general, the JCR size increases (that is, the uncertainty becomes greater) as the number of replicates decreases.
At first glance, it seems as though the uncertainty in r1 is much greater than that in r2 (as demonstrated by the horizontal growth in the JCRs as the number of replicates decreases).However, this is partially due to the fact that r1 is approximately 6 times larger than r2.Thus, the same relative error will have a larger absolute value in r1 compared to r2.Another factor that should be considered is the feed compositions used to collect the experimental data (f1,0).The T-M criterion suggested f1,0 = 0.0788 and f1,0 = 0.7193, where monomer 1 is styrene [21].While this is the statistically correct approach (and an excellent starting point), this means that the experimental data contain information only about a "monomer 2"-rich system (given f1,0 = 0.0788, f2,0 = 0.9212).While f1,0 = 0.7193 contains more styrene than ethyl acrylate, an even higher f1,0 would further improve the degree of confidence in r1.This will be discussed further (and proven through a case study) in Exhibit E, which demonstrates the effectiveness of a sequential design of experiments.
In looking at the importance of replication, it is also worth examining the effect of symmetry on reactivity ratio estimates and JCRs.Specifically, the next step in the investigation looks at how As shown in Figure 11, the reactivity ratio estimates are all similar for the copolymerization of styrene and ethyl acrylate, regardless of how many replicates are used (in this specific case, for this specific data set).The JCRs form a series of concentric ellipses and the slight skew (off-centeredness) depends on which of the replicates are randomly removed.Clearly, using the full data set (3 replicates) gives a much smaller JCR (and a much higher degree of confidence) compared to the other estimates.In general, the JCR size increases (that is, the uncertainty becomes greater) as the number of replicates decreases.
At first glance, it seems as though the uncertainty in r 1 is much greater than that in r 2 (as demonstrated by the horizontal growth in the JCRs as the number of replicates decreases).However, this is partially due to the fact that r 1 is approximately 6 times larger than r 2 .Thus, the same relative error will have a larger absolute value in r 1 compared to r 2 .Another factor that should be considered is the feed compositions used to collect the experimental data (f 1,0 ).The T-M criterion suggested f 1,0 = 0.0788 and f 1,0 = 0.7193, where monomer 1 is styrene [21].While this is the statistically correct approach (and an excellent starting point), this means that the experimental data contain information only about a "monomer 2"-rich system (given f 1,0 = 0.0788, f 2,0 = 0.9212).While f 1,0 = 0.7193 contains more styrene than ethyl acrylate, an even higher f 1,0 would further improve the degree of confidence in r 1 .This will be discussed further (and proven through a case study) in Exhibit E, which demonstrates the effectiveness of a sequential design of experiments.
In looking at the importance of replication, it is also worth examining the effect of symmetry on reactivity ratio estimates and JCRs.Specifically, the next step in the investigation looks at how estimation results are affected when replicates are only available for one of the two recipes.Again, the styrene/ethyl acrylate data set from McManus and Penlidis [21] was employed.
The reactivity ratio estimates and JCRs presented in Figure 12 tell an interesting story.When replicates are only included from the high f 1,0 runs, the error in r 2 (vertical error, in this case) increases (red JCR and square).The reverse is true when all of the replicates are from the low f 1,0 (high f 2,0 ) runs; uncertainty in r 1 becomes much more substantial (green JCR and triangle).This observation is in agreement with the results of Figure 11, as the inclusion of "monomer 1"-rich data improves the degree of confidence in r 1 .
Both asymmetrical data sets give reasonably good estimates of reactivity ratios for the styrene and ethyl acrylate copolymerization.Ultimately, limited replication is better than no replication at all.However, it is no coincidence that the JCR from the fully replicated data set is located almost in the intersection of the other two curves (between the red and green curves in Figure 12).Including all of the experimental replicates in the analysis ensures that we have the highest degree of confidence in both r 1 and r 2 , thus decreasing the JCR area as much as possible.
estimation results are affected when replicates are only available for one of the two recipes.Again, the styrene/ethyl acrylate data set from McManus and Penlidis [21] was employed.
The reactivity ratio estimates and JCRs presented in Figure 12 tell an interesting story.When replicates are only included from the high f1,0 runs, the error in r2 (vertical error, in this case) increases (red JCR and square).The reverse is true when all of the replicates are from the low f1,0 (high f2,0) runs; uncertainty in r1 becomes much more substantial (green JCR and triangle).This observation is in agreement with the results of Figure 11, as the inclusion of "monomer 1"-rich data improves the degree of confidence in r1.
Both asymmetrical data sets give reasonably good estimates of reactivity ratios for the styrene and ethyl acrylate copolymerization.Ultimately, limited replication is better than no replication at all.However, it is no coincidence that the JCR from the fully replicated data set is located almost in the intersection of the other two curves (between the red and green curves in Figure 12).Including all of the experimental replicates in the analysis ensures that we have the highest degree of confidence in both r1 and r2, thus decreasing the JCR area as much as possible.In both Exhibit C and Exhibit D, we observed that design of experiments (specifically, using the Tidwell-Mortimer criterion) is an important aspect of reactivity ratio estimation.Reactivity ratio estimates obtained using designed data are more accurate and more precise than preliminary experiments, because they use a combination of prior knowledge and statistical principles to increase confidence in the final estimates.
In the current case study, we will look at experimental data for the copolymerization of butyl acrylate (BA; monomer 1) and methyl methacrylate (MMA; monomer 2).The original investigation by Dubé and Penlidis [22] was a detailed, multi-step analysis but only data from the first step are used in the current exhibit.
In investigating the effect of experimental design on the confidence in our estimation results, there are two important pieces of information to consider.If the preliminary estimates of r1 are smaller than r2, then (1) uncertainty in r1 will seem much lower than uncertainty in r2 (as explained previously in Section 4.2.2, the same relative error will have a larger absolute value in r2 compared to r1) and ( 2) the Tidwell-Mortimer design will suggest recipes rich in monomer 1.As shown in Figure 13 (black In both Exhibit C and Exhibit D, we observed that design of experiments (specifically, using the Tidwell-Mortimer criterion) is an important aspect of reactivity ratio estimation.Reactivity ratio estimates obtained using designed data are more accurate and more precise than preliminary experiments, because they use a combination of prior knowledge and statistical principles to increase confidence in the final estimates.
In the current case study, we will look at experimental data for the copolymerization of butyl acrylate (BA; monomer 1) and methyl methacrylate (MMA; monomer 2).The original investigation by Dubé and Penlidis [22] was a detailed, multi-step analysis but only data from the first step are used in the current exhibit.
In investigating the effect of experimental design on the confidence in our estimation results, there are two important pieces of information to consider.If the preliminary estimates of r 1 are smaller than r 2 , then (1) uncertainty in r 1 will seem much lower than uncertainty in r 2 (as explained previously in Section 4.2.2, the same relative error will have a larger absolute value in r 2 compared to r 1 ) and (2) the Tidwell-Mortimer design will suggest recipes rich in monomer 1.As shown in Figure 13 (black JCR; 2 feed compositions), r 1 < r 2 for the BA/MMA system and there is more uncertainty in r 2 (that is, in the vertical direction).Generally speaking, we find that the JCR is "stretched" along the axis of the larger reactivity ratio estimate, which is due to both the absolute error and the selected feed compositions.
Since the absolute error is experiment-dependent, it is a fact of life that one has to live with.Thus, this case study focuses on item (2) described above: how do the feed compositions (selected randomly or via design of experiments) affect the reactivity ratio estimates and associated JCRs?In the case of BA/MMA copolymerization (given preliminary estimates r 1 = 0.51 and r 2 = 2.38 from Grassie et al. [24]), the Tidwell-Mortimer criterion suggests the following feed compositions: f 1,0 = 0.543 and f 1,0 = 0.798, where monomer 1 is butyl acrylate [22].Based on this criterion, all of the experimental data collected are rich in monomer 1, which provides us with more certainty in r 1 (see again Figure 13; black JCR; 2 feed compositions).
At the next step, we can use EVM-based sequential design of experiments [8].This allows for further refinement of the reactivity ratio estimates, a higher degree of certainty and therefore smaller JCRs.The procedure is described below: (1) EVM is applied to instantaneous (low conversion) data (from [22]) to estimate reactivity ratios.
Feed compositions are selected according to Tidwell-Mortimer design and four runs are done at each level: f 1,0 = 0.543 and f 1,0 = 0.798.(2) Parameter estimation results from EVM are recorded (see Appendix C, Section C.5).Specifically, reactivity ratio estimates (r 1 and r 2 ) and the G matrix (Appendix A) are required for sequential design of experiments.(3) The EVM-based sequential design of experiments program (using data from step (2), as well as the preliminary feed compositions from step (1)) is employed.Details on the design have been reported by Kazemi et al. [8].
From the sequential design of experiments, we find that the "next best" feed composition for analysis of the BA/MMA copolymerization is f 1,0 = 0.100.This indication that more monomer 2-rich data is required is very reasonable, since all data collected to this point has been rich in monomer 1.By introducing experimental data rich in monomer 2, the uncertainty in r 2 should decrease.
In the absence of experimental data for f 1,0 = 0.100, data were simulated using the instantaneous copolymerization model and random error was added (based on the variance reported in the original study [21]).As was the case for the other feed compositions, four data points at f 1,0 = 0.100 were added to the analysis.These new data points, along with the original data (shown in Appendix C, Section C.5) were then used to re-estimate the reactivity ratios with EVM.The results are shown alongside the original analysis in Figure 13 (red JCR; 3 feed compositions).
The inclusion of data rich in monomer 2 drastically improves the degree of certainty in our reactivity ratio estimates.While the point estimates are unaffected, the error in r 2 is significantly reduced.This is as expected: when data rich in monomer 2 are available, we can have greater confidence in r 2 .
This result demonstrates the importance of design of experiments for reactivity ratio estimation.We are able to maximize the information content from a minimal number of runs and we are able to decrease the degree of uncertainty in our parameter estimates.Sequential designs are extremely useful and revealing and minimize the overall experimental effort.

Conclusions
The MATLAB-based program for reactivity ratio estimation (using the error-in-variables-model) is statistically correct, accurate and user-friendly.This ready-to-use software can easily be employed to evaluate conversion and composition data, providing researchers with precise reactivity ratio estimates within seconds.Using EVM for reactivity ratio estimation should provide polymer researchers with confidence, especially as they continue to use reactivity ratios to predict copolymer properties and microstructure for specific applications.
We hope that the case studies presented herein will motivate polymer chemists and engineers to think critically about which technique(s) they use for parameter estimation.Current "best practices" have serious shortcomings, especially as linear parameter estimation techniques continue to be used.Ignoring the limitations of low conversion data is imprudent; inappropriate statistical approaches can undermine even carefully collected data.The case studies have also shown that the error-in-variables-model can maximize information content from copolymerization data.Researchers can ensure the accuracy of their results (while minimizing required time and resources) by collecting and analyzing full conversion data.Also, as demonstrated by the case studies, replication and design of experiments are key to a high degree of confidence.
Ultimately, the computer program described and demonstrated herein provides both the easeof-use required for introductory studies and the flexibility needed for extensions to complex and/or multi-component systems.Therefore, it should find use in both industry and academia.The program can be obtained by contacting the authors.
the magnitude of the matrix entries may be modified but the size of the matrix itself should not be changed (that is, it should remain a 2 × 2 matrix for the instantaneous case).An alternative to the step-by-step prompts is to include all of the required estimation data in a single ".txt" data file.When a user chooses the data file input option, a pop-up window containing their files appears (that is, any files in the same folder as the "QuickStart" file).Once an appropriate file is selected, the program will access the data and run automatically.
As mentioned previously, a data file includes all of the same data as the prompts but it can be saved, modified and reused.It can either be created in Notepad or in MATLAB but should have the extension ".txt".The data file must be prepared prior to running the EVM program, since the program will access the data file "behind the scenes" to obtain the required information.A sample data file (here for the McManus and Penlidis [21] data set) is shown in Figure A5.An alternative to the step-by-step prompts is to include all of the required estimation data in a single ".txt" data file.When a user chooses the data file input option, a pop-up window containing their files appears (that is, any files in the same folder as the "QuickStart" file).Once an appropriate file is selected, the program will access the data and run automatically.
As mentioned previously, a data file includes all of the same data as the prompts but it can be saved, modified and reused.It can either be created in Notepad or in MATLAB but should have the extension ".txt".The data file must be prepared prior to running the EVM program, since the program will access the data file "behind the scenes" to obtain the required information.A sample data file (here for the McManus and Penlidis [21] data set) is shown in Figure A5.An alternative to the step-by-step prompts is to include all of the required estimation data in a single ".txt" data file.When a user chooses the data file input option, a pop-up window containing their files appears (that is, any files in the same folder as the "QuickStart" file).Once an appropriate file is selected, the program will access the data and run automatically.
As mentioned previously, a data file includes all of the same data as the prompts but it can be saved, modified and reused.It can either be created in Notepad or in MATLAB but should have the extension ".txt".The data file must be prepared prior to running the EVM program, since the program will access the data file "behind the scenes" to obtain the required information.A sample data file (here for the McManus and Penlidis [21] data set) is shown in Figure A5.from Figure A10b).Notice here that both the reactivity ratio estimates (THETA) and the G matrices (G) are similar, which translates into similar JCRs, as observed in Figure 10.

4. 1 .
Current "Best Practices" and Their Shortcomings 4.1.1.Exhibit A: Why Do Researchers Continue to Use Linear Parameter Estimation Techniques?
) the lack of composition drift considerations (RRE experiments should be performed at low conversion, or a cumulative model should be used); (3) the use of an outdated (and linear!) RRE Processes 2018, 6, 8 9 of 35

Figure 6 .
Figure 6.Comparison of prediction performance for RR estimates obtained by (a) F-R and (b) EVM.

Figure 6 .
Figure 6.Comparison of prediction performance for RR estimates obtained by (a) F-R and (b) EVM.

Figure 7 .
Figure 7. Prediction performance of RR estimates obtained by linear RRE techniques.

Figure 7 .
Figure 7. Prediction performance of RR estimates obtained by linear RRE techniques.

Figure 9 .
Figure 9. Prediction performance of RR estimates obtained using EVM.

Figure 9 .
Figure 9. Prediction performance of RR estimates obtained using EVM.

Figure 10 .
Figure 10.Comparison of results for the copolymerization of BMA/BA using the instantaneous model (r BMA = 2.11; r BA = 0.49) and the cumulative model (r BMA = 2.11; r BA = 0.50).

Figure 11 .
Figure 11.Importance of replication for the copolymerization of Sty/EA.

Figure 11 .
Figure 11.Importance of replication for the copolymerization of Sty/EA.

Figure 12 .
Figure 12.Effect of symmetry in replication for the copolymerization of Sty/EA.

Figure 12 .
Figure 12.Effect of symmetry in replication for the copolymerization of Sty/EA.

Figure 13 .
Figure 13.Effect of sequentially designed experiments for the copolymerization of BA/MMA.

Finally, theFigure A8 .
Figure A8.Data files for the (a) instantaneous and (b) cumulative analysis of HEA/DCP.Figure A8.Data files for the (a) instantaneous and (b) cumulative analysis of HEA/DCP.

Figure A8 .
Figure A8.Data files for the (a) instantaneous and (b) cumulative analysis of HEA/DCP.Figure A8.Data files for the (a) instantaneous and (b) cumulative analysis of HEA/DCP.

Figure A8 .
Figure A8.Data files for the (a) instantaneous and (b) cumulative analysis of HEA/DCP.

Figure A9 .
Figure A9.Program output for the analysis of HEA/DCP using the cumulative model.

Figure A10 .
Figure A10.Data files for the (a) instantaneous and (b) cumulative analysis of BMA/BA.Figure A10.Data files for the (a) instantaneous and (b) cumulative analysis of BMA/BA.

Figure A10 .
Figure A10.Data files for the (a) instantaneous and (b) cumulative analysis of BMA/BA.Figure A10.Data files for the (a) instantaneous and (b) cumulative analysis of BMA/BA.

Figure A11 .
Figure A11.Program output for the preliminary analysis of BMA/BA using the instantaneous model.Figure A11.Program output for the preliminary analysis of BMA/BA using the instantaneous model.

Figure A11 .
Figure A11.Program output for the preliminary analysis of BMA/BA using the instantaneous model.Figure A11.Program output for the preliminary analysis of BMA/BA using the instantaneous model.

Figure A12 .
Figure A12.Program output for the analysis of BMA/BA using the cumulative model.Figure A12.Program output for the analysis of BMA/BA using the cumulative model.

Figure A12 .
Figure A12.Program output for the analysis of BMA/BA using the cumulative model.Figure A12.Program output for the analysis of BMA/BA using the cumulative model.

Figure A13 .
Figure A13.Data file for the preliminary analysis of BA/MMA.

Figure A14 .
Figure A14.Program output for the preliminary analysis of BMA/BA.

Figure A13 .
Figure A13.Data file for the preliminary analysis of BA/MMA.

Figure A14 .
Figure A14.Program output for the preliminary analysis of BMA/BA.

Figure A14 .
Figure A14.Program output for the preliminary analysis of BMA/BA.

Figure A15 .
Figure A15.Data file for the sequential (DOE) analysis of BA/MMA.

Figure A16 .
Figure A16.Program output for the sequential (DOE) analysis of BMA/BA.

Figure A16 .
Figure A16.Program output for the sequential (DOE) analysis of BMA/BA.Figure A16.Program output for the sequential (DOE) analysis of BMA/BA.

Figure A16 .
Figure A16.Program output for the sequential (DOE) analysis of BMA/BA.Figure A16.Program output for the sequential (DOE) analysis of BMA/BA.
• Linear RRE technique used • Inappropriate low conversion assumption (reactivity ratios are "different enough" that composition drift is possible) • Low conversion (<20%) data not presented; cannot be re-evaluated with EVM • Non-linear RRE technique used (good!) • Controlled radical polymerization (RAFT) data used for RRE, therefore parameter estimates are "apparent" reactivity ratios (as per Feldermann et al. [14]) • Linear RRE technique used • Inappropriate low conversion assumption (as with [10], reactivity ratios are "different enough" that composition drift is possible) • Suggest that low MDO reactivity (compared to other MDO/VAc RRE results in the literature) a result of low temperature; however, effect of temperature on RRs is usually weak Evaluation of MDO/VAc Copolymerization Data: Fineman-Ross vs. EVM