Open Access
This article is
 freely available
 reusable
Processes 2018, 6(1), 8; doi:10.3390/pr6010008
Article
Computational Package for Copolymerization Reactivity Ratio Estimation: Improved Access to the ErrorinVariablesModel
Institute for Polymer Research (IPR), Department of Chemical Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada
^{*}
Author to whom correspondence should be addressed.
Received: 12 December 2017 / Accepted: 13 January 2018 / Published: 20 January 2018
Abstract
:The errorinvariablesmodel (EVM) is the most statistically correct nonlinear parameter estimation technique for reactivity ratio estimation. However, many polymer researchers are unaware of the advantages of EVM and therefore still choose to use rather erroneous or approximate methods. The procedure is straightforward but it is often avoided because it is seen as mathematically and computationally intensive. Therefore, the goal of this work is to make EVM more accessible to all researchers through a series of focused case studies. All analyses employ a MATLABbased computational package for copolymerization reactivity ratio estimation. The basis of the package is previous work in our group over many years. This version is an improvement, as it ensures wider compatibility and enhanced flexibility with respect to copolymerization parameter estimation scenarios that can be considered.
Keywords:
copolymerization kinetics; copolymer composition; design of experiments; errorinvariablesmodel (EVM); parameter estimation; polymer reaction engineering; reactivity ratios1. Introduction
In copolymerization kinetics, reactivity ratios are important parameters. Not only do reactivity ratio estimates specify the degree of incorporation of each comonomer into the copolymer (i.e., average copolymer composition) but they also provide information about other copolymer microstructural indicators (namely azeotropic point, sequence length distribution, triad fractions and so on). This knowledge of kinetics and microstructure can be useful in synthesizing copolymers with specific desirable properties for specific applications. Thus, polymer chemists and polymer reaction engineers require reliable reactivity ratio estimates.
Over the years, many different (and incorrect) methods have been implemented for reactivity ratio estimation. Linear parameter estimation techniques (such as the MayoLewis method (method of intersections), the FinemanRoss method and the KelenTüdös method) were used previously due to lack of computational power. However, these techniques should not be used, as linear estimation techniques applied to nonlinear models result in faulty parameter estimates and a distorted error structure [1,2,3]. Other common sources of error in parameter estimation include poorly designed experiments—too few (usually unreplicated) data points, chosen at random—inherent experimental difficulties (especially at low conversion levels) and inappropriate kinetic models. Ultimately, this has created a wide variety of reactivity ratios in the literature, even for similar copolymer systems (see, for example, reactivity ratios associated with the copolymer of 2acrylamido2methylpropane sulfonic acid and acrylamide, as summarized by Scott et al. [4]).
The most statistically correct technique for reactivity ratio estimation is the errorinvariablesmodel (EVM). EVM is a nonlinear parameter estimation technique that considers the error present in all variables. The procedure is fairly straightforward but somewhat computationally intensive. As a result, researchers often revert back to “historical” (and incorrect) linear parameter estimation techniques. It is speculated that researchers choose not to use EVM for two main reasons: (1) They are unaware of EVM and its advantages, and/or (2) They are intimidated by the complexity of the background mathematics required to use EVM. Therefore, the goal of the current work (based on past work as described in references [1,2,3]) is to make EVM more accessible to all researchers by analyzing a variety of copolymerization case studies using a readytouse computational package.
A series of five case studies presented herein revisits copolymerization data from the literature; each analysis has a specific goal in mind. Initially, we will look at current “best practices” and their shortcomings (Section 4.1) by exploring linear parameter estimation techniques (Exhibit A) and the limitations associated with low conversion data sets (Exhibit B). Next, we will demonstrate how to maximize and exploit information content from experimental data (Section 4.2). More specifically, case studies will exhibit the benefits of using cumulative copolymerization data (Exhibit C), the need for replicated experiments (Exhibit D) and the advantages of sequential design of experiments (Exhibit E).
2. A Brief Overview of EVM for Reactivity Ratio Estimation
2.1. Copolymerization Models
Monomer reactivity ratios (r_{1} and r_{2}) are parameters used to describe the potential for homopropagation relative to crosspropagation. Reactivity ratios can be estimated using experimental data and a copolymerization model, if the unreacted monomer composition in the polymerizing mixture and the cumulative copolymer composition are known [1,2,3,4,5].
The MayoLewis equation (see Equation (1)), also called the instantaneous copolymer composition (ICC) equation, is the most widely used copolymerization model. Equation (1) can be used to determine the instantaneous mole fraction of monomer 1 incorporated into the copolymer (F_{1}) given the comonomer composition in the polymerizing mixture (as mole fractions of unbound monomer, f_{i}). It is important to note that the MayoLewis equation provides the instantaneous copolymer composition, which means that the model is only applicable for low conversion data (typically <10%, where composition drift is minimal).
where ${r}_{1}=\frac{{k}_{11}}{{k}_{12}}$ and ${r}_{2}=\frac{{k}_{22}}{{k}_{21}}$ (k_{ij} is the rate constant for each of the four possible propagation reactions, with active center i adding monomer j).
$${F}_{1}=\frac{{r}_{1}{f}_{1}^{2}+{f}_{1}{f}_{2}}{{r}_{1}{f}_{1}^{2}+2{f}_{1}{f}_{2}+{r}_{2}{f}_{2}^{2}}$$
In order to analyze copolymerization data for medium or high conversion levels, the cumulative form of the copolymer composition model becomes necessary. Direct numerical integration (DNI) requires combining and solving (simultaneously) an instantaneous mole balance and a cumulative mole balance (after reaching a certain molar conversion level, X_{n}). The instantaneous mole balance (Equation (2)) is an ordinary differential equation, from which f_{i} can be found at any conversion level (initial conditions f_{1} = f_{1,0} at X_{n} = 0). The cumulative mole fraction corresponding to X_{n} is given by the wellknown Skeist equation (Equation (3)). DNI is a direct numerical approach and does not rely on model transformations or other potentially restrictive assumptions. This is a significant advantage over other estimation approaches with copolymerization models [3,6].
$$\frac{d{f}_{1}}{d{X}_{n}}=\frac{{f}_{1}{F}_{1}}{1{X}_{n}}$$
$${\overline{F}}_{1}=\frac{{f}_{1,0}{f}_{1}\left(1{X}_{n}\right)}{{X}_{n}}$$
2.2. ErrorinVariablesModel (EVM)
A full statistical explanation of the errorinvariablesmodel (and enumeration of its benefits) has been presented previously; the interested reader should refer to Reilly and PatinoLeal [7] or Kazemi et al. [3,6,8]. Only the basics are presented herein, for the reader to have a brief overview before we tackle the case studies.
As mentioned previously, EVM forces the researcher to consider all sources of error, including the error associated with independent variables (such as feed composition) and (measured) cumulative copolymer composition. To obtain estimates of the “true” values of both the independent variables and the parameters, the EVM program uses a nestediterative loop (this is represented schematically in Figure 1, with variables defined in the discussion below). The inner loop searches for “true” values of the independent variables, since there is inevitably some error associated with the measured values. Mathematically, we can relate the vector of measurements (x_{i}) to the vector of their unknown “true” values (ξ_{i}) and an error term (kε_{i}), according to Equation (4). In the error term, k is a constant that represents the magnitude of the error and ε (error) is a random variable that is typically uniformly distributed on the interval [−1, 1] (an additional explanation is included in Appendix A, for the interested reader). At the same time, the outer loop uses a copolymerization model (such as the ICC model, Equation (1)) to relate the “true” variables and the parameter (reactivity ratio) estimates, as shown in Equation (5).
$${\underset{\xaf}{x}}_{i}={\underset{\xaf}{\xi}}_{i}\left(1+k{\underset{\xaf}{\epsilon}}_{i}\right)$$
$$\underset{\xaf}{g}\left({\underset{\xaf}{\xi}}_{i},\underset{\xaf}{\theta}\right)=0$$
From a statistical perspective, the program uses this nestediterative approach to minimize the sum of squares between the observed and predicted values, both in terms of the error in the independent variables and in terms of the parameter estimates. When the objective function (Equation (6)) is minimized, the program has found the best estimates for both the independent variables and the parameters (reactivity ratios).
where n is the number of experimental trials (runs), r_{i} is the number of replicates for the ith trial, $\overline{{\underset{\xaf}{x}}_{i}}$ is the average of the r_{i} measurements (${\underset{\xaf}{x}}_{i}$), ${\widehat{\underset{\xaf}{\xi}}}_{i}$ is an estimate of the true values of the variables (${\underset{\xaf}{\xi}}_{i}$) and V is the variancecovariance matrix of the variables (which provides information about measurement error of the variables involved).
$$\mathrm{\Phi}=\frac{1}{2}{\displaystyle \sum}_{i=1}^{n}{r}_{i}{({\overline{\underset{\xaf}{x}}}_{i}\underset{\xaf}{\widehat{{\xi}_{i}}})}^{\prime}\underset{\xaf}{V}{}^{1}\left({\overline{\underset{\xaf}{x}}}_{i}\underset{\xaf}{\widehat{{\xi}_{i}}}\right)$$
Alternatively, minimizing the objective function can be considered graphically, as in Figure 2. Given a model and some measured (independent) data, the inner loop minimizes the horizontal distances between the data points and the model (curve). At the same time, the outer loop minimizes the vertical distances between the data points and the model (that is, the outer loop attempts to reconcile model predictions and measurements).
The computational package described herein that employs EVM for reactivity ratio estimation is based on the RREVM program created by Dubé et al. [1] (in Fortran 77), which was later updated by Polic et al. [2]. The program version was further updated and converted to MATLAB by Kazemi et al. [3,6,8]. The new and improved software version in MATLAB ensures wider compatibility and allows for possible extensions to other multicomponent systems [9]. Also, using MATLAB as the program platform allows for opensource programming, which gives researchers the option of modifying and tailoring the program as needed.
3. Program Description
3.1. Overview
Although the technical aspects of EVM were kept to a minimum in Section 2 (since more details can be found in the references), several comments are now in order about the modifications to the program, in order to make it more userfriendly. The program has been equipped with a graphical user interface (GUI), so that very little knowledge of MATLAB (or programming, in general) is required to invoke the EVM algorithm. Once users open the QuickStart file and execute the program, they are presented with a series of instructions and user prompts. Details and program screenshots are presented in Appendix B for the interested reader, whereas a brief overview is presented in this section.
First, the user must choose their preferred method of data entry (Figure A1). Data input may be manual (with step by step prompts) or may employ a userprepared data file. Next, the user must indicate whether the copolymerization information (data) for analysis is instantaneous (below 10% conversion) or cumulative (mediumhigh conversion).
Once these preliminary decisions have been made, the user is prompted to provide the copolymerization data required for analysis (Section 3.2). Finally, the program evaluates the data and presents the results (Section 3.3). The program typically converges within seconds for the instantaneous analysis and in under one minute for the cumulative analysis (usually within less than 10 to 20 iterations in both cases). The time (or number of iterations) required for convergence is a consequence of the location of the initial estimates and the precision required in the estimates; better initial estimates will result in faster program convergence. The EVM program is also equipped with a “timeout” option, which occurs if no solution has been found after a predetermined number of iterations.
3.2. Program Requirements
3.2.1. Instantaneous Model
Preliminary estimates of r_{1} and r_{2} act as “starting points” for EVM (Figure A2). Depending on how much is known about the copolymer system, this information may be acquired from either the literature or preliminary experiments. Literature values can be good starting guesses for reactivity ratios, either from a prior study on the same copolymer system (or even from existing values from a similar system) or from simple trending analysis based on preliminary/screening experiments over a limited range of conditions. Prior knowledge can provide valuable information for both the design and estimation steps. If these preliminary estimates are far from the real ones, convergence may simply take slightly longer than typical orders of magnitude given at the end of Section 3.1.
For the instantaneous case, the copolymerization data required are the feed composition (f_{1,0}) and cumulative copolymer composition (${\overline{F}}_{1}$), both in terms of monomer 1 (see sample program prompt in Figure A3). (${\overline{F}}_{1}$ is approximated by F_{1} for low conversion experiments treated by the instantaneous case). As mentioned previously, the ICC model assumes that composition drift does not occur. Therefore, it is recommended that only low conversion data (below 10% conversion) be included in this analysis.
The final prompt prior to parameter estimation is a review of the default settings (Figure A4). This window gives users the opportunity to check settings such as the error type (additive vs. multiplicative error), the error tolerance level and the variancecovariance matrix for the copolymerization system. Details regarding these input values are presented in the Default Settings section (Section 3.2.3).
If the user prefers to use a premade data file for program input (see again Figure A1), the same information is required: preliminary reactivity ratio estimates, experimental data and program settings. However, all of the data input is presented in a single ‘.txt’ file, which can be saved, modified (as necessary) and reanalyzed. This is particularly advantageous if a data set is being altered slightly between analyses, as in some of the case studies presented in Section 4. For the interested reader, a sample data file is presented in Appendix B (Figure A5).
3.2.2. Cumulative Model
The analysis with the cumulative model uses many of the same inputs as the instantaneous analysis but the direct numerical integration (DNI) requires additional information. After the user provides preliminary reactivity ratio estimates, one is prompted to input the molecular weights (MW_{i}) of both comonomers. This information is required to relate weight conversion data (X_{w}, which can be experimentally determined using gravimetry) to molar conversion data (X_{n}, which is used in the DNI as per Equations (2) and (3)). The program converts X_{w} to X_{n} according to Equation (7).
$${X}_{n}={X}_{w}\frac{M{W}_{1}{f}_{1,0}+M{W}_{2}{f}_{2,0}}{M{W}_{1}{\overline{F}}_{1}+M{W}_{2}{\overline{F}}_{2}}$$
The next program requirement is the input of the copolymerization data. In this case, since medium or even high conversion data may be analyzed, the conversion values must be included for each point. Therefore, the user enters three arrays of data: X_{w} (measured mass conversion), f_{1,0} (known initial feed composition) and ${\overline{F}}_{1}$ (measured cumulative copolymer composition). Again, initial reactivity ratio estimates, monomer molecular weights and experimental data may be provided through this series of prompts or in a single data file.
3.2.3. Default Settings
A number of default settings are included in the program to ensure ease of implementation (see again Figure A4). However, settings such as the number of parameters, equations and/or variables involved should not be modified. Changing these settings (and modifying the associated source code) allows for expansion of the program to other applications, such as reactivity ratio estimation for multicomponent polymerizations (a program for ternary reactivity ratio estimation has already been developed and applied to experimental terpolymerization data) [9].
The only change that might be made to the default settings is to the variancecovariance matrix, V. The matrix dimension must not be modified but individual entries may be changed to incorporate prior knowledge. For example, since the instantaneous model uses two variables (f_{1,0} and ${\overline{F}}_{1}$), the variancecovariance matrix is a 2 × 2 matrix. Additional information about default entries in the V matrix can be found in Appendix A.
3.3. Results & Diagnostics
Once the program has all necessary (input) data, EVM acquires the best possible estimates of the reactivity ratios ($\underset{\xaf}{\theta}$) and the independent variables (${\underset{\xaf}{\xi}}_{i}$).
Additional outputs are the objective function value (Φ, minimized as per Equation (6)) and G, which is the expected value of the second derivative of Φ with respect to the parameters. This is expressed mathematically in Equation (8).
$$\underset{\xaf}{G}=E\left[\frac{{d}^{2}\mathrm{\Phi}}{d{\theta}_{i}d{\theta}_{j}}\right]={\displaystyle \sum}_{i=1}^{n}{r}_{i}{\underset{\xaf}{Z}}_{i}^{\prime}{\left({\underset{\xaf}{B}}_{i}\underset{\xaf}{V}{\underset{\xaf}{B}}_{i}^{\prime}\right)}^{1}{\underset{\xaf}{Z}}_{i}$$
For the interested reader, more information about Equation (8) (and relevant variables) is given in Appendix A. However, since the program calculates the G matrix “behind the scenes”, the average user should focus on the fact that the G matrix gives valuable information about the parameters (θ, specifically r_{1} and r_{2}). In fact, the inverse of the G matrix provides an approximation of the variancecovariance matrix for the parameters. With this information, the MATLAB program can plot joint confidence regions (JCRs), which are discussed in what follows.
Joint Confidence Regions
JCRs are typically elliptical contours that quantify the level of uncertainty in the parameter estimates; smaller JCRs indicate higher precision and therefore more confidence in the estimation results.
In this program, the joint confidence region for parameter estimates can be visualized using an “error ellipse” (Equation (9)). This assumes that the error be normally distributed and that the variance be known.
where ${\chi}^{2}{}_{p,\alpha}$ represents the chisquared distribution for p parameters and a confidence level of (1 − α). The program uses ${\chi}^{2}{}_{2,0.05}=5.991$ to plot JCRs at the 95% confidence level.
$${\left(\underset{\xaf}{\theta}\underset{\xaf}{\widehat{\theta}}\right)}^{\prime}\underset{\xaf}{G}\left(\underset{\xaf}{\theta}\underset{\xaf}{\widehat{\theta}}\right)\le {\chi}^{2}{}_{p,\alpha}$$
Perhaps the most useful information from the calculation of JCRs is the degree of precision (that is, the size and shape of the error ellipse). Ideally, the JCR will be small and round (as in Figure 3, ellipse “A”). A small JCR confirms that the parameter estimates are close to the “true” values and a round JCR indicates that the two parameter estimates have approximately the same amount of associated uncertainty. If, on the other hand, the JCR is long and narrow, it suggests that one parameter may be welldefined, whereas the other parameter may have a significant amount of associated uncertainty (see, for example, Figure 3, ellipse “B”).
Another important piece of information is the degree of parameter correlation, which can be evaluated according to the slope of the JCR. Parameter correlation is something that should be avoided as much as possible and can be minimized by using designed experiments. If there is a high degree of parameter correlation, the elliptical JCR will be at an angle, as in Figure 3, ellipse “C”. Wellbehaved copolymerization systems should have reactivity ratios with similar degrees of uncertainty and minimal correlation.
4. Case Studies
4.1. Current “Best Practices” and Their Shortcomings
4.1.1. Exhibit A: Why Do Researchers Continue to Use Linear Parameter Estimation Techniques?
As mentioned in the introduction, linear parameter estimation techniques were originally used for reactivity ratio estimation (RRE) due to lack of computational power. However, since the required technology is now readily available, linear parameter estimation techniques should no longer be used for RRE. Linearizing or transforming the model distorts the error structure and may result in faulty parameter estimates. The statement may seem obvious but it is still worth emphasizing: linear parameter estimation techniques should not be used for the estimation of parameters in nonlinear models!
Although most polymer researchers know that linear techniques are inaccurate, they are still taught in both introductory and graduate level polymer chemistry/science courses. Additionally, in perhaps the most commonly perused nontechnical “reference,” Wikipedia, only the outdated linear techniques are mentioned. It is no wonder, then, that researchers continue to use incorrect parameter estimation techniques.
Exhibit A presents an overview of recent literature [10,11,12,13] regarding the copolymerization of 2methylene1,3dioxepane (MDO; monomer 1) and vinyl acetate (VAc; monomer 2) (see also Table 1; RR stands for reactivity ratio). This copolymer has gained considerable attention in the past decade, largely due to its degradable properties. Researchers are especially interested in the reactivity ratios for the system, as reactivity ratios provide information about the copolymer microstructure. However, the reactivity ratio estimation (RRE) techniques used in this field are often incorrect. This case study will focus primarily on the issue of linear parameter estimation techniques, but invalid low conversion assumptions (that is, inappropriate use of the instantaneous copolymerization model) and errorprone data cannot be overlooked. Therefore, to demonstrate the advantages of EVM, select data from the literature will be reevaluated (properly) and comparisons will be conducted.
Evaluation of MDO/VAc Copolymerization Data: FinemanRoss vs. EVM
In a recent study by Undin et al. [11], experimental data from six distinct feed compositions were used to estimate reactivity ratios for the MDO/VAc copolymerization. These six (batch) runs were allowed to continue until conversion did not change and the final conversion and composition measurements were reported. Finally, the reactivity ratios for the system were calculated using the FinemanRoss (FR) method. However, as mentioned briefly in Table 1, the data on the x and y axes were unintentionally flipped in the analysis; thus, the reactivity ratio estimates originally reported are not representative of the experimental data collected.
Besides this unintended error, there are several other problems with the analysis, including (1) the use of undesigned data (that is, no design of experiments used for the selection of feed compositions); (2) the lack of composition drift considerations (RRE experiments should be performed at low conversion, or a cumulative model should be used); (3) the use of an outdated (and linear!) RRE technique. For the purposes of this discussion, we will focus on the use of the FR method for RRE but the other important points should also be noted and kept in mind.
As discussed by Hagiopol [15], the FR method is often justified by its simplicity. However, it has many shortcomings, including unequal weighting of experimental data and symmetry issues (i.e., calculation results depend on which monomer is selected as M_{1}). The data set presented in [11] is especially vulnerable to these shortcomings, largely due to the undesigned initial feed compositions (collection and use of undesigned data for parameter estimation also induce considerable correlation between the parameters, which is highly undesirable). As shown in Table 2, some of the data are obtained under fairly low M_{1} comonomer feed fraction; these conditions tend to have the greatest influence on the slope of a line, which ultimately affects reactivity ratio estimates obtained using the FR method [15].
The more pressing concern with the FR method (also described by Hagiopol [15]) is the lack of symmetry. Thus, values of r_{1} and r_{2} depend on which monomer is selected as M_{1}. To demonstrate this point, the data collected by Undin et al. [11] are evaluated with M_{1} = MDO (which was performed incorrectly in the original work; see Figure 4a) and with M_{1} = VAc (performed herein for the demonstration; see Figure 4b).
It is clear from Figure 4 that the reactivity ratio estimates depend on which comonomer is selected as M_{1}; the fact that two reactivity ratio pairs can be obtained from a single estimation technique is problematic. It is also interesting to note that both analyses give r_{1} > 1 and r_{2} > 1. While this is physically impossible, it is a sideeffect of experimental (and estimation) error. In reality, these results suggest that both reactivity ratios should be close to unity (which agrees with the findings of Undin et al. [11] and Hedir et al. [12]) but that at least one reactivity ratio is <1.
The issue of symmetry (combined with the statistical inaccuracy of using linear parameter estimation to evaluate nonlinear models) highlights the need for a nonlinear parameter estimation technique like EVM. When using EVM for reactivity ratio estimation, the influence of which comonomer is defined as M_{1} has no impact on the parameter estimates. (If RR estimates are slightly different based on the choice of M_{1}, this is due to experimental error in the data). As shown in Figure 5, reactivity ratio estimates are within the JCR, regardless of which monomer is identified as M_{1}. That is, slight discrepancies between EVMobtained reactivity ratio estimates are well within the expected error (1% error in f_{i,}_{0} and 10% error in ${\overline{F}}_{i}$; more on typical error levels in Appendix A). As expected, using measured/reported values as program inputs (in this case, f_{MDO,0} and ${\overline{F}}_{\mathrm{MDO}}$) provides us with a greater degree of confidence in our results; note that the JCR in Figure 5a is smaller than that in Figure 5b. There is also significantly more parameter correlation visible in Figure 5b, as evidenced by the diagonal nature of the (more elongated) JCR. This, again, is as expected; the VAc data set was calculated from the measured MDO composition data, so correlation is inevitable here. For the interested reader, data files used for this analysis are provided in Appendix C (Section C.1).
In using EVM to reanalyze the data, r_{1} > 1 and r_{2} > 1 is still observed (see again Figure 5). This outcome is likely a result of using cumulative composition data in an instantaneous model, since composition drift was not taken into account for this data set and conversion levels up to 80% are reported. Even the most statistically correct technique cannot reconcile cumulative experimental data with an instantaneous model (and, in this case, appropriate conversion data are unavailable for reanalysis with the cumulative EVM program).
Finally, we can visually evaluate the prediction performance of the reactivity ratio estimates, which involves comparing the experimental values to those predicted by the ICC equation (Equation (1)). As shown in Figure 6, the symmetry issues associated with the FinemanRoss (FR) technique have a significant impact on the prediction performance (red curves, Figure 6a). In contrast, both of the predictions using EVMobtained RR estimates (blue curves, Figure 6b) are in agreement with each other and with the experimental data. This is compelling evidence to choose nonlinear parameter estimation techniques like EVM over the statistically incorrect linear parameter estimation techniques.
4.1.2. Exhibit B: Are Researchers Addressing the Limitations of Low Conversion Data Analysis?
As described in Section 2.1, the instantaneous copolymer composition equation (Equation (1)) is only valid at low conversion levels (<10%). This limitation, though sometimes difficult to achieve experimentally, allows researchers to assume that composition drift does not occur in the samples being analyzed. Hence, the cumulative and instantaneous mole fractions in the copolymer are about the same. This experimental fact (and subsequent analysis) is considered “best practice,” and is used throughout the reactivity ratio estimation literature (see, for example, [10,12,13,14,16,17]).
However, limiting kinetic investigations to low conversion levels presents some fundamental challenges. In spite of our best efforts to validate the “lack of composition drift” assumption, there is almost inevitably some change in feed composition with increasing conversion. From a more practical perspective, collecting low conversion data presents experimental challenges and the collected data are extremely prone to error.
Researchers should be aware of these limitations and should act accordingly. One might choose to include conversion data in the analysis (using a cumulative model and DNI, as per Section 2.1) to account for composition drift. Alternatively (rather, in addition), researchers might use design of experiments and experimental replication to address the inevitable error associated with the data collected. If nothing else, parameter estimation using EVM considers the error present in all variables, which can account for some of the experimental error. Ultimately, though, even the most statistically correct technique cannot compensate for bad data collection!
Evaluation of HEA/DCP Copolymerization Data
Recent work by Suresh et al. [16] describes the synthesis and reactivity ratio estimation of photosensitive copolymers based on 4(3(2,4dichorophenyl)3oxoprop1enyl) phenylacrylate (DCP; monomer 2). In the study, DCP was copolymerized with hydroxyethyl acrylate (HEA; monomer 1) and with styrene and reactivity ratios were determined to better understand copolymerization behavior. However, as established in Exhibit A (Section 4.1.1), researchers often revert back to linear parameter estimation techniques and the authors (incorrectly) used the FinemanRoss (FR) and KelenTüdös (KT) methods for parameter estimation (see Table 3).
Since the virtues of EVM over linear parameter estimation techniques have already been established, the goal of the current case study is to emphasize the limitations of low conversion data and demonstrate how they can be addressed using EVM. All experimental data that Suresh et al. [16] used for reactivity ratio estimation were kept below 15% conversion, so that the instantaneous copolymerization equation could be used for parameter estimation. But, is “below 15% conversion” enough? As mentioned previously, this is largely considered “best practice,” but does not account for composition drift even at low conversions nor for experimental error. As shown in Figure 7, only five (seemingly unreplicated) data points were collected for reactivity ratio estimation, with obvious discrepancies between the experimental data and the model predictions.
It is likely that these discrepancies are due to experimental error; this type of behavior is observed very often, especially for low conversion data. In order to address this, researchers should review potential sources of error; they are likely (1) unidentified composition drift and/or (2) experimental difficulties. This case study will demonstrate both of these sources of error (and how to handle them). This is another very important, yet implicit, contribution of EVM. EVM, if nothing else, forces one to think about the possible sources of variation (and quantify them). Relevant data, program screenshots and results are available in Appendix C, Section C.2.
To account for composition drift (even at low conversions), the cumulative copolymerization model (Equations (2) and (3)) should be used. Using direct numerical integration to solve this system of equations ensures that the feed composition (f_{1}) is considered as a function of conversion, thus taking any composition drift into account mathematically. To establish whether unidentified composition drift is the culprit in the current experimental data set, we can evaluate the data using both the instantaneous and cumulative models and compare the results (Figure 8, discussed below). In reality, using the cumulative model would also increase the amount of data available for analysis, as the copolymerization would be allowed to go to higher conversion levels (and data would continue to be collected); in addition, the experimental information is enhanced, anyway, since both conversion and copolymer composition data are included; see also Section 4.2.1. Ultimately, this would increase the degree of confidence in the reactivity ratio estimates and decrease the size of the JCR.
Figure 8 indicates that the two EVMobtained reactivity ratio estimates are in good agreement and the JCR sizes and orientations are similar. Thus, in this case, the effect of composition drift is likely minimal. Therefore, we will continue our troubleshooting by investigating the second source of error: experimental difficulties. Since no replicate data are available, we cannot calculate the error associated with the composition measurements shown in Figure 7. However, as discussed in Section 2.2, EVM considers the error present in all variables throughout the parameter estimation process. The program default values of 1% error (associated with f_{1,0}) and 5% error (associated with ${\overline{F}}_{1}$) were therefore used in the analysis (see Appendix A for additional information).
The reactivity ratios calculated using the EVM program are as described in Table 3 and Figure 8. The converged program also provides the best possible estimates of “true” values of the variables, a feature which was described in Section 3.3 (see also Appendix C, Section C.2). Thus, in Figure 9, we can compare the experimental (measured) values to the “true” experimental values and the EVM model predictions (both instantaneous and cumulative).
Figure 9 indicates that the experimental data points were subject to some degree of error, especially for f_{1,0} = 0.3 and f_{1,0} = 0.7. Thus, experimental difficulties are likely the culprit here. The best way to mitigate this type of problem is to design experiments, replicate copolymerization runs and use EVM for parameter estimation. Additionally, using full conversion data with the cumulative model gives a more complete picture of copolymerization kinetics; researchers do not have to struggle with the experimental challenges of collecting low conversion data.
4.2. Maximizing and Exploiting Information Content
4.2.1. Exhibit C: What Happens when We Take Advantage of ALL Copolymerization Data?
Collecting low conversion data presents some challenges (as described in Exhibit B) but the instantaneous model is statistically valid if managed properly. That is, good control over conversion levels (typically < 10%), statistical design of experiments, experimental replication, and/or a nonlinear parameter estimation technique should be used.
Evaluation of BMA/BA Copolymerization Data: Low Conversion Analysis
Ren et al. [18] recently investigated the copolymerization kinetics of nbutyl methacrylate (BMA; monomer 1) and nbutyl acrylate (BA; monomer 2). Originally, the group collected copolymerization data at low conversion levels (<10%) so that the instantaneous model (Equation (1)) could be used for analysis. After estimating preliminary reactivity ratios using the RREVM program [1], additional (replicated) designed experiments were completed to improve the quality of the data.
This low conversion data set has been reanalyzed using the new MATLABbased EVM program. As with the original investigation, analysis was first performed with the preliminary data only (9 equidistant feed compositions). Then, the analysis was repeated with preliminary data supplemented by the designed replicates (feed compositions selected using the TidwellMortimer criterion [4,19]). A comparison of reactivity ratio estimates is shown in Table 4 and the original data and program output are provided in Appendix C, Section C.3.
Good agreement is observed between reactivity ratio estimates, no matter what the amount (or type) of data used. This indicates wellbehaved data; the low conversion analysis was done in a methodical and statistically correct manner.
Evaluation of BMA/BA Copolymerization Data: MediumHigh Conversion Analysis
This type of low conversion analysis (as performed by Ren et al. [18]) would be sufficient for reactivity ratio estimation, especially since design of experiments was included in the investigation. However, in this case, what if one had also included additional (mediumhigh conversion) experimental data? Ren et al. [18] chose to run three feed compositions up to high conversion values; the full conversion experimental data was used to evaluate the prediction performance of the reactivity ratio estimates. The analysis showed good agreement between model predictions and experimental data, thus confirming the reactivity ratio estimates.
Let’s now take this a step further for illustration purposes. We know that using the cumulative model and direct numerical integration provides us with the opportunity to “repurpose” this cumulative (mediumhigh conversion) data for improved reactivity ratio estimation. With a cumulative model, there is potential to obtain significantly more information (that is, more data points) from each experiment. Since researchers are not limited to low conversion, less experimental tedium is required to obtain the same degree of accuracy, as long as the experiments are welldesigned.
A direct comparison of the preliminary analysis (9 feed compositions) and the cumulative analysis (3 feed compositions) results is provided in Figure 10. The same initial estimates were used in both cases to ensure that both of the RRE techniques had the same starting point. It is interesting to note that both the reactivity ratio estimates and the JCR areas are almost identical for the two data sets. This provides us with two main conclusions: that the parameter estimation results using the instantaneous and cumulative models are in agreement and that the degree of confidence in our results is approximately the same (regardless of which data set and/or model is being used). Therefore, in this case, 3 full conversion runs have approximately the same information content as 9 runs that are limited to low conversion levels.
This result should motivate researchers to think carefully about their preliminary experimental work. By strategically selecting feed compositions (using design of experiments techniques like TidwellMortimer [4,19] or EVM [8,9]) and collecting copolymerization data up to medium or high conversion levels, it is possible to obtain sufficient information about a new system. The results shown herein suggest that preliminary experimental work can almost be reduced to 1/3 of the original load, without any loss of information content. Therefore, researchers should be encouraged to make use of all copolymerization data by employing the cumulative copolymerization model.
4.2.2. Exhibit D: How Many Replicates Do We Really Need?
As demonstrated in Exhibit C, experimental replication is an important aspect of reactivity ratio estimation. Collection of data for reactivity ratio estimation, especially at low conversion, is subject to experimental error; replicating experiments helps researchers account for experimental error and potential lurking variables. In deciding on the number of required replicates, a number of aspects must be considered [20]. The goal of the current case study is not to do a statistical evaluation of the number of replicates required but rather to look at preexisting data and demonstrate once more the importance of experimental replicates.
Many reactivity ratio estimation studies (including [18,21,22,23]) have used the TidwellMortimer (TM) criterion for design of experiments to select the feed compositions at which to run copolymerizations for reactivity ratio estimation. In most cases, two optimal feed compositions (according to TM) are used and replicated four times each. To establish the importance of replication and symmetry, this case study will analyze subsets of an original data set from the copolymerization of styrene (Sty; monomer 1) and ethyl acrylate (EA; monomer 2) [21].
When experiments are statistically designed, it is possible to obtain accurate reactivity ratio estimates from reduced experimental effort. Replication ensures that experimental results are repeatable, gives us an estimate of the experimental error and increases the degree of confidence in the resulting reactivity ratio estimates. The data reported by McManus and Penlidis [21] included an initial run and 3 replicates at two feed compositions (8 data points total); the data set is available in Appendix C, Section C.4.
To create this collection of reactivity ratio estimates and JCRs (Figure 11), the original data set [21] was revisited with the MATLABbased RREVM program (blue JCR and circle point estimate in Figure 11; 3 replicates). Then, one randomly selected replicate at each feed composition was removed and the data was reevaluated (red JCR and square point estimate; 2 replicates). This process was repeated, thus leaving half of the original data set (green JCR and triangle point estimate; 1 replicate). Finally, only one data point at each feed composition was used in the analysis (black JCR and diamond point estimate; no replicates).
As shown in Figure 11, the reactivity ratio estimates are all similar for the copolymerization of styrene and ethyl acrylate, regardless of how many replicates are used (in this specific case, for this specific data set). The JCRs form a series of concentric ellipses and the slight skew (offcenteredness) depends on which of the replicates are randomly removed. Clearly, using the full data set (3 replicates) gives a much smaller JCR (and a much higher degree of confidence) compared to the other estimates. In general, the JCR size increases (that is, the uncertainty becomes greater) as the number of replicates decreases.
At first glance, it seems as though the uncertainty in r_{1} is much greater than that in r_{2} (as demonstrated by the horizontal growth in the JCRs as the number of replicates decreases). However, this is partially due to the fact that r_{1} is approximately 6 times larger than r_{2}. Thus, the same relative error will have a larger absolute value in r_{1} compared to r_{2}. Another factor that should be considered is the feed compositions used to collect the experimental data (f_{1,0}). The TM criterion suggested f_{1,0} = 0.0788 and f_{1,0} = 0.7193, where monomer 1 is styrene [21]. While this is the statistically correct approach (and an excellent starting point), this means that the experimental data contain information only about a “monomer 2”rich system (given f_{1,0} = 0.0788, f_{2,0} = 0.9212). While f_{1,0} = 0.7193 contains more styrene than ethyl acrylate, an even higher f_{1,0} would further improve the degree of confidence in r_{1}. This will be discussed further (and proven through a case study) in Exhibit E, which demonstrates the effectiveness of a sequential design of experiments.
In looking at the importance of replication, it is also worth examining the effect of symmetry on reactivity ratio estimates and JCRs. Specifically, the next step in the investigation looks at how estimation results are affected when replicates are only available for one of the two recipes. Again, the styrene/ethyl acrylate data set from McManus and Penlidis [21] was employed.
The reactivity ratio estimates and JCRs presented in Figure 12 tell an interesting story. When replicates are only included from the high f_{1,0} runs, the error in r_{2} (vertical error, in this case) increases (red JCR and square). The reverse is true when all of the replicates are from the low f_{1,0} (high f_{2,0}) runs; uncertainty in r_{1} becomes much more substantial (green JCR and triangle). This observation is in agreement with the results of Figure 11, as the inclusion of “monomer 1”rich data improves the degree of confidence in r_{1}.
Both asymmetrical data sets give reasonably good estimates of reactivity ratios for the styrene and ethyl acrylate copolymerization. Ultimately, limited replication is better than no replication at all. However, it is no coincidence that the JCR from the fully replicated data set is located almost in the intersection of the other two curves (between the red and green curves in Figure 12). Including all of the experimental replicates in the analysis ensures that we have the highest degree of confidence in both r_{1} and r_{2}, thus decreasing the JCR area as much as possible.
4.2.3. Exhibit E: Can We Use Design of Experiments to Increase Confidence in Our Results?
In both Exhibit C and Exhibit D, we observed that design of experiments (specifically, using the TidwellMortimer criterion) is an important aspect of reactivity ratio estimation. Reactivity ratio estimates obtained using designed data are more accurate and more precise than preliminary experiments, because they use a combination of prior knowledge and statistical principles to increase confidence in the final estimates.
In the current case study, we will look at experimental data for the copolymerization of butyl acrylate (BA; monomer 1) and methyl methacrylate (MMA; monomer 2). The original investigation by Dubé and Penlidis [22] was a detailed, multistep analysis but only data from the first step are used in the current exhibit.
In investigating the effect of experimental design on the confidence in our estimation results, there are two important pieces of information to consider. If the preliminary estimates of r_{1} are smaller than r_{2}, then (1) uncertainty in r_{1} will seem much lower than uncertainty in r_{2} (as explained previously in Section 4.2.2, the same relative error will have a larger absolute value in r_{2} compared to r_{1}) and (2) the TidwellMortimer design will suggest recipes rich in monomer 1. As shown in Figure 13 (black JCR; 2 feed compositions), r_{1} < r_{2} for the BA/MMA system and there is more uncertainty in r_{2} (that is, in the vertical direction). Generally speaking, we find that the JCR is “stretched” along the axis of the larger reactivity ratio estimate, which is due to both the absolute error and the selected feed compositions.
Since the absolute error is experimentdependent, it is a fact of life that one has to live with. Thus, this case study focuses on item (2) described above: how do the feed compositions (selected randomly or via design of experiments) affect the reactivity ratio estimates and associated JCRs? In the case of BA/MMA copolymerization (given preliminary estimates r_{1} = 0.51 and r_{2} = 2.38 from Grassie et al. [24]), the TidwellMortimer criterion suggests the following feed compositions: f_{1,0} = 0.543 and f_{1,0} = 0.798, where monomer 1 is butyl acrylate [22]. Based on this criterion, all of the experimental data collected are rich in monomer 1, which provides us with more certainty in r_{1} (see again Figure 13; black JCR; 2 feed compositions).
At the next step, we can use EVMbased sequential design of experiments [8]. This allows for further refinement of the reactivity ratio estimates, a higher degree of certainty and therefore smaller JCRs. The procedure is described below:
 (1)
 EVM is applied to instantaneous (low conversion) data (from [22]) to estimate reactivity ratios. Feed compositions are selected according to TidwellMortimer design and four runs are done at each level: f_{1,0} = 0.543 and f_{1,0} = 0.798.
 (2)
 Parameter estimation results from EVM are recorded (see Appendix C, Section C.5). Specifically, reactivity ratio estimates (r_{1} and r_{2}) and the G matrix (Appendix A) are required for sequential design of experiments.
 (3)
 The EVMbased sequential design of experiments program (using data from step (2), as well as the preliminary feed compositions from step (1)) is employed. Details on the design have been reported by Kazemi et al. [8].
From the sequential design of experiments, we find that the “next best” feed composition for analysis of the BA/MMA copolymerization is f_{1,0} = 0.100. This indication that more monomer 2rich data is required is very reasonable, since all data collected to this point has been rich in monomer 1. By introducing experimental data rich in monomer 2, the uncertainty in r_{2} should decrease.
In the absence of experimental data for f_{1,0} = 0.100, data were simulated using the instantaneous copolymerization model and random error was added (based on the variance reported in the original study [21]). As was the case for the other feed compositions, four data points at f_{1,0} = 0.100 were added to the analysis. These new data points, along with the original data (shown in Appendix C, Section C.5) were then used to reestimate the reactivity ratios with EVM. The results are shown alongside the original analysis in Figure 13 (red JCR; 3 feed compositions).
The inclusion of data rich in monomer 2 drastically improves the degree of certainty in our reactivity ratio estimates. While the point estimates are unaffected, the error in r_{2} is significantly reduced. This is as expected: when data rich in monomer 2 are available, we can have greater confidence in r_{2}.
This result demonstrates the importance of design of experiments for reactivity ratio estimation. We are able to maximize the information content from a minimal number of runs and we are able to decrease the degree of uncertainty in our parameter estimates. Sequential designs are extremely useful and revealing and minimize the overall experimental effort.
5. Conclusions
The MATLABbased program for reactivity ratio estimation (using the errorinvariablesmodel) is statistically correct, accurate and userfriendly. This readytouse software can easily be employed to evaluate conversion and composition data, providing researchers with precise reactivity ratio estimates within seconds. Using EVM for reactivity ratio estimation should provide polymer researchers with confidence, especially as they continue to use reactivity ratios to predict copolymer properties and microstructure for specific applications.
We hope that the case studies presented herein will motivate polymer chemists and engineers to think critically about which technique(s) they use for parameter estimation. Current “best practices” have serious shortcomings, especially as linear parameter estimation techniques continue to be used. Ignoring the limitations of low conversion data is imprudent; inappropriate statistical approaches can undermine even carefully collected data. The case studies have also shown that the errorinvariablesmodel can maximize information content from copolymerization data. Researchers can ensure the accuracy of their results (while minimizing required time and resources) by collecting and analyzing full conversion data. Also, as demonstrated by the case studies, replication and design of experiments are key to a high degree of confidence.
Ultimately, the computer program described and demonstrated herein provides both the easeofuse required for introductory studies and the flexibility needed for extensions to complex and/or multicomponent systems. Therefore, it should find use in both industry and academia. The program can be obtained by contacting the authors.
Acknowledgments
The authors wish to acknowledge financial support from the Natural Sciences and Engineering Research Council (NSERC) of Canada and the Canada Research Chair (CRC) program. In addition, thanks go to UWW/OMNOVA Solutions, Akron, OH, USA, for special support to A.J.S.
Author Contributions
A.J.S. made the EVM program “userfriendly” within MATLAB, researched and analyzed the case studies and wrote the paper. A.P. supervised the work and corrected several drafts of the paper.
Conflicts of Interest
The authors declare no conflict of interest.
An Overview of the Appendices
In Appendix A, information about relevant statistical principles is presented. This is intended to provide the interested reader with additional insight about “behindthescenes” mathematical details. Appendix B presents a general program description, including screenshots with default program values. Finally, Appendix C contains details and data for each of the specific case studies presented in the main text (Section 4).
Appendix A. Relevant Statistical Principles
The user need not have a detailed understanding of the statistical principles used in the errorinvariablesmodel. However, for the interested reader, some additional information is included in what follows.
Appendix A.1. Additive and Multiplicative Error
The magnitude (and type) of error associated with the variables can be determined through independent replication. Typically, the relationship between a variable and its error is either additive (absolute) or multiplicative (relative). Multiplicative error is typically assumed because error is presented as a percentage of the measurement (and is, therefore, relative in nature). However, if a user has insight about a system that indicates additive error, it is possible to modify the program accordingly.
The relationships between the “true” value of the variable (${\underset{\xaf}{\xi}}_{i}$) and the measured/recorded values (${\underset{\xaf}{x}}_{i}$) are shown in Equations (A1) and (A2) for additive and multiplicative error, respectively. Note that Equation (A2) has been shown previously as Equation (4) but is repeated here (more generally) for reference.
$$x=\xi +k\epsilon $$
$$x=\xi \left(1+k\epsilon \right)$$
As explained before, k is a constant that reflects the uncertainty of the variables (for example, if 5% error is assumed for the multiplicative case, k = 0.05). Error, ε, is a random variable that is typically uniformly distributed between −1 and 1 [25].
When multiplicative error is assumed, it becomes necessary to transform Equation (A2) so that the error term is additive. Taking the natural logarithm of both sides gives Equation (A3). Note that ln(1 + kε) can be replaced by kε, as long as the magnitude of the error does not exceed 10% (k ≤ 0.10).
$$\mathrm{ln}\left(x\right)=\mathrm{ln}\left(\xi \right)+k\epsilon $$
Regardless of error structure, the value of k (the degree of uncertainty) manifests itself in the same way in the variancecovariance matrix. This is shown in Equations (A4) through (A7). Equation (A4) gives the variance of x (for the additive case), whereas Equation (A5) gives the variance of ln(x) (for the multiplicative case).
$$V\left(x\right)=V\left(\xi +k\epsilon \right)={k}^{2}V\left(\epsilon \right)$$
$$V\left(\mathrm{ln}\left(x\right)\right)=V\left(\mathrm{ln}\left(\xi \right)+k\epsilon \right)={k}^{2}V\left(\epsilon \right)$$
Equations (A6) and (A7) are relevant to both error structures since V(x) = V(ln(x)), as shown above.
$$V\left(\epsilon \right)=E\left({\epsilon}^{2}\right){\left[E\left(\epsilon \right)\right]}^{2}=\underset{1}{\overset{1}{{\displaystyle \int}}}\frac{{\epsilon}^{2}}{2}d\epsilon =\frac{1}{3}$$
$$V\left(x\right)=V\left(\mathrm{ln}\left(x\right)\right)=\frac{{k}^{2}}{3}$$
The variance estimate shown in Equation (A7) is applied to different variables, which populate the variancecovariance matrix for the EVM program. The program’s default settings assume 1% error associated with feed composition (x_{1} = f_{1,0} and k_{1} = 0.01) and 5% error associated with cumulative copolymer composition (x_{2} = ${\overline{F}}_{i}$ and k_{2} = 0.05). Therefore, the variancecovariance matrix, V, for the instantaneous (low conversion) case is shown in Equation (A8).
$$\underset{\xaf}{V}=\left[\begin{array}{cc}V\left({x}_{1}\right)& 0\\ 0& V\left({x}_{2}\right)\end{array}\right]=\left[\begin{array}{cc}\frac{{k}_{1}^{2}}{3}& 0\\ 0& \frac{{k}_{2}^{2}}{3}\end{array}\right]=\left[\begin{array}{cc}\frac{{0.01}^{2}}{3}& 0\\ 0& \frac{{0.05}^{2}}{3}\end{array}\right]=\left[\begin{array}{cc}0.0000\overline{3}& 0\\ 0& 0.0008\overline{3}\end{array}\right]$$
Appendix A.2. Calculation of G as an EVM Program Output
Only the very basics are presented in what follows. A detailed discussion of the nestediterative EVM algorithm has been presented by Reilly and PatinoLeal [7] and has more recently been described by Kazemi et al. [3,6].
Equation (A9) below is the definition of G, the second derivative of Φ with respect to the parameters, given earlier by Equation (8) (Section 3.3).
$$\underset{\xaf}{G}=E\left[\frac{{d}^{2}\mathrm{\Phi}}{d{\theta}_{i}d{\theta}_{j}}\right]={\displaystyle \sum}_{i=1}^{n}{r}_{i}{\underset{\xaf}{Z}}_{i}^{\prime}{\left({\underset{\xaf}{B}}_{i}\underset{\xaf}{V}{\underset{\xaf}{B}}_{i}^{\prime}\right)}^{1}{\underset{\xaf}{Z}}_{i}$$
r_{i} and V are as defined in Equation 6 (recall that r_{i} is the number of replicates for the ith trial and V is the variancecovariance matrix of the variables). Z_{i} is the vector of partial derivatives of the function $\underset{\xaf}{g}\left({\underset{\xaf}{\xi}}_{i},\underset{\xaf}{\theta}\right)$ (that is, the model, e.g., see Equation (5)) with respect to the parameters for the mth element (see Equation (A10)) and B_{i} is the vector of partial derivatives of the function $\underset{\xaf}{g}\left({\underset{\xaf}{\xi}}_{i},\underset{\xaf}{\theta}\right)$ with respect to the variables.
$${\underset{\xaf}{Z}}_{i}=\left[\frac{\partial \underset{\xaf}{g}\left({\underset{\xaf}{\xi}}_{i},\underset{\xaf}{\theta}\right)}{\partial {\theta}_{m}}\right]$$
$${\underset{\xaf}{B}}_{i}=\left[\frac{\partial \underset{\xaf}{g}\left({\underset{\xaf}{\xi}}_{i},\underset{\xaf}{\theta}\right)}{\partial \left({\underset{\xaf}{\xi}}_{i}\right)}\right]$$
Appendix B. General Program Description
The following screenshots from the MATLABbased EVM program are meant to supplement descriptions in the main text (especially Section 3.2). The analysis of instantaneous copolymerization data is presented herein but the same general information is relevant to the analysis of cumulative data.
We will start by demonstrating the manual data input option, then we will show the same information using the “data file” option. The prompts contain sample data from McManus and Penlidis [21]. If the user decides to input data using a premade data file, the same inputs are required (with slightly different formatting).
Running the program brings up the “QuickStart” menu (Figure A1), which was described in Section 3.1. Figure A2, Figure A3 and Figure A4 show the popup menus (that is, the required data) for manual input of instantaneous composition data. Typically, only the preliminary reactivity ratio estimates (Figure A2) and the copolymerization (composition) data (Figure A3) need to be modified by the user; the “form” of the data entry can be observed in the screenshots below.
As for Figure A4, the only “default” value that may require updating (at the user’s discretion) is the variancecovariance matrix. The default values were explained in Section A.1. If necessary, the magnitude of the matrix entries may be modified but the size of the matrix itself should not be changed (that is, it should remain a 2 × 2 matrix for the instantaneous case).
An alternative to the stepbystep prompts is to include all of the required estimation data in a single “.txt” data file. When a user chooses the data file input option, a popup window containing their files appears (that is, any files in the same folder as the “QuickStart” file). Once an appropriate file is selected, the program will access the data and run automatically.
As mentioned previously, a data file includes all of the same data as the prompts but it can be saved, modified and reused. It can either be created in Notepad or in MATLAB but should have the extension “.txt”. The data file must be prepared prior to running the EVM program, since the program will access the data file “behind the scenes” to obtain the required information. A sample data file (here for the McManus and Penlidis [21] data set) is shown in Figure A5.
In Figure A5, line 1 provides the “Default Settings” for the program, such as the number of parameters, equations and variables (the reader will notice that these values are the same as in Figure A4, with slightly different formatting). Line 2 contains preliminary reactivity ratio estimates (recall Figure A2), while lines 3 and 4 can be combined to form the variancecovariance matrix for the variables. Finally, lines 5 through 12 are the experimental copolymerization data (recall Figure A3). The first column represents the initial feed composition (f_{1,0}) and the second column represents the corresponding measured cumulative copolymer composition (${\overline{F}}_{1}$).
Appendix C. Data & Screenshots from Case Studies
In this section, additional data and program screenshots are presented for the interested reader (or potential program user). Data files are typically shown (rather than stepbystep prompts), which allows for easy reference and modification. However, the same data could be fed to the EVM program using the prompts described in Section 3 and Appendix B. The time required for parameter estimation (that is, parameter convergence) is listed for each case study below (as run on an Intel(R) Core™ i7860 processor). However, one should note that execution time is dependent on a number of factors including preliminary parameter estimates, data sets (size and associated error), computer processor, background tasks, etc.
Appendix C.1. Screenshots from MDO/VAc Copolymerization Analysis
The additional information shown herein is relevant to Exhibit A (Section 4.1.1), the copolymerization of 2methylene1,3dioxepane (MDO) and vinyl acetate (VAc). In spite of the fact that cumulative copolymer composition was measured and reported by Undin et al. [11], information from relevant conversion data was not included. Thus, the analysis conducted was based on the instantaneous model. Regardless of which monomer was selected as M_{1}, all calculations converged in less than five seconds.
Two data files (used to obtain the results shown in Figure 5) are presented in Figure A6. Both contain the same data set (for the MDO/VAc copolymerization from Undin et al. [11]) but differ in terms of which comonomer is identified as M_{1}. Here, Figure A6a is as reported (M_{1} = MDO) and Figure A6b is the reverse (M_{1} = VAc).
Figure A6.
Data files for the analysis of MDO/VAc using EVM assuming (a) M_{1} = MDO and (b) M_{1} = VAc.
Text output is shown in Figure A7 for M_{1} = VAc, as a sample program output (described in Section 3.3). THETA gives the reactivity ratio estimates (r_{1} and r_{2}), XI gives the best estimates of the “true” values of the variables (compare to lines 5 through 10 in Figure A6b), Phi is the objective function value (recall Equation (6)) and G is the expected value of the second derivative of Φ with respect to the parameters (recall Equation (8)).
Finally, the user is asked “Do you wish to continue with Joint Confidence Calculations?” By typing “Y,” the JCR shown in Figure 5b is obtained.
Appendix C.2. Screenshots from HEA/DCP Copolymerization Analysis
The additional information shown herein is relevant to Exhibit B (Section 4.1.2), the copolymerization of hydroxyethyl acrylate (HEA) and 4(3(2,4dichlorophenyl)3oxoprop1enyl) phenylacrylate (DCP) (original data from Suresh et al. [16]).
The two data files (Figure A8) contain similar information, as discussed in Section 3.2.2. The additional data required by the cumulative model are the monomer molecular weights and the conversion values. The sample output shown below (Figure A9) is for the cumulative analysis; XI here is the best estimate for the cumulative copolymer composition $\left({\overline{F}}_{1}\right)$. JCRs for the two analyses are shown in Figure 8. The instantaneous analysis required less than five seconds to estimate reactivity ratios, whereas the cumulative model converged in less than twenty seconds.
Appendix C.3. Screenshots from BMA/BA Copolymerization Analysis
The additional information shown herein is relevant to Exhibit C (Section 4.2.1), the copolymerization of nbutyl methacrylate (BMA) and nbutyl acrylate (BA) (original data from Ren et al. [18]). The analysis of low conversion data (using the instantaneous model) is shown first, followed by the analysis of full conversion data (using the cumulative model). Only the preliminary (instantaneous) data analysis and the cumulative data analysis (rows 3 and 5 in Table 4; featured in Figure 10) are shown herein. However, the same general procedure was applied to the “Instantaneous EVM (preliminary data & designed replicates)” analysis (refer to row 4 in Table 4). Instantaneous analyses converged in under five seconds, whereas the cumulative analysis took less than twenty seconds.
Figure A10a shows the preliminary (low conversion) data set reported by Ren et al. [18]. Figure A10b shows the full conversion data (again reported by Ren et al. [18]) which was originally used for reactivity ratio prediction performance (see original work, Figure 5). Nontruncated data points are a result of plot digitization. Next, Figure A11 shows the program output for the preliminary analysis (data from Figure A10a) and Figure A12 shows the same for the analysis of full conversion data (data from Figure A10b). Notice here that both the reactivity ratio estimates (THETA) and the G matrices (G) are similar, which translates into similar JCRs, as observed in Figure 10.
Appendix C.4. Screenshots from Sty/EA Copolymerization Analysis
The additional information shown herein is relevant to Exhibit D (Section 4.2.2), the copolymerization of styrene (Sty) and ethyl acrylate (EA). Since all of the recorded data was for low conversion experiments, only the instantaneous model was used (and always converged in under five seconds). Several examples of low conversion data files have already been shown throughout the appendix (see, for example, Figure A6, Figure A8a and Figure A10a), so the information is not repeated herein.
However, as explained in Section 4.2.2, subsets of the original data (reported by McManus and Penlidis [21]) were analyzed to demonstrate the importance of experimental replicates. Specific data used at each stage of the analysis (and resulting reactivity ratio estimates) are shown in Table A1 and Table A2. As stated in the main text, runs were randomly selected for removal during this exercise. The investigation could be repeated with different runs removed (or, the same runs removed in a different order) and the general observations would be the same.
Table A1.
Experimental data for investigating the importance of replication (see Figure 11).
3 Replicates  2 Replicates  1 Replicate  No Replicates  

f_{1,0}  F_{1}  f_{1,0}  F_{1}  f_{1,0}  F_{1}  f_{1,0}  F_{1} 
0.079  0.296  0.079  0.296  0.079  0.296  
0.079  0.308  
0.079  0.303  0.079  0.303  0.079  0.303  0.079  0.303 
0.079  0.286  0.079  0.286  
0.719  0.716  
0.719  0.736  0.719  0.736  0.719  0.736  0.719  0.736 
0.719  0.736  0.719  0.736  
0.719  0.732  0.719  0.732  0.719  0.732  
r_{1} = 0.717  r_{1} = 0.746  r_{1} = 0.740  r_{1} = 0.750  
r_{2} = 0.128  r_{2} = 0.132  r_{2} = 0.127  r_{2} = 0.124 
Table A2.
Experimental data for investigating the effect of replicate symmetry (see Figure 12).
3 Replicates  High f_{1,0} Replicates  High f_{2,0} Replicates  

f_{1,0}  F_{1}  f_{1,0}  F_{1}  f_{1,0}  F_{1} 
0.079  0.296  0.079  0.296  
0.079  0.308  0.079  0.308  
0.079  0.303  0.079  0.303  0.079  0.303 
0.079  0.286  0.079  0.286  
0.719  0.716  0.719  0.716  
0.719  0.736  0.719  0.736  0.719  0.736 
0.719  0.736  0.719  0.736  
0.719  0.732  0.719  0.732  
r_{1} = 0.717  r_{1} = 0.715  r_{1} = 0.752  
r_{2} = 0.128  r_{2} = 0.123  r_{2} = 0.129 
Appendix C.5. Screenshots from BA/MMA Copolymerization Analysis
The additional information shown herein is relevant to Exhibit E (Section 4.2.3), the copolymerization of butyl acrylate (BA) and methyl methacrylate (MMA). The original data set from Dubé and Penlidis [22] employed the TidwellMortimer design of experiments; the data file is shown in Figure A13 and the program output (obtained in less than five seconds) is shown in Figure A14. The associated JCR was shown previously (recall Figure 13).
Given the output data from Figure A14, the EVMbased sequential design of experiments can be used to select the next optimal feed composition. Using EVM for sequential design of experiments has been discussed by Kazemi et al. [8]. From the design of experiments, the next logical feed composition is f_{1,0} = 0.1. Simulated data (shown in addition to the preliminary data set) used for subsequent analysis are shown in Figure A15, with the results shown in Figure A16. Again, the program converged in under five seconds.
References
 Dubé, M.A.; Amin Sanayei, R.; Penlidis, A.; O’Driscoll, K.F.; Reilly, P.M. A microcomputer program for estimation of copolymerization reactivity ratios. J. Polym. Sci. Part A Polym. Chem. 1991, 29, 703–708. [Google Scholar] [CrossRef]
 Polic, A.L.; Duever, T.A.; Penlidis, A. Case studies and literature review on the estimation of copolymerization reactivity ratios. J. Polym. Sci. Part A Polym. Chem. 1998, 36, 813–822. [Google Scholar] [CrossRef]
 Kazemi, N.; Duever, T.A.; Penlidis, A. A powerful estimation scheme with the errorinvariablesmodel for nonlinear cases: Reactivity ratio estimation examples. Comput. Chem. Eng. 2013, 48, 200–208. [Google Scholar] [CrossRef]
 Scott, A.J.; Riahinezhad, M.; Penlidis, A. Optimal design for reactivity ratio estimation: A comparison of techniques for AMPS/acrylamide and AMPS/acrylic acid copolymerizations. Processes 2015, 3, 749–768. [Google Scholar] [CrossRef]
 Scott, A.J.; Penlidis, A. Copolymerization. In Elsevier Reference Module in Chemistry, Molecular Sciences and Chemical Engineering; Reedijk, J., Ed.; Elsevier: Waltham, MA, USA, 2017. [Google Scholar]
 Kazemi, N.; Duever, T.A.; Penlidis, A. Reactivity ratio estimation from cumulative copolymer composition data. Macromol. React. Eng. 2012, 5, 385–403. [Google Scholar] [CrossRef]
 Reilly, P.M.; PatinoLeal, H. A Bayesian study of the errorinvariables model. Technometrics 1981, 23, 221–231. [Google Scholar] [CrossRef]
 Kazemi, N.; Duever, T.A.; Penlidis, A. Design of experiments for reactivity ratio estimation in multicomponent polymerizations using the errorinvariables approach. Macromol. Theory Simul. 2013, 22, 261–272. [Google Scholar] [CrossRef]
 Scott, A.J.; Kazemi, N.; Penlidis, A. AMPS/AAm/AAc terpolymerization: Experimental verification of the EVM framework for ternary reactivity ratio estimation. Processes 2017, 5, 9. [Google Scholar] [CrossRef]
 Agarwal, S.; Kumar, R.; Kissel, T.; Reul, R. Synthesis of degradable materials based on caprolactone and vinyl acetate units using radical chemistry. Polym. J. 2009, 42, 650–660. [Google Scholar] [CrossRef]
 Undin, J.; Illanes, T.; FinneWistrand, A.; Albertsson, A. Random introduction of degradable linkages into functional vinyl polymers by radical ringopening polymerization, tailored for soft tissue engineering. Polym. Chem. 2012, 3, 1260–1266. [Google Scholar] [CrossRef]
 Hedir, G.; Bell, C.; Ieong, N.; Chapman, E.; Collins, I.; O’Reilly, R.; Dove, A. Functional degradable polymers by xanthatemediated polymerization. Macromolecules 2014, 47, 2847–2852. [Google Scholar] [CrossRef]
 Ding, D.; Pan, X.; Zhang, Z.; Li, N.; Zhu, J.; Zhu, X. A degradable copolymer of 2methylene1,3dioxepane and vinyl acetate by photoinduced cobaltmediated radical polymerization. Polym. Chem. 2016, 7, 5258–5264. [Google Scholar] [CrossRef]
 Feldermann, A.; Toy, A.; Phan, H.; Stenzel, M.; Davis, T.; BarnerKowollik, C. Reversible addition fragmentation chain transfer copolymerization: Influence of the RAFT process on the copolymer composition. Polymer 2004, 45, 3997–4007. [Google Scholar] [CrossRef]
 Hagiopol, C. Copolymerization: Toward a Systematic Approach; Plenum Publishers: New York, NY, USA, 1999. [Google Scholar]
 Suresh, J.; Karthik, S.; Arun, A. Photocrosslinkable polymer based on 43(2,4dichlorophenyl)3oxoprop1enyl) phenylacrylate: Synthesis, reactivity ratio, and crosslinking studies. Mater. Sci. Pol. 2016, 34, 834–844. [Google Scholar] [CrossRef]
 Zhang, G.; Zhang, L.; Gao, H.; Konstantinov, I.; Arturo, S.; Yu, D.; Torkelson, J.; Broadbelt, L. A combined computational and experimental study of copolymerization propagation kinetics for 1ethylcyclopentyl methacrylate and methyl methacrylate. Macromol. Theory Simul. 2016, 25, 263–273. [Google Scholar] [CrossRef]
 Ren, S.; HinojosaCastellanos, L.; Zhang, L.; Dubé, M.A. Bulk freeradical copolymerization of nbutyl acrylate and nbutyl methacrylate: Reactivity ratio estimation. Macromol. React. Eng. 2017, 11, 1600050. [Google Scholar] [CrossRef]
 Tidwell, P.W.; Mortimer, G.A. An improved method of calculating copolymerization reactivity ratios. J. Polym. Sci. Part A Polym. Chem. 1965, 3, 369–387. [Google Scholar] [CrossRef]
 Cochran, W.G.; Cox, G.M. Experimental Designs; John Wiley & Sons, Inc.: New York, NY, USA, 1957. [Google Scholar]
 McManus, N.; Penlidis, A. A kinetic investigation of styrene/ethyl acrylate copolymerization. J. Polym. Sci. Part A Polym. Chem. 1996, 34, 237–248. [Google Scholar] [CrossRef]
 Dubé, M.A.; Penlidis, A. A systematic approach to the study of multicomponent polymerization kinetics—The butyl acrylate/methyl methacrylate/vinyl acetate example: 1. Bulk copolymerization. Polymer 1995, 36, 587–598. [Google Scholar] [CrossRef]
 Zhang, Y.; Dubé, M.A. Copolymerization of nbutyl methacrylate and dlimonene. Macromol. React. Eng. 2014, 8, 805–812. [Google Scholar] [CrossRef]
 Grassie, N.; Torrance, B.; Fortune, J.; Gemell, J. Reactivity ratios for the copolymerization of acrylates and methacrylates by nuclear magnetic resonance spectroscopy. Polymer 1965, 6, 653–658. [Google Scholar] [CrossRef]
 Rossignoli, P.J.; Duever, T.A. The estimation of copolymer reactivity ratios: A review and case studies using the errorinvariables model and nonlinear least squares. Polym. React. Eng. 1995, 3, 361–395. [Google Scholar]
Figure 2.
Graphical representation of EVM (inspired by [2]).
Figure 4.
FinemanRoss plots for the copolymerization of MDO/VAc with (a) M_{1} = MDO (r_{MDO} = 1.06; r_{VAc} = 1.83 and (b) M_{1} = VAc (r_{MDO} = 1.96; r_{VAc} = 2.03).
Figure 5.
EVMobtained RR estimates and JCRs for the copolymerization of MDO/VAc with (a) M_{1} = MDO (r_{MDO} = 1.19; r_{VAc} = 1.87) and (b) M_{1} = VAc (r_{MDO} = 1.01; r_{VAc} = 1.72).
Figure 8.
EVMobtained RR estimates and JCRs for the copolymerization of HEA/DCP using the instantaneous model (r_{HEA} = 1.28; r_{DCP} = 0.56) and the cumulative model (r_{HEA} = 1.32; r_{DCP} = 0.55).
Figure 10.
Comparison of results for the copolymerization of BMA/BA using the instantaneous model (r_{BMA} = 2.11; r_{BA} = 0.49) and the cumulative model (r_{BMA} = 2.11; r_{BA} = 0.50).
Table 1.
Summary of reactivity ratio estimation (RRE) studies for 2methylene1,3dioxepane (MDO; monomer 1)/vinyl acetate (VAc; monomer 2) copolymerization.
Ref.  RRE Technique  RRE Results  Comments  

r_{1}  r_{2}  
[10]  KelenTüdös (KT)  0.47  1.56 

[11]  FinemanRoss (FR)  0.93  1.71 

[12]  NonLinear Least Squares (NLLS)  1.03  1.22 

[13]  FinemanRoss (FR)  0.14  1.89 

Table 2.
RRE data for the copolymerization of MDO (monomer 1)/VAc (monomer 2) [11].
Sample  Monomer Feed  Copolymer Composition  

f_{1,0}  f_{2,0}  ${\overline{\mathit{F}}}_{\mathbf{1}}$  ${\overline{\mathit{F}}}_{\mathbf{2}}$  
MDO70  0.70  0.30  0.66  0.34 
MDO50  0.50  0.50  0.42  0.58 
MDO30  0.30  0.70  0.23  0.77 
MDO10  0.10  0.90  0.06  0.94 
MDO5  0.05  0.95  0.03  0.97 
MDO1  0.01  0.99  0.005  0.995 
Table 3.
Summary of RRE results for hydroxyethyl acrylate (HEA; monomer 1)/4(3(2,4dichorophenyl)3oxoprop1enyl) phenylacrylate (DCP; monomer 2) copolymerization.
Ref.  RRE Technique  RRE Results  

r_{1}  r_{2}  
[16]  FinemanRoss (FR)  1.53 ± 0.10  0.76 ± 0.16 
[16]  KelenTüdös (KT)  1.67 ± 0.13  0.58 ± 0.05 
[16]  Extended KT  1.65 ± 0.13  0.60 ± 0.08 
Current Work  Instantaneous EVM  1.28 *  0.56 * 
Current Work  Cumulative EVM  1.32 *  0.55 * 
* Note: For EVMobtained reactivity ratio estimates, statistically correct JCRs are presented instead of approximate confidence intervals (derived on a linear hypothesis); see Figure 8.
Table 4.
Summary of RRE results for nbutyl methacrylate (BMA; monomer 1) and nbutyl acrylate (BA; monomer 2) copolymerization.
Ref.  RRE Technique  RRE Results  

r_{1}  r_{2}  
[18]  Preliminary Estimates (RREVM) [1]  2.100  0.489 
[18]  Estimates from TidwellMortimer Designed Experiments (RREVM) [1,2]  2.008  0.460 
Current Work  Instantaneous EVM (preliminary data)  2.109  0.492 
Current Work  Instantaneous EVM (preliminary data & designed replicates)  2.012  0.462 
Current Work  Cumulative EVM  2.114  0.500 
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).