A Machine Learning Approach for Phase-Split Calculations in n-Octane/Water and PASN/Water Systems

: Flash calculations, including phase split and phase classiﬁcation for both n-octane/water blends and parafﬁnic aromatic synthetic naphtha (PASN)/water blends present signiﬁcant computational challenges. Calculations to establish the two-phase and three-phase regions, as well as the transitions between regions, were addressed by a phase classiﬁcation method proposed in a recent contribution involving machine learning (ML). This work focusses on the phase-split calculations, considering (a) the lack of numerical convergence of the traditional calculations and their related numerical issues for water/n-octane and PASN/water systems based on the Rachford–Rice derived surfaces and (b) the successful implementation of an ML approach based on a K-nearest-neighbor (KNN) algorithm, which uses the abundant experimental data obtained in a CREC-VL cell. Abstract: Flash calculations, including phase split and phase classification for both n-octane/water blends and paraffinic aromatic synthetic naphtha (PASN)/water blends present significant computa- tional challenges. Calculations to establish the two-phase and three-phase regions, as well as the tran- sitions between regions, were addressed by a phase classification method proposed in a recent contri- bution (Lopez Zamora et al., 2021) involving machine learning (ML). This work focusses on the phase- split calculations, considering (a) the lack of numerical convergence of the traditional calculations and their related numerical issues for water/n-octane and PASN/water systems based on the Rachford – Rice derived surfaces and (b) the successful implementation of an ML approach based on a K-nearest- neighbor (KNN) algorithm, which uses the abundant experimental data obtained in a CREC-VL cell. liquid equilibrium;


Introduction
Phase equilibrium calculations deal with two main problems [1]: (i) phase stability or the number of phases present, and (ii) phase split or the establishment of the composition and amount of each phase present. For water/hydrocarbon mixtures, conventional approaches to determine phase equilibrium are computationally expensive. Furthermore, the various analyses are penalized by the lack of good initial estimates of equilibrium phase ratios [2,3].
While the number of phases at thermodynamic equilibrium is an unknown condition, as considered in our previous work [4][5][6], experimental data from the CREC-VL cell can be used to determine whether two-phase or three-phase regions are present. This allows for, in combination with ML, an effective "a priori" classification of the number of phases. This also reduces the computational cost and gives better initial estimates of the phase-split calculation once the model is trained.
Traditionally, the designated Rachford-Rice equations [7] can be used to solve the phase-split calculations for a flash unit (Figures 1 and 2). The equations for both two-phase and three-phase calculations are summarized in Equations (1)- (3).
where β corresponds to the vapor mole fraction, z i is the molar fraction of component i, K i stand for the equilibrium constants, and H i is an auxiliary variable.
where corresponds to the vapor mole fraction, zi is the molar fraction of component i, Ki stand for the equilibrium constants, and Hi is an auxiliary variable.
Calculations using the Rachford-Rice equations may involve different numerical approaches as reported in the technical literature, such as successive substitution, quasi-Newton, Newton, steepest-descend, and their modifications and combinations [2].
In the present study, two-phase and three -flash calculations are specifically developed for water/n-octane and PASN/water blends using the Soave-Redlich-Kwong-Kabadi-Danner equation of state (SRKKD EoS). Following this, convergence calculations and numerical issues for water/n-octane and for PASN/water systems are established by using the resulting Rachford-Rice derived surfaces. Once this is accomplished, phase-split computations are compared with those from the ML approach in terms of vapor pressure. The outcome of this highlights the importance of an ML approach for accurate predictions, which is developed quite effectively by using the abundant research data available from the CREC-VL Cell experiments [8].

Soave-Redlich-Kwong-Kabadi-Danner (SRKKD) Equation of State
Traditionally, the Peng-Robinson (PR) equation of state (EoS) is one of the most popular EoSs, for calculating hydrocarbon-based PVT behavior, including vapor pressures [9]. When using simulation software, such as HYSYS V9 or Aspen Plus V9, it is considered one of the most improved thermodynamic models available, given its large binary interaction parameter database.
However, the PR EoS displays limitations when it is considered for water or aqueous hydrocarbon mixtures [10]. In these cases, as suggested by previous research from our CREC-UWO group [8], the PR EoS does not describe well the system under study, and a different EoS must be used. In binary systems, such as n-octane/water mixtures, an activity coefficient model can be used, as we proposed in our previous work [4].
Nevertheless, classical activity coefficient models are limited to low pressures (≤10 bar) with no C7+ species included. In the case of water and heavy hydrocarbons, such as naphtha or bitumen (C7+), as in the present study, a cubic equation of state is strongly Calculations using the Rachford-Rice equations may involve different numerical approaches as reported in the technical literature, such as successive substitution, quasi-Newton, Newton, steepest-descend, and their modifications and combinations [2].
In the present study, two-phase and three -flash calculations are specifically developed for water/n-octane and PASN/water blends using the Soave-Redlich-Kwong-Kabadi-Danner equation of state (SRKKD EoS). Following this, convergence calculations and numerical issues for water/n-octane and for PASN/water systems are established by using the resulting Rachford-Rice derived surfaces. Once this is accomplished, phase-split computations are compared with those from the ML approach in terms of vapor pressure. The outcome of this highlights the importance of an ML approach for accurate predictions, which is developed quite effectively by using the abundant research data available from the CREC-VL Cell experiments [8].

Mathematical Model Soave-Redlich-Kwong-Kabadi-Danner (SRKKD) Equation of State
Traditionally, the Peng-Robinson (PR) equation of state (EoS) is one of the most popular EoSs, for calculating hydrocarbon-based PVT behavior, including vapor pressures [9]. When using simulation software, such as HYSYS V9 or Aspen Plus V9, it is considered one of the most improved thermodynamic models available, given its large binary interaction parameter database.
However, the PR EoS displays limitations when it is considered for water or aqueous hydrocarbon mixtures [10]. In these cases, as suggested by previous research from our CREC-UWO group [8], the PR EoS does not describe well the system under study, and a different EoS must be used. In binary systems, such as n-octane/water mixtures, an activity coefficient model can be used, as we proposed in our previous work [4].
Nevertheless, classical activity coefficient models are limited to low pressures (≤10 bar) with no C 7+ species included. In the case of water and heavy hydrocarbons, such as naphtha or bitumen (C 7+ ), as in the present study, a cubic equation of state is strongly suggested. In this work, the Soave-Redlich-Kwong (SRK) EoS with a Kabadi-Danner [11] modification is used. These authors suggested that the SRK EoS with Kabadi-Danner modification improves the VLLE calculations for water-hydrocarbon systems, particularly under highly diluted hydrocarbons in water, which is of great interest for this research. Given the reported advantages of using the Kabadi-Danner modification with the Soave-Redlich-Kwong EoS (SRKKD EoS) for VLL equilibrium calculations for hydrocarbonwater blends [12,13], the SRKKD EoS was used in the present study. The SRKKD EoS can be defined by Equations (4)-(8) as follows: where A = aP (RT) 2 and B = bP RT . The mixing rules required to determine the SRK EoS parameters are given by Equations (10) and (11), with the Kabadi-Danner modification being reported by Equations (9) and (12)-(14) [11].
With respect to G i , it can be calculated using group contribution methods, which account for the sum of the contributions of the different functional groups included in every hydrocarbon molecule. Values from various functional groups can be obtained from the table published by Kabadi-Danner (1985) [11]. The k ij parameters used for the calculations reported in this work were taken from HYSYS V9 software.
Furthermore, to compute the fugacity coefficient with the SRKKD EoS, Equations (15) and (16) can be used [14] as follows: x j a i a j 1 − k ij (16)

Materials
Distilled water was used with a pH of 7 ± 0.05 and a total organic carbon of less than 0.1 ± 0.02 ppm for all experimental studies. In the case for the alkane compounds, Sigma-Aldrich HPLC assay purity reagents were used. The purity of those hydrocarbons was represented in mole percent as follows: n-hexane >97%, n-heptane >96%, n-octane >99%, n-decane >99%, n-dodecane >99%. The water content of the n-alkanes was 0% for n-octane and n-dodecane, 0.01% for n-hexane and n-decane, and 0.02% for n-heptane. Toluene was obtained from Fisher Scientific with a purity >99% and 0.008% water content.

CREC Vapor-Liquid Equilibrium Cell
The Chemical Reactor Engineering Center (CREC) developed a CREC VL-Cell [15] that allows for the measurement of VLL equilibrium ( Figure 3). This unit uses a "dynamic method". The temperature of the cell increases progressively by using a thermal ramp of 1.2 • C/min. As a result, every run provides plenty of vapor-liquid equilibrium data at 10 Hz every 0.01 s. Additional explanations about the cell operation are reported in [8,15].
0.1 ± 0.02 ppm for all experimental studies. In the case for the alkane compounds, Sigma-Aldrich HPLC assay purity reagents were used. The purity of those hydrocarbons was represented in mole percent as follows: n-hexane >97%, n-heptane >96%, n-octane >99%, n-decane >99%, n-dodecane >99%. The water content of the n-alkanes was 0% for n-octane and n-dodecane, 0.01% for n-hexane and n-decane, and 0.02% for n-heptane. Toluene was obtained from Fisher Scientific with a purity >99% and 0.008% water content.

CREC Vapor-Liquid Equilibrium Cell
The Chemical Reactor Engineering Center (CREC) developed a CREC VL-Cell [15] that allows for the measurement of VLL equilibrium ( Figure 3). This unit uses a "dynamic method". The temperature of the cell increases progressively by using a thermal ramp of 1.2 C/min. As a result, every run provides plenty of vapor-liquid equilibrium data at 10 Hz every 0.01 s. Additional explanations about the cell operation are reported in [8,15].
The CREC VL-Cell uses a marine type of impeller (propeller). The unit propeller helps to ensure a well dispersed and homogeneous blend of non-miscible liquids. This provides a good heat distribution inside the CREC VL-Cell. Furthermore, the special cell design allows one to directly analyze a process sample. It also avoids losses of light volatile components due to sample transfers. The CREC VL-Cell has two thermocouples strategically located inside the CREC VL-Cell and connected to a temperature data acquisition unit. The thermocouples help to measure both the gas-and liquid-phase temperatures within the VL-Cell through a USB desktop computer port. As a result, experimental data can be stored and displayed in real time on a PC using Omega temperature data acquisition software.
In addition to these features, the CREC VL-Cell includes a pressure transducer placed alongside a pressure gauge. The combinations of two separate pressure instruments allows for simultaneous measurement and validation of the saturation vapor pressure. The pressure transducer is logged into a desktop USB port, allowing for observation and registration of the changes of pressure through the Omega software. Thus, one should note The CREC VL-Cell uses a marine type of impeller (propeller). The unit propeller helps to ensure a well dispersed and homogeneous blend of non-miscible liquids. This provides a good heat distribution inside the CREC VL-Cell. Furthermore, the special cell design allows one to directly analyze a process sample. It also avoids losses of light volatile components due to sample transfers.
The CREC VL-Cell has two thermocouples strategically located inside the CREC VL-Cell and connected to a temperature data acquisition unit. The thermocouples help to measure both the gas-and liquid-phase temperatures within the VL-Cell through a USB desktop computer port. As a result, experimental data can be stored and displayed in real time on a PC using Omega temperature data acquisition software.
In addition to these features, the CREC VL-Cell includes a pressure transducer placed alongside a pressure gauge. The combinations of two separate pressure instruments allows for simultaneous measurement and validation of the saturation vapor pressure. The pressure transducer is logged into a desktop USB port, allowing for observation and registration of the changes of pressure through the Omega software. Thus, one should note that the instrumentation implemented into the CREC VL-Cell delivers accurate temperature and pressure data [15]. In addition, the automatization of the CREC VL-Cell permits the collection of large amounts of vapor pressure data per experiment, which constitute a very valuable source of information for VLL equilibrium simulations and modeling.

Traditional Phase-Split Calculations
Traditionally, phase-split calculations are performed by solving Rachford-Rice (RR) equations involving phase equilibrium constants. Rachford-Rice equations are nonlinear functions obtained from the equal chemical potentials combined with species material balances [2].
In the case of the three-phase flash, the main equations (Equations (17)- (21)) are reported below, with the hydrocarbon phase used as a as reference [2,16]. Or In this respect, the root-finding calculation is a complex one, given these equations present discontinuities at their extremes (division by zero) and may have an almost flat slope near their roots [17].
Additionally, in solving the Rachford-Rice equations, the choice of numerical method is influenced by the independent variables that are selected, such as component mole fractions, equilibrium ratios, or the logarithm of equilibrium ratios [18].
When equilibrium constant ratios or logarithms of equilibrium constants are considered, a Newton-Raphson method is typically applied. In the case of mole fractions, either a Newton method or a minimization of Gibbs free energy can be considered [19]. Logarithms of K calculations are usually preferred, given the use of mole fractions may create an ill-defined Jacobian, and the natural logarithm stabilizes the Newton method when K values of different orders of magnitude are involved [18].
Regarding multiphase flash-split calculations with three or more phases, they typically involve an outer loop, where the equilibrium constants are calculated, and an inner loop, where the mass balances (Rachford-Rice equations) are evaluated. The end goal is to determine the phase mole fractions and the compositions for a given set of K values [20]. This inner loop is known as "constant-K" flash [21].
Given the above, a general algorithm to solve multiphase flash calculations is described in Figure 4. It can be observed that within each successive substitution step, the Rachford-Rice equations designated as "constant-K" flash are solved independently.
One should note that the solution of the two-phase constant-K flash calculation is a relatively easy one. However, and in the case where one has to account for a threephase flash, these calculations may become extensive and challenging. This is due to the non-linearity of the objective resulting functions [18].
The "constant-K" flash is discussed in Section 4.2. In this respect, in phase-split calculations, good initial estimates increase the probability of finding the global minimum Gibbs free energy, with an initial guess from stability testing or the previous simulation timestep being an option [2]. To accomplish this, constraints for the initial estimates are usually needed, as suggested previously by Okuno et al. [20] and Leibovici and Neoschil [22]. linearity of the objective resulting functions [18].
The "constant-K" flash is discussed in Section 4.2. In this respect, in phase-split calculations, good initial estimates increase the probability of finding the global minimum Gibbs free energy, with an initial guess from stability testing or the previous simulation timestep being an option [2]. To accomplish this, constraints for the initial estimates are usually needed, as suggested previously by Okuno et al. [20] and Leibovici and Neoschil [22].

Constant-K Flash Solution
The solution of the "constant-K flash" has been studied previously using two different approaches [24], which include (i) minimization techniques and (ii) direct solution of the RR equations. Tranenstein (1985) [19] proposed a constrained minimization of the Gibbs free energy to solve the two-phase problem, whereas Leibovicy and Jean (1995) [22] used a Newton procedure with a relaxation parameter to solve the multiphase problem. Furthermore, Okuno et al. (2010) [20] minimized the non-monotonic convex function using the independent-phase mole fractions. Haugen et al. (2011) [24] used a two-dimensional bisection method to provide good initial guesses for the Newton algorithm in the three-phase case. Li and Firoozabadi (2012) [2] employed stability testing as an initial guess for phase-split calculations with a two-dimensional bisection method for two and three phases. Yan and Stenby (2014) [25] proposed Householder's high-order iteration method together with a method to improve the initial estimate for the two-phase problem.

Constant-K Flash Solution
The solution of the "constant-K flash" has been studied previously using two different approaches [24], which include (i) minimization techniques and (ii) direct solution of the RR equations. Tranenstein (1985) [19] proposed a constrained minimization of the Gibbs free energy to solve the two-phase problem, whereas Leibovicy and Jean (1995) [22] used a Newton procedure with a relaxation parameter to solve the multiphase problem. Furthermore, Okuno et al. (2010) [20] minimized the non-monotonic convex function using the independent-phase mole fractions. Haugen et al. (2011) [24] used a two-dimensional bisection method to provide good initial guesses for the Newton algorithm in the threephase case. Li and Firoozabadi (2012) [2] employed stability testing as an initial guess for phase-split calculations with a two-dimensional bisection method for two and three phases. Yan and Stenby (2014) [25] proposed Householder's high-order iteration method together with a method to improve the initial estimate for the two-phase problem. More recently, Fernandez-Martinez et al. (2020) [17] applied an associated polynomial to obtain all the roots of a two-phase isothermal flash.
One of the most popular approaches to solve the "constant-K" flash problem is to use a Newton-Raphson method to solve Rachford-Rice equations (Equations (17) and (18) obtaining values for β V and β W . In that case, the Newton-Raphson method considered is given by Equations (23)- (27).
The Newton-Raphson method solution can converge to a non-desired root value, with this being a function of the initial guest. It can also lead to numerical divergence, with this being an inherent characteristic of the non-linear equations being solved [18]. As shown by Hinojosa-Gomez et al. (2012) [26], Newton's method fails to converge near the critical point and phase boundaries. Thus, good initial guesses are required for the phase fraction (β) calculations, with poor initial estimates leading to incorrect roots or being unable to find a numerical solution [24,26].
In this respect, the initial guess for β V , β W should be constrained within the proper solution domain. In this sense, when approaching the numerical solution of the constant-K flash, it is advantageous to consider this as an iterative constrained minimization calculation instead of being a root-finding problem [20]: F(β) refers to a convex function, as proven by Okuno et al., 2010 [20], with N linear constrains and with f representing the F gradient, with the condition of having a symmetric Jacobian matrix [20,27]. If this is the case, the F(β) scalar function involves a gradient vector, which represents the RR equations [20].
By integrating f j (Equations (20) and (21)) with respect to β j , one can obtain Equation (31), with the integration constant set to zero: Or, alternatively: When examining the possible mathematical solutions for a multiphase system at equilibrium with the number of phases larger than 2 (N p > 2), one can notice that the range of these solutions is wider than the space of the physical solutions [22]. To address this matter, Leibovicy and Neoschil (1995) [22] proposed that numerical solutions should be limited by hyperplanes defined by: In this respect, it is important to notice that in Equation (31), the region t i > 0 is unbounded if the following applies: (a) the function is monotonic, (b) it does not have a minimum, and (c) there is no solution to the constant-K flash with N p phases [20].
In that sense, Okuno et al. [20] proposed a feasible solution region based on the nonnegativity of phase-component mole fractions in a given phase, Then, from Equation (34) and with positive phase-component mole fractions in phase L, it results: And from Equations (35) and (36): Equations (40), (44), and (47) can be summarized as follows: Okuno et al. [20] proposed a general definition of these thermodynamic parameters The constant-K flash problem is solved by Equation (49).
In that sense, Equation (49) accounts for the "negative flash" case. One should note that when the iterative flash procedure yields β values in the β < 0 or β > 1 ranges, this leads to a "negative flash" [28]. These "negative flashes", although "not physically acceptable" roots, can be preserved for the next calculation step. This is the case, given the anticipated function continuity. It is interesting to mention that Okuno et al.'s resulting algorithm performs better with initial negative roots than when the condition 0 ≤ β ≤ 1 is complied from the very beginning in the first calculation step [28].

Algorithm to Solve the Flash Unit for Water/PASN Mixtures
After addressing the numerical solution of the constant-K flash problem, it is possible to complete the flash calculations as described in Figure 4. In this sense, the steps involved in the flash calculations are as follows: 1.
Input the operating and feed conditions: T, P, z i ; 2.
Provide an initial guess for the K-values; 3.
Calculate objective functions and compare with tolerance: The main problem with the proposed algorithm is that it may be computationally very expensive [1] as a function of the initial guesses chosen, as well as presenting both convergence and accuracy issues.
When applying the Newton-Raphson method (Equations (23)- (27)) using an initial estimate within the set boundaries (as presented in the following section), such as β V sup = 0.4 and β W sup = 0.2, for the water/PASN, employing SRKKD EoS model and Python, the result is β V = 0.10004025 and β W = 0.44494793 root with four iterations only. In this respect, the HYSYS V9 results in this case were β V = 0.1 and β W = 0.4449395, with the difference being much lower than 0.1%. In contrast and as expected, for the water/n-octane system, the calculation reaches the 10,000 maximum number of iterations with the obtained results not having physical meaning: β V = −10.9198 and β W = 2.6645 * 10 −15 .
Given the above, the function F(β) (Equation (49)) for the water/PASN mixture was minimized using different methods within the minimize functions available in the Python Optimize library. Tolerance was set in the 10 −8 range, with the percentage of difference for the calculated β values being lower than 0.3%. Table 1 reports the various methods tested. Best results were obtained using the constrained optimization by linear approximation (COBYLA) method.
However, in spite of this, none of the considered methods led us to meaningful physical solutions for water/n-octane blends, as reported in Table 2. Not even a genetic algorithm was able to numerically solve this case, arriving at a result of β = −20.7 14.2 . Additional explanations of this matter are described in Section 4.4. Table 2. Results for water/n-octane system using different methods for the minimize function of Scipy-Optimize package for Python. Regarding the VLLE for water/PASN, it can be calculated accordingly, as reported in Table 3. In this case, the bubble pressure (β V = 0), which is of interest for this research, can be calculated using both Python and Hysis V9.

Initial Estimate
However, when comparing VLLE results from HYSYS V9 and results from SRKKD EoS implemented with Python, the described Python algorithm for water/PASN works better to calculate pressure results than HYSYS V9, as shown in Figures 5 and 6, with SRKKD EoS implemented with Python achieving better agreement with experimental data from the CRE VL-Cell. However, when comparing VLLE results from HYSYS V9 and results from SRKKD EoS implemented with Python, the described Python algorithm for water/PASN works better to calculate pressure results than HYSYS V9, as shown in Figures 5 and 6, with SRKKD EoS implemented with Python achieving better agreement with experimental data from the CRE VL-Cell.  On the other hand, in the case of water/n-octane blends, the described algorithm presents convergence problems. To describe these issues, it is important to establish how the numerical Rachford-Rice equations (Equations (17)-(19)) influence these types of iterative calculations for both water/n-octane and water/PASN mixtures. To address this matter, the following section evaluates the approach proposed by Li and Firoozabadi [29] and the However, when comparing VLLE results from HYSYS V9 and results from SRKKD EoS implemented with Python, the described Python algorithm for water/PASN works better to calculate pressure results than HYSYS V9, as shown in Figures 5 and 6, with SRKKD EoS implemented with Python achieving better agreement with experimental data from the CRE VL-Cell.  On the other hand, in the case of water/n-octane blends, the described algorithm presents convergence problems. To describe these issues, it is important to establish how the numerical Rachford-Rice equations (Equations (17)-(19)) influence these types of iterative calculations for both water/n-octane and water/PASN mixtures. To address this matter, the following section evaluates the approach proposed by Li and Firoozabadi [29] and the On the other hand, in the case of water/n-octane blends, the described algorithm presents convergence problems. To describe these issues, it is important to establish how the numerical Rachford-Rice equations (Equations (17)- (19)) influence these types of iterative calculations for both water/n-octane and water/PASN mixtures. To address this matter, the following section evaluates the approach proposed by Li and Firoozabadi [29] and the boundaries set by Okuno et al. [20].

Issues with Constant-K Solution Calculations
Li and Firoozabadi [29] reported some examples of how RR y and RR w surfaces (Rachford-Rice surfaces) change, whereas β V and β w are varied, with β V and β w parameters representing the vapor and water fraction, respectively. A display of one of the examples reported by Li and Firoozabadi for a general case [29] is shown in Figure 7 for RR y and RR w intersecting the z = 0 plane. Li and Firoozabadi [29] reported some examples of how and surfaces (Rachford-Rice surfaces) change, whereas and are varied, with and parameters representing the vapor and water fraction, respectively. A display of one of the examples reported by Li and Firoozabadi for a general case [29] is shown in Figure 7 for and intersecting the z = 0 plane. The triangle defined in Figure 7a by the vertices (0,1), (0, 0), and (1, 0) represents the solution domain [29], with the solution at = 0 and = 0 plane shown with a red dot. In this regard, if one attempts to develop a similar Python-based calculation for an octane/water blend "constant-K" flash, one can observe that it is not possible to obtain a converging iterative solution. This is also true for a wide range of octane in water concentrations in the 0.5-99.75 wt.% range.
As a result, to provide a sound explanation of the findings, the following steps were followed: (a) The first step involves the SRKKD EoS model and HYSYS V9 with "constant-K" flash simulations. Equilibrium constants are approximated on this basis and used later for a thorough analysis of Rachford-Rice equations. (b) The second step considers a "constant-K" flash calculation using the equilibrium constant calculated in step 1. This helps to provide a good understanding of how the Rachford-Rice equations perform in such hydrocarbon/water mixtures.

Octane-Water Blends
To illustrate the calculations, a three-phase separator was specified in HYSYS V9 feeding a 100 kgmol/h blend with a 50% mol water/50% mol n-octane mixture for the first step. Working conditions for this three-phase separator were T = 80 °C and a vapor fraction of 0.1. In this case, the presence of air was not considered. Results obtained are reported in Table 4. Table 4. HYSYS V9 results for three-phase flash calculations at T = 80 °C using SRKKD. Figure 7. (a) RR y and RR w surfaces intersecting the z = 0 plane highlighted in grey (Adapted from [29]), (b) The β V and β w lines at z = 0 plane. Note: The "red dot" represents the solution at RR y = 0 and RR w = 0 plane. The triangle defined in Figure 7a by the vertices (0,1), (0, 0), and (1, 0) represents the solution domain [29], with the solution at RR y = 0 and RR w = 0 plane shown with a red dot.

Molar Fraction
In this regard, if one attempts to develop a similar Python-based calculation for an octane/water blend "constant-K" flash, one can observe that it is not possible to obtain a converging iterative solution. This is also true for a wide range of octane in water concentrations in the 0.5-99.75 wt.% range.
As a result, to provide a sound explanation of the findings, the following steps were followed: (a) The first step involves the SRKKD EoS model and HYSYS V9 with "constant-K" flash simulations. Equilibrium constants are approximated on this basis and used later for a thorough analysis of Rachford-Rice equations. (b) The second step considers a "constant-K" flash calculation using the equilibrium constant calculated in step 1. This helps to provide a good understanding of how the Rachford-Rice equations perform in such hydrocarbon/water mixtures.

Octane-Water Blends
To illustrate the calculations, a three-phase separator was specified in HYSYS V9 feeding a 100 kgmol/h blend with a 50% mol water/50% mol n-octane mixture for the first Processes 2022, 10, 710 13 of 26 step. Working conditions for this three-phase separator were T = 80 • C and a vapor fraction of 0.1. In this case, the presence of air was not considered. Results obtained are reported in Table 4. In developing the second calculation step, using the K V i and K W i constants obtained from HYSYS V9 in Python, the RR y and RR w values were in a low-level range, as shown in Figure 8a. The values of β V and β w also changed in a restricted domain. Furthermore, if the β V value was higher than 0.63 or β w was higher than 0.58, the solution for RR y and RR w did not converge. In developing the second calculation step, using the and constants obtained from HYSYS V9 in Python, the RRy and RRw values were in a low-level range, as shown in Figure 8a. The values of and also changed in a restricted domain. Furthermore, if the value was higher than 0.63 or was higher than 0.58, the solution for and did not converge. Additionally, when RRy and RRw were considered at z = 0, the obtained for different values of led to two parallel and z = 0 plane-superimposed RRy and RRw straight lines, as shown in Figure 9. In contrast, an HYSYS V9 solution was obtained, as identified with a red dot in Figure 9. Additionally, when RR y and RR w were considered at z = 0, the obtained β w for different values of β V led to two parallel and z = 0 plane-superimposed RR y and RR w straight lines, as shown in Figure 9. In contrast, an HYSYS V9 solution was obtained, as identified with a red dot in Figure 9. In developing the second calculation step, using the and constants obtained from HYSYS V9 in Python, the RRy and RRw values were in a low-level range, as shown in Figure 8a. The values of and also changed in a restricted domain. Furthermore, if the value was higher than 0.63 or was higher than 0.58, the solution for and did not converge. Additionally, when RRy and RRw were considered at z = 0, the obtained for different values of led to two parallel and z = 0 plane-superimposed RRy and RRw straight lines, as shown in Figure 9. In contrast, an HYSYS V9 solution was obtained, as identified with a red dot in Figure 9. As a result of this and under these conditions, one can understand why the iterative calculations trying to find an intersection of the RR y and RR w lines fail and the "constant-K" solution remains unknown.
Regarding these results, Haugen and Firoozabadi [24] advanced that algorithms of this type solving RR equations can fail when the lines at RR y = 0 and RR w = 0 are parallel in the domain of interest. In this respect, these authors designated these conditions as the result of a "bicritical point" where two of the phases have very similar compositions. They identified three different kinds of "bicritical regions" [24]: In this case, the K values for water/n-octane mixtures are as follows: K V i = 1.092510 2 , 3.369210 −1 and K W i = 1.642610 2 , 1.210510 −6 . As a result and given the conditions considered involving K V i ≈ K W i , they could be considered in partial agreement with case (iv) from the Haugen and Firoozabadi criteria [24].
For different temperatures (in this case, 110 • C) ( Table 5), the behavior of n-octane/water mixtures displays similar calculation challenges, as can be observed in Figures 10 and 11. As a result of this and under these conditions, one can understand why the it calculations trying to find an intersection of the RRy and RRw lines fail and the "co K" solution remains unknown.
Regarding these results, Haugen and Firoozabadi [24] advanced that algorit this type solving RR equations can fail when the lines at RRy = 0 and RRw = 0 are p in the domain of interest. In this respect, these authors designated these conditions result of a "bicritical point" where two of the phases have very similar composition identified three different kinds of "bicritical regions" [24]: (i) ≈ 1, (ii) ≈ 1 and ≈ . In this case, the K values for water/n-octane mixtures are as follows [1.092510 2 , 3.369210 −1 ] and = [1.642610 2 , 1.210510 −6 ]. As a result and giv conditions considered involving ≈ , they could be considered in partial agre with case (iv) from the Haugen and Firoozabadi criteria [24].
For different temperatures (in this case, 110°C) (Table 5), the behavior of n-octa ter mixtures displays similar calculation challenges, as can be observed in Figures 11.   Haugen et al. (2011) [24] described that the non-converging lines problem lea very large number of iterations, with the numerical solution becoming unacceptab pensive.
Thus, in the case of octane/water mixtures, the described shape of the and surfaces make it very challenging for a proposed algorithm, such as the SRKKD EoS with Python algorithm, to converge towards a "constant-K" flash solution.

PASN/Water Blends
In contrast with the "non-convergence" results described in Sections 4.3 and 4. n-octane/water blends, the PASN/water mixtures evaluated with the SRKKD EoS with Python can provide consistent "constant-K" flash-convergent simulations. Thi case when performing flash calculations for a 50%mole water/PASN mixture. Result convergence are presented in Table 6.  Figure 12 reports the calculated RRy and RRw, using Ki V and Ki W from the S model using the Python calculations. In this case, values of and are smalle 0.5, providing converging numerical solutions in all cases. Haugen et al. (2011) [24] described that the non-converging lines problem leads to a very large number of iterations, with the numerical solution becoming unacceptably expensive.
Thus, in the case of octane/water mixtures, the described shape of the RR y and RR w surfaces make it very challenging for a proposed algorithm, such as the SRKKD EoS model with Python algorithm, to converge towards a "constant-K" flash solution.

PASN/Water Blends
In contrast with the "non-convergence" results described in Sections 4.3 and 4.4.1 for n-octane/water blends, the PASN/water mixtures evaluated with the SRKKD EoS model with Python can provide consistent "constant-K" flash-convergent simulations. This is the case when performing flash calculations for a 50%mole water/PASN mixture. Results after convergence are presented in Table 6. Table 6. PASN/water mixture three-phase flash calculations at T = 80 • C and P = 83.16 kPa using SRKKD model, the Rachford-Rice equations and the Python calculations of the present study.  Figure 12 reports the calculated RR y and RR w , using K i V and K i W from the SRKKD model using the Python calculations. In this case, values of β V and β W are smaller than 0.5, providing converging numerical solutions in all cases.

Boundary Conditions
Regarding the boundary conditions, if one considers the water/n-octane mixture with Ki approximated by HYSYS V9 (Figure 11), the boundary conditions proposed by Okuno et al. [20] result in Equations (54) and (55), with the hyperplanes superimposed to the lines for = 0 and = 0, as presented in Figure 14.

Boundary Conditions
Regarding the boundary conditions, if one considers the water/n-octane mixture with Ki approximated by HYSYS V9 (Figure 11), the boundary conditions proposed by Okuno et al. [20] result in Equations (54) and (55), with the hyperplanes superimposed to the lines for = 0 and = 0, as presented in Figure 14.  Figure 13. Lines for RR y = 0 and RR w = 0. Note: red point represents Python solution, the blue line is related to RR w = 0, and the green line is related to RR y = 0.

Boundary Conditions
Regarding the boundary conditions, if one considers the water/n-octane mixture with K i approximated by HYSYS V9 (Figure 11), the boundary conditions proposed by Okuno et al. [20] result in Equations (54) and (55), with the hyperplanes superimposed to the lines for RR y = 0 and RR w = 0, as presented in Figure 14. Furthermore, in applying the boundary conditions proposed by Okuno et al. [20] for the water/PASN blend, the hyperplanes related to Equations (56)-(62) can be represented as in Figure 15. Furthermore, in applying the boundary conditions proposed by Okuno et al. [20] for the water/PASN blend, the hyperplanes related to Equations (56)-(62) can be represented as in Figure 15.

Remarks
On the basis of results from "constant-K" flash calculations for n-octane/water and PASN/water blends, one can conclude that the composition of hydrocarbon/water blends is a key factor in allowing for a viable numerical calculation using the Rachford-Rice equations. Thus, to address possible calculation uncertainty and ambiguity resulting for octane/water blends, an alternate methodology to calculate the molar fractions and mixture pressure is proposed in the upcoming sections.

Remarks
On the basis of results from "constant-K" flash calculations for n-octane/water and PASN/water blends, one can conclude that the composition of hydrocarbon/water blends is a key factor in allowing for a viable numerical calculation using the Rachford-Rice equations. Thus, to address possible calculation uncertainty and ambiguity resulting for octane/water blends, an alternate methodology to calculate the molar fractions and mixture pressure is proposed in the upcoming sections.

Liquid-Phase K-Values from Experimental Data for Water/n-Octane Mixtures
As discussed in the previous sections, the initial guess for K values affects the convergence of the flash calculation algorithm (Figure 4). Usually, Wilson correlation [30] (Equation (63)) is used as a first approximation for the K value in the hydrocarbon phase. For the K values in the water phase, Connolly et al. [1] suggested values based on the initial feed, as presented in Equation (64).
In this sense, another advantage of the CREC VL-Cell developed by CREC researchers is that it allows one to determine the solubility values based on the VLLE data obtained. The transition from the three-phase and two-phase regions defines the solubility limit. Figure 16 presents the solubility limit regions for a water/n-octane mixture at 110 • C. This region is characterized by the transition from three phases to two phases. The applicability of the CREC-VL-Cell for establishing solubility and solubility limits is not restricted to any type of hydrocarbon/water blend. Thus, the CREC-VL-Cell can be of special value in dealing with hydrocarbon/water blends involving intrinsic convergence uncertainties observed by octane-water mixtures while using the Rachford-Rice equations (refer to Section 4.4). Figure 17 shows that the solubility limit for an octane/water blend can be calculated from the intersection between the lines that define the three-phase and two-phase domain. The calculated solubilities at 110 °C are = 0.01810534 ± 2.5 × 10 −3 and = 0.00084875 ± 2.6 × 10 −4 , considering the 95% confidence interval defined by the blue region. Although the solubility of water in n-octane is in close agreement with = 0.016 calculated, the value using HYSYS V9 with SRKKD EoS and Rachford-Rice equations yields a solubility for n-octane in water two orders of magnitude higher compared with −6 The applicability of the CREC-VL-Cell for establishing solubility and solubility limits is not restricted to any type of hydrocarbon/water blend. Thus, the CREC-VL-Cell can be of special value in dealing with hydrocarbon/water blends involving intrinsic convergence uncertainties observed by octane-water mixtures while using the Rachford-Rice equations (refer to Section 4.4). Figure 17 shows that the solubility limit for an octane/water blend can be calculated from the intersection between the lines that define the three-phase and two-phase domain. The calculated solubilities at 110 • C are x L w = 0.01810534 ± 2.5 × 10 −3 and x W oct = 0.00084875 ± 2.6 × 10 −4 , considering the 95% confidence interval defined by the blue region. Although the solubility of water in n-octane is in close agreement with x L w = 0.016 calculated, the value using HYSYS V9 with SRKKD EoS and Rachford-Rice equations yields a solubility for n-octane in water two orders of magnitude higher compared with the x W oct = 2.06 × 10 −6 Hysis V9-predicted value.
is not restricted to any type of hydrocarbon/water blend. Thus, the CREC-VL-Cell can be of special value in dealing with hydrocarbon/water blends involving intrinsic convergence uncertainties observed by octane-water mixtures while using the Rachford-Rice equations (refer to Section 4.4). Figure 17 shows that the solubility limit for an octane/water blend can be calculated from the intersection between the lines that define the three-phase and two-phase domain. The calculated solubilities at 110 °C are = 0.01810534 ± 2.5 × 10 −3 and = 0.00084875 ± 2.6 × 10 −4 , considering the 95% confidence interval defined by the blue region. Although the solubility of water in n-octane is in close agreement with = 0.016 calculated, the value using HYSYS V9 with SRKKD EoS and Rachford-Rice equations yields a solubility for n-octane in water two orders of magnitude higher compared with the = 2.06 × 10 −6 Hysis V9-predicted value.
Assuming the vapor phase behaves as an ideal gas, then φ V i ≈ 1, and the obtained results are y w ≈ 0.7347 ± 2.38 × 10 −2 and K V i = 40.5772 0.2702 . The mean value obtained has a 16.04% difference from the value predicted by HYSYS V9 (y w ≈ 0.6799). However, the HYSYS V9 value is within the range of the 95% confidence interval. Figures 18 and 19 report the observed solubilities of water in n-octane and n-octane in water, respectively, as determined in the CREC VL-Cell and compared with the values reported by Maczynski et al. (2004) [31] for both water in n-octane and n-octane in water.
Furthermore, on the basis of the experimental data obtained in the CREC-VL-Cell, a correlation to obtain the K values for water/octane mixtures in the temperature range of interest and low pressure (1-3 atm) is proposed. This correlation is given by the following Equation (65), with the constants involved reported in Table 7. A comparison with calculated values from experimental results is also reported in Figure 20, showing the adequate fitting of the experimental values by this correlation.
The mean value obtained has a 16.04% difference from the value predicted by HYSYS V9 ( ≈ 0.6799). However, the HYSYS V9 value is within the range of the 95% confidence interval. Figures 18 and 19 report the observed solubilities of water in n-octane and noctane in water, respectively, as determined in the CREC VL-Cell and compared with the values reported by Maczynski et al. (2004) [31] for both water in n-octane and n-octane in water.  Furthermore, on the basis of the experimental data obtained in the CREC-VL-Cell, a correlation to obtain the K values for water/octane mixtures in the temperature range of interest and low pressure (1-3 atm) is proposed. This correlation is given by the following Equation (65), with the constants involved reported in Table 7. A comparison with calculated values from experimental results is also reported in Figure 20, showing the adequate fitting of the experimental values by this correlation.

ln
= ln + (65)  Figure 19. Solubility of n-octane in water in the temperature range of interest. Note: blue bands represent the 95% confidence interval.  Figure 20. K values calculated from experimental data in the CREC VL-Cell for water/n-octane mixtures: (a) K value for water in W phase, (b) K value for octane in W phase, (c) K value for water in V phase, (d) K value for octane in V phase.
Given the observed differences with the experimental points reported in Figure 6 and obtained in the CREC-VL-Cell, it is important to provide a more precise methodology to calculate the pressures at VLLE, accounting for the uncertainties related to the experimental values. This will be discussed in the following section.

Machine Learning Approach
As shown in the previous sections of this work (Sections 4.1-4.5), traditional thermodynamic models for multiphase systems based on Rachford-Rice equations overpredict vapor pressure in cases unable to provide converging and meaningful solutions.
Thus, to implement machine learning, linear regression, KNN, SVM, and decision tree regressor (DTR) are considered. On this basis, the prediction errors, coefficients of determination (R2), mean squared errors (MSE), and mean absolute errors (MAE) are established. To accomplish this for each experimental condition, the following is considered: (a) the predicted number of phases together with the feed molar fraction and temperature as input parameters [4,6] and (b) the system pressure as target variable. Additional details of the classification methodology can be found in a recent publication of our research group [4].
In the case of KNN, SVM, and DTR, the parameters were tuned using a grid search with cross validation (GridSearchCV). This method allows for determination of the best set of parameters in each model based on the coefficient of determination (R 2 ). The training set was selected by randomly splitting the sample into 80% for training and 20% testing (train_test_split function). Results for the grid search are summarized in Table 8, showing the best parameters for each method calculated by the GridSearchCV function by comparing R 2 .
Given the observed differences with the experimental points reported in Figure 6 and obtained in the CREC-VL-Cell, it is important to provide a more precise methodology to calculate the pressures at VLLE, accounting for the uncertainties related to the experimental values. This will be discussed in the following section.

Machine Learning Approach
As shown in the previous sections of this work (Section 4.1-4.5), traditional thermodynamic models for multiphase systems based on Rachford-Rice equations overpredict vapor pressure in cases unable to provide converging and meaningful solutions.
Thus, to implement machine learning, linear regression, KNN, SVM, and decision tree regressor (DTR) are considered. On this basis, the prediction errors, coefficients of determination (R2), mean squared errors (MSE), and mean absolute errors (MAE) are established. To accomplish this for each experimental condition, the following is considered: (a) the predicted number of phases together with the feed molar fraction and temperature as input parameters [4,6] and (b) the system pressure as target variable. Additional details of the classification methodology can be found in a recent publication of our research group [4].
In the case of KNN, SVM, and DTR, the parameters were tuned using a grid search with cross validation (GridSearchCV). This method allows for determination of the best set of parameters in each model based on the coefficient of determination (R 2 ). The training set was selected by randomly splitting the sample into 80% for training and 20% testing (train_test_split function). Results for the grid search are summarized in Table 8, showing the best parameters for each method calculated by the GridSearchCV function by comparing R 2 .  Table 9 reports the coefficient of determination (R 2 ), mean squared error (MSE), and mean absolute error (MAE) for the selected models (best score) for octane/water using the abundant testing dataset obtained in the CREC VL cell. Figure 21 describes a comparison for the pressures measured and the predicted values, showing that the KNN model provides the best approximation, with an R 2 value of 0.99 and MSE values as low as 43.78 and 4.91 parameters. The KNN method is selected as the best model to predict the pressure of water/n-octane mixtures in the range of interest.   The best ML model to describe the behavior of pressure for n-octane/water blends is KNN, with this model overcoming the issues of the traditional thermodynamic models involving the Rachford-Rice equations, significantly reducing the uncertainty of any theoretically thermodynamically based algorithm. The best ML model to describe the behavior of pressure for n-octane/water blends is KNN, with this model overcoming the issues of the traditional thermodynamic models involving the Rachford-Rice equations, significantly reducing the uncertainty of any theoretically thermodynamically based algorithm.
In the same way, if a pseudo-binary mixture is assumed for PASN/water mixtures, a KNN model based on the experimental results can also be applied with an R 2 = 0.9933 MSE = 34.24 and MAE = 4.41, as presented in Figure 22. The pseudo-component for the PASN is defined using the hydrocarbon blends reported in Table 6. In the same way, if a pseudo-binary mixture is assumed for PASN/water mixtures, a KNN model based on the experimental results can also be applied with an R 2 = 0.9933 MSE = 34.24 and MAE = 4.41, as presented in Figure 22. The pseudo-component for the PASN is defined using the hydrocarbon blends reported in Table 6. Although the ML model developed in the present study has the ability to predict vapor pressure efficiently, it requires a significant number of experimental data points allowing for both the training of the proposed model, as well as its validation. In this respect, the availability of laboratory equipment, such as the CREC VL-Cell, providing abundant phase equilibrium data, is critical for both ML model development and valida-