1. Introduction

Simulations of vapor-liquid equilibrium (VLE) are widely studied and used in the industry, given the impact that simulations have on the unit operations of purification and separation [^{1}]. The design of these unit operations is not possible without estimating equilibrium, where temperature, pressure, chemical nature, and concentrations are correlated by empirical, semi-empirical and/or phenomenological relationships. The equilibrium between phases is established once there are no observable variations in chemical, thermal, and force potentials between each phase under consideration. The chemical potential equality is normally expressed in terms of fugacities (f̂) and equalities of each component *i* in each phase under consideration [^{2}].

At low pressures, the gas phase and the liquid phase fugacities can be represented in function of fugacity coefficients (*(*). Both terms, *(*
_{
i
}
^{
V
} and *(*
_{i}
^{
L
} (fugacity coefficients in the vapor (*V*) and the liquid (*L*) phase, respectively) can be expressed by using a suitable Equation of States (EOS), such as Peng-Robinson, van der Waals, Soave-Redlich-Kwong (SRK), among others, and the representation of partial properties (Equation 1):

In Equation 1, *Z* is the compressibility factor; the derivative part represents the partial derivative of component *i* keeping the temperature, volume, and the moles of the other components constant.

Each EOS can hold important information about mixture behavior, such as the Peng-Robinson *a*
_{
i
} and *b*
_{
i
} coefficients. When mixtures are modelled, mixing rules, which account for molecular interactions, are needed. Equations 2 and 3 show the modified van der Waals mixing rules, where *a*
_{
m
} and *b*
_{
m
} are the EOS coefficients of mixture and *k*
_{
ij
} is a temperature dependent binary interaction parameter (BIP) [^{2}]. The BIP holds the dissimilar molecular contribution information needed for mixture representations.

Artificial Neural Networks (ANNs) are normally used as a predicting tool, where knowledge of phenomenological representations is not important. Nevertheless, due to the huge quantity of internal parameters that may be used in an ANN, over-fitting and poor generalization problems can be observed [^{3}]. A Hybrid Neural Network, also known as Gray-Box Neural Network Model (GNM), can be considered as an improved ANN, which incorporates phenomenological/empirical equations in its architecture. In fact, the GNM can be thought of as a hierarchical network of sub-networks. The sub-networks are assigned to perform small calculations restricted by some physical-chemical constraints, such as mass and energy balances [^{3}]. The incorporation of restrictions not only improves the general prediction of the modified network; it also improves the extrapolation capabilities.

GNMs have been used for a wide gamut of applications, such as fed-batch bioreactors [^{3}], Williams-Otto reactor [^{4}], maximization in methanol conversions [^{5}], wastewater systems [^{6}], and empirical improvement of equations of state [^{7}], among others.

Regarding VLE, classical ANNs have been applied to determine pressure (*P*) - temperature (*T*) - liquid (*x*
_{
i
} ) or vapor (*y*
_{
i
} ) molar fractions relationships at equilibrium [^{8}-^{12}]. A different approach was proposed by Urata et. al. [^{13}]. In their work, activity coefficients were estimated by separating the calculation in two independent ANNs and using pure component data as ANN input information: the normal boiling point divided by molecular weight, critical density, and dipole moment. Even though the estimations showed low deviations, thermodynamic inconsistencies were observed. Ramírez et al. [^{14}] developed an ANN capable of determining the four-suffix Margules parameters. The ANN training was performed using a group-contribution methodology where the ANN input variables were the number of interactions between different functional groups, the pressure, the temperature, the functional group quantities, the molecular weights, and the mole fractions. The proposed methodology represented the VLE correctly, but their results showed local minimum issues.

In the present work, a methodology that incorporates a suitable EOS (Peng-Robinson Equation of State; PR-EOS) in the formulation of a GNM is presented in order to predict the VLE. Specifically, the aim of this work is to determinate binary interaction parameters by using pure component data and an EOS represented in a neural-network-like structure. Additionally, in order to achieve the main objective, common activation functions were replaced by complex mathematical formulations (including iterative estimations). The last modification can be seen as a methodology to improve the applicability of GNMs. Simple hydrocarbons mixtures were considered for the GNM evaluation. VLE information was used for GNM training, VLE extrapolation, and binary interaction parameters prediction. The GNM predicted binary interaction parameters, extrapolated VLE information (with deviations on average lower than 5%) and, to a certain degree, determinate VLE of mixture not considered in the GNM training (deviations on average lower than a 20%).

2. Materials and methods

In the Present work, Matlab 7.1 was used as simulator. This version was used given its simplicity at the moment of modifying ANN/GNM functions and parameters (activation functions, transfer function, and training process methodologies). The embedded ANN toolbox was used in the programming. For a proper and simpler creation/evaluation of the GNM, VLE data from binary systems (Acetone-Butanol at 99.46 kPa [^{15}], Acetone-Ethanol at 101.33 kPa [^{16}], and Acetone-Methanol at 13.33 kPa [^{17}]) were used. The data was composed of *y*
_{
i
} -*x*
_{
i
} -P-T information and the boiling points, molecular weight, molecular diameter, critical pressure, critical temperature, and acentric factor of each component. Furthermore, since VLE information for the three systems is limited in the literature, interpolation of the *y*
_{
i
} -*x*
_{
i
} -*P*-*T* data was performed (by using the PR-EOS and a lineal correlation temperature dependency of the predicted BIP) in order to obtain 500 total data points useful for training and validation of the GNM.

3. Thermodynamic modeling

As previously specified, the PR-EOS (Equation 6, where *a* and *b* are adjustable parameters) was used in the formulation of the GNM. Different EOS could have been used; nevertheless, the PR-EOS was chosen for its simplicity (parameters determination from critical properties and acentric factor) and improved representation of some liquid properties [^{18}].

As extensively described in the literature, for a pure component, both parameters can be determined by using critical properties and temperature corrections.

By combining equations 1 and 4, the fugacity coefficient for the liquid phase can be calculated (Equation 5) where, *Z*
_{
liq
} is the compressibility factor of the liquid phase. The expression for the vapor phase fugacity coefficient is similar to Equation 5, but representative values of the vapor phase have to be used (i.e. *Z*
_{
vap
} , *V*
_{
vap
} , and *y*
_{
i
} ; [^{2}]).

The previous equation needs the gas or liquid phase compressibility factor, which is obtained from the PR-EOS cubic expression (Equation 6; [^{2}]). In Equation 6 *A* is equal to *a*
_{
m
}
*P /R*
^{
2
}
*T*
^{
2
} and *B* is equal to *b*
_{
m
}
*P/RT*.

4. Gray box neural modeling

The theory behind artificial neural networks will not be thoroughly discussed here; more information can be found in references [^{19}-^{20}]. As mathematically proven by Cybenko and Hornik et al. [^{21}-^{22}], a Feed-Forward-Neural-Network (FFNN) with a single hidden layer can approximate, with a high degree of accuracy, any non-linear function. Even though ANNs can correctly approximate non-linear functions, they are classified as data-based models, which imply that the prediction is based on specific data sets. The use of data sets allows good estimation of variables inside the same data set limits, but extrapolation cannot fully be trusted given the lack of boundary restrictions.

GNMs, on the other hand, have proved to reduce the previous problems by embedding phenomenological equations in a Neural-Network-like architecture. In this architecture, the ANN focuses on phenomenological parameter determination while the phenomenological equation helps in the extrapolation, interpolation and/or boundaries application (recognized as series configuration). Further benefits, such as the reduction of neural internal parameters, better starting points for training, reduction of over-fitting, improved network adaptation, and improvement in generalization have been observed [^{3}].

5. PR-EOS gray box neural model generation

The first step in the GNM creation is the architecture definition. Fig. 1 shows the PR-EOS GNM architecture. As defined in this work, and observed in Fig. 1, different artificial neurons (circles) are interconnected at different levels in order to create a neural architecture specific for the problem formulation. The dashed box in Fig. 1 represents an embedded ANN, which is a classical ANN that possesses one Input Layer, one Hidden Layer (HL), and one Output Layer (OL). The embedded ANN, which possesses parameters that have to be estimated during the training process, has a FFNN architecture. The number of parameters has to be optimized in order to produce a useful output; that means, to define the most suitable number of neurons in the hidden layer to generate the binary interaction parameters (10 neurons in the HL in the present work with Hyperbolic Tangent Sigmoid as transfer function; this number was obtained by finding the minimum number of neurons in the HL that produce an unaffected goodness-of-fit statistics). It must be highlighted that the training process is performed by the back propagation technique; therefore, no BIPs are needed in order to perform the training independently of the embedded ANN estimate output (the BIP). By considering the defined architecture, the VLE information (which is the global output) will be transformed as it is back propagated in the GNM to estimate the BIPs. The input information of the embedded ANN was chosen based on pure component information and the recognition that the embedded ANN outputs are a function of the temperature. The selection of different inputs can be modified easily. In fact, to estimate binary interaction parameters, an optimization process of the input information could be performed in order to determinate the most suitable pure component and/or group contribution data useful to obtain the best non-linear mathematical representation. In the present work, the embedded ANN Input information are the pressure, temperature, boiling points (*BP*
_{
i
} ), molecular weights (*MW*
_{
i
} ), and molecular diameters (*MD*
_{
i
} ) of each component *i* in the mixture.

The embedded ANN output is defined in the architecture by a proper representation of the phenomenological equation in a neural-network-like architecture. In a GNM, different mathematical operations must be used to fully represent the phenomenological equation. Mathematical operations, other than classical ANN operations (such as roots, logarithms, exponentials, among others), can easily be programmed by specifying the mathematical expression in a neuron transfer function or in a neuron activation function.

Following the Fig. 1 architecture, once the embedded ANN calculation is performed (*k*
_{
ij
} ), the information is forwarded to two neurons (*(*
_{i}
^{L} and *(*
_{i}
^{V}) followed by a third neuron (equal fugacity that estimate *y*
_{
i
} ). The most important characteristics of these neurons are:

Each input signal is unweighted (weight equal to 1). The weights are kept constant during the GNM creation by modifying the training algorithms (the values of the weights are restored after the modifications performed by the Levenberg-Marquardt Backpropagation training process).

The identity transfer functions are used in each neuron. Therefore, each neuron output is the evaluation of the activation function, where the bias was set to 0.

The activation functions are replaced by a set of equations rather than one equation. The set of equations allows the estimation of fugacity coefficients (as described from Equation 1 to Equation 6) or the vapor phase mole fraction (obtained by fugacity equality,

*(*_{i}^{L}x_{i}/*(*_{i}^{V}=*y*_{ i }). The inputs of a neuron, in the phenomenological representations, are useful variables for the full estimation of important parameters/variables of the VLE representation.

Both neurons *(*
_{i}
^{L} and *(*
_{i}
^{V} are representation of Equation 5, where the information of *k*
_{
ij
} , critical properties of each component (*P*
_{
Ci
} , *T*
_{
Ci
} ), acentric factor of each component (*(*
_{i}) , and mole fractions (*y*
_{
i
} or *x*
_{
i
} , depending on the phase that is intended to be represented) are used as feed to the neurons. Additionally, in the same neuron the mixing rules are used to estimate *a*
_{
m
} and *b*
_{
m
} , and the PR-EOS in its cubic form, to estimate *Z*
_{
liq
} or *Z*
_{
vap
} . The estimation of both compressibility factors was performed by the software by evaluating the cubic equation roots. For the estimations of fugacity coefficient, the higher root was used for the gas phase, while the lower root was used for the liquid phase. After the estimation of both fugacity coefficients, the VLE is calculated by the neuron *f*
_{
i
}
^{
V
}
*=f*
_{
i
}
^{
L
} which corresponds to the equilibrium representation. Specifically, in this neuron, the inputs are the liquid and vapor fugacity coefficients of component 1 (in a binary system) and the molar fraction of the liquid phase. By using the equilibrium representation, the molar fraction of component 1 in the gas phase is estimated. The *y*
_{
i
} information is used as feed and estimated as output in the GNM. Independent of this recursive architecture, it must be kept in mind that the main objective is to find an embedded ANN able to determinate binary interaction parameters with phenomenological restrictions (EOS).

Once the architecture was specified, the GNM training was performed together with the optimization of the number of neurons in the HL. The training process corresponds to the determination of the ANNs internal parameters (weights and

bias of each neuron) by performing sensitivity analyses on an objective function, which is optimized by different methods. After calculation of the new weights, the fixed GNM weights were restored to their values by using a modified version of the Matlab algorithm (*trainlm*).

As a methodological resume of the GNM creation, Fig. 2 is incorporated. This figure could help to better understand the creation process and to use it in the case that other VLE representations are intended to be used. The dashed line box represents the typical process of the classical ANN parameters selection, which could include selection of neural network type, number of hidden layers, training algorithm selection, activation function, and others [^{24}]. In the present work only the number of neurons in the hidden layer was modified as an improvement parameter.

6. Results

Fig. 3 and Fig. 4 shows the results obtained (estimations of *y*
_{
i
} values, represented in *y*
_{
i
} -*x*
_{
i
} -*T* graphs) by applying the GNM after being trained with the VLE data sets of different Acetone-Alcohol mixtures at different pressures. Fig. 3 shows the full data set used for training (bold curves) and, at the same time, the prediction capabilities of the *y*
_{
i
} values (fixed *x*
_{
i
} ) for conditions not used during training (dashed lines). It can be observed that the tendency of VLE acetone-ethanol mixture was properly estimated by the GNM in the range where the data set information was not included in the training process. Of course, the embedded ANN performs calculations of the binary interaction parameters which were fed to the PR-EOS and the different formulations previously described.

Fig. 4 focuses on the predicted values representation and compares specific experimental data (van Winkle, 1956) with those estimated by the presented methodology. The lines are used as references and do not represent experimental values. As observed in the figure, there is a tendency to obtain higher deviations of the gas phase representation (*y*
_{
i
} ) as the concentration is closer to the equimolar mixture range (i.e. at the higher contribution of the *a*
_{
ij
} parameter). Even though the error is considerable under these conditions, the error was lower than 10% and on average, the deviation was lower than 5%.

One of the main objectives of the present work was to predict, from pure component data, binary interaction parameters. Fig. 5 shows the obtained *k*
_{
ij
} parameters as a function of temperature, where the line is added to describe the effect of temperature on the BIP (slope of 0.0018). As observed in the figure, the binary interaction parameter (which ranges between -0.004 to 0.013) describes a linear functionality with temperature. The values describe a relatively similar interaction between even molecules and uneven molecules (i.e. relatively low *k*
_{
ij
} ).

The prediction capabilities were analyzed by the Index of Agreement (IA; [^{24}]); see Equation 7. In Equation 7 *O*
_{
i
} and *P*
_{
i
} are the observed and predicted output values; *n* is the total number of data; and *p*
_{
i
}
*’=p*
_{
i
}
*- o*
_{
m
} and *o*
_{
i
}
*’=o*
_{
i
}
*- o*
_{
m
} are the differences between the mean value of the observations (*o*
_{
m
} ) and the predicted and observed output values, respectively.

Once the previous equation was applied to the predicted values of known mixtures (Acetone-Ethanol; Dashed Line of Fig. 4) the Index of Agreement found was equal to 0.988, which implies a considerable agreement (higher than 95% of accuracy) for the total estimations performed for the system under consideration.

The prediction capability of the system was further analyzed by evaluating a mixture of components not used in the training data. The mixture Methanol (1) - Ethanol (2) at 101.33 kPa, 348.15 K, x_{1}=0.242 and *y*
_{
1
} =0.326 [^{16}], was compared with the predicted values. As evaluated, an error of approximately 18% was obtained with a predicted *k*
_{
ij
} value of -0.016. This number indicates that there is a lack of the GNM representing capability. Of course, there is no alcohol-alcohol pair used in the training set; therefore a reduction in the prediction capability was expected.

Once comparing the GNM results with other works that have used ANN for predicting VLE, a relatively similar performance of the present GNM was observed. Si-Maussa et al. [^{23}] focused in the VLE prediction of carbon dioxide-esters mixtures and reported Absolute Relative Deviations in the order of 4.95% and 0.19% for pressure and mole fraction estimations, respectively. Similarly, Karimi and Yousefi [^{25}] studied a VLE of four binary refrigerant systems. In their work, medium average errors as high as 5.15% were observed for mole fraction estimations. In both of the previously mentioned manuscripts, comparisons with EOS were performed. In both works it was observed that the ANN performed better than the EOS. In the present manuscript this comparison cannot be performed since the prediction is embedded in the EOS and therefore the errors that the EOS models have intrinsic in their representations are passed to the GNM results. Furthermore, in both mentioned works the focuses have been interpolating data. In the present manuscript extrapolations at other ranges of the system conditions and mixtures were performed. Is in these points that present work makes an important difference since, by predicting binary interaction parameters instead of VLE data the network could be used (if considerable input data is used during training) to not only interpolate within input data mixtures, it could be used to predict non input mixtures behaviors.

7. Conclusion

A methodology to estimate binary interaction parameters was implemented in this work. A GNM was used, in which the PR-EOS was formulated in a neural-network-like architecture in order to limit and improve the prediction capabilities of an embedded ANN. The embedded ANN focuses on estimate binary interaction parameters useful in the representation of VLE. As observed in the results, the generated GNM successfully represented the VLE of mixtures used in the GNM training (deviations on average were lower than 5%) and was able to estimate the temperature dependence of the binary interaction parameters.

Further evaluation of the GNM, with mixtures not used during training, showed a lower VLE prediction (i.e. *k*
_{
ij
} ). This lower prediction capability (deviations in the order of 20%) was explained by a lack of representability of the system (non-alcohol-alcohol pair used during training) and the need to define the most suitable parameters that should be used as input in the embedded ANN in order to improve the prediction capabilities of the GNM.

Additionally, an innovative methodology to incorporate non-linear evaluations in a neural-network-like architecture was developed. As far as the authors are aware, additive and multiplicative evaluations have mainly been used as transfer/input functions. The possibility of incorporating polynomial expressions and other mathematical expressions could improve and extend the use of GNM in more complex phenomenological/empirical representations.

In terms of the Peng-Robinson EOS, the present methodology will be limited by the same restrictions as the Peng-Robinson possesses. Nevertheless, a gamut of possible modifications can be foreseen that could improve the applicability of the presented work. For example, the methodology could be extended by using considerable data sets during the training; using an increased number of parameters as feed to the embedded ANN in order to improve the prediction of BIPs; incorporating group contribution methods in the estimation; considering activity coefficients for the liquid representations; using other EOS; using Gibbs Excess Free Energy representations in neural network-like architecture; and/or incorporating long- and medium-range interactions (ionic in nature), among others. In summary, a powerful and simple tool for the estimation of state variables and VLE information was developed. The authors hope that this developed tool generates a significant impact on decision-making and is well received by industry and academia.