Hide code cell content
###############################################################################
# The Institute for the Design of Advanced Energy Systems Integrated Platform
# Framework (IDAES IP) was produced under the DOE Institute for the
# Design of Advanced Energy Systems (IDAES).
#
# Copyright (c) 2018-2023 by the software owners: The Regents of the
# University of California, through Lawrence Berkeley National Laboratory,
# National Technology & Engineering Solutions of Sandia, LLC, Carnegie Mellon
# University, West Virginia University Research Corporation, et al.
# All rights reserved.  Please see the files COPYRIGHT.md and LICENSE.md
# for full copyright and license information.
###############################################################################

Parameter Estimation Using the NRTL State Block#

Author: Jaffer Ghouse
Maintainer: Andrew Lee
Updated: 2023-06-01

In this module, we use Pyomo’s parmest tool in conjunction with IDAES models for parameter estimation. We demonstrate these tools by estimating the parameters associated with the NRTL property model for a benzene-toluene mixture. The NRTL model has 2 sets of parameters: the non-randomness parameter (alpha_ij) and the binary interaction parameter (tau_ij), where i and j are the pure component species. In this example, we only estimate the binary interaction parameter (tau_ij) for a given dataset. When estimating parameters associated with the property package, IDAES provides the flexibility of doing the parameter estimation by just using the state block or by using a unit model with a specified property package. This module will demonstrate parameter estimation by using only the state block.

We will complete the following tasks:

  • Set up a method to return an initialized model

  • Set up the parameter estimation problem using parmest

  • Analyze the results

  • Demonstrate advanced features using parmest

Setting up an initialized model#

We need to provide a method that returns an initialized model to the parmest tool in Pyomo.

Inline Exercise: Using what you have learned from previous modules, fill in the missing code below to return an initialized IDAES model.
def NRTL_model(data):

    # Todo: Create a ConcreteModel object
    m = ConcreteModel()

    # Todo: Create FlowsheetBlock object
    m.fs = FlowsheetBlock(dynamic=False)

    # Todo: Create a properties parameter object with the following options:
    # "valid_phase": ('Liq', 'Vap')
    # "activity_coeff_model": 'NRTL'
    m.fs.properties = BTXParameterBlock(
        valid_phase=("Liq", "Vap"), activity_coeff_model="NRTL"
    )
    m.fs.state_block = m.fs.properties.build_state_block(defined_state=True)

    # Fix the state variables on the state block
    # hint: state variables exist on the state block i.e. on m.fs.state_block

    m.fs.state_block.flow_mol.fix(1)
    m.fs.state_block.temperature.fix(368)
    m.fs.state_block.pressure.fix(101325)
    m.fs.state_block.mole_frac_comp["benzene"].fix(0.5)
    m.fs.state_block.mole_frac_comp["toluene"].fix(0.5)

    # Fix NRTL specific parameters.

    # non-randomness parameter - alpha_ij (set at 0.3, 0 if i=j)
    m.fs.properties.alpha["benzene", "benzene"].fix(0)
    m.fs.properties.alpha["benzene", "toluene"].fix(0.3)
    m.fs.properties.alpha["toluene", "toluene"].fix(0)
    m.fs.properties.alpha["toluene", "benzene"].fix(0.3)

    # binary interaction parameter - tau_ij (0 if i=j, else to be estimated later but fixing to initialize)
    m.fs.properties.tau["benzene", "benzene"].fix(0)
    m.fs.properties.tau["benzene", "toluene"].fix(-0.9)
    m.fs.properties.tau["toluene", "toluene"].fix(0)
    m.fs.properties.tau["toluene", "benzene"].fix(1.4)

    # Initialize the flash unit
    m.fs.state_block.initialize(outlvl=idaeslog.INFO_LOW)

    # Fix at actual temperature
    m.fs.state_block.temperature.fix(float(data["temperature"]))

    # Set bounds on variables to be estimated
    m.fs.properties.tau["benzene", "toluene"].setlb(-5)
    m.fs.properties.tau["benzene", "toluene"].setub(5)

    m.fs.properties.tau["toluene", "benzene"].setlb(-5)
    m.fs.properties.tau["toluene", "benzene"].setub(5)

    # Return initialized flash model
    return m

Parameter estimation using parmest#

In addition to providing a method to return an initialized model, the parmest tool needs the following:

  • List of variable names to be estimated

  • Dataset

  • Expression to compute the sum of squared errors

In this example, we only estimate the binary interaction parameter (tau_ij). Given that this variable is usually indexed as tau_ij = Var(component_list, component_list), there are 2*2=4 degrees of freedom. However, when i=j, the binary interaction parameter is 0. Therefore, in this problem, we estimate the binary interaction parameter for the following variables only:

  • fs.properties.tau[‘benzene’, ‘toluene’]

  • fs.properties.tau[‘toluene’, ‘benzene’]

Inline Exercise: Create a list called `variable_name` with the above-mentioned variables declared as strings.
# Todo: Create a list of vars to estimate
variable_name = [
    "fs.properties.tau['benzene', 'toluene']",
    "fs.properties.tau['toluene', 'benzene']",
]

Pyomo’s parmest tool supports the following data formats:

  • pandas dataframe

  • list of dictionaries

  • list of json file names.

Please see the documentation for more details.

For this example, we load data from the csv file BT_NRTL_dataset.csv. The dataset consists of fifty data points which provide the mole fraction of benzene in the vapor and liquid phase as a function of temperature.

# Load data from csv
data = pd.read_csv("BT_NRTL_dataset.csv")

# Display the dataset
display(data)
temperature liq_benzene vap_benzene
0 365.500000 0.480953 0.692110
1 365.617647 0.462444 0.667699
2 365.735294 0.477984 0.692441
3 365.852941 0.440547 0.640336
4 365.970588 0.427421 0.623328
5 366.088235 0.442725 0.647796
6 366.205882 0.434374 0.637691
7 366.323529 0.444642 0.654933
8 366.441176 0.427132 0.631229
9 366.558824 0.446301 0.661743
10 366.676471 0.438004 0.651591
11 366.794118 0.425320 0.634814
12 366.911765 0.439435 0.658047
13 367.029412 0.435655 0.654539
14 367.147059 0.401350 0.604987
15 367.264706 0.397862 0.601703
16 367.382353 0.415821 0.630930
17 367.500000 0.420667 0.640380
18 367.617647 0.391683 0.598214
19 367.735294 0.404903 0.620432
20 367.852941 0.409563 0.629626
21 367.970588 0.389488 0.600722
22 368.000000 0.396789 0.612483
23 368.088235 0.398162 0.616106
24 368.205882 0.362340 0.562505
25 368.323529 0.386958 0.602680
26 368.441176 0.363643 0.568210
27 368.558824 0.368118 0.577072
28 368.676471 0.384098 0.604078
29 368.794118 0.353605 0.557925
30 368.911765 0.346474 0.548445
31 369.029412 0.350741 0.556996
32 369.147059 0.362347 0.577286
33 369.264706 0.362578 0.579519
34 369.382353 0.340765 0.546411
35 369.500000 0.337462 0.542857
36 369.617647 0.355729 0.574083
37 369.735294 0.348679 0.564513
38 369.852941 0.338187 0.549284
39 369.970588 0.324360 0.528514
40 370.088235 0.310753 0.507964
41 370.205882 0.311037 0.510055
42 370.323529 0.311263 0.512055
43 370.441176 0.308081 0.508437
44 370.558824 0.308224 0.510293
45 370.676471 0.318148 0.528399
46 370.794118 0.308334 0.513728
47 370.911765 0.317937 0.531410
48 371.029412 0.289149 0.484824
49 371.147059 0.298637 0.502318

We need to provide a method to return an expression to compute the sum of squared errors that will be used as the objective in solving the parameter estimation problem. For this problem, the error will be computed for the mole fraction of benzene in the vapor and liquid phase between the model prediction and data.

Inline Exercise: Complete the following cell by adding an expression to compute the sum of square errors.
# Create method to return an expression that computes the sum of squared error
def SSE(m, data):
    # Todo: Add expression for computing the sum of squared errors in mole fraction of benzene in the liquid
    # and vapor phase. For example, the squared error for the vapor phase is:
    # (float(data["vap_benzene"]) - m.fs.state_block.mole_frac_phase_comp["Vap", "benzene"])**2
    expr = (
        float(data["vap_benzene"])
        - m.fs.state_block.mole_frac_phase_comp["Vap", "benzene"]
    ) ** 2 + (
        float(data["liq_benzene"])
        - m.fs.state_block.mole_frac_phase_comp["Liq", "benzene"]
    ) ** 2
    return expr * 1e4
Note: Notice that we have scaled the expression up by a factor of 10000 as the SSE computed here will be an extremely small number given that we are using the difference in mole fraction in our expression. This will help in using a well-scaled objective to improve solve robustness when using IPOPT.

We are now ready to set up the parameter estimation problem. We will create a parameter estimation object called pest. As shown below, we pass the method that returns an initialized model, data, variable_name, and the SSE expression to the Estimator method. tee=True will print the solver output after solving the parameter estimation problem.

import logging

idaeslog.getIdaesLogger("core.property_meta").setLevel(logging.ERROR)
# Initialize a parameter estimation object
pest = parmest.Estimator(NRTL_model, data, variable_name, SSE, tee=True)

# Run parameter estimation using all data
obj_value, parameters = pest.theta_est()
WARNING: DEPRECATED: You're using the deprecated parmest interface
(model_function, data, theta_names). This interface will be removed in a
future release, please update to the new parmest interface using experiment
lists.  (deprecated in 6.7.2) (called from
/home/docs/.asdf/installs/python/3.8.18/lib/python3.8/functools.py:912)
WARNING: Params with units must be mutable.  Converting Param
'fs.properties.pressure_critical' to mutable.
WARNING: Params with units must be mutable.  Converting Param
'fs.properties.temperature_critical' to mutable.
WARNING: Params with units must be mutable.  Converting Param
'fs.properties.mw_comp' to mutable.
WARNING: Params with units must be mutable.  Converting Param
'fs.properties.dh_form' to mutable.
WARNING: Params with units must be mutable.  Converting Param
'fs.properties.ds_form' to mutable.
---------------------------------------------------------------------------
ApplicationError                          Traceback (most recent call last)
Cell In[9], line 8
      5 pest = parmest.Estimator(NRTL_model, data, variable_name, SSE, tee=True)
      7 # Run parameter estimation using all data
----> 8 obj_value, parameters = pest.theta_est()

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:888, in Estimator.theta_est(self, solver, return_values, calc_cov, cov_n)
    886 # check if we are using deprecated parmest
    887 if self.pest_deprecated is not None:
--> 888     return self.pest_deprecated.theta_est(
    889         solver=solver,
    890         return_values=return_values,
    891         calc_cov=calc_cov,
    892         cov_n=cov_n,
    893     )
    895 assert isinstance(solver, str)
    896 assert isinstance(return_values, list)

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:2042, in _DeprecatedEstimator.theta_est(self, solver, return_values, calc_cov, cov_n)
   2035     assert isinstance(
   2036         cov_n, int
   2037     ), "The number of datapoints that are used in the objective function is required to calculate the covariance matrix"
   2038     assert cov_n > len(
   2039         self._return_theta_names()
   2040     ), "The number of datapoints must be greater than the number of parameters to estimate"
-> 2042 return self._Q_opt(
   2043     solver=solver,
   2044     return_values=return_values,
   2045     bootlist=None,
   2046     calc_cov=calc_cov,
   2047     cov_n=cov_n,
   2048 )

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:1645, in _DeprecatedEstimator._Q_opt(self, ThetaVals, solver, return_values, bootlist, calc_cov, cov_n)
   1637     ef = sputils.create_EF(
   1638         scen_names,
   1639         _experiment_instance_creation_callback,
   (...)
   1642         scenario_creator_kwargs=scenario_creator_options,
   1643     )
   1644 else:
-> 1645     ef = local_ef.create_EF(
   1646         scen_names,
   1647         _experiment_instance_creation_callback,
   1648         EF_name="_Q_opt",
   1649         suppress_warnings=True,
   1650         scenario_creator_kwargs=scenario_creator_options,
   1651     )
   1652 self.ef_instance = ef
   1654 # Solve the extensive form with ipopt

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/utils/create_ef.py:111, in create_EF(scenario_names, scenario_creator, scenario_creator_kwargs, EF_name, suppress_warnings, nonant_for_fixed_vars)
    109 if scenario_creator_kwargs is None:
    110     scenario_creator_kwargs = dict()
--> 111 scen_dict = {
    112     name: scenario_creator(name, **scenario_creator_kwargs)
    113     for name in scenario_names
    114 }
    116 if len(scen_dict) == 0:
    117     raise RuntimeError("create_EF() received empty scenario list")

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/utils/create_ef.py:112, in <dictcomp>(.0)
    109 if scenario_creator_kwargs is None:
    110     scenario_creator_kwargs = dict()
    111 scen_dict = {
--> 112     name: scenario_creator(name, **scenario_creator_kwargs)
    113     for name in scenario_names
    114 }
    116 if len(scen_dict) == 0:
    117     raise RuntimeError("create_EF() received empty scenario list")

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:165, in _experiment_instance_creation_callback(scenario_name, node_names, cb_data)
    163 # at least three signatures are supported. The first is preferred
    164 try:
--> 165     instance = callback(experiment_number=exp_num, cb_data=cb_data)
    166 except TypeError:
    167     raise RuntimeError(
    168         "Only one callback signature is supported: "
    169         "callback(experiment_number, cb_data) "
    170     )

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:1596, in _DeprecatedEstimator._instance_creation_callback(self, experiment_number, cb_data)
   1594 else:
   1595     raise RuntimeError(f'Unexpected data format for cb_data={cb_data}')
-> 1596 model = self._create_parmest_model(exp_data)
   1598 return model

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:1523, in _DeprecatedEstimator._create_parmest_model(self, data)
   1519 def _create_parmest_model(self, data):
   1520     """
   1521     Modify the Pyomo model for parameter estimation
   1522     """
-> 1523     model = self.model_function(data)
   1525     if (len(self.theta_names) == 1) and (
   1526         self.theta_names[0] == 'parmest_dummy_var'
   1527     ):
   1528         model.parmest_dummy_var = pyo.Var(initialize=1.0)

Cell In[5], line 41, in NRTL_model(data)
     38 m.fs.properties.tau["toluene", "benzene"].fix(1.4)
     40 # Initialize the flash unit
---> 41 m.fs.state_block.initialize(outlvl=idaeslog.INFO_LOW)
     43 # Fix at actual temperature
     44 m.fs.state_block.temperature.fix(float(data["temperature"]))

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/idaes/models/properties/activity_coeff_models/activity_coeff_prop_pack.py:614, in _ActivityCoeffStateBlock.initialize(blk, state_args, hold_state, state_vars_fixed, outlvl, solver, optarg)
    607 if (
    608     (k.config.has_phase_equilibrium)
    609     or (k.config.parameters.config.valid_phase == ("Liq", "Vap"))
    610     or (k.config.parameters.config.valid_phase == ("Vap", "Liq"))
    611 ):
    613     with idaeslog.solver_log(solve_log, idaeslog.DEBUG) as slc:
--> 614         res = solve_indexed_blocks(opt, [blk], tee=slc.tee)
    616 else:
    617     res = "skipped"

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/idaes/core/util/initialization.py:285, in solve_indexed_blocks(solver, blocks, **kwds)
    278     tmp._ctypes[Block] = [  # pylint: disable=protected-access
    279         0,
    280         nBlocks - 1,
    281         nBlocks,
    282     ]
    284     # Solve temporary Block
--> 285     results = solver.solve(tmp, **kwds)
    287 finally:
    288     # Clean up temporary Block contents so they are not removed when Block
    289     # is garbage collected.
    290     tmp._decl = {}  # pylint: disable=protected-access

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/opt/base/solvers.py:534, in OptSolver.solve(self, *args, **kwds)
    531 def solve(self, *args, **kwds):
    532     """Solve the problem"""
--> 534     self.available(exception_flag=True)
    535     #
    536     # If the inputs are models, then validate that they have been
    537     # constructed! Collect suffix names to try and import from solution.
    538     #
    539     from pyomo.core.base.block import BlockData

File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/opt/solver/shellcmd.py:140, in SystemCallSolver.available(self, exception_flag)
    138     if exception_flag:
    139         msg = "No executable found for solver '%s'"
--> 140         raise ApplicationError(msg % self.name)
    141     return False
    142 return True

ApplicationError: No executable found for solver 'ipopt'

You will notice that the resulting parameter estimation problem will have 1102 variables and 1100 constraints. Let us display the results by running the next cell.

print("The SSE at the optimal solution is %0.6f" % (obj_value * 1e-4))
print()
print("The values for the parameters are as follows:")
for k, v in parameters.items():
    print(k, "=", v)
The SSE at the optimal solution is 0.000507

The values for the parameters are as follows:
fs.properties.tau[benzene,toluene] = -0.8987624036283798
fs.properties.tau[toluene,benzene] = 1.4104861099366137

Using the data that was provided, we have estimated the binary interaction parameters in the NRTL model for a benzene-toluene mixture. Although the dataset that was provided was temperature dependent, in this example we have estimated a single value that fits best for all temperatures.

Advanced options for parmest: bootstrapping#

Pyomo’s parmest tool allows for bootstrapping where the parameter estimation is repeated over n samples with resampling from the original data set. Parameter estimation with bootstrap resampling can be used to identify confidence regions around each parameter estimate. This analysis can be slow given the increased number of model instances that need to be solved. Please refer to https://pyomo.readthedocs.io/en/stable/contributed_packages/parmest/driver.html for more details.

For the example above, the bootstrapping can be run by uncommenting the code in the following cell:

# Run parameter estimation using bootstrap resample of the data (10 samples),
# plot results along with confidence regions

# Uncomment the following code:
# bootstrap_theta = pest.theta_est_bootstrap(4)
# display(bootstrap_theta)