Show code cell content
###############################################################################
# The Institute for the Design of Advanced Energy Systems Integrated Platform
# Framework (IDAES IP) was produced under the DOE Institute for the
# Design of Advanced Energy Systems (IDAES).
#
# Copyright (c) 2018-2023 by the software owners: The Regents of the
# University of California, through Lawrence Berkeley National Laboratory,
# National Technology & Engineering Solutions of Sandia, LLC, Carnegie Mellon
# University, West Virginia University Research Corporation, et al.
# All rights reserved. Please see the files COPYRIGHT.md and LICENSE.md
# for full copyright and license information.
###############################################################################
Parameter Estimation Using Flash Unit Model#
Author: Jaffer Ghouse
Maintainer: Andrew Lee
Updated: 2023-06-01
In this module, we will be using Pyomo’s parmest
tool in conjunction with IDAES models for parameter estimation. We demonstrate these tools by estimating the parameters associated with the NRTL property model for a benzene-toluene mixture. The NRTL model has 2 sets of parameters: the non-randomness parameter (alpha_ij
) and the binary interaction parameter (tau_ij
), where i
and j
is the pure component species. In this example, we will be only estimate the binary interaction parameter (tau_ij
) for a given dataset. When estimating parameters associated with the property package, IDAES provides the flexibility of doing the parameter estimation by just using the state block or by using a unit model with a specified property package. This module will demonstrate parameter estimation by using the flash unit model with the NRTL property package.
We will complete the following tasks:
Set up a method to return an initialized model
Set up the parameter estimation problem using
parmest
Analyze the results
Demonstrate advanced features from
parmest
Key links to documentation:#
# Todo: import ConcreteModel from pyomo.environ
from pyomo.environ import ConcreteModel, value
# Todo: import FlowsheetBlock from idaes.core
from idaes.core import FlowsheetBlock
# Todo: import Flash unit model from idaes.models.unit_models
from idaes.models.unit_models import Flash
In the next cell, we will be importing the parameter block that we will be using in this module and the idaes logger.
from idaes.models.properties.activity_coeff_models.BTX_activity_coeff_VLE import (
BTXParameterBlock,
)
import idaes.logger as idaeslog
In the next cell, we import parmest
from Pyomo and the pandas
package. We need pandas
as parmest
uses pandas.dataframe
for handling the input data and the results.
import pyomo.contrib.parmest.parmest as parmest
import pandas as pd
Setting up an initialized model#
We need to provide a method that returns an initialized model to the parmest
tool in Pyomo.
def NRTL_model(data):
# Todo: Create a ConcreteModel object
m = ConcreteModel()
# Todo: Create FlowsheetBlock object
m.fs = FlowsheetBlock(dynamic=False)
# Todo: Create a properties parameter object with the following options:
# "valid_phase": ('Liq', 'Vap')
# "activity_coeff_model": 'NRTL'
m.fs.properties = BTXParameterBlock(
valid_phase=("Liq", "Vap"), activity_coeff_model="NRTL"
)
m.fs.flash = Flash(property_package=m.fs.properties)
# Initialize at a certain inlet condition
m.fs.flash.inlet.flow_mol.fix(1)
m.fs.flash.inlet.temperature.fix(368)
m.fs.flash.inlet.pressure.fix(101325)
m.fs.flash.inlet.mole_frac_comp[0, "benzene"].fix(0.5)
m.fs.flash.inlet.mole_frac_comp[0, "toluene"].fix(0.5)
# Set Flash unit specifications
m.fs.flash.heat_duty.fix(0)
m.fs.flash.deltaP.fix(0)
# Fix NRTL specific variables
# alpha values (set at 0.3)
m.fs.properties.alpha["benzene", "benzene"].fix(0)
m.fs.properties.alpha["benzene", "toluene"].fix(0.3)
m.fs.properties.alpha["toluene", "toluene"].fix(0)
m.fs.properties.alpha["toluene", "benzene"].fix(0.3)
# initial tau values
m.fs.properties.tau["benzene", "benzene"].fix(0)
m.fs.properties.tau["benzene", "toluene"].fix(-0.9)
m.fs.properties.tau["toluene", "toluene"].fix(0)
m.fs.properties.tau["toluene", "benzene"].fix(1.4)
# Initialize the flash unit
m.fs.flash.initialize(outlvl=idaeslog.INFO_LOW)
# Fix at actual temperature
m.fs.flash.inlet.temperature.fix(float(data["temperature"]))
# Set bounds on variables to be estimated
m.fs.properties.tau["benzene", "toluene"].setlb(-5)
m.fs.properties.tau["benzene", "toluene"].setub(5)
m.fs.properties.tau["toluene", "benzene"].setlb(-5)
m.fs.properties.tau["toluene", "benzene"].setub(5)
# Return initialized flash model
return m
Parameter estimation using parmest#
In addition to providing a method to return an initialized model, the parmest
tool needs the following:
List of variable names to be estimated
Dataset with multiple scenarios
Expression to compute the sum of squared errors
In this example, we only estimate the binary interaction parameter (tau_ij
). Given that this variable is usually indexed as tau_ij = Var(component_list, component_list)
, there are 2*2=4 degrees of freedom. However, when i=j, the binary interaction parameter is 0. Therefore, in this problem, we estimate the binary interaction parameter for the following variables only:
fs.properties.tau[‘benzene’, ‘toluene’]
fs.properties.tau[‘toluene’, ‘benzene’]
# Todo: Create a list of vars to estimate
variable_name = [
"fs.properties.tau['benzene', 'toluene']",
"fs.properties.tau['toluene', 'benzene']",
]
Pyomo’s parmest
tool supports the following data formats:
pandas dataframe
list of dictionaries
list of json file names.
Please see the documentation for more details.
For this example, we load data from the csv file BT_NRTL_dataset.csv
. The dataset consists of fifty data points which provide the mole fraction of benzene in the vapor and liquid phase as a function of temperature.
# Load data from csv
data = pd.read_csv("BT_NRTL_dataset.csv")
# Display the dataset
display(data)
temperature | liq_benzene | vap_benzene | |
---|---|---|---|
0 | 365.500000 | 0.480953 | 0.692110 |
1 | 365.617647 | 0.462444 | 0.667699 |
2 | 365.735294 | 0.477984 | 0.692441 |
3 | 365.852941 | 0.440547 | 0.640336 |
4 | 365.970588 | 0.427421 | 0.623328 |
5 | 366.088235 | 0.442725 | 0.647796 |
6 | 366.205882 | 0.434374 | 0.637691 |
7 | 366.323529 | 0.444642 | 0.654933 |
8 | 366.441176 | 0.427132 | 0.631229 |
9 | 366.558824 | 0.446301 | 0.661743 |
10 | 366.676471 | 0.438004 | 0.651591 |
11 | 366.794118 | 0.425320 | 0.634814 |
12 | 366.911765 | 0.439435 | 0.658047 |
13 | 367.029412 | 0.435655 | 0.654539 |
14 | 367.147059 | 0.401350 | 0.604987 |
15 | 367.264706 | 0.397862 | 0.601703 |
16 | 367.382353 | 0.415821 | 0.630930 |
17 | 367.500000 | 0.420667 | 0.640380 |
18 | 367.617647 | 0.391683 | 0.598214 |
19 | 367.735294 | 0.404903 | 0.620432 |
20 | 367.852941 | 0.409563 | 0.629626 |
21 | 367.970588 | 0.389488 | 0.600722 |
22 | 368.000000 | 0.396789 | 0.612483 |
23 | 368.088235 | 0.398162 | 0.616106 |
24 | 368.205882 | 0.362340 | 0.562505 |
25 | 368.323529 | 0.386958 | 0.602680 |
26 | 368.441176 | 0.363643 | 0.568210 |
27 | 368.558824 | 0.368118 | 0.577072 |
28 | 368.676471 | 0.384098 | 0.604078 |
29 | 368.794118 | 0.353605 | 0.557925 |
30 | 368.911765 | 0.346474 | 0.548445 |
31 | 369.029412 | 0.350741 | 0.556996 |
32 | 369.147059 | 0.362347 | 0.577286 |
33 | 369.264706 | 0.362578 | 0.579519 |
34 | 369.382353 | 0.340765 | 0.546411 |
35 | 369.500000 | 0.337462 | 0.542857 |
36 | 369.617647 | 0.355729 | 0.574083 |
37 | 369.735294 | 0.348679 | 0.564513 |
38 | 369.852941 | 0.338187 | 0.549284 |
39 | 369.970588 | 0.324360 | 0.528514 |
40 | 370.088235 | 0.310753 | 0.507964 |
41 | 370.205882 | 0.311037 | 0.510055 |
42 | 370.323529 | 0.311263 | 0.512055 |
43 | 370.441176 | 0.308081 | 0.508437 |
44 | 370.558824 | 0.308224 | 0.510293 |
45 | 370.676471 | 0.318148 | 0.528399 |
46 | 370.794118 | 0.308334 | 0.513728 |
47 | 370.911765 | 0.317937 | 0.531410 |
48 | 371.029412 | 0.289149 | 0.484824 |
49 | 371.147059 | 0.298637 | 0.502318 |
We need to provide a method to return an expression to compute the sum of squared errors that will be used as the objective in solving the parameter estimation problem. For this problem, the error will be computed for the mole fraction of benzene in the vapor and liquid phase between the model prediction and data.
# Create method to return an expression that computes the sum of squared error
def SSE(m, data):
# Todo: Add expression for computing the sum of squared errors in mole fraction of benzene in the liquid
# and vapor phase. For example, the squared error for the vapor phase is:
# (float(data["vap_benzene"]) - m.fs.flash.vap_outlet.mole_frac_comp[0, "benzene"])**2
expr = (
float(data["vap_benzene"]) - m.fs.flash.vap_outlet.mole_frac_comp[0, "benzene"]
) ** 2 + (
float(data["liq_benzene"]) - m.fs.flash.liq_outlet.mole_frac_comp[0, "benzene"]
) ** 2
return expr * 1e4
We are now ready to set up the parameter estimation problem. We will create a parameter estimation object called pest
. As shown below, we pass the method that returns an initialized model, dataset, list of variable names to estimate, and the SSE expression to the Estimator object. tee=True
will print the solver output after solving the parameter estimation problem.
# Initialize a parameter estimation object
pest = parmest.Estimator(NRTL_model, data, variable_name, SSE, tee=True)
# Run parameter estimation using all data
obj_value, parameters = pest.theta_est()
WARNING: Params with units must be mutable. Converting Param
'fs.properties.pressure_critical' to mutable.
WARNING: Params with units must be mutable. Converting Param
'fs.properties.temperature_critical' to mutable.
WARNING: Params with units must be mutable. Converting Param
'fs.properties.mw_comp' to mutable.
WARNING: Params with units must be mutable. Converting Param
'fs.properties.dh_form' to mutable.
WARNING: Params with units must be mutable. Converting Param
'fs.properties.ds_form' to mutable.
---------------------------------------------------------------------------
ApplicationError Traceback (most recent call last)
Cell In[9], line 5
2 pest = parmest.Estimator(NRTL_model, data, variable_name, SSE, tee=True)
4 # Run parameter estimation using all data
----> 5 obj_value, parameters = pest.theta_est()
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:915, in Estimator.theta_est(self, solver, return_values, calc_cov, cov_n)
908 assert isinstance(
909 cov_n, int
910 ), "The number of datapoints that are used in the objective function is required to calculate the covariance matrix"
911 assert cov_n > len(
912 self._return_theta_names()
913 ), "The number of datapoints must be greater than the number of parameters to estimate"
--> 915 return self._Q_opt(
916 solver=solver,
917 return_values=return_values,
918 bootlist=None,
919 calc_cov=calc_cov,
920 cov_n=cov_n,
921 )
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:518, in Estimator._Q_opt(self, ThetaVals, solver, return_values, bootlist, calc_cov, cov_n)
510 ef = sputils.create_EF(
511 scen_names,
512 _experiment_instance_creation_callback,
(...)
515 scenario_creator_kwargs=scenario_creator_options,
516 )
517 else:
--> 518 ef = local_ef.create_EF(
519 scen_names,
520 _experiment_instance_creation_callback,
521 EF_name="_Q_opt",
522 suppress_warnings=True,
523 scenario_creator_kwargs=scenario_creator_options,
524 )
525 self.ef_instance = ef
527 # Solve the extensive form with ipopt
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/utils/create_ef.py:111, in create_EF(scenario_names, scenario_creator, scenario_creator_kwargs, EF_name, suppress_warnings, nonant_for_fixed_vars)
109 if scenario_creator_kwargs is None:
110 scenario_creator_kwargs = dict()
--> 111 scen_dict = {
112 name: scenario_creator(name, **scenario_creator_kwargs)
113 for name in scenario_names
114 }
116 if len(scen_dict) == 0:
117 raise RuntimeError("create_EF() received empty scenario list")
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/utils/create_ef.py:112, in <dictcomp>(.0)
109 if scenario_creator_kwargs is None:
110 scenario_creator_kwargs = dict()
111 scen_dict = {
--> 112 name: scenario_creator(name, **scenario_creator_kwargs)
113 for name in scenario_names
114 }
116 if len(scen_dict) == 0:
117 raise RuntimeError("create_EF() received empty scenario list")
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:160, in _experiment_instance_creation_callback(scenario_name, node_names, cb_data)
158 # at least three signatures are supported. The first is preferred
159 try:
--> 160 instance = callback(experiment_number=exp_num, cb_data=cb_data)
161 except TypeError:
162 raise RuntimeError(
163 "Only one callback signature is supported: "
164 "callback(experiment_number, cb_data) "
165 )
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:469, in Estimator._instance_creation_callback(self, experiment_number, cb_data)
467 else:
468 raise RuntimeError(f'Unexpected data format for cb_data={cb_data}')
--> 469 model = self._create_parmest_model(exp_data)
471 return model
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/contrib/parmest/parmest.py:396, in Estimator._create_parmest_model(self, data)
392 def _create_parmest_model(self, data):
393 """
394 Modify the Pyomo model for parameter estimation
395 """
--> 396 model = self.model_function(data)
398 if (len(self.theta_names) == 1) and (
399 self.theta_names[0] == 'parmest_dummy_var'
400 ):
401 model.parmest_dummy_var = pyo.Var(initialize=1.0)
Cell In[5], line 42, in NRTL_model(data)
39 m.fs.properties.tau["toluene", "benzene"].fix(1.4)
41 # Initialize the flash unit
---> 42 m.fs.flash.initialize(outlvl=idaeslog.INFO_LOW)
44 # Fix at actual temperature
45 m.fs.flash.inlet.temperature.fix(float(data["temperature"]))
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/idaes/core/base/unit_model.py:540, in UnitModelBlockData.initialize(blk, *args, **kwargs)
537 c.deactivate()
539 # Remember to collect flags for fixed vars
--> 540 flags = blk.initialize_build(*args, **kwargs)
542 # If costing block exists, activate and initialize
543 for c in init_order:
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/idaes/core/base/unit_model.py:589, in UnitModelBlockData.initialize_build(blk, state_args, outlvl, solver, optarg)
585 opt = get_solver(solver, optarg)
587 # ---------------------------------------------------------------------
588 # Initialize control volume block
--> 589 flags = blk.control_volume.initialize(
590 outlvl=outlvl,
591 optarg=optarg,
592 solver=solver,
593 state_args=state_args,
594 )
596 init_log.info_high("Initialization Step 1 Complete.")
598 # ---------------------------------------------------------------------
599 # Solve unit
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/idaes/core/base/control_volume0d.py:1490, in ControlVolume0DBlockData.initialize(blk, state_args, outlvl, optarg, solver, hold_state)
1487 init_log = idaeslog.getInitLogger(blk.name, outlvl, tag="control_volume")
1489 # Initialize state blocks
-> 1490 in_flags = blk.properties_in.initialize(
1491 outlvl=outlvl,
1492 optarg=optarg,
1493 solver=solver,
1494 hold_state=hold_state,
1495 state_args=state_args,
1496 )
1498 if state_args is None:
1499 # If no initial guesses provided, estimate values for states
1500 blk.estimate_outlet_state(always_estimate=True)
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/idaes/models/properties/activity_coeff_models/activity_coeff_prop_pack.py:614, in _ActivityCoeffStateBlock.initialize(blk, state_args, hold_state, state_vars_fixed, outlvl, solver, optarg)
607 if (
608 (k.config.has_phase_equilibrium)
609 or (k.config.parameters.config.valid_phase == ("Liq", "Vap"))
610 or (k.config.parameters.config.valid_phase == ("Vap", "Liq"))
611 ):
613 with idaeslog.solver_log(solve_log, idaeslog.DEBUG) as slc:
--> 614 res = solve_indexed_blocks(opt, [blk], tee=slc.tee)
616 else:
617 res = "skipped"
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/idaes/core/util/initialization.py:285, in solve_indexed_blocks(solver, blocks, **kwds)
278 tmp._ctypes[Block] = [ # pylint: disable=protected-access
279 0,
280 nBlocks - 1,
281 nBlocks,
282 ]
284 # Solve temporary Block
--> 285 results = solver.solve(tmp, **kwds)
287 finally:
288 # Clean up temporary Block contents so they are not removed when Block
289 # is garbage collected.
290 tmp._decl = {} # pylint: disable=protected-access
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/opt/base/solvers.py:534, in OptSolver.solve(self, *args, **kwds)
531 def solve(self, *args, **kwds):
532 """Solve the problem"""
--> 534 self.available(exception_flag=True)
535 #
536 # If the inputs are models, then validate that they have been
537 # constructed! Collect suffix names to try and import from solution.
538 #
539 from pyomo.core.base.block import BlockData
File ~/checkouts/readthedocs.org/user_builds/idaes-examples/envs/latest/lib/python3.8/site-packages/pyomo/opt/solver/shellcmd.py:140, in SystemCallSolver.available(self, exception_flag)
138 if exception_flag:
139 msg = "No executable found for solver '%s'"
--> 140 raise ApplicationError(msg % self.name)
141 return False
142 return True
ApplicationError: No executable found for solver 'ipopt'
You will notice that the resulting parameter estimation problem, when using the flash unit model, will have 2952 variables and 2950 constraints. This is because the unit models in IDAES use control volume blocks which have two state blocks attached; one at the inlet and one at the outlet. Even though there are two state blocks, they still use the same parameter block i.e. m.fs.properties
in our example which is where our parameters that need to be estimated exist.
Let us display the results by running the next cell.
print("The SSE at the optimal solution is %0.6f" % (obj_value * 1e-4))
print()
print("The values for the parameters are as follows:")
for k, v in parameters.items():
print(k, "=", v)
The SSE at the optimal solution is 0.000507
The values for the parameters are as follows:
fs.properties.tau[benzene,toluene] = -0.8987624039723903
fs.properties.tau[toluene,benzene] = 1.410486110660486
Using the data that was provided, we have estimated the binary interaction parameters in the NRTL model for a benzene-toluene mixture. Although the dataset that was provided was temperature dependent, in this example we have estimated a single value that fits best for all temperatures.
Advanced options for parmest: bootstrapping#
Pyomo’s parmest
tool allows for bootstrapping where the parameter estimation is repeated over n
samples with resampling from the original data set. Parameter estimation with bootstrap resampling can be used to identify confidence regions around each parameter estimate. This analysis can be slow given the increased number of model instances that need to be solved. Please refer to https://pyomo.readthedocs.io/en/stable/contributed_packages/parmest/driver.html for more details.
For the example above, the bootstrapping can be run by uncommenting the code in the following cell:
# Run parameter estimation using bootstrap resample of the data (10 samples),
# plot results along with confidence regions
# Uncomment the following lines
# bootstrap_theta = pest.theta_est_bootstrap(4)
# display(bootstrap_theta)