Appendix L — Parameter Estimation

This appendix provides a simplified overview on using experimental data to (a) estimate the values of parameters in a model and (b) assess the accuracy of the resulting model. Students may be familiar with estimating parameter values in linear models using linear least-squares fitting. The parameter estimation procedure outlined here is not limited to linear models, and it is usually performed using computer software. The presentation here is not specific to any one computer program or language. Students seeking a more complete understanding should consult a statistical analysis textbook or take a course on statistical analysis.

L.1 Defining the Problem

Parameter estimation requires experimental data. In each experiment, the person performing the experiments sets the values of one or more quantities. Herein, the variables used to represent those quantities are referred to as the adjusted input variables. After setting those values, the experimenter then records the values of other quantities. The variables used to represent these other quantities are referred to as the experimental responses. In Reaction Engineering Basics only experiments with a single experimental response are considered. Every experiment performed involves the same set of adjusted input variables and the same experimental response, but their values are different from experiment to experiment.

Often, the reason someone performs experiments like these is because they believe that the response depends upon the values of the adjusted input variables. Based upon that belief, they propose a mathematical response model. The values of the adjusted input variables from any one experiment can be substituted into the response model permitting the calculation of a model-predicted response for that experiment. Typically the response model contains one or more unknown constants, called the model parameters.

It is important to understand that even if the model is correct and the values of the model parameters are known exactly, the model-predicted response for any one experiment will not be exactly equal to the experimental response. Measured values, such as the experimental response, always include random error. Put differently, if the same experiment was performed 10 times, 10 slightly different experimental responses could be recorded. Given a response model (including the values of the parameters appearing in it), the difference between the experimental response and the model-predicted response for any one experiment is called the error or the residual for that one experiment.

Returning to the reason the experiments were performed, the person who performed them believes that the response depends upon the values of the adjusted input variables. However, they do not know what the correct response model is. Therefore, when they propose a response model, they don’t know whether the proposed model is correct and they don’t know the values of the parameters that the proposed model contains. The purposes of parameter estimation are first, to calculate the “best” values of the parameters in the proposed model, second, to estimate the uncertainty in each parameter value, and third, to provide some measure of how accurately the model predicts the experimental responses.

This raises the question of how to define the “best” values of the model parameters. If the error (i. e. the difference between the experimental response and the model-predicted response) for each experiment is squared and then the resulting squares of the errors are added together, the most common definition for the “best” parameter values is that the sum of the squares of the errors is minimized when the “best” values of the parameters are used in the response model. The squares of the errors are used instead of the errors themselves so that positive and negative errors don’t cancel each other out. For this reason, parameter estimation is sometimes referred to as “least-squares fitting.” An alternative definition of the “best” parameter values uses the sum of the squares of the relative errors. The use of this definition is considered in the final section of this appendix.

L.2 Items that Must be Provided to the Solver

Model parameters can be estimated numerically. While “estimator” might be more appropriate, for consistency with the other numerical methods appendices in Reaction Engineering Basics, the term “solver” is used herein to refer to the computer program or function that numerically estimates the parameters in a model. There are many different parameter estimation solvers available, and each has specific instructions on how to use it. This appendix makes no attempt to explain the details of using any of the available parameter estimation solvers.

Regardless of the specific solver being used, when performing parameter estimation, four things must be provided to the solver:

  1. A guess for the value of each model parameter.
  2. A set of values of the experimental adjusted input variables for every experiment as a vector, \(\underline x\), or matrix \(\underline{\underline x}\).
    1. If only one input was adjusted in the experiments, the column vector, \(\underline x\), will contain one row for each experiment.
    2. If more than one input was adjusted in the experiments the matrix, \(\underline{\underline x}\), will contain one column for each adjusted input and one row for each experiment.
  3. A corresponding set of values of the experimental responses as a column vector, \(\underline{y_{expt}}\), with one row per experiment, and with the rows in the same order as in \(\underline x\) or \(\underline{\underline x}\).
  4. A function or subroutine that calculates the model-predicted response.
    1. This subroutine receives as arguments (a) the adjusted input variables, \(\underline x\), or \(\underline{\underline x}\), and (b) a set of possible values of the parameters that appear in the model.
    2. This subroutine uses the parameters provided to it in the model; it computes and returns a column vector, \(\underline{y_{pred}}\), containing the model-predicted responses, with one row per experiment, and with the rows in the same order as in \(\underline x\) or \(\underline{\underline x}\).

L.3 A Simplistic Explanation of How the Parameters are Estimated

The intent here is to provide a sense of how the solver estimates parameters. It is highly simplified and skips over some details. Generally, numerical parameter estimation is an iterative process that involves the steps listed here.

  1. Using the guess for the parameter values, calculate the model-predicted response for each experiment.
  2. Calculate the error (difference between the experimental and model-predicted responses) for each experimental data point and square it.
  3. Sum the resulting squares of the errors.
  4. On the second and later iterations, check the convergence criteria described in the next section.
    1. if converged
      1. the guess represents the best estimates of the parameter values
      2. calculate a statistical measure of the undertainty in each parameter, e. g. the 95% confidence interval
      3. calculate some measure of the accuracy of the model, e. g. the coefficient of determination, \(R^2\)
      4. return the results from i. through iii. and quit
    2. if this is the first iteration or if not converged but making progress, generate a new (and hopefully, improved) guess and return to the first step.
    3. if not converged and not making progress, warn that a solution was not obtained and quit.

L.4 Convergence

The solver uses a set of convergence criteria to determine whether (a) it has converged to a minimum value of the sum of the squares of the errors, (b) it has not converged to a minimum value of the sum of the squares of the errors but it is making progress and should continue iterating or (c) it has not converged to a minimum value and it is not making progress, so it should stop iterating. Typical convergence criteria are listed below; the solver usually provides a default for the “specified amounts” in this list, but the defaults can be overridden at the time the solver is started.

  • the sum of the squares of the errors is smaller than a specified amount (converged).
  • the sum of the squares of the errors is smaller than it was for the previous iteration (making progress)
  • a (large) specified number of iterations has occurred (not making progress).
  • the sum of the squares of the errors have been calculated a specified number of times (not making progress).
  • the sum of the squares of the errors is increasing instead of decreasing (not making progress).
  • the sum of the squares of the errors is changing by less than a specified amount between iterations (not making progress).
  • the guess is changing by less than a specified amount between iterations (not making progress).

L.5 Assessing the Accuracy of the Model

Typically, if successful, the solver will return (a) the estimated values of the model parameters, (b) some measure of the uncertainty in the estimated values (e. g. a 95% confidence interval) and (c) the coefficient of determination, \(R^2\).

The estimated values of the model parameters can be used to compute the model-predicted response for each experiment. A “parity plot” can then be created wherein the model-predicted response is plotted versus the experimental response. If the model was perfect, every point in the parity plot would fall on a diagonal line passing through the origin. The farther the points are from the diagonal, the lower the accuracy of the model.

The errors, or residuals, for each experiment can also be calculated. A set of “residuals plots” can then be created wherein the residuals are plotted versus each of the experimental adjusted input variables. If the model was perfect, every residual would equal zero and every point in every residuals plot would fall on the horizontal axis. The magnitude of the deviations from the horizontal axis is not important (the parity plot is used for this purpose). The critical issue whether the points scatter randomly above and below the horizontal axis as the value of the adjusted input variable increases. When the model is good, there should not be any trends in the deviations as the value of the adjusted input variable increases.

Deciding whether the model is sufficiently accurate is a judgement call. The following criteria are satisfied by an accurate model.

  • The coefficient of determination, \(R^2\), is nearly equal to 1.
  • The uncertainty in each model parameter is small compared to its value.
  • The points in the parity plot are all close to a diagonal line passing through the origin.
  • In each residuals plot, the points scatter randomly about the horizontal axis, and no systematic deviations are apparent.