Back Numerical Data Fitting in Dynamical Systems - A Practical Introduction with Applications and Software
K. Schittkowski: Monography, Kluwer Academic Publishers (2002)
Real life phenomena of engineering, natural science or medical problems are often formulated by means of a mathematical model with the goal to simulate the behaviour of the system in computerized form. Advantages of mathematical models are their cheap availability, the possibility to simulate extreme situations that cannot be handled by experiments, or to simulate real systems during the design phase before constructing a first prototype. Moreover, they serve to verify decisions, to avoid expensive and time consuming experimental tests, to analyze, understand and explain the behaviour of systems, or to optimize design and production.
As soon as a mathematical model contains differential dependencies from an additional parameter, typically the time, we call it a dynamical model. There are now two key questions that always arise in a practical environment:
- Is the mathematical model correct?
- How can I quantify model parameters that cannot be measured directly?
In principle, both questions are easily answered as soon as some experimental data are available. The idea is to compare measured data with predicted model function values and to minimize the sum of their squared differences over the whole parameter space. We have to reject a model if we are unable to find a reasonably accurate fit.
To summarize, parameter estimation or data fitting, respectively, is extremely important in all practical situations, where a mathematical model and corresponding experimental data are available to describe the behaviour of a dynamical system.
The main goal of the book is to give an overview of numerical methods that are needed to compute parameters of a dynamical model by a least squares fit. The mathematical equations that must be provided by the system analyst, are explicit model functions or steady state systems in the most simple situations, or responses of dynamical systems defined by ordinary differential equations, differential algebraic equations, or one-dimensional partial differential equations. Many different mathematical disciplines must be combined, and the intention is to present at least some fundamental ideas of the numerical methods needed, so that available software can be applied successfully.
It must be noted that there are two alternative aspects that are not treated in this book. First we do not emphasize statistical analysis, which is also known as nonlinear regression or nonlinear parameter estimation. Moreover, we do not investigate the question whether parameters of a dynamical model can be identified at all, and under which mathematical conditions. Is is supposed that a user is able prepare a well-defined model, i.e., that the dynamical system is uniquely solvable and that the parameters can be identified by a least squares fit. For both topics there exist numerous very qualified text books that are recommended as secondary literature.
It is assumed that the typical reader is familiar with basic mathematical notation of linear algebra and analysis, as for example learned in elementary calculus lectures. Any additional knowledge about mathematical theory is not required. New concepts are presented in an elementary form and are illustrated by detailed analytical and numerical examples.
Extensive numerical results are included to show the efficiency of modern mathematical algorithms. But we also discuss possible pitfalls in form of warnings that even the most qualified numerical algorithms we know today, can fail or produce unacceptable responses. The practical progress of mathematical models and data fitting calculations is illustrated by case studies from pharmaceutics, mechanical, electrical or chemical engineering, and ecology.
To be able to repeat all numerical tests presented in the book, and to play with algorithms, data and solution tolerances, an interactive software system is included that runs under Windows 95/98/NT4.0/2000. The program contains the mathematical algorithms described in the book. The database contains 1,000 illustrative examples, to be used as benchmark test problems. Among them is large number of real life models (learning by doing).
The book is the outcome of my research in this area since the last 20 years with emphasis on the development of numerical algorithms for solving optimization problems. It would be impossible to design applicable mathematical algorithms and implement the corresponding software without intensive discussions, contact and cooperation with firms, e.g. Boehringer Ingelheim Pharma, BASF Ludwigshafen, Siemens Munich, Schloemann-Siemag Hilchenbach, DaimlerChrysler Aerospace (DASA/MBB) Munich, Bayer Sarnia, Dornier Satellite Systems Munich, and many research institutions at universities. In particular I would like to thank Dr. M. Wolf from Boehringer Ingelheim Pharma KG for providing many dynamical models describing pharmaceutical applications, and for encouraging the investigation of models based on partial differential equations. Part of my research was supported by projects funded by the BMBF research program Anwendungsbezogene Verbundprojekte in der Mathematik and the DFG research program Echtzeit-Optimierung grosser Systeme.
IntroductionParameter estimation plays an important role in many natural science, engineering and other disciplines. The key idea is to estimate unknown parameters of a mathematical model that describes a real life situation, by minimizing the distance of some known experimental data from theoretical fitting function values at certain time values. Thus, also model parameters that cannot be measured directly, can be identified by a least squares fit and analyzed subsequently in a quantitative way.) A typical dynamical system is given by differential equations that describe a time-dependent process, and that depend also on the parameters p. Instead of minimizing the sum of squares, we may apply alternative residual norms, e.g., to minimize the sum of absolute residual values or the maximum of absolute residual values.
Parameter estimation, also called parameter identification, nonlinear regression or data fitting, is extremely important for all practical situations, where a mathematical model and corresponding experimental data are available to analyze the behaviour of a dynamical system.
The main goal of the book is to introduce some numerical methods that can be used to compute parameters by a least squares fit in form of a tool box. The mathematical model that must be provided by the system analyst, has to belong to one of the following categories:
- explicit model functions
- steady state systems
- Laplace transforms of differential equations
- ordinary differential equations, initial value problems
- differential algebraic equations
- one-dimensional time-dependent partial differential equations
- one-dimensional partial differential algebraic equations
- nonlinear programming
- system identification
- numerical solution of ordinary differential equations
- discretization of partial differential equations
- sensitivity analysis
- automatic differentiation
- Laplace transforms
Thus, parameter estimation in dynamical systems covers a large variety of mathematical techniques and it is impossible to present a complete treatment of all topics. Instead we present an overview with the intention to understand at least some fundamental ideas of the numerical algorithms, so that available software can be applied successfully.
The general mathematical model to be investigated, contains certain features to apply the presented methods for a large set of different practical applications. Some of the most important items are:
- More than one fitting criterion can be defined, i.e. more than one experimental data set can be fitted within a model formulation.
- The fitting criteria are arbitrary functions depending on the parameters to be estimated, the solution of the underlying dynamical system, and the time variable.
- The model may possess arbitrary equality or inequality constraints subject to the parameters to be estimated, and upper and lower bounds for the parameters.
- Model equations may contain an additional independent parameter called concentration.s
- Differential-algebraic equations can be solved up to index 3. Consistent initial values for index-1-formulations are computed internally.
- In case of partial differential equations, also coupled ordinary differential equations and non-continuous transitions for solution and flux between different areas can be taken into account.
- Differential equation models may possess additional break or switching points where the model dynamics is changed and where integration is restarted, e.g. if a new doses is applied in case of a pharmacokinetic model
- The switching points mentioned, may become optimization variables allowing e.g. modeling of dynamical input functions for computing the optimal control of given state equations.
- The model functions may be defined by their Laplace transforms, where the back-transformation is performed numerically.
- Gradients can be evaluated by automatic differentiation, i.e. without additional round-off, truncation or approximation errors, and without compiling and linking of code.
- Ordinary differential equations may become stiff and large, i.e. we introduce explicit and implicit methods and exploit band structures.
- Parameter estimation problems based on unstable differential equations can be solved by the shooting method.
- Various types of one-dimensional partial differential equations are permitted, e.g. hyperbolic ones describing shock waves. A large number of different discretization techniques is introduced, especially upwind formulae for hyperbolic equations. Advection-diffusion, transport or related equations can be solved successfully by non-oscillatory discretization schemes, even with non-continuous initial or boundary conditions.
- Partial differential equations may be defined with Neumann and Dirichlet boundary or transitions conditions. Moreover, these conditions can be formulated in terms of algebraic equations coupled at arbitrary spatial positions.
- Algebraic partial differential equations may be added to the time-dependent ones.
- Data can be fitted with respect to the L2-, the L1-, or the maxumum-norm, i.e. with respect to sum of squares, sum of absolute values, or maximum of absolute values of the residuals.
Only for illustration purposes we denote the first independent model variable the time variable of the system, the second one the concentration variable and the dependent data as measurement values of an experiment. These words describe their probably most frequent usage in a practical application. On the other hand, the terms may have any other meaning within a model depending on the underlying application problem.
Due to the practical importance of parameter estimation, very many numerical codes have been developed in the past and are distributed within software packages. However, there is no guarantee that the mathematical algorithm chosen is capable to solve the problem under investigation. Possible traps preventing a solution in the desired way, are
- approximation of a local solution that is unacceptable,
- round-off errors for example because of an iterative solution of the dynamical system,
- narrow curved valleys where progress towards the solution is hard to achieve,
- very flat objective function in the neighborhood of a solution, for example in case of large noise in measurements,
- overdetermined models, i.e. too many model parameters to be estimated, leading to infinitely many solution vectors,
- bad starting values for parameters requiring a large number of steps,
- badly scaled model functions and, in particular, measurement values,
- non-differentiable model functions.
We have to know that all efficient optimization algorithms for the problem class we are considering, require differentiable fitting criteria and the availability of a starting point from which the iteration cycle is initiated. Additional difficulties arise in the presence of nonlinear constraints, in particular if they are badly stated, e.g. ill-conditioned, badly scaled, linearly dependent or, worst of all, contradictory.
Thus, users of parameter estimation software are often faced with the situation that the algorithm is unable to get a satisfactory answer subject to solution parameters selected, and that one has to restart the solution cycle by changing tolerances, internal algorithmic decisions of the code or at least the starting values to get a better result. To sum up, a black box approach to solve a parameter estimation problem, does not exist and a typical life cycle of a solution process consists of stepwise redesign of solution data. In the subsequent chapters we summarize the mathematical and numerical tools that are needed to set up a feasible model, to select a suitable algorithm, to perform a parameter estimation run and to analyze results in a practical situation from the viewpoint of a user.
First we give a very compact introduction into the mathematical background in Chapter 2. Only the most important topics needed for the design of the numerical methods discussed and for interpreting results, are outlined. Any special mathematical knowledge for instance about optimization theory or the theory of ordinary differential equations is not required to understand the chapter. However, some basic ideas about calculus and linear algebra are desirable. Topics treated, are optimality criteria, sequential quadratic programming methods, nonlinear least squares optimization, ordinary differential equations, differential-algebraic equations, one-dimensional time-dependent partial differential equations, Laplace transforms, automatic differentiation and statistical analysis. Readers mainly interested in solving a parameter estimation problem and with minor interest in the mathematical background of algorithms, may skip this chapter.
The mathematical parameter estimation model, alternative phrases are data fitting or system identification, is outlined in Chapter 3. Is is shown, how the dynamical systems have to be formulated and adapted to fit into the least squares formulation required for executing an optimization algorithm. Moreover, we show that there is a large variety of different alternatives to formulate a parameter estimation model that must be considered separately because of their practical importance.
In Chapter 4, we present numerical experiments. First we show a couple of possible pitfalls to outline that even qualified numerical algorithms can fail or can compute unacceptable answers. Possible remedies are proposed to overcome numerical instabilities and deficiencies in the model formulation. Also we discuss the important question, how to check the validity of a mathematical model. Then we present numerical results of comparative test runs of some least squares codes. To analyze these questions in a quantitative way, a large number of test examples is collected, many of them with some practical background and experimental data.
A couple of case studies are included in Chapter 5 to motivate the approach and to show the practical impact of the methods investigated. The application models discussed, are for
- linear pharmacokinetics,
- receptor-ligand binding,
- robot design,
- acetylene reactor,
- distillation column,
- multibody system of a truck,
- diffusion of drugs through skin,
- ground water flow,
- cooling a hot strip mill,
- drying maltodextrin in a convection oven,
- fluid dynamics of hydro systems,
- horn radiators for satellite communication.
To be able to repeat numerical tests outlined in the previous chapters and to play with algorithms, data and solution tolerances, an interactive software system is included running under Windows 95/98/NT4.0/2000. The codes contain the mathematical algorithms discussed in Chapter 2, and allow the numerical identification of parameters in any of the dynamical systems under investigation. All test examples mentioned in the book, are included among very many other test cases, and can be solved again either proceeding from the default values included, or from user-defined parameters, tolerances and experimental data.
The installation of the software system is outlined in Appendix A. After a successful installation, more organizational details on its usage can be retrieved directly from the interactive, context-sensitive help option. The user interface is developed in form of an MS-Access database, the numerical algorithms in form of stand-alone 32-bit-Fortran codes.
The availability of test examples is an important issue for implementing and testing numerical software. Thus, another appendix is included that collects some information about the test problems a unified format. The problems can be retrieved immediately from the database, and they serve also for a review on the complexity of the test examples used in Chapter 4.
Appendix C contains a brief description of the modeling language used for the implementation of the test examples. Especially all error messages are listed. A particular advantage is that derivatives can be evaluated automatically without approximation errors.
To be able to execute the test examples from own analysis software, a program is documented in Appendix D that generates Fortran code for function and gradient evaluation based on the backward mode of automatic differentiation. The code generator is also attached on the CD-ROM.
The book has 402 pages containing 493 references, 147 figures, 79 tables, and 1,000 test examples.
Kluwer Academic Publishers, Dordrecht,
Hardbound together with CD-ROM, ISBN 1-4020-1079-6,
EUR 148.00 / USD 145.00 / GBP 90.00