An information theoretic approach for combining neural network process models
Typically neural network modelers in chemical engineering focus on identifying and using a single, hopefully optimal, neural network model. Using a single optimal model implicitly assumes that one neural network model can extract all the information available in a given data set and that the other c...
Gespeichert in:
Veröffentlicht in: | Neural networks 1999-07, Vol.12 (6), p.915-926 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Typically neural network modelers in chemical engineering focus on identifying and using a single, hopefully optimal, neural network model. Using a single optimal model implicitly assumes that one neural network model can extract all the information available in a given data set and that the other candidate models are redundant. In general, there is no assurance that any individual model has extracted all relevant information from the data set. Recently, Wolpert (
Neural Networks, 5(2), 241 (1992)) proposed the idea of stacked generalization to combine multiple models. Sridhar, Seagrave and Barlett (
AIChE J., 42, 2529 (1996)) implemented the stacked generalization for neural network models by integrating multiple neural networks into an architecture known as stacked neural networks (SNNs). SNNs consist of a combination of the candidate neural networks and were shown to provide improved modeling of chemical processes. However, in Sridhar's work SNNs were limited to using a linear combination of artificial neural networks. While a linear combination is simple and easy to use, it can utilize only those model outputs that have a high linear correlation to the output. Models that are useful in a nonlinear sense are wasted if a linear combination is used. In this work we propose an information theoretic stacking (ITS) algorithm for combining neural network models. The ITS algorithm identifies and combines useful models regardless of the nature of their relationship to the actual output. The power of the ITS algorithm is demonstrated through three examples including application to a dynamic process modeling problem. The results obtained demonstrate that the SNNs developed using the ITS algorithm can achieve highly improved performance as compared to selecting and using a single hopefully optimal network or using SNNs based on a linear combination of neural networks. |
---|---|
ISSN: | 0893-6080 1879-2782 |
DOI: | 10.1016/S0893-6080(99)00030-1 |