Doctor of Philosophy
School of Mathematics and Applied Statistics
Ellem, Bernard A., Curvature measures for generalized linear models, Doctor of Philosophy thesis, School of Mathematics and Applied Statistics, University of Wollongong, 1999. http://ro.uow.edu.au/theses/2045
First addressed by Beale (1960), the use of curvature measures of nonlinearity in nonlinear regression has been elucidated most comprehensively by Bates and Watts (1980). They used differential geometric results that exploit features of the Euclidean space imposed by the Normality assumption. The partitioning of these measures into intrinsic effects (due to the model) and parameter effects (due to the form or parameterization of the model) allows a proper assessment of model departures from linearity. Indeed, the term 'linear' has become synonymous with a lack of both of these effects, since the commonly designated 'linear model' with Normal disturbance does not contain either effect. These curvature measures are used to unravel the effects of model reformulation on convergence of fitting procedures, and on the appropriateness of confidence regions based on the linearization assumption. For model criticism using residual analysis, the presence of intrinsic curvature in a nonlinear regression model can distort the visual assessment procedures borrowed from linear modelling, since the fundamental basis of these procedures can be undermined when the model is nonlinear.
When the disturbances are non-Normal, the consequent geometry is no longer Euclidean, necessitating a different approach, as outlined by Amari (1982a). The required approach generalizes the Euclidean inner product to a metric, and the ordinary derivative to an α-connection. The concept of these α-connections is fundamental to a proper understanding of the role of differential geometry to the investigation of estimator behaviour in the case of non-Normal errors. These connections provide the general method for comparing nearby points in the parameter space, for general classes of error distributions. In these cases, such a comparison is complicated by the difficulty of the existence of different bases for the neighbouring tangent spaces derived from the likelihood. The exception or special case is the linear model with Normal errors, where no such difficulty arises.
Casting the generalization as being from Normal to non-Normal errors, the extension can be considered to cause an 'unbundling' of the statistical properties of estimators, which in the case of Normal errors can be enjoyed simultaneously by the same estimator. In the general non-Normal case, such behaviour can no longer be guaranteed, implying that all properties may need to be considered separately, since, in the general case, specific properties of the estimator are associated with particular values of α.
This thesis outlines the fundamentals of the generalization of curvature measures to models of exponential type, in particular curved exponential families for which generalized linear models are an important subclass. This approach is used to generate insights into the properties of generalized linear models, with particular reference to the canonical link function as the non-Normal generalization of a linear model with Normal errors.
Indeed, the underlying 'theme' of this study is the investigation of the generalization of 'linearity' for the Normal error linear model to the non-Normal error nonlinear model. The potential simultaneity of estimator properties for the Normal distribution does not carry over to the generalization from the Normal to the non-Normal, since now each property has to be investigated separately, for each particular value of a.
As shown in Chapter 2, this individual treatment involves the statistical interpretation of each α-connection to demonstrate how key values of a are associated with estimator properties such as unbiasedness, stability of variance, lack of skewness, 'normal' likelihood and sufficiency. In terms of data analysis, all of these investigations need to be performed on the regression coefficients rather than on the fitted value (expectation parameter) scale. This requires the use of curved exponential families involving an imbedding of the regression coefficients in the original expectation space.
One of the properties of Normal error linear models is estimator sufficiency, which for generalized linear models implies a canonical link function. The associated α-connection is the exponential or Efron connection. This connection could be considered as the springboard for the generalization of Normal error linear models to non-Normal error nonlinear models, since for generalized linear models it mimics the special case of Normal errors, by the conditions under which it vanishes. The investigation of this connection and its special relationship with generalized linear models has generated in Chapter 2 a test of adequacy for canonical link functions, based on the skewness of the regression coefficients.
The generalization of curvature follows a similar path to the a-connections, being a function of them in terms of the expectation parameters. In line with the decomposition demonstrated by Bates and Watts (1980) for Normal errors, generalized a-curvature decomposes into intrinsic and parameter-effects curvature; now, each particular α-curvature is associated with individual properties of the model, depending on the value of α. The other main change from the curvature measures of Bates and Watts is that, in the general case, a contribution to curvature is made
from the error distribution as well as from the model and its parameterization. A major new result in Chapter 3 has been the proof of the invariance of intrinsic α-curvature in the general case, using a coordinate based system. A consequence of examining the generalization has been to define in Chapter 3 a class of models, generalized nonlinear models, having a non-Normal error distribution and a general nonlinear response function. The relationship of this class with classes of known models such as generalized linear models again raises the question of what is meant by 'nonlinearity' in general. Several related derivations such as the invariance of parameter-effects curvature in generalized linear models, and results involving exponential curvature, generalized linear models and generalized nonlinear models verify expected behaviour and highlight the generalizations that are possible.
The generalized curvature measures are shown in Chapter 4 to be related to quantities of statistical interest such as the bias and covariance of estimators for curved exponential families, mirroring the known situation for nonlinear regression. For generalized linear models, alternative link functions to the canonical can be chosen on the basis of properties such as variance stabilization, 'normal' likelihood and lack of skewness. As expected, these links have been shown in Chapter 4 to be associated with specific α-connections. A table is presented of those link functions that produce the required properties on the expected value scale for each error distribution in a generalized linear model.
The special relationship between curvature measures, nonlinear regression and generalized linear models is further demonstrated in Chapter 5 by the use of a new method for nonlinear regression based on a second order approximant to the nonlinear function by means of a special generalized linear model. As expected, such
an approximation follows the true function more closely than linearization; this is demonstrated empirically from calculations of leverage, parameter estimates and corresponding interval estimation. All these effects are predicted from considerations based on curvature measures, both intrinsic and parameter effects.
The effect of replication on curvature is known empirically and theoretically in the case of nonlinear regression. In Chapter 5 it is shown that replication has two implications for the effects of curvature in a generalized nonlinear model. Firstly, the central limit theorem produces convergence to the Normal distribution, so that the error contribution to general α-curvature becomes zero asymptotically. The effect of replication on the model contribution is less clear, since the general limiting case is nonlinear regression if only the error component of a-curvature is considered. Locally, the generalized nonlinear model will be well approximated by a linear model. Secondly, under some conditions, a generalized nonlinear model will converge locally to a generalized linear model with canonical link. However, when the error component and the model component are considered, the overall effect of intense replication will be to produce locally a linear model with Normal errors.