Degree Name

Doctor of Philosophy


School of Mathematics and Applied Statistics


Tree methodology is a potentially powerful exploratory analysis tool for survival data, enabling identification of risk factors, comparison of treatments and prognosis for individual subjects. In survival trees node membership is dynamic rather than static, sample sizes changing as subjects drop out of the study. In the case of time-dependent covariates, subjects may also move within a tree. Delayed or intermittent treatment results in movement between treatment groups. In particular, this dynamic behaviour introduces complications relating to sample size at each node site, while, in general, it introduces complication into computation, growth, display and interpretation of survival trees.

To deal with dynamic tree structure, it is necessary to first develop some mathematical notation. A methodology is required for graphical and numerical display of the information extracted from survival trees. As an integral part of tree-based methods, a procedure is required for splitting and stopping.

In the context of survival analysis, the aim is usually to isolate the effect of treatment. Therefore a methodology is required for the comparison of treatment groups, allowing for the possibility of unbalanced design, dynamic group membership, and presence of interaction.

In this thesis, attention is focussed on transitions rather than subjects. The change in focus is necessary to view the whole procedure of tree growing as dynamic. A binary labelling scheme is used for dynamic updating. Both the notation and labelling scheme may also be potentially useful in general CART methodology. To accomplish the comparison of treatments, the transition approach is extended to parallel trees.

The Kaplan-Meier formula for the survival function is adjusted to handle movement from one group to another. S-Plus functions are developed and can be used in any survival setting for searching over potential branching covariates and cutpoints.These functions are capable of dealing with fixed or time-dependent covariates.

Particular attention has been paid to the graphical and numerical display of survival trees. This will enhance communication between statisticians and medical researchers. The idea and implementation of adjusted sample size, resolves the complication pertaining to definition of sample size at each node site. To demonstrate the potential advantages of the new approach, data sets on heart transplantation and A I D S are analysed.



Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.