Accelerated failure time model
In the
Model specification
In full generality, the accelerated failure time model can be specified as[2]
where denotes the joint effect of covariates, typically . (Specifying the regression coefficients with a negative sign implies that high values of the covariates increase the survival time, but this is merely a sign convention; without a negative sign, they increase the hazard.)
This is satisfied if the probability density function of the event is taken to be ; it then follows for the survival function that . From this it is easy[citation needed] to see that the moderated life time is distributed such that and the unmoderated life time have the same distribution. Consequently, can be written as
where the last term is distributed as , i.e., independently of . This reduces the accelerated failure time model to regression analysis (typically a linear model) where represents the fixed effects, and represents the noise. Different distributions of imply different distributions of , i.e., different baseline distributions of the survival time. Typically, in survival-analytic contexts, many of the observations are censored: we only know that , not . In fact, the former case represents survival, while the later case represents an event/death/censoring during the follow-up. These right-censored observations can pose technical challenges for estimating the model, if the distribution of is unusual.
The interpretation of in accelerated failure time models is straightforward: means that everything in the relevant life history of an individual happens twice as fast. For example, if the model concerns the development of a tumor, it means that all of the pre-stages progress twice as fast as for the unexposed individual, implying that the expected time until a clinical disease is 0.5 of the baseline time. However, this does not mean that the hazard function is always twice as high - that would be the proportional hazards model.
Statistical issues
Unlike proportional hazards models, in which Cox's semi-parametric proportional hazards model is more widely used than parametric models, AFT models are predominantly fully parametric i.e. a probability distribution is specified for . (Buckley and James[3] proposed a semi-parametric AFT but its use is relatively uncommon in applied research; in a 1992 paper, Wei[4] pointed out that the Buckley–James model has no theoretical justification and lacks robustness, and reviewed alternatives.) This can be a problem, if a degree of realistic detail is required for modelling the distribution of a baseline lifetime. Hence, technical developments in this direction would be highly desirable.
Unlike proportional hazards models, the regression parameter estimates from AFT models are robust to omitted
The results of AFT models are easily interpreted.[7] For example, the results of a clinical trial with mortality as the endpoint could be interpreted as a certain percentage increase in future life expectancy on the new treatment compared to the control. So a patient could be informed that he would be expected to live (say) 15% longer if he took the new treatment. Hazard ratios can prove harder to explain in layman's terms.
Distributions used in AFT models
The
The Weibull distribution (including the exponential distribution as a special case) can be parameterised as either a proportional hazards model or an AFT model, and is the only family of distributions to have this property. The results of fitting a Weibull model can therefore be interpreted in either framework. However, the biological applicability of this model may be limited by the fact that the hazard function is monotonic, i.e. either decreasing or increasing.
Any distribution on a multiplicatively closed group, such as the positive real numbers, is suitable for an AFT model. Other distributions include the log-normal, gamma, hypertabastic, Gompertz distribution, and inverse Gaussian distributions, although they are less popular than the log-logistic, partly as their cumulative distribution functions do not have a closed form. Finally, the generalized gamma distribution is a three-parameter distribution that includes the Weibull, log-normal and gamma distributions as special cases.
References
Further reading
- Bradburn, MJ; Clark, TG; Love, SB; Altman, DG (2003), "Survival Analysis Part II: Multivariate data analysis - an introduction to concepts and methods", British Journal of Cancer, 89 (3): 431–436, PMID 12888808
- Hougaard, Philip (1999), "Fundamentals of Survival Data", Biometrics, 55 (1): 13–22, PMID 11318147
- Collett, D. (2003), Modelling Survival Data in Medical Research (2nd ed.), CRC press, ISBN 978-1-58488-325-8
- ISBN 978-0-412-24490-2
- Marubini, Ettore; Valsecchi, Maria Grazia (1995), Analysing Survival Data from Clinical Trials and Observational Studies, Wiley, ISBN 978-0-470-09341-2
- Martinussen, Torben; Scheike, Thomas (2006), Dynamic Regression Models for Survival Data, Springer, ISBN 0-387-20274-9
- Bagdonavicius, Vilijandas; Nikulin, Mikhail (2002), Accelerated Life Models. Modeling and Statistical Analysis, Chapman&Hall/CRC, ISBN 1-58488-186-0