Accelerated failure time model

In the

hazard by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. There is strong basic science evidence from C. Elegans experiments by Stroustrup et al.^[1]

indicating that AFT models are the correct model for biological survival processes.

Model specification

In full generality, the accelerated failure time model can be specified as [2]

\lambda (t|\theta )=\theta \lambda _{0}(\theta t)

where $\theta$ denotes the joint effect of covariates, typically $\theta =\exp(-[\beta _{1}X_{1}+\cdots +\beta _{p}X_{p}])$ . (Specifying the regression coefficients with a negative sign implies that high values of the covariates increase the survival time, but this is merely a sign convention; without a negative sign, they increase the hazard.)

This is satisfied if the probability density function of the event is taken to be $f(t|\theta )=\theta f_{0}(\theta t)$ ; it then follows for the survival function that $S(t|\theta )=S_{0}(\theta t)$ . From this it is easy^{[citation needed]} to see that the moderated life time $T$ is distributed such that $T\theta$ and the unmoderated life time $T_{0}$ have the same distribution. Consequently, $\log(T)$ can be written as

\log(T)=-\log(\theta )+\log(T\theta ):=-\log(\theta )+\epsilon

where the last term is distributed as $\log(T_{0})$ , i.e., independently of $\theta$ . This reduces the accelerated failure time model to regression analysis (typically a linear model) where $-\log(\theta )$ represents the fixed effects, and $\epsilon$ represents the noise. Different distributions of $\epsilon$ imply different distributions of $T_{0}$ , i.e., different baseline distributions of the survival time. Typically, in survival-analytic contexts, many of the observations are censored: we only know that $T_{i}>t_{i}$ , not $T_{i}=t_{i}$ . In fact, the former case represents survival, while the later case represents an event/death/censoring during the follow-up. These right-censored observations can pose technical challenges for estimating the model, if the distribution of $T_{0}$ is unusual.

The interpretation of $\theta$ in accelerated failure time models is straightforward: $\theta =2$ means that everything in the relevant life history of an individual happens twice as fast. For example, if the model concerns the development of a tumor, it means that all of the pre-stages progress twice as fast as for the unexposed individual, implying that the expected time until a clinical disease is 0.5 of the baseline time. However, this does not mean that the hazard function $\lambda (t|\theta )$ is always twice as high - that would be the proportional hazards model.

Statistical issues

Unlike proportional hazards models, in which Cox's semi-parametric proportional hazards model is more widely used than parametric models, AFT models are predominantly fully parametric i.e. a probability distribution is specified for $\log(T_{0})$ . (Buckley and James^[3] proposed a semi-parametric AFT but its use is relatively uncommon in applied research; in a 1992 paper, Wei^[4] pointed out that the Buckley–James model has no theoretical justification and lacks robustness, and reviewed alternatives.) This can be a problem, if a degree of realistic detail is required for modelling the distribution of a baseline lifetime. Hence, technical developments in this direction would be highly desirable.

Unlike proportional hazards models, the regression parameter estimates from AFT models are robust to omitted

covariates. They are also less affected by the choice of probability distribution.^[5]^[6]

The results of AFT models are easily interpreted.[7] For example, the results of a clinical trial with mortality as the endpoint could be interpreted as a certain percentage increase in future life expectancy on the new treatment compared to the control. So a patient could be informed that he would be expected to live (say) 15% longer if he took the new treatment. Hazard ratios can prove harder to explain in layman's terms.

Distributions used in AFT models

The

monotonic hazard function which increases at early times and decreases at later times. It is somewhat similar in shape to the log-normal distribution but it has heavier tails. The log-logistic cumulative distribution function has a simple closed form, which becomes important computationally when fitting data with censoring

. For the censored observations one needs the survival function, which is the complement of the cumulative distribution function, i.e. one needs to be able to evaluate

S(t|\theta )=1-F(t|\theta )

.

The Weibull distribution (including the exponential distribution as a special case) can be parameterised as either a proportional hazards model or an AFT model, and is the only family of distributions to have this property. The results of fitting a Weibull model can therefore be interpreted in either framework. However, the biological applicability of this model may be limited by the fact that the hazard function is monotonic, i.e. either decreasing or increasing.

Any distribution on a multiplicatively closed group, such as the positive real numbers, is suitable for an AFT model. Other distributions include the log-normal, gamma, hypertabastic, Gompertz distribution, and inverse Gaussian distributions, although they are less popular than the log-logistic, partly as their cumulative distribution functions do not have a closed form. Finally, the generalized gamma distribution is a three-parameter distribution that includes the Weibull, log-normal and gamma distributions as special cases.

References

PMC 4828198
.

^ Kalbfleisch & Prentice (2002). The Statistical Analysis of Failure Time Data (2nd ed.). Hoboken, NJ: Wiley Series in Probability and Statistics.

JSTOR 2335161

PMID 1480879
.

PMID 15449337

PMID 9004393
.

doi:10.1177/009286150203600312

Further reading

Bradburn, MJ; Clark, TG; Love, SB; Altman, DG (2003), "Survival Analysis Part II: Multivariate data analysis - an introduction to concepts and methods", British Journal of Cancer, 89 (3): 431–436,
PMID 12888808

Hougaard, Philip (1999), "Fundamentals of Survival Data", Biometrics, 55 (1): 13–22,
PMID 11318147

Collett, D. (2003), Modelling Survival Data in Medical Research (2nd ed.), CRC press,
ISBN 978-1-58488-325-8

ISBN 978-0-412-24490-2

Marubini, Ettore; Valsecchi, Maria Grazia (1995), Analysing Survival Data from Clinical Trials and Observational Studies, Wiley,
ISBN 978-0-470-09341-2

Martinussen, Torben; Scheike, Thomas (2006), Dynamic Regression Models for Survival Data, Springer,
ISBN 0-387-20274-9

Bagdonavicius, Vilijandas; Nikulin, Mikhail (2002), Accelerated Life Models. Modeling and Statistical Analysis, Chapman&Hall/CRC,
ISBN 1-58488-186-0

v
t
e
Statistics

Outline

Index

Continuous data
Center

Mean
Arithmetic

Arithmetic-Geometric

Cubic

Generalized/power

Geometric

Harmonic

Heronian

Heinz

Lehmer

Median

Mode

Dispersion

Average absolute deviation

Coefficient of variation

Interquartile range

Percentile

Range

Standard deviation

Variance

Shape

Central limit theorem

Moments
Kurtosis

L-moments

Skewness

Count data

Index of dispersion

Summary tables

Contingency table

Frequency distribution

Grouped data

Dependence

Partial correlation

Pearson product-moment correlation

Rank correlation
Kendall's τ

Spearman's ρ

Scatter plot

Graphics

Bar chart

Biplot

Box plot

Control chart

Correlogram

Fan chart

Forest plot

Histogram

Pie chart

Q–Q plot

Radar chart

Run chart

Scatter plot

Stem-and-leaf display

Violin plot

Data collection
Study design

Effect size

Missing data

Optimal design

Population

Replication

Sample size determination

Statistic

Statistical power

Survey methodology

Sampling
Cluster

Stratified

Opinion poll

Questionnaire

Standard error

Controlled experiments

Blocking

Factorial experiment

Interaction

Random assignment

Randomized controlled trial

Randomized experiment

Scientific control

Adaptive designs

Adaptive clinical trial

Stochastic approximation

Up-and-down designs

Observational studies

Cohort study

Cross-sectional study

Natural experiment

Quasi-experiment

Statistical inference
Statistical theory

Population

Statistic

Probability distribution

Sampling distribution
Order statistic

Empirical distribution
Density estimation

Statistical model
Model specification

L^p space

Parameter
location

scale

shape

Parametric family
Likelihood (monotone)

Location–scale family

Exponential family

Completeness

Sufficiency

Statistical functional

Bootstrap

U

V

Optimal decision
loss function

Efficiency

Statistical distance
divergence

Asymptotics

Robustness

Frequentist inference
Point estimation

Estimating equations
Maximum likelihood

Method of moments

M-estimator

Minimum distance

Unbiased estimators
Mean-unbiased minimum-variance
Rao–Blackwellization

Lehmann–Scheffé theorem

Median unbiased

Plug-in

Interval estimation

Confidence interval

Pivot

Likelihood interval

Prediction interval

Tolerance interval

Resampling
Bootstrap

Jackknife

Testing hypotheses

1- & 2-tails

Power

Uniformly most powerful test

Permutation test
Randomization test

Multiple comparisons

Parametric tests

Likelihood-ratio

Score/Lagrange multiplier

Wald

Specific tests

Z-test (normal)

Student's t-test

F-test

Goodness of fit

Chi-squared

G-test

Kolmogorov–Smirnov

Anderson–Darling

Lilliefors

Jarque–Bera

Normality (Shapiro–Wilk)

Likelihood-ratio test

Model selection
Cross validation

AIC

BIC

Rank statistics

Sign
Sample median

Signed rank (Wilcoxon)
Hodges–Lehmann estimator

Rank sum (Mann–Whitney)

Nonparametric anova
1-way (Kruskal–Wallis)

2-way (Friedman)

Ordered alternative (Jonckheere–Terpstra)

Van der Waerden test

Bayesian inference

Bayesian probability
prior

posterior

Credible interval

Bayes factor

Bayesian estimator
Maximum posterior estimator

Correlation

Pearson product-moment

Partial correlation

Confounding variable

Coefficient of determination

Regression analysis

Errors and residuals

Regression validation

Mixed effects models

Simultaneous equations models

Multivariate adaptive regression splines (MARS)

Linear regression

Simple linear regression

Ordinary least squares

General linear model

Bayesian regression

Non-standard predictors

Nonlinear regression

Nonparametric

Semiparametric

Isotonic

Robust

Heteroscedasticity

Homoscedasticity

Generalized linear model

Exponential families

Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance

Analysis of variance (ANOVA, anova)

Analysis of covariance

Multivariate ANOVA

Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis
Categorical

Cohen's kappa

Contingency table

Graphical model

Log-linear model

McNemar's test

Cochran–Mantel–Haenszel statistics

Multivariate

Regression

Manova

Principal components

Canonical correlation

Discriminant analysis

Cluster analysis

Classification

Structural equation model
Factor analysis

Multivariate distributions

Elliptical distributions
Normal

Time-series
General

Decomposition

Trend

Stationarity

Seasonal adjustment

Exponential smoothing

Cointegration

Structural break

Granger causality

Specific tests

Dickey–Fuller

Johansen

Q-statistic (Ljung–Box)

Durbin–Watson

Breusch–Godfrey

Time domain

Autocorrelation (ACF)
partial (PACF)

Cross-correlation (XCF)

ARMA model

ARIMA model (Box–Jenkins)

Autoregressive conditional heteroskedasticity (ARCH)

Vector autoregression (VAR)

Frequency domain

Spectral density estimation

Fourier analysis

Least-squares spectral analysis

Wavelet

Whittle likelihood

Survival
Survival function

Kaplan–Meier estimator (product limit)

Proportional hazards models

Accelerated failure time (AFT) model

First hitting time

Hazard function

Nelson–Aalen estimator

Test

Log-rank test

Applications
Biostatistics

Bioinformatics

Clinical trials / studies

Epidemiology

Medical statistics

Engineering statistics

Chemometrics

Methods engineering

Probabilistic design

Process / quality control

Reliability

System identification

Social statistics

Actuarial science

Census

Crime statistics

Demography

Econometrics

Jurimetrics

National accounts

Official statistics

Population statistics

Psychometrics

Spatial statistics

Cartography

Environmental statistics

Geographic information system

Geostatistics

Kriging

Category

Mathematics portal

Commons

WikiProject

Retrieved from "https://en.wikipedia.org/w/index.php?title=Accelerated_failure_time_model&oldid=1215233244"

[1] PMC 4828198
.

[2] Kalbfleisch & Prentice (2002). The Statistical Analysis of Failure Time Data (2nd ed.). Hoboken, NJ: Wiley Series in Probability and Statistics.

[3] JSTOR 2335161

[4] PMID 1480879
.

[5] PMID 15449337

[6] PMID 9004393
.

[7] doi:10.1177/009286150203600312

[1]

[3]

[4]

[5]

[6]