Chapter 3

(AST405) Lifetime data analysis

Author
Affiliation

Md Rasel Biswas

ISRT, University of Dhaka

3 Some Nonparametric and Graphical Procedures

3.1 Introduction

  • Graphs and simple data summaries are important for both description and analysis of data.

  • They are closely related to nonparametric estimates of distributional characteristics; many graphs are just plots of some estimate.

  • This chapter introduces nonparametric estimation and procedures for portraying univariate lifetime data.

  • Tools such as frequency tables and histograms, empirical distribution functions, probability plots, and data density plots are familiar across different branches of statistics.

  • For lifetime data, the presence of censoring makes it necessary to modify the standard methods.


  • To illustrate, let us consider one of the most elementary procedures in statistics, the formation of a relative-frequency table.

  • Suppose we have a complete (i,e., uncensored) sample of n lifetimes from some population.

  • Divide the time axis [0,) into k+1 intervals lj=[aj1,aj),j=1,,k+1, where 0=a0<a1<<ak<ak+1=; with ak being the upper limit on observation.

  • Let dj be the observed number of lifetimes that lie in Ij.

  • A frequency table is just a list of the intervals and their associated frequencies, dj, or relative fequencies, dj/n.

  • A relative-frequency histogram, consisting of rectangles with bases on [aj1,aj ) and areas dj/n(j=1,,k); is often drawn to portray this.


  • When data are censored, however, it is generally not possible to form the frequency table, because if a lifetime is censored, we do not know which interval, Ij, it lies in. As a result, we cannot determine the dj.

  • Section 3.6 describes how to deal with frequency tables when data are censored; this is referred to as life table methodology.

  • First, however, we develop methods for ungrouped data.

  • Section 3.2 discusses nonparametric estimation of distribution, survivor, or cumulative hazard functions under right censoring.

    • This also forms the basis for descriptive and diagnostic plots, which are presented in Section 3.3.
  • Sections 3.4 and 3.5 deal with the estimation of hazard functions and with nonparametric estimation from some other types of incomplete data.

3.2 Non-parametric Estimation of a Survivor Function and Quantiles

Recall: Parametric estimation of survivor function

This method assumes a parametric model (e.g., exponential distribution) of the data and we estimate the parameter first, then form the estimator of the survival function. In Parametric approach, we assume that we model the distribution as an exponential distribution with unknown parameter λ. Then we find an estimator of λ, which is λ^. Then we estimate the survival function using S^(t)=λ^eλ^t


Non-parametric estimation of a survivor function

  • As an example, consider the following sample of n complete observations {t1,,tn}

  • Empirical survivor function (ESF) for a specific value t>0 is defined as S^(t)=Pr^(Tt)=number of observations t n

    • S^(t) is a step function that decreases by (1/n) just after each observed lifetime if all observations are distinct

    • Generally, the ESF drops by (d/n) just past t if d lifetimes equal to t

  • For a specific value t>0, ESF can also be defined as S^(t+)=Pr^(T>t)=number of observations >t n


Acute myeloid leukemia (AML)

  • AML patients who reached a remission status after the treatment of chemotherapy were randomly assigned to one of the two treatments

    • maintenance chemotherapy

    • no-maintenance chemotherapy (control group)

  • Time of interest: Length of remission (in weeks)

    • maintained: 13, 161+, 9, 13+, 18, 28+, 31, 23, 34, 45+, 48

    • control: 5, 8, 12, 5, 30, 33, 8, 16+, 23, 27, 43, 45

Does maintenance chemotherapy prolong the time until relapse?


  • Estimate the survival function for the following sample of 11 complete observations of control group (n=11)
5 5 8 8 12 23 27 30 33 43 45

  • Estimates of survival function for t=0,4,5,8, S^(0+)=Pr^(T>0)=(11/11)=1S^(5+)=Pr^(T>5)=(9/11)=0.818S^(8+)=Pr^(T>8)=(7/11)=0.636S^(12+)=Pr^(T>12)=(6/11)=0.545and so on
    • Find S^(9) or S^(9+)

Sorted lifetimes 5,5,8,8,12,23,27,30,33,43,45

  • Estimated survivor function S^(tj+)=rjn

rj=i=1nI(ti>tj)number of observations>tjntotal number of observations


tj rj S^(tj+)
0 11 1.000
5 9 0.818
8 7 0.636
12 6 0.545
23 5 0.455
27 4 0.364
30 3 0.273
33 2 0.182
43 1 0.091
45 0 0.000

Nonparametric estimate of survivor function (Empirical Survivor Function - ESF)


Exercise

The following are life times of 21 lung cancer patients receiving control treatment (with no censoring): 1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23

  • Draw the ESF

  • How would we estimate S(10), the probability that an individual survives to time 10 or later?


Let’s get back to the AML example:

Sorted lifetimes: 5,5,8,8,12,23,27,30,33,43,45

tj nj dj S^(tj+)
0 11 0 1.000
5 11 2 0.818
8 9 2 0.636
12 7 1 0.545
23 6 1 0.455
27 5 1 0.364
30 4 1 0.273
33 3 1 0.182
43 2 1 0.091
45 1 1 0.000

nj=number of subjects alive (or ar risk) just before time tjdj=number of subjects failed at time tj


tj nj dj p^j S^(tj+)
0 11 0 1.000 1.000
5 11 2 0.818 0.818
8 9 2 0.778 0.636
12 7 1 0.857 0.545
23 6 1 0.833 0.455
27 5 1 0.800 0.364
30 4 1 0.750 0.273
33 3 1 0.667 0.182
43 2 1 0.500 0.091
45 1 1 0.000 0.000

p^j=Pr^(T>tj|Ttj)=1djnj


Relationship between p^j and S^(tj+)

tj nj dj p^j S^(tj+)
0 11 0 1.000 1.000 = 1.000
5 11 2 0.818 1.000*0.818 = 0.818
8 9 2 0.778 1.0000.8180.778 = 0.636
12 7 1 0.857 ’’ 0.545
23 6 1 0.833 ’’ 0.455
27 5 1 0.800 ’’ 0.364
30 4 1 0.750 ’’ 0.273
33 3 1 0.667 ’’ 0.182
43 2 1 0.500 ’’ 0.091
45 1 1 0.000 ’’ 0.000

  • Sorted unique lifetimes 5,8,12,23,27,30,33,43,45

P(T>8)=P(T>8|T8)P(T8)=P(T>8|T8)P(T>5)=P(T>8|T8)P(T>5|T5)P(T5)=P(T>8|T8)P(T>5|T5)P(T0)=0.778×0.818×1.0=0.636


  • Sorted unique lifetimes 5,8,12,23,27,30,33,43,45

P(T>10)=P(T>10|T10)P(T10)=P(T>10|T10)P(T>8)=P(T>10|T10)P(T>8|T8)P(T8)=P(T>10|T10)P(T>8|T8)P(T>5|T5)P(T0)=1.0×0.778×0.818×1.0=0.636


tj nj dj p^j S^(t+) Ij
0 11 0 1.000 1.000 = 1.000 [0, 5)
5 11 2 0.818 1.000*0.818 = 0.818 [5, 8)
8 9 2 0.778 1.0000.8180.778 = 0.636 [8, 12)
12 7 1 0.857 ’’ 0.545 [12, 23)
23 6 1 0.833 ’’ 0.455 [23, 27)
27 5 1 0.800 ’’ 0.364 [27, 30)
30 4 1 0.750 ’’ 0.273 [30, 33)
33 3 1 0.667 ’’ 0.182 [33, 43)
43 2 1 0.500 ’’ 0.091 [43, 45)
45 1 1 0.000 ’’ 0.000 [45, Inf)

Notations:

  • Observed times: t1,t2,,tn

  • Ordered observed unique time points: t1<t2<<tk

  • Intervals I1=[t1,t2)I2=[t2,t3)I3=[t3,t4)Ik=[tk,)

  • Intervals are constructed so that each of which starts at an observed lifetime and ends just before the next observed lifetime

    • E.g. Ij=[tj,tj+1)

  • Sorted unique lifetimes 5,8,12,23,27,30,33,43,45
Figure 3.1

  • Expressing S^(t) in terms of p^ S^(t+)=Pr^(T>t)=tjtp^jS^(t)=Pr^(Tt)=tj<tp^j This method is known as Kaplan-Meier or Product-limit estimator of survivor function.

    • We saw that this method is equivalent to the ESF approach: S^(t+)=Pr^(T>t)=number of observations >t n
  • But the advantage of Kaplan-Meier method is that it can handle censored observations too.

Censored sample

If we had censored data, then?

For the control group of AML example, now include the censored observation 16+.

5,8,12,5,30,33,8,16+,23,27,43,45

  • S^(t)=??

  • Censored sample: 5,8,12,5,30,33,8,16+,23,27,43,45
  • Sorted censored sample 5,5,8,8,12,16+,23,27,30,33,43,45
tj nj dj
5 12 2
8 10 2
12 8 1
16 7 0
23 6 1
27 5 1
30 4 1
33 3 1
43 2 1
45 1 1

p^j=Pr^(T>tj|Ttj)=1djnj

tj nj dj p^j
5 12 2 0.833
8 10 2 0.800
12 8 1 0.875
16 7 0 1.000
23 6 1 0.833
27 5 1 0.800
30 4 1 0.750
33 3 1 0.667
43 2 1 0.500
45 1 1 0.000

S^(t+)=j:tjtp^j

tj nj dj p^j S^(tj+)
5 12 2 0.833 0.833
8 10 2 0.800 0.667
12 8 1 0.875 0.583
16 7 0 1.000 0.583
23 6 1 0.833 0.486
27 5 1 0.800 0.389
30 4 1 0.750 0.292
33 3 1 0.667 0.194
43 2 1 0.500 0.097
45 1 1 0.000 0.000

A comparison between estimated survivor function with and without censored observations

Kaplan-Meier estimator

Kaplan-Meier estimator

  • Let (ti,δi) be a censored random sample of lifetimes i=1,,n

  • Suppose that there are k (kn) distinct lifetimes at which deaths (event) occurs t1<<tk


  • Define for jth time j=1,,k

    • dj=iI(ti=tj,δi=1)=i=1ndNi(tj) no. of deaths observed at tj

    • nj=iI(titj)=i=1nYi(tj) no. of individuals at risk at time tj, i.e. number of individuals alive an uncensored just prior time tj


  • A non-parametric estimator of survivor function S(t) S^(t)=j:tj<tp^j=j:tj<t(1djnj)=j:tj<tnjdjnj

  • It is known as Kaplan-Meier (KM) or Product-limit (PL) estimator of survivor function ()

  • Similarly S^(t+)=j:tjtp^j


  • The paper was published in the Journal of American Statistical Association in 1958

  • Number of citations 66,345 (Google Scholar, 02 November 2024)


Edward L Kaplan (1920–2006)

Paul Meier (1924–2011)

Kaplan-Meier estimator as an MLE

PL estimator as an MLE

  • Assume T1,,Tn have a discrete distribution with survivor function S(t) and hazard function h(t)

  • Without loss of generality, assume t=0,1,2,


  • The general expression of likelihood function (from Eq. 2.2.12) L=t=0i=1n[hi(t)]dNi(t)[1hi(t)]Yi(t)(1dNi(t))

    • ti lifetime of the ith individual

    • δi=I(ti is a lifetime)

    • Yi(t)=I(tit)

    • dNi(t)=I(ti=t,δi=1)


  • Since hi(t)=h(t) i L=t=0i=1n[h(t)]dNi(t)[1h(t)]Yi(t)(1dNi(t))=t=0[h(t)]dt[1h(t)]ntdt

    • dt=idNi(t) number of observed lifetimes equal to t, i.e. number of observed deaths at t

    • nt=iYi(t) number of subjects at risk (alive and uncensored) at time t


  • The parameters of the lifetime distribution h=(h(0),h(1),h(2),)

  • The likelihood function L(h)=t=0[h(t)]dt[1h(t)]ntdt

  • The log-likelihood function (h)=t=0{dtlogh(t)+(ntdt)log(1h(t))}


  • The MLE of h(0) (h)h(0)|h(0)=h^(0)=0

  • The score function evaluated at h^ d0h^(0)=n0d01h^(0)h^(0)=d0n0


  • In general h^(t)=dtnt,t=0,1,2,,τ
    • τ=max{t:nt>0}

  • The mle of S(t) S^(t)=s=0t1(1h^(s))=s=0t1(1dsns)

  • If dτ<nτ (which would happen if the largest observed lifetime is a censored observation) then S^(τ+)>0 and undefined beyond τ+,

  • If dτ=nτ then S(τ+)=0 and S(t)=0 for all t>τ

Standard error of h^(t)

  • The rth diagonal element of the information matrix I(h)

Irr(h)=E[2h(r)2]=E[drh(r)2+nrdr(1h(r))2]=nrh(r)[(1h(r))]

  • Using the assumption drBinomial(nr,h(r))

  • Off diagonal elements of I(h) are zero


  • The asymptotic variance of h^(r) Var^(h^(r))=Irr1(h)=h^(r)(1h^(r))nr

Standard error of the PL estimator S^(t)

  • Variance of log(S^(t)) Var^[log(S^(t))]=s=0t1Var^[log(1h^(s))]=s=0t11(1h^(s))2Var^[h^(s)]=s=0t1h^(s)[(1h^(s))]1ns

    • Using the delta method Var(S^) is obtained from Var(logS^)

  • Using the delta method Var^[log(S^(t))]=1S^(t)2Var^(S^(t))

Var^(S^(t))=S^(t)2Var^[log(S^(t))]=S^(t)2s=0t1h^(s)[(1h^(s))]1ns=S^(t)2s=0t1dsns(nsds)

  • This formula of variance of PL estimator is known as the Greenwood’s formula

  • Censored sample: 5,8,12,5,30,33,8,16+,23,27,43,45
tj nj dj S^(tj+)
5 12 2 0.833
8 10 2 0.667
12 8 1 0.583
16 7 0 0.583
23 6 1 0.486
27 5 1 0.389
30 4 1 0.292
33 3 1 0.194
43 2 1 0.097
45 1 1 0.000

Var^(S^(t+))=[S^(t+)]2j:tjtdjnj(njdj)

tj nj dj S^(tj+) djnj(njdj)
5 12 2 0.833 0.017
8 10 2 0.667 0.025
12 8 1 0.583 0.018
16 7 0 0.583 0.000
23 6 1 0.486 0.033
27 5 1 0.389 0.050
30 4 1 0.292 0.083
33 3 1 0.194 0.167
43 2 1 0.097 0.500
45 1 1 0.000 Inf

Var^(S^(t+))=[S^(t+)]2j:tjtdjnj(njdj)

tj nj dj S^(tj+) djnj(njdj) j:tjtdjnj(njdj)
5 12 2 0.833 0.017 0.017
8 10 2 0.667 0.025 0.042
12 8 1 0.583 0.018 0.060
16 7 0 0.583 0.000 0.060
23 6 1 0.486 0.033 0.093
27 5 1 0.389 0.050 0.143
30 4 1 0.292 0.083 0.226
33 3 1 0.194 0.167 0.393
43 2 1 0.097 0.500 0.893
45 1 1 0.000 Inf Inf

Var^(S^(t+))=[S^(t+)]2j:tjtdjnj(njdj)

tj nj dj S^(tj+) djnj(njdj) j:tjtdjnj(njdj) Var^(S^(t+))
5 12 2 0.833 0.017 0.017 0.012
8 10 2 0.667 0.025 0.042 0.019
12 8 1 0.583 0.018 0.060 0.020
16 7 0 0.583 0.000 0.060 0.020
23 6 1 0.486 0.033 0.093 0.022
27 5 1 0.389 0.050 0.143 0.022
30 4 1 0.292 0.083 0.226 0.019
33 3 1 0.194 0.167 0.393 0.015
43 2 1 0.097 0.500 0.893 0.008
45 1 1 0.000 Inf Inf NaN

survival package in R

library(survival)
dat <- tibble(
  time = c(5, 8, 12, 5, 30, 33, 8, 16, 23, 27,  43, 45),
  status = c(rep(1, 7), 0, rep(1, 4))
) 
surv_model <- survfit(Surv(time, status) ~ 1, data = dat)


print(summary(surv_model), digits = 2)
Call: survfit(formula = Surv(time, status) ~ 1, data = dat)

 time n.risk n.event survival std.err lower 95% CI upper 95% CI
    5     12       2    0.833   0.108        0.647         1.00
    8     10       2    0.667   0.136        0.447         0.99
   12      8       1    0.583   0.142        0.362         0.94
   23      6       1    0.486   0.148        0.268         0.88
   27      5       1    0.389   0.147        0.185         0.82
   30      4       1    0.292   0.139        0.115         0.74
   33      3       1    0.194   0.122        0.057         0.66
   43      2       1    0.097   0.092        0.015         0.62
   45      1       1    0.000     NaN           NA           NA

survminer::ggsurvplot(surv_model, data = dat, surv.median.line = "hv", conf.int = FALSE)

Nelson-Aalen estimator

Estimator of H(t)

  • Cumulative hazard function H(t)=0th(s)ds=0tdH(s)

    • dH(t) increment of cumulative hazard function over [t,t+dt)

Nelson-Aalen estimator

  • The following estimator of cumulative hazard function is known as Nelson-Aalen (NA) estimator (; ) H^(t)=0tdH^(s)=0tdN(s)Y(s),

    • Y(s)>0for0st
  • In the notations used for Kaplan-Meier, NA estimator looks like H^(t)=j:tjtdjnj


  • The variance of H^(t) can be obtained from the information matrix derived for the non-parametric likelihood function Var^[H^(t)]=j:tjtVar(djnj)=j:tjt(djnj)(1djnj)(1nj)=j:tjtdj(njdj)nj3

  • Censored sample 5,8,12,5,30,33,8,16+,23,27,43,45
tj nj dj
5 12 2
8 10 2
12 8 1
16 7 0
23 6 1
27 5 1
30 4 1
33 3 1
43 2 1
45 1 1

h^j=P(T=tj|Ttj)=djnj and H^(t)=j:tjth^j

tj nj dj h^j H^(tj)
5 12 2 0.167 0.167
8 10 2 0.200 0.367
12 8 1 0.125 0.492
16 7 0 0.000 0.492
23 6 1 0.167 0.658
27 5 1 0.200 0.858
30 4 1 0.250 1.108
33 3 1 0.333 1.442
43 2 1 0.500 1.942
45 1 1 1.000 2.942

Nelson-Aalen estimator

se(H^(t))=j:tjtdj(njdj)nj3

tj nj dj h^j H^(tj) se(H^(tj))
5 12 2 0.167 0.167 0.108
8 10 2 0.200 0.367 0.166
12 8 1 0.125 0.492 0.203
16 7 0 0.000 0.492 0.203
23 6 1 0.167 0.658 0.254
27 5 1 0.200 0.858 0.310
30 4 1 0.250 1.108 0.379
33 3 1 0.333 1.442 0.466
43 2 1 0.500 1.942 0.585
45 1 1 1.000 2.942 0.585

  • Both S^(t) and H^(t) are nonparametric m.l.e.’s, and are connected by the relationship between survivor and cumulative hazard function S(t)=P(Tt)=(0,t)[1dH(u)]S(t+)=P(T>t)=(0,t][1dH(u)]

  • Note S^(t) and H^(t) are discrete and don’t satisfy the relationship H(t)=logS(t), which is true for the continuous distributions S^NA(t)=exp(H^(t))H^KM(t)=logS^(t)

CIs for survival probabilities

  • Nonparametric methods can also be used to construct confidence intervals for different lifetime distribution characteristics, such as

    • Survival probabilities S(t)

    • Quantiles tp

  • The methods of constructing confidence intervals are based on the following property of MLE θ^ n(θ^θ)N(0,σ2)(θ^θ)N(0,var(θ^))

Plain CI

  • The PL estimator S^(t) is an MLE of S(t) (S^(t)S(t))N(0,σs2(t))

    • σ^s2(t)=Var^(S^(t)) Greenwood’s variance estimator
  • A pivotal quantity can be defined as Z1=S^(t)S(t)σ^s(t)N(0,1)


  • The 100(1α)% confidence interval for S(t+) can be obtained from the following expression P(aZ1b)=1α

    • b=a=z(1α/2)

    • zp pth quantile of the standard normal distribution, i.e. P(Z<zp)=p

  • 100(1α)% confidence interval for S(t+) S^(t)±z(1α/2)σ^s(t)


tj nj dj S^(tj+) Var^(S^(t+))
5 12 2 0.833 0.012
8 10 2 0.667 0.019
12 8 1 0.583 0.020
16 7 0 0.583 0.020
23 6 1 0.486 0.022
27 5 1 0.389 0.022
30 4 1 0.292 0.019
33 3 1 0.194 0.015
43 2 1 0.097 0.008
45 1 1 0.000 NaN
  • Find the 95% confidence interval of S(15+)

S^(15+)±z.975σ^s(15+)


  • 95% confidence interval of S(15+)

S^(15+)±z.975σ^s(15+)=0.583±(1.960)(0.020)=0.583±0.279=(0.304,0.862)


PL estimate and corresponding 95% CI of S(15+) using the censored sample

  • Z1-based confidence interval (???)S^(t)±z(1α/2)σ^s(t)

  • Limitations

    • When the number of uncensored lifetimes is small or when S(t) is close to 0 or 1, the distribution of Z1 may not be well approximated by N(0,1)

    • The expression (???) may contain values outside of the interval (0,1)

CI using transformation

  • Consider a function of S(t) that takes values on (,) ψ(t)=g[S(t)]

  • Examples of the function g(), for p(0,1) g(p)={log(log(p))log(p1p)log(p)


  • MLE of ψ(t) ψ^(t)=g[S^(t)]

    • S^(t) PL estimate of S(t)
  • Asymptotic variance of ψ^(t) Var^[ψ^(t)]=σ^ψ2(t)={g[S^(t)]}2Var^[S^(t)]

    • g(p)=dg(p)dp

  • We can define a pivotal quantity based on the distribution of the sampling distribution of ψ^(t) Z2=ψ^(t)ψ(t)σ^ψ(t)

    • Compare to Z1, Z2N(0,1) is closer to standard normal distribution

    • Confidence intervals based on Z2 are better performing compared to that of Z1


  • Using the distribution of Z2, confidence interval of S(t) can be obtained in two steps

  • Obtain the (1α)% CI of ψ(t) ψLψ(t)ψU

    • ψL=ψ^(t)z(1α/2)σ^ψ(t)

    • ψU=ψ^(t)+z(1α/2)σ^ψ(t)

  • Using inverse transformation, obtain the CI of S(t) from that of ψ(t)


  • Using inverse transformation, obtain the CI of S(t) from that of ψ(t)
    ψLψ(t)ψUψLg[S(t)]ψUg1(ψL)S(t)g1(ψU)

    • g1() inverse function of g()

Inverse functions

  • Log function g(p)=log(p)=ug1(u)=p=eu

  • Logit function g(p)=log(p1p)=ug1(u)=p=exp(u)1+exp(u)

  • Log-log function g(p)=log(log(p))=ug1(u)=p=exp(eu)


  • 95% CI of ψ(t)=g[S(t)]=log[log(S(t))] is ψ^(t)±σ^ψz.975

  • ψ^(t)=log[log(S^(t))]

  • σ^ψ(t)=[1S^(t)log(S^(t))]2σ^S2(t)


tj nj dj S^(tj+) Var^(S^(t+))
8 10 2 0.667 0.019
12 8 1 0.583 0.020
16 7 0 0.583 0.020

ψ^(15+)=log[log(S^(15))]=log[log(0.583)]=0.618σ^ψ(15+)=[S^(15)log(S^(15))]2σ^S2(15)=[(0.583)log(0.583)]2(0.020)=0.453


  • 95% CI of ψ(15+)=log[log(S(15+))] is

ψ^(15+)±σ^ψ(15+)z.9750.618±(0.453)(1.960)0.618±0.8871.505ψ(15+)0.269


  • 95% CI of S(15+) can be obtained as 1.505ψ(15+)0.2691.505log(log[S(15+)])0.269e0.269log[S(15+)]e1.505exp(e0.269)S(15+)exp(e1.505)0.270S(15+)0.801

  • 95% CI of S(15+)

    • Using the distribution of Z1 0.304S(15+)0.862

    • Using the distribution of Z2 0.270S(15+)0.801


Compariswon of 95% CI of S(15+) based of the distribution of Z1 and Z2

Homework

  • Obtain the 95% CI of S(15+) using the following transformations

    1. g(p)=log(p1p)

    2. g(p)=logp

Bootstrap CI

  • Nonparametric bootstrap methods can be used to obtain the sampling distributions of pivotal quantities Z1 and Z2

  • (α/2)- and (1α/2)-quantile of the sampling distribution constitutes a (1α)100% CI of S(t)


Steps for obtaining bootstrap CIs

  1. Observed data {(ti,δi),i=1,,n}, and S^(t) and σ^s(t) are MLE of S(t) and corresponding SE

  2. Generate a bootstrap sample {(ti,δi),i=1,,n} by sampling with replacement from {(ti,δi),i=1,,n}

  3. Obtain PL estimate S^(t) and the corresponding SE σ^s(t) from the bootstrap sample {(ti,δi),i=1,,n}

  4. Compute pivotal quantity Z1=S^(t)S^(t)σ^s(t)

  5. Repeat the steps 1–3 for B(>1000) number of times to obtain Z1,1,,Z1,B


  1. The q-quantile of Z1 is estimated by z(qB), where qB is an integer and z(qB) is (qB)th-smallest value among {Z1,1,,Z1,B}

  2. The (q2q1)100% CI of S(t), where (q2>q1) S^(t)z(q2B)σ^s(t)<S(t)<S^(t)z(q1B)σ^s(t)

  • E.g. For 95% CI, q2=.975 and q1=.025

Homework

  • Obtain bootstrap confidence interval for S(15+)

Example: Remission data

  • The following data are on lengths of remission for two groups (placebo and 6-MP) of leukemia patients

  • Objective was to examine whether the drug 6-MP is more effective than placebo


PL estimates and CIs (no transformation) for 6-MP group
tj nj dj S^(tj+) SE lower upper
6 21 3 0.857 0.076 0.707 1.000
7 17 1 0.807 0.087 0.636 0.977
10 15 1 0.753 0.096 0.564 0.942
13 12 1 0.690 0.107 0.481 0.900
16 11 1 0.627 0.114 0.404 0.851
22 7 1 0.538 0.128 0.286 0.789
23 6 1 0.448 0.135 0.184 0.712

PL estimates and CIs (no transformation) for placebo group
tj nj dj S^(tj+) SE lower upper
1 21 2 0.905 0.064 0.779 1.000
2 19 2 0.810 0.086 0.642 0.977
3 17 1 0.762 0.093 0.580 0.944
4 16 2 0.667 0.103 0.465 0.868
5 14 2 0.571 0.108 0.360 0.783
8 12 4 0.381 0.106 0.173 0.589
11 8 2 0.286 0.099 0.092 0.479
12 6 2 0.190 0.086 0.023 0.358
15 4 1 0.143 0.076 0.000 0.293
17 3 1 0.095 0.064 0.000 0.221
22 2 1 0.048 0.046 0.000 0.139
23 1 1 0.000 NaN NaN NaN


No transformation was used to obtain CIs

Logit transformation was used to obtain CIs

Log-log transformation was used to obtain CIs

Log transformation was used to obtain CIs

CIs for quantiles

  • For lifetime distribution, the quantiles tp are of more interest than mean of the distribution, e.g., the median t.50 is used as the measure of location for lifetime distribution

  • Median has some advantages over mean as a measure of location for lifetime distributions

  • Median always exist (provided S()<.5) and it is easier to estimate when data are censored


Estimates of quantiles

  • Nonparametric estimates of tp can be obtained from the PL estimates S^(t)

  • For a step function S^(t), the corresponding inverse function S^1(1p) is not uniquely defined

  • The estimate t^p could be either an intervals of times (t’s) or a specific value of times (t’s) depending on the point at which the line S^(t)=1p intersect the step function S^(t) t^p=min{ti:S^(ti)1p}


Estimated quartiles of the length of remission time for the two treatment groups

Estimated quartiles for two treatment groups
prob 6-MP placebo
0.25 13 4
0.50 23 8
0.75 NA 12

Confidence intervals of quantiles

  • 100(1α)% confidence interval of tp can be obtained by inverting the corresponding confidence interval of survival function S(t)

  • The lower limit tL(Data) of 100(1α)% confidence interval is defined as Pr(tptL(Data))=Pr(F(tp)F(tL(Data)))=Pr(S(tp)S(tL(Data)))=Pr(S(tL(Data))1p)

    • Similarly Pr(tptU(Data))=Pr(S(tU(Data))1p)

95% CI of median (t.5) remission time for placebo group of patients

Estimated two-sided 95% CI of different quartiles for two treatment groups
prob 6-MP placebo
0.25 (6,23) (2,8)
0.50 NA (4,11)
0.75 NA (8,17)
  • Confidence intervals for second and third quartiles cannot be estimated from this data set for the “6-MP” group because S^()<.5

Standard error of t^p

  • The expression of Var(t^p) can be obtained from the sampling distribution of S^(t) using delta method Var(S^(tp))=[dS^(tp)dtp]2|tp=t^pVar(t^p)=[f^(t^p)]2Var(t^p)Var(t^p)=Var(S^(tp))[f^(t^p)]2

    • Var(S^(tp)) is obtained using Greenwood’s formula

Confidence interval of tp

  • 100(1α)% CI for tp t^p±se(t^p)z1α/2 where se(t^p)=Var(S^(tp))[f^(t^p)]2

3.3 Descriptive and diagnostic plots

  • Plots of PL (S^(t)) or Nelson-Aalen (H^(t)) estimates can be used to provide good description of univariate lifetime data

  • These estimates are useful to assess the appropriateness of a parametric model

Plots of survivor functions

  • Data: {(ti,δi),i=1,,n}

  • S^(t) PL estimate of survivor function S(t)

  • Assume lifetimes follow a distribution with survivor function S(t;θ) and F(t;θ) is the corresponding distribution function, where θ is the parameter vector, e.g. for exponential distribution S(t;θ)=et/θt0,θ>0


  • Let θ^ is an estimate of θ and S(t;θ^) is the estimate of survivor function S(t;θ), e.g. for exponential distribution S(t;θ^)=et/θ^

  • If the model assumption (i.e. lifetimes follow a distribution with survivor function S(t;θ)) is appropriate then S(t;θ^) should not be very far from S^(t)

  • A comparison between S(t;θ^) and S^(t) can be used as a model assessment tool

  • A plot of S(t;θ^) and S^(t) on the same graph can be used to compare graphically


Example 3.3.1: Ball bearing data

The number of revolutions (in million) before failure for each of 23 ball bearings
17.88 41.52 48.40 54.12 68.64 84.12 105.12 128.04
28.92 42.12 51.84 55.56 68.64 93.12 105.84 173.40
33.00 45.60 51.96 67.80 68.88 98.64 127.92 NA

PL estimate of survival for ball bearing data

  • Assume Weibull and log-normal models for analyzing ball bearing data and want to assess which of these two models is appropriate for the data

    • Weibull model S(t;α,β)=e(t/α)βt0,α>0,β>0

    • Log-normal model S(t;μ,σ)=1Φ(logtμσ)t0,<μ<,σ>0


Estimated survivor functions

  • Weibull model S(t;α^,β^)=e(t/81.87)2.10

  • Log-normal model S(t;μ^,σ^)=1Φ(logt4.150.52)


Estimated survivor functions using Weibull and log-normal models

Figure 3.2: Comparison of Weibull and log-normal models with respect to PL estimates

  • shows that there is a good agreement between nonparametric PL estimates S^(t) and both the Weibull and log-normal models

  • Either Weibull or log-normal can be assumed to analyze ball bearing data


Probability plots

  • Let t1<<tk are k distinct failure times, and S(tj;θ^) and S^(tj) are corresponding parametric and non-parametric estimates of survivor function

  • Probability-probability (P-P) plot is defined as the scatter plot of {S(tj;θ^),S^(tj)},j=1,,k

  • If the assumed parametric model S(t;θ) is appropriate then all the points in the resulting scatter plot should lie around a straight line with slope one


Estimate of survivor function with nonparametric (PL) and parametric (Weibull, and log-normal) methods for smallest 10 failure times of ball bearing data
time PL Weibull Log-normal
17.88 0.978 0.960 0.992
28.92 0.935 0.894 0.934
33.00 0.891 0.862 0.895
41.52 0.848 0.787 0.792
42.12 0.804 0.781 0.784
45.60 0.761 0.747 0.737
48.40 0.717 0.718 0.698
51.84 0.674 0.682 0.651
51.96 0.630 0.681 0.649
54.12 0.587 0.658 0.620

P-P plot for comparing Weibull and log-normal models

Quantile-Quantile (Q-Q) plot

  • Quantile-quantile (Q-Q) plot compares observed quantiles and the corresponding quantiles estimated from the assumed parametric model S(t;θ)

  • Observed quantiles are observed distinct failure time tj corresponding to the PL estimates S^j


  • t(S^j,θ) are the quantiles corresponding to S^j obtained from the model S(t;θ) t(p,θ)={α[log(1p)]1/βWeibullexp[μ+σΦ1(p)]Log-normal

  • Q-Q plot is defined as the scatter plot of t(S^j,θ^) versus tj


Q-Q plot for comparing Weibull and log-normal models

Linearization method

  • This method is based on linearizing the survivor or distribution function of the assumed parametric model

  • There exists functions g1() and g2() such that g1[S(t;θ)] is a linear function of g2(t)

  • If the parametric model S(t;θ)] is appropriate, the plot of g1[S^(t)] versus g2(t) should be roughly linear, where S^(t) is the PL estimate

  • This method does not require to estimate the parameter θ


  • Exponential distribution S(t;α)=et/αlog[S(t;α]=t/α

    • log[S(t;α)] is a linear function of t, so a plot of logS^(t) versus t should be linear through the origin if the exponential model is appropriate

    • A graphical estimate of α can be obtained when the plot is roughly linear by fitting a straight line through the points


A plot of log(S^(t)) versus t to assess whether exponential model is appropriate for ball bearing data

  • Weibull distribution S(t;α,β)=e(t/α)βlog[logS(t;α,β)]=βlogtβlogα

    • A plot of log[logS^(t)] versus logt should roughly linear if a Weibull model is appropriate

    • When the plot is approximately linear, graphical estimates of α and β can be obtained by intercept and slope of the fitted straight line through the points


A plot of log[log(S^(t))] versus log(t) to asses whether Weibull model is appropriate for ball bearing data

Location-scale family

  • In general, linearization method of assessing the appropriateness of the assumed parametric model can be defined for location-scale family of models

  • Let transformed lifetime Y=g(T) has a location-scale distribution, e.g. g()=log()


  • Survivor function of Y can be defined as Pr(Yy)=Pr(Yubyub)=Pr(Zyub)=S0(yub)=Pr(Tt)=S(t)
    • t=g1(y), <u<, b>0

  • It can be shown that S0(yub)=S(t)S01[S(t)]=ybub is a linear of y=g(t)

  • A plot of S01[S^(t)] versus g(t) should be roughly linear if the assumed family of models is appropriate.


  • Expressions of S0() and S01() for log-location-scale family of models S0(z)={exp(ez)Extreme value(1+ez)1Logistic1Φ(z)NormalS01(p)={log(log(p))Extreme valuelog1ppLogisticΦ1(1p)Normal
  • For all these three distributions, y=g(t)=log(t)

  • In general, for distinct failure times, a plot of points (log(tj),S01(S^j)),j=1,,k can be used to assess whether the assumed location-scale model is appropriate

  • Graphical estimates of u and b can be obtained from the lines fitted to the points, where b1 and u are estimated by the slope and log(t)-intercept, respectively

  • Graphical estimates can be used as initial values of the optimization routines that are used for estimating model parameters

  • Since these plots are subject to sampling variations, these are often used for informal model assessment


Comparison of models of log-location-scale family

Hazard plots

  • The plotting procedures described for survivor functions S(t) can also be described for cumulative hazards function H(t)

    • Survivor function of Weibull distribution S(t;α,β)=e(t/α)βlog[logS(t;α,β)]=βlog(t)βlog(α)

    • Cumulative hazard function of Weibull distribution H(t;α,β)=(tα)βlogH(t;α,β)=βlog(t)βlog(α)


  • An alternative of plotting log[logS^(t)] versus log(t) would be to plot logH^(t) versus log(t) can be used to assess the appropriateness of the Weibull model for the data at hand

    • S^(t) PL estimate

    • H^(t) Nelson-Aalen estimate

  • The plots based on survivor function and cumulative hazard function could differ slightly, specially for a large t, because logS^(t)H^(t) for discrete data

Acknowledgements

This lecture is adapted from materials created by Mahbub Latif

References

Aalen, O. O. 1975. Statistical Inference for a Family of Counting Processes. Institute of Mathematical Statistics, University of Copenhagen. https://books.google.com.bd/books?id=sn1QAQAAMAAJ.
Kaplan, Edward L, and Paul Meier. 1958. “Nonparametric Estimation from Incomplete Observations.” Journal of the American Statistical Association 53 (282): 457–81.
Nelson, Wayne. 1969. “Hazard Plotting for Incomplete Failure Data.” Journal of Quality Technology 1 (1): 27–52.