Chapter 5B

(AST405) Lifetime data analysis

Author

Md Rasel Biswas

5 Inference Procedures for Log-location-scale Distributions

5.1 Log-normal and normal distributions

Log-normal distribution

  • T follows a log-normal distribution with location parameter μ and scale parameter σ if Y=logTN(μ,σ2)

  • The pdf and survivor function of log-normal distribution f(t;μ,σ)=1σt2πexp[12(logtμσ)2]S(t;μ,σ)=1Φ(logtμσ)

    • μ and σ are the parameters of both normal and log-normal distributions

    • Φ() cumulative distribution function of standard normal distribution


  • Log-normal distribution is a member of the log-location-scale family of distributions and the corresponding location-scale distribution is normal with S0(z)=1Φ(z)f0(z)=12πez2/2=ϕ(z)
    • ϕ() pdf of standard normal distribution

    • z=(yμ)/σ


  • Density function of log-lifetime f(y;μ,σ)=1σf0(yμσ)=1σ2πexp[12(yμσ)2]

  • Survivor function of log-lifetime S(y;μ,σ)=S0(yμσ)=1Φ(yμσ)

Likelihood function normal distribution

  • Data {(ti,δi),i=1,,n}

  • Log-likelihood function (μ,σ)=logi=1n[(1/σ)f0(zi)]δi[S0(zi)]1δi=rlogσ+i=1nδilogf0(zi)+i=1n(1δi)logS0(zi)=rlogσ12i=1nδizi2+i=1n(1δi)logS0(zi)

    • zi=(yiμ)/σ and yi=logti

    • r=i=1nδi


  • Elements of hessian matrix and score function depend on the followings logf0(z)z=z2logf0(z)z2=1logS0(z)z=f0(z)S0(z)2logS0(z)z2=zf0(z)S0(z)[f0(z)S0(z)]2

  • MLEs (μ^,σ^)=argmaxΘ(μ,σ)

    • Sampling distribution (μ^,σ^)N((μ,σ),V) where V^=[H(μ^,σ^)]1
  • Confidence intervals of parameters, quantiles, and survival probabilities can be obtained using the methods described for Weibull models


  • Estimate of survivor function (Log-normal distribution) S(t;μ^,σ^)=1Φ(logtμ^σ^)=1Φ(ψ^) where ψ^=Φ1(1S(t;μ^,σ^))=logtμ^σ^

    • Standard error of ψ^ se(ψ^)=aV^a where a=(1/σ^,ψ^/σ^)

Estimate of survivor function

  • (1α)100% CI of S(t) L<ψ<UL<Φ1(1S(t;μ,σ))<UΦ(L)<1S(t;μ,σ)<Φ(U)1Φ(U)<S(t;μ,σ)<1Φ(L) where L=ψ^z1α/2se(ψ^)U=ψ^+z1α/2se(ψ^)

  • LRT statistics based method of obtaining CI for survivor function is described with H0:S(y0)=S(logt0)=s0

  • The 100(1α)% CI for S(t) can be obtained from the values of s0 that satisfy Λ(s0)=2(μ^,σ^)2(μ~,σ~)χ(1),1α2


  • Unrestricted and unrestricted MLEs are obtained as unrestricted(μ^,σ^)=argmaxΘ(μ,σ)restricted(μ~,σ~)=argmaxΘ(y0σΦ1(1s0),σ) where under H0, we can show S(y0)=1Φ(y0μσ)=s0μ=y0σΦ1(1s0)

Quantiles

  • The expression of estimate of yp y^p=μ^+σ^wp where for normal distribution wp=S01(1p)=Φ1(p)
  • Standard error of y^p
    se(y^p)=aV^a where a=(1,wp)

Homework

  • Obtain the expressions of Wald-type and LRT based 100(1α)% confidence intervals of yp

Example 5.3.1

  • Data are available on lifetimes (in thousand miles) of 96 locomotive controls, of which were failed.

  • The test was terminated after 135K miles, so 59 lifetimes were censored at 135K.


dat_ex531
# A tibble: 96 × 2
    time status
   <dbl>  <int>
 1  22.5      1
 2  37.5      1
 3  46        1
 4  48.5      1
 5  51.5      1
 6  53        1
 7  54.5      1
 8  57.5      1
 9  66.5      1
10  68        1
# ℹ 86 more rows

dat_ex531 %>% 
  count(status)
# A tibble: 2 × 2
  status     n
   <int> <int>
1      0    59
2      1    37

Log-normal and normal model fit

mod_LN <- survreg(Surv(time, status) ~ 1, 
                  dist = "lognormal",
                  data = dat_ex531)
mod_N <- survreg(Surv(log(time), status) ~ 1, 
                dist = "gaussian",
                data = dat_ex531)

MLEs (μ^,logσ^) and corresponding standard errors

tidy(mod_LN)
# A tibble: 2 × 5
  term        estimate std.error statistic p.value
  <chr>          <dbl>     <dbl>     <dbl>   <dbl>
1 (Intercept)    5.19      0.129     40.3    0    
2 Log(scale)    -0.136     0.131     -1.04   0.297

Estimated variance of (μ^,logσ^)

mod_LN$var
            (Intercept) Log(scale)
(Intercept)  0.01657557 0.00983969
Log(scale)   0.00983969 0.01703353

Estimated variance V(μ^,σ^) from V(μ^,logσ^)

  • GV(μ^,logσ^)G
           [,1]       [,2]
[1,] 0.01657557 0.00858735
[2,] 0.00858735 0.01297359

G=[100exp(σ)]


95% Confidence intervals for μ and σ
par lower upper lower upper
μ 4.942 5.447 5.000 5.400
σ 0.676 1.127 0.709 1.109

  • Estimate of S(80) S(80;μ^,σ^)=1Φ(log80μ^σ^)=0.824

    • μ^=5.195andσ^=0.873

Comparison of the estimates of survivor function

`


Estimate and confidence interval of S(80)
parameter est lower upper
S(80) 0.824 0.667 0.924
  • Obtain LRT based 95% CI for S(80)

Quantiles

  • General expression of pth quantile of log-lifetime (μ^=5.195 and σ^=0.873) y^p=μ^+σ^wp
    • wp=Φ1(p)

Estimate and confidence intervals of different quantiles of locomotive controls lifetime (normal distribution)
p wp y^p se lower upper
0.25 -0.674 4.606 0.105 NA NA
0.50 0.000 5.195 0.129 NA NA
0.75 0.674 5.783 0.194 NA NA

5.2 Log-logistic and logistic distributions

Log-logistic distribution

  • T follows a log-logistic distribution with parameters α (scale) and β (shape) if Y=logT follows a logistic distribution with parameters u (location) and b (scale)

  • The pdf, survivor, and hazard function of log-logistic distribution f(t;α,β)=(β/α)(t/α)β1[1+(t/α)β]2S(t;α,β)=[1+(t/α)β]1h(t;α,β)=(β/α)(t/α)β1[1+(t/α)β]

Logistic distribution

  • Log-logistic distribution is a member of the log-location-scale family of distributions and the corresponding location-scale distribution is logistic with S0(z)=11+ezf0(z)=ez(1+ez)2

    • z=(yu)/b

  • Density function of log-lifetime f(y;u,b)=1bf0(yub)=(1/b)exp[(yu)/b]{1+exp[(yu)/b]}2

  • Survivor function of log-lifetime S(y;u,b)=S0(yub)=11+exp[(yu)/b]


  • Data: {(ti,δi),i=1,,n}

  • Log-likelihood function (μ,σ)=logi=1n[(1/b)f0(zi)]δi[S0(zi)]1δi=rlogb+i=1nδilogf0(zi)+i=1n(1δi)logS0(zi)=rlogb+i=1n[δi{zilog(1+ezi)}log(1+ezi)]

    • zi=(yiu)/b and yi=logti

    • r=i=1nδi


  • Elements of hessian matrix and score function depend on the followings logf0(z)z=12ez1+ez2logf0(z)z2=2f0(z)logS0(z)z=ez1+ez2logS0(z)z2=ez(1+ez)2

  • MLEs (u^,b^)=argmaxΘ(u,b)

  • Sampling distribution (u^,b^)N((u,b),V) where V^=[H(u^,b^)]1

  • Confidence intervals of parameters, quantiles, and survival probabilities can be obtained using the methods described for Weibull models


  • Estimate of survivor function (logistic distribution) S0(yu^b^)=S(y;u^,b^)=11+exp[(yu^)/b^]log[1S(y)S(y)]=yu^b^=ψ^=S01(S(y))

    • Standard error of ψ^ se(ψ^)=aV^a,where a=(1/b^,ψ^/b^)

(1α)100% CI of S(t)

L<ψ<UL<log1S(y)S(y)<Uexp(L)<1S(y)S(y)<exp(U)1+exp(L)<1+1S(y)S(y)<1+exp(L)11+exp(U)<S(y)<11+exp(L) where L=ψ^z1α/2se(ψ^)U=ψ^+z1α/2se(ψ^)

Estimate of survivor function

  • LRT statistics based method of obtaining CI for survivor function is described with H0:S(y0)=S(logt0)=s0

  • The 100(1α)% CI for S(t) can be obtained from the values of s0 that satisfy Λ(s0)χ(1),1α2 where Λ(s0)=2(u^,b^)2(u~,b~)


  • Unrestricted and unrestricted MLEs are obtained as unrestricted(u^,b^)=argmaxΘ(u,b)restricted(u~,b~)=argmaxΘ(y0blog{(1s0)/s0},b) where under H0, we can show S(y0)=s0u=y0blog1s0s0

Quantiles

  • The expression of estimate of yp y^p=u^+b^wp where for normal distribution wp=S01(1p)=logp1p
  • Standard error of y^p
    se(y^p)=aV^a where a=(1,wp)

Homework

  • Obtain the expressions of Wald-type and LRT based 100(1α)% confidence intervals of yp

Example 5.3.1

  • Data are available on lifetimes (in thousand miles) of 96 locomotive controls, of which were failed.

  • The test was terminated after 135K miles, so 59 lifetimes were censored at 135K.


dat_ex531
# A tibble: 96 × 2
    time status
   <dbl>  <int>
 1  22.5      1
 2  37.5      1
 3  46        1
 4  48.5      1
 5  51.5      1
 6  53        1
 7  54.5      1
 8  57.5      1
 9  66.5      1
10  68        1
# ℹ 86 more rows

dat_ex531 %>% 
  count(status)
# A tibble: 2 × 2
  status     n
   <int> <int>
1      0    59
2      1    37

Log-logistic and logistic model fit

mod_LL <- survreg(Surv(time, status) ~ 1, 
                  dist = "loglogistic",
                  data = dat_ex531)
mod_L <- survreg(Surv(log(time), status) ~ 1, 
                dist = "logistic",
                data = dat_ex531)

MLEs (u^,logb^)

[1]  5.1206418 -0.8266704

Estimated variance of (u^,logb^)

            (Intercept)  Log(scale)
(Intercept) 0.010490062 0.007837215
Log(scale)  0.007837215 0.022515937

MLEs of (u^,b^)

[1] 5.1206418 0.4375036

Estimated variance of (u^,b^)

            [,1]        [,2]
[1,] 0.010490062 0.003428809
[2,] 0.003428809 0.004309761

95% Confidence intervals for location and scale parameters
dist par est lower upper lower upper
Logistic u 5.121 4.920 5.321 5.000 5.300
NA b 0.438 0.326 0.587 0.360 0.559
Gaussian μ 5.195 4.942 5.447 5.000 5.400
NA σ 0.873 0.676 1.127 0.709 1.109

  • Estimate of S(80) (log-logistic distribution) S(80;u^,b^)=11+exp[(log80u^)/b^]=0.844

    • u^=5.121andb^=0.438

Comparison of the estimates of survivor function

`


Estimate and corresponding Wald-type confidence interval of the survival probability S(80)
dist est lower upper
Log-logistic 0.844 0.566 0.957
Log-normal 0.824 0.667 0.924
  • Obtain LRT based 95% CI for S(80)

Quantiles

  • General expression of pth quantile of log-lifetime (u^=5.121 and b^=0.438) y^p=u^+b^wp
    • wp=logp1p

Estimate and confidence intervals of different quantiles
dist p wp y^p se lower upper
Logistic 0.25 -1.099 4.640 0.143 NA NA
NA 0.50 0.000 5.121 0.102 NA NA
NA 0.75 1.099 5.601 0.234 NA NA
Gaussian 0.25 -0.674 4.826 0.101 NA NA
NA 0.50 0.000 5.121 0.102 NA NA
NA 0.75 0.674 5.416 0.177 NA NA

Homework

  • Analyze the locomotive control lifetimes using Weibull model and compare the results

5.3 Comparison of distributions

  • Let Tji be the lifetime of ith subject of the jth group (i=1,,nj, j=1,,m)

  • Assume Tji follows a distribution of log-location-scale family with parameters αj (scale) and βj (shape)

  • The corresponding distribution of log-lifetime Yji=logTji is of a location-scale family distribution with parameters uj (location) and bj (scale) uj=logαjandbj=(1/βj)

Survivor functions

  • The survivor function of Yji=logTji Sj(y)=S0(yujbj)

  • The survivor function of Tji Sj(t)=S0[(t/αj)βj]

    • S0(x)=S0(logx)

    • uj=logαj

    • bj=(1/βj)


  • Comparison of several normal populations is a well-known problem in statistics, where equal population variances are assumed, and the comparisons are performed on the basis of equality of population means

Quantile

  • General expression of the pth quantile of the jth population takes the form yjp=uj+bjwp,j=1,,m
    • wp=S01(1p)

Equality of two populations

  • When the scales are not equal (i.e. b1b2), the difference between the pth quantiles does depend on the probability p y1py2p=u1u2+wp(b1b2)

  • Under the assumption of equality of the scales (i.e. b1=b2), difference between pth (log-lifetime) quantile of a pair of populations (say 1 and 2) is constant, i.e. it does not depend on the probability p(0,1) y1py2p=u1u2


  • The difference between two log-lifetime quantiles can be expressed in terms of the ratio of lifetime quantiles y1py2p=u1u2logt1plogt2p=logα1logα2t1p/t2p=α1/α2

  • The ratio of the pth quantiles of two lifetime distributions does not depend on the probability p when the corresponding shape parameters are equal (β1=β2)


  • Equality of all quantiles of two distributions, i.e. y1p=y2pp(0,1), corresponds to equality of two distributions, i.e. S1(y)=S2(y)

  • Under the assumption of common scale (shape for lifetime) parameter, the null hypothesis of equality of two distributions can be expressed as H0:u1u2=0orH0:(α1/α2)=1


  • Equality of two populations with survivor functions (say S1 and S2) can be expressed in terms of survivor functions

  • Since y1p=y2p+u1u2ort1p=t2p(α1/α2), the corresponding survivor functions can be expressed as S1(y+u1u2)=S2(y)S1(t(α1/α2))=S2(t)

  • That is, the survivor functions for Y are translations of one another by an amount (u1u2) along the y-axis

Wald-type statistic

  • Data {(tji,δji),i=1,2} and yji=logtji

  • Two populations can be compared in terms of pth quantile H0:y1p=y2p

  • Corresponding pivotal quantity Zp=(y^1py^2p)(y1py2p)[var(y^1p)+var(y2p)]1/2N(0,1)under H0

    • The statistic Zp can be used to obtain confidence interval for (y1py2p)

  • To test H0:b1=b2, the following pivotal quantity can be considered Zb=(logb^1logb^2)(logb1logb2)[var(logb^1)+var(logb^2)]1/2N(0,1)under H0

    • The statistic Zb can be used to obtain confidence interval for (b1/b2)

  • When scales are equal, two populations can be compared with respect their location parameter H0:u1=u2

  • The corresponding pivotal quantity Zu=(u^1u^2)(u1u2)[var(u^1)+var(u^2)]1/2N(0,1)under H0

    • The statistic Zu can be used to obtain confidence interval for (u1u2)

  • Wald statistic cannot be used to test H0:u1=u2,b1=b2

LRT based inference

  • Data {(tji,δji),j=1,,m,i=1,,nj} and yji=logtji

  • Different tests and confidence intervals of interest

    1. H0:b1==bm

    2. Confidence interval for (b1/b2)

    3. Equality of several location parameters when scale parameters are equal H0:u1==um,b1==bmH1:all uj’s are not equal,b1==bm

    4. Confident interval for (u1u2) when b1=b2

    5. Confidence interval for (y1py2p) when b1b2

Case 1

  • Hypothesis of interest (5.1)H0:b1==bm=b(say)

  • Log-likelihood function (u1,,um,b1,,bm)=j=1mj(uj,bj)

  • Contribution to log-likelihood function for the jth population j(uj,bj)=rjlogbj+i=1nj[δilogf0(zji)+(1δji)logS0(zji)]

    • rj=iδji

  • LRT statistic Λ=2(u^1,,u^m,b^1,,b^m)2(u~1,,u~m,b~,,b~)

    • Λχ(m1)2 under the null hypothesis defined in
  • MLEs

    • (u^j,b^j)=argmaxΘj(uj,bj),j=1,,m

    • (u~1,,u~m,b~,,b~)=argmaxΘ(u1,,um,b,,b)

Case 2

  • To obtain confidence interval of (b1/b2), consider H0:(b1/b2)=aH0:b1=ab2

  • 100(1α)% confidence interval of (b1/b2) can be obtained from the range of a values that satisfy Λ(a)χ(1),1α2, where the LRT statistic Λ(a)=2(u^1,u^2,b^1,b^2)2(u~1,u~2,ab~2,b~2)

    • (u^j,b^j)=argmaxΘj(uj,bj),j=1,2

    • (u~1,u~2,b~2)=argmaxΘ(u1,u2,ab2,b2)

Case 3

  • Test equality of several location parameters when scale parameters are equal H0:u1==um,b1==bmH1:all uj’s are not equal,b1==bm

  • MLEs

    • under H0, (u,b)=argmaxΘ(u,,u,b,,b)

    • under H1, (u~1,,u~m,b~)=argmaxΘ(u1,,um,b,,b)

  • LRT statistic Λ=2(u~1,,u~m,b~,,b~)2(u,,u,b,,b)

    • Under the null hypothesis, Λ follows χ(m1)2 distribution

Case 4

  • To obtain a confidence interval of (u1u2) when b1=b2, consider the null and alternative hypothesis H0:u1u2=δ,b1=b2vsH1:u1u2δ,b1=b2

  • LRT statistic Λ(δ)=2(u~1,u~2,b~,b~)2(u2+δ,u2,b,b)

    • under H0, (u,b)=argmaxΘ(u,u,b,b)

    • under H1, (u~1,u~2,b~)=argmaxΘ(u1,u2,b,b)

  • 100(1α) confidence interval for (u1u2) can be obtained from the set of δ values that satisfy Λ(δ)χ(1),1α2

Case 5

  • When b1b2, to obtain confidence interval for (y1py2p) consider the following hypothesis H0:y1py2p=ΔH0:u1u2=Δ+(b2b1)wp
    • wp=S01(1p)
  • LRT statistic Λ(Δ)=2(u^1,u^2,b^1,b^2)2(u~1,u~2,b~1,b~2)

  • under H0 (u~1,u~2,b~1,b~2)=argmaxΘ(u2+Δ+(b2b1)wp,u2,b1,b2)

  • under H1 (u^1,u^2,b^2,b^1)=argmaxΘ(u1,u2,b1,b2)

  • 100(1α) confidence interval for (y1py2p) can be obtained from the set of Δ values that satisfy Λ(Δ)χ(1),1α2

Comparison of Weibull or extreme value distributions

  • Assume TjiWeibull(αj,βj) (j=1,,m,i=1,,nj)

    • Data {(tji,δji),j=1,,m,i=1,,nj}
  • Survivor function of Weibull distribution Sj(t)=exp[(t/αj)βj]

  • Survivor function of extreme value distribution Sj(y)=exp[e(yuj)/bj]

    • uj=logαj

    • bj=1/βj

Example 5.4.1

  • Data of the following table are on the time to breakdown of electrical insulating fluid subject to a constant voltage stress in a lifetest experiment


Estimate of voltage-specific extreme value models
voltage u^j±se(u^j) b^j±se(b^j)
26 6.862±1.104 1.834±0.885
28 5.865±0.486 1.022±0.474
30 4.351±0.302 0.944±0.303
32 3.256±0.486 1.781±0.254
34 2.503±0.315 1.297±0.211
36 1.457±0.309 1.125±0.221
38 0.001±0.273 0.734±0.367

Comparison of estimated survivor function

LRT (Case 1)

  • Null hypothesis H0:b1==b7

  • LRT statistic Λ=2(u^1,,u^7,b^1,,b^7)2(u~1,,u~7,b~,,b~)=2(132.181)2(136.578)=8.794

    • p-value Pr(χ(6)2Λ)=0.185 It does not provide enough evidence to reject the null hypothesis of equality of the scale parameters.

Confidence interval of (b1/b2) (Case 2)

  • Wald-type (logb^1logb^2)±z1α/2se(logb^1logb^2)(b^1/b^2)e±z1α/2se(logb^1logb^2)(1.834/1.022)e±(1.96)(0.624)0.529<(b1/b2)<6.095

    • Similarly confidence intervals for (bj/bj) j>j can be obtained

Estimate and 95% confidence interval of pair-wise comparisons of scale parameters (bj/bj)

Case 3

  • Equality of all location parameters when scales are equal H0:u1==um,b1==bm

  • LRT statistic Λ=2(u~1,,u~m,b~,,b~)2(u,,u,b,,b)=2(136.578)2(176.584)=80.013

    • p-value P(χ(1)2>80.013)<.001 There is a strong evidence against the assumption of equality of m location parameters

Case 4

  • Wald-type confidence interval of (u1u2) (u^1u^2)±z1α/2se(u^1u^2)(6.8624.351)±(1.96)(1.206)1.367<(u1u2)<3.361

    • There is no significant difference between u1 and u2

Estimate and 95% confidence interval of pair-wise comparisons of location parameters (ujuj)

Case 5

  • General expression of pth quantile of the group j yjp=uj+bjwp,j=1,,m

  • Difference of pth quantile between groups 1 and 2 y1py2p=u1u2+(b1b2)wp

    • 95% confidence interval for the difference of median between groups 1 and 2 y^1my^2m±z1α/2se(y^1my^2m)(6.195.491)±(1.96)(1.291)1.831<(y1my2m)<3.231

Estimate and 95% confidence interval of pair-wise comparisons of medians (yj,.5yj,.5)

Homework

  • Analyse the breakdown time data using log-logistic and log-normal distributions and compare the results with that of Weibull distribution