Chapter 7

(AST405) Lifetime data analysis

Author

Md Rasel Biswas

7 Semiparametric Multiplicative Hazards Regression Models

7.1 Methods for continuous multiplicative hazards model

  • Models in which covariates have a multiplicative effect on the hazard function play an important role in the analysis of lifetime data

  • Proportional hazard (PH) model is one of such models

  • Depending on whether baseline hazard function is left arbitrary or not, PH model could be either semiparametric or parametric

  • In this section, semiparametric PH models are discussed, where baseline hazard function is left arbitrary


  • The hazard function is modeled as (7.1)h(t|x)=h0(t)exp(xβ)=h0(t)exp(β1x1+β2x2++βpxp)

    • h(t|x) = hazard at time t for a person with covariates x
    • h0(t) = baseline hazard (unspecified)
    • β=(β1,,βp) vector of regression coefficients
    • Covariate vector x could include time-varying covariate
    • No intercept term is included in xβ
  • Model () is known as “Cox’s proportional hazards model” or simply “Cox model”

  • No distributional assumption is required for estimating the parameters of the Model ()


  • The cumulative baseline hazard function is defined as H0(t)=0th0(u)du

  • The baseline survivor function S0(t)=exp[H0(t)]

  • The survivor function of T given covariate vector x S(t|x)=[S0(t)]exp(xβ)

Estimation of model parameters

  • Data {(ti,δi,xi),i=1,,n}

  • Parameters of interest are h0(t) and β


Log-likelihood function (h0(t),β)=logi=1n[f(ti;xi)]δi[S(ti;xi)]1δi=i{δilog[h0(ti)exp(xiβ)]+exp(xiβ)logS0(ti)}=i{δi[logh0(ti)+xiβ]+exp(xiβ)logS0(ti)]}

  • No unique solutions of the parameters because the number of parameters to be estimated is greater than the number of observations

  • Complete likelihood function is not useful for estimating parameters of Cox’s proportional hazards model

  • There are a number of different likelihood functions defined for estimating parameters, of which Cox’s “partial likelihood function” is widely used for PH models

  • Log-partial-likelihood function is defined as 1(β)=logi=1n(exp(xiβ)k=1nYk(ti)exp(xkβ))δi

    • Yk(t)=I(tkt) indicates whether the kth subject is still in the risk set at time t or not

  • Partial likelihood function can be treated as a regular likelihood function for making statistical inference

  • For partial likelihood function, the parameters of interest is β and the estimated parameters β^=argmaxβΘ1(β) follow asymptotically normal distribution, similar to MLEs

  • The baseline hazard functions are estimated from the full likelihood function with regression parameters are assumed to be known, i.e. (h0(t),β^)


  • Obtain the expression of partial likelihood function for the following censored sample
time x
3 1
5 0
8 1
4+ 1
10 0

7.2 Comparison of two or more lifetime distributions

  • Let Sj(t) be the survivor function of lifetime Tj, j=1,2

  • Data available {(ti,δi,xi),i=1,,n}

    • xi=I(ith subject is from group 1)
  • Null hypothesis H0:S1(t)=S2(t)


  • Consider PH model h(t|x)=h0(t)exp(βx)S(t|x)=[S0(t)]exp(βx)

  • We can obtain S2(t)=S(t|x=0)=S0(t)S1(t)=S(t|x=1)=[S0(t)]exp(β)=[S2(t)]exp(β)


  • The null hypothesis under proportional model assumption H0:S1(t)=S2(t)H0:β=0

  • Large sample-based property of MLE β^ can be used to test the null hypothesis


  • Log-likelihood function (β)=logi=1n(eβxik=1nYk(ti)eβxk)δi=i=1n(δixiβδilogk=1nYk(ti)eβxk)

  • Score function U(β)=i=1n(δixiδik=1nYk(ti)eβxkxkk=1nYk(ti)eβxk)=i=1n(d1idin1ieβn1ieβ+n2i)

    • di=δi

    • d1i=δixi=I(ith subject from group 1)

    • n1i=k=1nYk(ti)xk number of group 1 subjects at risk at time ti

    • n2i=k=1nYk(ti)(1xk) number of group 2 subjects at risk at time ti


  • Information matrix I(β)=din1ieβn1ieβdi(n1ieβ+n2i)n1ieβ(n1ieβ+n2i)2=din1in2ieβ(n1ieβ+n2i)2

  • Confidence interval for β can be obtained from the following pivotal quantity Z(β)=U(β)[I(β)]1/2 which follows an asymptotic standard normal distribution

  • 100(1α)% confidence interval for β can be obtained from the set of values of β that satisfy Z(β)z1α


  • Under H0:β=0 U(0)=i=1n(d1idin1in1i+n2i)I(0)=i=1ndin1in2i(n1i+n2i)2

  • Test statistic Z=U(0)[I(0)]1/2N(0,1)

    • MLE of β does not require to test H0:β=0 using the statistic Z

  • The expression of U(0) can be considered as the difference between observed number of deaths from group 1, (d1i), at time ti and the corresponding expected number of deaths di×n1in1i+n2i

  • At time ti, there are ni=n1i+n2i subjects are at risk and di is either 0 or 1 (i.e. there is no ties in the lifetime)

group event alive at risk
1 d1i n1id1i n1i
2 d2i n2id2i n2i
di nidi ni

  • This score test for the Cox model to compare two groups is also known as log-rank test.

Example 7.1.1

Data below show remission times (in weeks) for 40 leukemia patients who were randomly assigned either treatment A or B

tab7_1_1
# A tibble: 40 × 3
    time status group
   <dbl>  <dbl> <chr>
 1     1      1 A    
 2     3      1 A    
 3     3      1 A    
 4     6      1 A    
 5     7      1 A    
 6     7      1 A    
 7    10      1 A    
 8    12      1 A    
 9    14      1 A    
10    15      1 A    
# ℹ 30 more rows

survdiff(Surv(time, status) ~ group, 
         data = tab7_1_1) 
Call:
survdiff(formula = Surv(time, status) ~ group, data = tab7_1_1)

         N Observed Expected (O-E)^2/E (O-E)^2/V
group=A 20       17     21.5     0.951      2.36
group=B 20       20     15.5     1.322      2.36

 Chisq= 2.4  on 1 degrees of freedom, p= 0.1 

coxph(Surv(time, status) ~ group, data = tab7_1_1) %>% 
    tidy()
# A tibble: 1 × 5
  term   estimate std.error statistic p.value
  <chr>     <dbl>     <dbl>     <dbl>   <dbl>
1 groupB    0.503     0.332      1.51   0.130

Example 7.2.1

  • Patients with cystic fibrosis are susceptible to an accumulation of mucus in lungs, which leads to pulmonary exacerbation and deterioration of lung function

  • A clinical trial was conducted to investigate the efficacy of the new drug DNase-1

    • Subjects are randomly assigned to a new treatment or a placebo
  • Time of interest is the time to first exacerbation after randomization and data on fev (forced expiatory volume at the time of randomization) are also measured



Creating the data from the R object rhDNase

tab1_4 <- as_tibble(rhDNase) %>% 
  filter(is.na(ivstart) | ivstart > 0) %>% 
  mutate(time0 = as.numeric(end.dt - entry.dt),
         status = as.numeric(!is.na(ivstart)),
         time = if_else(status == 1, ivstart, time0),
         fevm = fev - mean(fev)) %>% 
  group_by(id) %>% 
  mutate(visit = n()) %>% 
  ungroup()

Cox’s PH model h(t|trt, fevm)=h0(t)exp(β1trt+β2fevm) R code for fitting the model

mod1 <- coxph(Surv(time,  status) ~ trt + fevm, 
              data = tab1_4)

Estimates of regression coefficients

tidy(mod1)
# A tibble: 2 × 5
  term  estimate std.error statistic  p.value
  <chr>    <dbl>     <dbl>     <dbl>    <dbl>
1 trt    -0.352    0.106       -3.31 9.47e- 4
2 fevm   -0.0188   0.00226     -8.31 9.63e-17
  • Treatment group patients have lower hazard for time to first exacerbation

  • As FEV value increases the hazard of first exacerbation decreases

  • Effects of treatment and FEV are significant on the hazard of first exacerbation decreases


Estimates and corresponding confidence intervals of the parameters of Cox's PH model
term estimate p.value HR 2.5 % 97.5 %
trt -0.352 0.001 0.703 0.571 0.867
fevm -0.019 0.000 0.981 0.977 0.986
  • Treatment group patients have about 30% lower hazard of first exacerbation than that of the placebo group patients provided FEV value remains constant

  • For 1-unit increase of FEV value, hazard of first exacerbation decreases about 2% provided treatment group remains constant


survfit() provides estimate of survivor function and corresponding standard errors

tidy(survfit(mod1)) %>% 
    as_tibble()
# A tibble: 161 × 8
    time n.risk n.event n.censor estimate std.error conf.high conf.low
   <dbl>  <dbl>   <dbl>    <dbl>    <dbl>     <dbl>     <dbl>    <dbl>
 1     1    761       1        0    0.999   0.00138     1        0.996
 2     5    760       3        0    0.994   0.00277     1.00     0.989
 3     6    757       1        0    0.993   0.00311     0.999    0.987
 4     8    756       4        0    0.988   0.00420     0.996    0.979
 5     9    752       3        0    0.983   0.00489     0.993    0.974
 6    11    749       2        0    0.981   0.00530     0.991    0.971
 7    13    747       2        0    0.978   0.00569     0.989    0.967
 8    14    745       2        0    0.975   0.00606     0.987    0.964
 9    15    743       4        0    0.970   0.00675     0.983    0.957
10    16    739       2        0    0.967   0.00708     0.980    0.953
# ℹ 151 more rows