time | status |
---|---|
2 | 1 |
72 | 0 |
51 | 1 |
60 | 0 |
33 | 1 |
27 | 1 |
14 | 1 |
24 | 1 |
4 | 1 |
21 | 0 |
Chapter 4
(AST405) Lifetime data analysis
4 Inference procedures for parametric models
Introduction
-
Likelihood methods for lifetime data were introduced in Chapter 2, which includes derivation of likelihood function for different types of censored data
Maximum likelihood estimator
Inference about parameters (hypothesis testing and confidence intervals)
Likelihood function (Complete data)
Let
be a random sample from a distribution , where is a -dimensional vector of parametersLikelihood and log-likelihood function
Maximum likelihood estimator (MLE)
Statistical inference (Complete data)
-
Large sample property of MLE
Fisher expected information matrix
Observed information matrix
Likelihood function (Type I or random censoring)
-
Data:
is a sample realization oflifetime
follows a distribution with pdf and the corresponding survivor functioncensoring time
could be random or fixed depending on the censoring mechanism , censoring indicator
Data for type I or random censoring
Likelihood function
Likelihood function (Type II censoring)
Let lifetime
follows a distribution with pdf and the corresponding survivor functionThe experiment was terminated after observing
smallest lifetimesThe remaining
observations are considered as censored atLikelihood function
Statistical inference (censored sample)
For censored samples, the result “asymptotic distribution of MLE is normal” is still valid
The expression of the Fisher expected information matrix
is complex for censored data, observed information matrix is used in making inference with censored sample
4.1 Exponential distribution
The exponential distribution is the first lifetime model for which statistical methodology were extensively developed
Exact tests can be developed for exponential distribution for certain type of censoring mechanism
Exponential distribution assume constant hazard and its use is limited for analyzing real life problems
Probability density function
Hazard function
Survivor function
th quantile
Homework
- Estimation and related inference of exponential distribution when the sample has no censored observations
Type I or random censoring
Lifetime
,Data:
Likelihood function
Likelihood function
-
Log-likelihood function
-
the number of failures observed in the sample
-
Log-likelihood function
-
MLE
- Assuming
and no finite MLE exist for
- Assuming
- Information matrix
- Replacing the parameter
by its MLE , the observed information matrix becomes
Confidence interval (Method I)
Using
’s asymptotic distributionFor a small sample,
does not approximate the standard normal distribution very accuratelyFor a sample with a small number of uncensored observations,
tends to be asymmetric
Confidence interval (Method II)
-
Sprott (1980) showed that
is more closer to symmetric compared to , whereMLE
Observed information matrix
-
Pivotal quantity
- Approximation of
is quite accurate compared to that of
- Approximation of
Confidence interval (Method III)
- Likelihood ratio statistic
can also be used to obtain confidence interval for , where
LRT statistic
Two-sided
confidence intervals are obtained as the set of values that satisfy
Confidence interval of survivor function
Confidence interval for the parameter
can also be used to obtain confidence intervals for a monotone function of , such asLet
confidence interval for where
-
confidence interval for
Confidence interval of hazard function
-
confidence interval for
Example 4.1.1
- Lifetimes (in days) of 10 pieces of equipment
- Assume lifetimes follow exponential distribution with parameter, i.e.
MLE
95% confidence interval of
(Method I)
-
95% confidence interval of
(Method II)Confidence interval for
Confidence interval for
-
Likelihood ratio statistic
95% CI for
can be obtained from the set of ’s such that95% CI for
method | lower | upper |
---|---|---|
Normal approx. | 11.40 | 76.60 |
Sprott | 22.69 | 103.03 |
LRT | 22.80 | 102.40 |
Methods for CI
Normal approximation
Sprott’s method
LRT
Type II censoring plan
Assume lifetime
Let
be the smallest lifetimes observed from an experiment with subjectsThe joint distribution of
- The log-likelihood function
-
Maximum likelihood estimator
- Observed information matrix
- Observed information matrix
CI for (by Normal approximation)
Exact confidence interval
Exact confidence interval for
can be derived for uncensored and Type II censored samples, defineIn general
- It can be shown that
- The joint distribution of
where
- We can write
-
The joint distribution of
- Using Equation 4.1
-
Equation
shows that-
are independent and
-
Since
’ are independent and , so follows a gamma distribution with shape parameter and scale parameterThen using the relationship between gamma and chi-square distribution
Review
- Moment generating functions for different distributions
-
Since
, we can write-
th quantile of distribution
-
exact confidence interval for
Example 4.1.3
The first 8 observations in a random sample of 12 lifetimes (in hours) from an assumed exponential distribution are
-
Homework
- Obtain 95% exact and approximate confidence intervals for
- Obtain 95% exact and approximate confidence intervals for
4.1.3 Comparison of distributions
Comparison of two or more lifetime distributions is often of interest in practice
For comparing two or more independent exponential distributions, different methods of hypothesis tests and confidence intervals are available
Likelihood ratio tests
Let
be the lifetime corresponding to the th subject of the th groupAssume
and the null hypothesis of interestData:
The likelihood function for
The corresponding log-likelihood function
- The log-likelihood function
-
and
-
- Maximum likelihood estimator of
-
Under
log-likelihood function- Under
, the MLE of
- Under
-
Likelihood ratio test statistic
- Under
, for a large sample size
- Under
Example 4.1.4
Four independent samples of size 10 each had 7 failures
MLE under exponential model
-
Homework
- Test
- Test
Confidence intervals for
Comparison between two exponential distributions
It can be shown that mle
approximately follows normal distributionDistribution of
Null hypothesis
Test statistic
95% CI for
95% CI for
Confidence intervals can also be found by inverting the likelihood ratio test for a hypothesis of the form
where-
For Type I sample
, the log-likelihood function- MLE
and
- MLE
-
Under
, the log-likelihood function- MLE under
- MLE under
- The likelihood ratio test statistic
where
and
- The likelihood ratio test statistic
confidence interval for can be construed from the values that satisfyFor Type I censored sample, LRT statistics based confidence interval for
Equation 4.3 is more accurate than that of normal approximation Equation 4.2 for small samples
Example 4.1.5
A small clinical trial was conducted to compare the duration of remission achieved by two drugs used in the treatment of leukemia.
Duration of remission is assumed to follow an exponential distribution and two groups of 20 patients produced the followings under a Type I censoring mechanism
Obtain 95% approximate and exact CI for
Unrestricted MLEs
95% approximate CI for
- 95% approximate CI (normal distribution based) for
Is there any significant difference between
Likelihood ratio statistic based confidence interval for
requires estimate of the parameters under the null hypothesisEstimates under
Likelihood ratio statistic
0.525 | 3.953 | 3.185 | 3.911 |
0.560 | 3.424 | 3.150 | 3.819 |
0.595 | 2.958 | 3.115 | 3.726 |
0.630 | 2.549 | 3.080 | 3.633 |
0.665 | 2.187 | 3.045 | 3.541 |
0.700 | 1.869 | 3.010 | 3.448 |
0.735 | 1.589 | 2.975 | 3.356 |
0.770 | 1.341 | 2.940 | 3.263 |
- 95% of
is the range of values of , such that which is
method | lower | upper |
---|---|---|
Normal approximation | 0.54 | 3.114 |
LRT | 0.56 | 3.150 |
Type II censored sample (CI for )
Lifetime distributions
Type II censored sample for the group
, which has number of failures and subjects are censored at
- Likelihood function
MLE
We have already shown that
Then we can show
We can write
confidence interval for
4.2 Gamma distribution
-
The pdf of two-parameter gamma distribution
- Scale parameter
and shape parameter
- Scale parameter
Survivor function
Incomplete gamma function
Uncensored data
-
Let
be a random sample from , the log-likelihood function-
and
-
- Score functions
where is the digamma function (see Appendix B of the Textbook)
-
MLE
is a solution of the system of linear equationsTwo equations and two unknowns
Equations are non-linear in terms of the variables
No closed form solutions
1. Substitution method
Solve Equation 4.5 using a suitable optimization technique (e.g. graphical method, Newton-Raphson method, etc.) to obtain
Then
can be obtained from Equation 4.4
2. Newton-Raphson method
- Score functions
- Elements of Hessian matrix
Score vector
Hessian matrix
Score vector and information matrix are function of parameters and data
Initial values
are chosen so that elements of score vector and hessian matrix are finiteUpdated estimate
is obtained asThe estimate
can be obtained by using as input in Equation 4.6Repeating the procedure of evaluating the Equation 4.6 using the current estimate, the following sequence of estimates can be obtained
- A convergence criterion needs to be defined to obtain the MLE from the sequence of estimates
Convergence criteria are defined on the basis of two successive values of the parameters estimates
-
is considered as MLE if one of the following criteria is satisfied
- Estimated variance-variance matrix of the MLE
can be obtained from the inverse of the negative of the hessian matrix evaluated at the MLE
Newton-Raphson method: Pseudo code
1: theta0 <- `initial value of the parameter`
2: eps <- 1
3: while (eps > 1e-5) {
4: u0 <- U(theta0)
5: h0 <- H(theta0)
6: theat1 <- theta0 - inv(h0) * u0
7: eps <- max(abs(theta1 - theta0))
8: if (eps < 1e-5) break
9: else theta0 <- theta1
10: }
11: return (list(theta0, h0))
-
Statistical software have routines (such as
optim()
of R) that can optimize likelihood function to obtain MLESuch routines require providing a “function” of likelihood function as an argument
Different optimization algorithms, such as Newton-Raphson, Mead-Nelder, simulated annealing, etc. are implemented in optimization routines
Statistical inference
Asymptotic distributions of MLEs, i.e.
-
Since
and , confidence intervals based on the sampling distribution of and ensure non-negative lower limit of the confidence interval
-
confidence interval for
-
confidence interval for
Example 4.2.1
Following are survival time of 20 male rates that were exposed to a high level of radiation
rtime <- c(152, 152, 115, 109, 137, 88, 94, 77, 160, 165, 125, 40, 128, 123, 136, 101, 62, 153, 83, 69)
rtime
[1] 152 152 115 109 137 88 94 77 160 165 125 40 128 123 136 101 62 153 83
[20] 69
- Assume the lifetimes follow a gamma distribution with scale parameter
and shape parameter
From the data:
Expressions of the score function
- R function
uniroot()
can be used to obtain the value of by solving the Equation 4.7
Log-likelihood function of gamma distribution
- R function to calculate log-likelihood function of a gamma distribution for a given sample
par
is a vector with the parameterscale
as the first element andshape
as the second elementtime
is the observed failure timesgamma_loglk
function can be evaluated for any given valid values ofpar
andtime
- For given values of parameters, say
, the value of log-likelihood function can be obtained for the rat data
gamma_loglk(par = c(1, 2), time = rtime)
[1] -2175.531
- For another set of values
, the corresponding value of log-likelihood function
gamma_loglk(par = c(100, 80), time = rtime)
[1] -5392.711
optim()
function
- R function
optim()
is a general purpose optimization routine
par
initial values for the parameters to be optimizedfn
a function to be minimizedgr
a function to return the gradient (score function)method
a method to be used, e.g. “Nelder-Mead”, “BFGS”, “CG”, “L-BFGS-B”, etc.lower
andupper
lower and upper limit of the parameters to be optimizedcontrol
list of arguments (e.g.fnscale
, etc.) for controlling the iterations
Initial values
Initial values of the parameters can be selected from the exploratory plots of log-likelihood function
The log-likelihood function must provide finite values with the initial values of the parameters
- As an example, assume the initial values
and , and the corresponding value of log-likelihood function
gamma_loglk(par = c(12.9, 8.8), time = rtime)
[1] -100.48
- For another set of initial values,
and
gamma_loglk(par = c(1.0, 1.0), time = rtime)
[1] -2269
- Since both sets of initial values can provide finite values of log-likelihood function, we can use either of these two as an initial value for optimizing the log-likelihood function using the R function
optim()
Example 4.2.1 using optim()
- Fit of the gamma distribution to the rat data
- List of objects in
gamma_fit
names(gamma_fit)
[1] "par" "value" "counts" "convergence" "message"
[6] "hessian"
- Converged?
gamma_fit$convergence
[1] 0
- Estimate of scale and shape parameters
gamma_fit$par
[1] 12.896976 8.795312
-
and
- Estimated variance-variance matrix
cvar <- solve(-gamma_fit$hessian)
cvar
[,1] [,2]
[1,] 16.88167 -10.871356
[2,] -10.87136 7.416138
CI for and
Using the sampling distribution of
andUsing the sampling distribution of
and
Quantiles
-
th quantile-
is the th quantile function of one-parameter gamma distribution [R functionqgamma(p, scale=1, shape)
]
-
Estimate of the median for the rat data is
12.893951 * qgamma(.5, scale = 1, shape = 8.799089)
[1] 109.1872
qgamma(.5, scale = 12.893951, shape = 8.799089)
[1] 109.1872
Likelihood ratio statistic
Likelihood ratio statistic
Approximate
confidence region for can be obtained from the set of points satisfying
LRT statistic based CI for
Consider the following null hypothesis
MLEs (unrestricted and under restriction)
-
Likelihood ratio statistic
- Under
, approximately follows a distribution
- Under
An approximate two-sided
confidence interval for can be obtained from a set of values of satisfying
LRT statistic based CI for
Consider the following null hypothesis
MLEs (unrestricted and under restriction)
-
Likelihood ratio statistic
- Under
, approximately follows a distribution
- Under
An approximate two-sided
confidence interval for can be obtained from a set of values of satisfying
Summary of CI
For
For
Censored
Data
No closed form solutions are available for MLEs
andAlgebraic expressions of score function and information matrix for the log-likelihood function
are very complicatedScore functions and information matrix can be evaluated numerically
-
Optimization routines available in statistical software can be used to obtain MLEs
and , and the corresponding SEs- Different optimization algorithms such as Newton-Raphson, Nelder-Mead, etc. are available in such routines
As an example, we are going to use the same rat data that we have used for the analysis of complete data
-
For the rat data, assume a time
weeks is considered as censored, there are 5 censored observations- An R object
rtimec
is created, which has two columnstime
andstatus
- An R object
rtimec |> t()
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
time 152 152 115 109 137 88 94 77 160 165 125 40 128
status 0 0 1 1 1 1 1 1 0 0 1 1 1
[,14] [,15] [,16] [,17] [,18] [,19] [,20]
time 123 136 101 62 153 83 69
status 1 1 1 1 0 1 1
gamma log-likelihood function for censored sample
gamma_loglk <- function(par, time, status = NULL) {
#
if (is.null(status)) status <- rep(1, length(time))
#
llk_f <- sum(status * dgamma(time, scale = par[1],
shape = par[2], log = T))
#
llk_c <- sum((1 - status) * pgamma(time, scale = par[1],
shape = par[2], lower.tail = F,
log.p = T))
return(llk_f + llk_c)
}
Censored sample
- R codes to obtain MLE of parameters of a gamma distribution from a censored sample
Maximum likelihood estimators [
gamma_out_c$par
]Standard errors [
solve(-gamma_out_c$hessian)
]
CI for CI for
- Estimate of median
Acknowledgements
This lecture is adapted from materials created by Mahbub Latif