13  Interval estimation

Important

This chapter is currently undergoing conversion and substantial changes will be made without notice.

13.1 Outline and prerequisites

13.2 Confidence intervals

13.2.1 Standard intervals

13.2.2 Likelihood intervals

13.2.3 Robust intervals

13.3 Calibration

13.3.1 RAL

13.3.2 Bootstrapping

13.3.3 Wilks theorem

Exercises

Exercise 13.1 Assume that the sample space is \(\mathbb{R}^k\) and the parameter of interest is the mean \[ \gamma(\rho) = \int y \, \mathrm{d}\rho(y). \] Assume that we have observations \(Y_1, \ldots, Y_n\) and let \(K\) denote the convex hull spanned by these observations (the smallest convex subset containing the observations). Let \(\hat{\rho}\) be the NPMLE, for which \[ \gamma(\hat{\rho}) = \frac{1}{n} Y_i. \]

  1. Argue that if the support of \(\rho'\) is contained in \(K\) then \(\gamma(\rho') \in K\), and show that in this case there exists a \[ \rho = \sum_{i=1}^n \rho_i \varepsilon_{Y_i}. \] such that \(L(\rho) \geq L(\rho')\) and \(\gamma(\rho) = \gamma(\rho')\).

  2. Show that if the support of \(\rho'\) is not restricted then for all \(\gamma \in \mathbb{R}^k\) \[ \sup_{\rho': \gamma(\rho') = \gamma} L(\rho') = L(\hat{\rho}). \]

Exercise 13.2 The Gompertz distribution on \([0, \infty)\) has survival function \[S (t) = \exp(- \eta (e^{\gamma t} - 1)) \quad t \geq 0, \] for parameters \(\eta, \gamma > 0\). We assume in the following that \[ (T_1, \delta_1), \ldots, (T_n, \delta_n) \] are \(n\) censored i.i.d. survival times following a Gompertz distribution. Introduce also the summary statistics \[ t = \sum_{i=1}^n T_i, \quad s = \sum_{i=1}^n \delta_i T_i \quad \textrm{and} \quad n_u = \sum_{i=1}^n \delta_i. \] The values of these summary statistics for a dataset are given in the following table together with some additional statistics.

Statistic: \(n\) \(n_u\) \(t\) \(s\) \(\hat{\gamma}\) \(\sum_{i=1}^n e^{\hat{\gamma} T_i}\) \(\sum_{i=1}^n T_i^2 e^{\hat{\gamma} T_i}\)
Value: 50 32 1637 1384 0.09390 3396 9949983
  1. Compute the hazard function for the Gompertz distribution. Show that the log-likelihood is \[ \ell(\eta, \gamma) = \log(\eta \gamma) n_u + \gamma s + \eta\left(n - \sum_{i=1}^n e^{\gamma T_i}\right). \]

  2. Show that the MLE (provided it exists) satisfies the equations \[ \begin{align*} \sum_{i=1}^n e^{\gamma T_i} & = n + \frac{n_u}{\eta} \\ \sum_{i=1}^n T_i e^{\gamma T_i} & = \frac{s}{\eta} + \frac{n_u}{\eta \gamma}. \end{align*} \]

  3. With \(\mathcal{J}^{\mathrm{obs}}\) denoting the observed Fisher information show that
    \[ ((\mathcal{J}^{\mathrm{obs}})^{-1})_{22} = \left(\frac{n_u}{\hat{\gamma}^2} + \hat{\eta} \sum_{i=1}^n T_i^2 e^{\hat{\gamma} T_i} - \frac{1}{n_u} \left(s + \frac{n_u}{\hat{\gamma}}\right)^2\right)^{-1}. \]

  4. Using the summary statistics from the table above show that
    \[ \hat{\eta} = 0.009564 \] and compute an approximate likelihood interval for \(\gamma\).

From hereon it is assumed that the censoring distribution is exponential with rate parameter \(\lambda\), and that the censoring variables are i.i.d.

  1. Argue why \(\hat{\lambda} = (n - n_u) / t\) is a sensible estimator of \(\lambda\).

  2. Simulate a new data set, \((\tilde{T}_1, \tilde{\delta}_1), \ldots, (\tilde{T}_n, \tilde{\delta}_n)\), using the estimated parameters \(\hat{\eta}\) and \(\hat{\gamma}\) for the Gompertz survival distribution and \(\hat{\lambda}\) for the exponential censoring distribution. Compute and plot the Kaplan-Meier estimator of the survival function for the simulated data and compare it to the theoretical survival function.

The implementation of the simulation is the first step in the construction of a parametric bootstrap algorithm, which, for instance, can be used to calibrate the likelihood interval.

  1. Implement a parametric bootstrap algorithm1 to obtain a bootstrap approximation of the distribution of the statistic \[ \frac{(\gamma - \hat{\gamma})^2}{((\mathcal{J}^{\mathrm{obs}})^{-1})_{22}}. \] Compare it with a \(\chi^2_1\)-distribution.

1 Remember the parameter constraints!

Hint: The implementation will require the numerical solution of the score equation for the computation of the MLE. You will only need to numerically solve a univariate equation, and you can implement your own Newton algorithm. You can also use the R function uniroot(). Alternatively, you can use the R function optim() for direct minimization of the negative log-likelihood.

Exercise 13.3 This exercise is investigating the effect of model selection on the standard confidence interval constructions.

The exercise uses the cholost data set from the R package RwR, and you will consider models of the effect of compliance on cholesterol decrease. Data is from a trial where 164 men took cholestyramine for a number of years. Their decrease in blood-level cholesterol from the beginning to the end of the trial is the response variable (y in the data set). Not all of them took the intended dose. In the data set the variable c measures compliance. It is a monotone transformation of the proportion of the dose actually taken.

The conditional expectation of the decrease in cholesterol level \(Y\) as a function of the compliance \(C\) is denoted \[ E(Y \mid C) = f(C), \] and it is assumed throughout that \(V(Y \mid C) = \sigma^2\).

  1. Fit the model \[ f(c) = \beta_0 + \beta_1 c \] of the cholesterol decrease as a function of compliance. Investigate the model fit2.

2 This is a small data set, so don’t use hexagonal binning for plotting the data points!

In the following \(f\) is expanded using natural cubic splines. Using the R terminology, an expansion with \(\mathrm{df}\) degrees of freedom is a model with \(\mathrm{df} - 1\) knots, \(\mathrm{df}\) basis functions and \(\mathrm{df} + 1\) parameters (an intercept and one parameter for each of the basis functions).

  1. Fit \(f\) as a natural cubic spline with 3 degrees of freedom.
    Give a graphical presentation of the resulting model. Compare the model to the linear regression model above.

  2. In the remaining part of the exercise the focus is on the estimation of \(f(c)\) for \[ c \in I := \{-2, -1, 0, 1, 2\}. \] That is, the focus is on the computation of the estimators \(\hat{f}(c)\) and their distribution for \(c\) taking these five particular values.

  3. Use nonparametric bootstrapping to estimate standard errors of \(\hat{f}(c)\) for \(c \in I\) when fitting the natural cubic spline model with 3 degrees of freedom. Compare with the theoretical values of the standard errors.

  4. Consider the 6 models where \(f\) is a natural cubic spline with 1–6 degrees of freedom. Compare the models and use the adjusted \(R^2\)-statistic to select one of the 6 models.

The bootstrap above does not take into account any model selection such as the selection of degrees of freedom of the cubic spline expansion of \(f\). The last question in this problem explores the effect of model selection on the confidence intervals.

  1. Use nonparametric bootstrapping to estimate standard errors of \(\hat{f}(c)\) for \(c \in I\) when selection of the degrees of freedom in the range from 1 to 6 is part of the estimation procedure. Compare with the previously estimated standard errors.