Standard error using the Fisher Information Matrix

Purpose

The variance of the maximum likelihood estimate (MLE) $\hat{\theta}$ , and thus confidence intervals, can be derived from the observed Fisher information matrix (FIM), itself derived from the observed likelihood (i.e., the pdf of observations y):

$I_{y}(\hat{\theta})\triangleq -\frac{\partial^2}{\partial\theta^2}\log({\cal L}_y(\hat{\theta}))$

There are two different algorithms: by linearization or by stochastic approximation. When “linearization” is used, the structural model is linearized, and the full statistical model is approximated by a Gaussian model. When “stochastic approximation” is used, the exact model is used, and the Fisher information matrix (F.I.M) is approximated stochastically. The final estimations are displayed in the command window (Matlab or DOS) together with the population parameters:

the estimated fixed effects, their standard-errors, the absolute and relative p-values obtained from the Wald test (only for the coefficients of the covariates),
the estimated variances (or standard deviations) and their standard-errors,
the estimated residual error parameters and their standard-errors,
the estimated correlation matrix of the random effects if the covariance matrix is not diagonal,
the correlation matrix of the fixed effect estimates, with the smallest and largest eigenvalues,
the correlation matrix of the variance components estimates, with the smallest and largest eigenvalues,

All that information is appended to the file pop_parameters.txt. The following file presents an example on the theophylline_project.

******************************************************************
*      theophylline_project.mlxtran
*      November 26, 2015 at 16:17:01
*      Monolix version: 4.4.0
******************************************************************

Estimation of the population parameters

                  parameter     s.e. (lin)   r.s.e.(%)   p-value 
ka_pop          :     1.57         0.31          20              
V_pop           :    0.454        0.019           4              
beta_V_t_WEIGHT :   -0.471          0.3          63        0.11  
Cl_pop          :   0.0399       0.0034           9              

omega_ka        :    0.641         0.14          22              
omega_V         :    0.113        0.035          31              
omega_Cl        :    0.272        0.066          24              

a               :    0.735        0.056           8              

______________________________________________
correlation matrix of the estimates(linearization)

ka_pop               1          
V_pop             0.15       1       
beta_V_t_WEIGHT      0    0.09       1    
Cl_pop           -0.06   -0.15    0.01       1 

Eigenvalues (min, max, max/min): 0.79  1.3  1.6

omega_ka      1          
omega_V   -0.02       1       
omega_Cl      0   -0.01       1    
a         -0.02   -0.11   -0.05       1 

Eigenvalues (min, max, max/min): 0.87  1.1  1.3

Numerical covariates
    t_WEIGHT = log(WEIGHT/70)

Notice that the chosen method is displayed in the file.

Best practices: when to use “linearization” and when to use “stochastic approximation”

Firstly, it is only possible to use the linearization algorithm for the continuous data. In that case, this method is generally much faster than stochastic approximation and also gives good estimates of the FIM. The FIM by model linearization will generally be able to identify the main features of the model. More precise– and time-consuming – estimation procedures such as stochastic approximation and importance sampling will have very limited impact in terms of decisions for these most obvious features. Precise results are required for the final runs where it becomes more important to rigorously defend decisions made to choose the final model and provide precise estimates and diagnostic plots. In the theophylline example, the evaluation of the FIM is presented with the CPU time.

Method	s.e. $(ka_{pop}, V_{pop}, \beta_{V,t_{WEIGHT}}, Cl_{pop})$	s.e. $(\omega_{ka}, \omega_{V}, \omega_{Cl})$	e.v. correlation matrix of the estimates	CPU time [s]
Linearization	(0.31, 0.019, .3, .0035)	(0.14, 0.035, .066)	(0.8, 1.3, 1.6, 0.87, 1.1, 1.3)	.1
Stochastic Approximation	(0.31, 0.019, .3, .0035)	(0.15, 0.035, .068)	(0.8, 1.2, 1.5, 0.79, 1.2, 1.5)	3.4

One can see that it the stochastic approximation is much more costly in computation time but the result is more precise.