Fisher Information Matrix
Matrix has a special name in the context of least squares problems. This matrix is called the Fisher information matrix, named after the famous statistician.
Imagine measuring how sharp the peak of a mountain is. The sharper the peak, the easier it is to determine the exact location of the peak. Similarly, the Fisher information matrix provides a measure of how well we can determine the optimal parameters.
Parameter Covariance Matrix
Matrix is the covariance matrix of the parameter estimator . This matrix applies when we assume that components for are independent values that are standard normally distributed.
With this assumption, the estimator follows a multivariate normal distribution
where is the unknown true parameter as the expected value and as the covariance matrix.
Diagonal elements describe the variance of parameters, like measuring how far parameter estimates can deviate from their true values. From these values, confidence intervals for the parameters can be calculated. Off-diagonal elements with are covariances that show how the uncertainties of two parameters are related. From these covariances, correlations between parameters can be obtained.
What matters in parameter estimation is not only the estimator itself, but also its statistical significance as described by the covariance matrix . Like a doctor who not only provides test results, but also explains the level of confidence in those results. In statistics courses, these concepts are discussed in more detail.
QR Decomposition
The covariance matrix can be calculated using the reduced QR decomposition of . If , then it holds
Weighted Least Squares
To meet requirements regarding measurement errors and provide appropriate weights to measurement data, weighted least squares problems are commonly used
This problem can be transformed by defining
with
Here is the variance of measurement errors that are independent and normally distributed. Additionally, it is assumed that measurement errors have expected value , so there are no systematic errors. Thus is standard normally distributed.
In weighted least squares functions, measurement values with large measurement errors are given weaker weights compared to measurement values with small measurement errors. The analogy is like when we listen to opinions from various sources, we give greater weight to more reliable sources and smaller weight to less accurate sources.