Preliminaries of KF – Deterministic and stochastic least (mean) squares
Least squares is a very old and classic problem. But it is very important for us to understand Kalman Filter. In this post, we revisit:
- deterministic least-squares
- stochastic least-mean-squares
1. Deterministic least-squares
Problem statement: Let . Find to minimize
Solution: I don’t want to show the derivation in detail. See ‘Linear estimation’ for a good discussion on it. Here I want to merely highlight some points.
- The optimal estimation on is the solution to the normal equation no matter is full rank or not.
- Usually least-squares is used for inconsistent overdetermined system: . In this case, is full rank and is invertible. At this time the solution is unique.
- More important: we should keep in mind that this problem is not only for inconsistent overdetermined system. The matrix can be arbitrary!! But no matter what the is (full rank or deficient rank), the optimal estimation is always the solution to the normal equation. The only difference when is not full rank is there are infinite minimizers . But all these minimizers gives the same minimum . Another interesting thing is among the infinite solutions, there is a one with minimum norm which is .
- To prove the normal equation, two methods can be used: one is to let the derivative of J with respect to be zero; the other one is to use completion of squares.
- Why is called normal equation: it can be rewritten as . That means is orthogonal (normal) to the range space of .
2. Stochastic least-mean-squares
Problem statement: Let be a random variable vector, is some measurements. We need to find to minimize
Solution: Refer to ‘Linear estimation’ P80 for details. Here I want to highlight some points:
- Stochastic least-mean-squares is not very similar to the deterministic least-squares though they share similar names.
- The optimal gain is the solution to the normal equation .
- When we say we want to find the , we are trying to get an estimator!! And this is a linear estimator! Because it is a linear function of . Of course the function can be nonlinear.