Skip to content

Preliminaries of KF – Deterministic and stochastic least (mean) squares


Least squares is a very old and classic problem. But it is very important for us to understand Kalman Filter. In this post, we revisit:

  1. deterministic least-squares
  2. stochastic least-mean-squares

1. Deterministic least-squares

Problem statement: Let x\in \mathbb{R}^{n}, y\in \mathbb{R}^m, A\in \mathbb{R}^{m\times n}. Find \hat{x} to minimize


Solution: I don’t want to show the derivation in detail. See ‘Linear estimation’ for a good discussion on it. Here I want to merely highlight some points.

  1. The optimal estimation on x is the solution to the normal equation A^TAx=A^Ty no matter A is full rank or not.
  2. Usually least-squares is used for inconsistent overdetermined system: Ax\cong y. In this case, A is full rank and A^TA is invertible. At this time the solution is unique.
  3. More important: we should keep in mind that this problem is not only for inconsistent overdetermined system. The matrix A can be arbitrary!! But no matter what the A is (full rank or deficient rank), the optimal estimation is always the solution to the normal equation. The only difference when A is not full rank is there are infinite minimizers \hat{x}. But all these minimizers gives the same  minimum J. Another interesting thing is among the infinite solutions, there is a one with minimum norm which is x=(A^TA)^{+}A^Ty.
  4. To prove the normal equation, two methods can be used: one is to let the derivative of J with respect to x be zero; the other one is to use completion of squares.
  5. Why A^TAx=A^Ty is called normal equation: it can be rewritten as A^T(Ax-y)=0. That means Ax-y is orthogonal (normal) to the range space of A.

2. Stochastic least-mean-squares

Problem statement: Let x\in \mathbb{R}^{n} be a random variable vector, y\in \mathbb{R}^m is some measurements. We need to find \hat{x}=Ky to minimize


Solution: Refer to ‘Linear estimation’ P80 for details. Here I want to highlight some points:

  1. Stochastic least-mean-squares is not very similar to the deterministic least-squares though they share similar names.
  2. The optimal gain K^* is the solution to the normal equation K^*R_y=R_{xy}.
  3. When we say we want to find the \hat{x}=Ky, we are trying to get an estimator!! And this is a linear estimator! Because it is a linear function of y. Of course the function can be nonlinear.
No comments yet

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s