Kalman Filters (Tracking)

It is a very popular technique to estimate the state of a system. Kalman filters estimate continuous states, as a result Kalman Filters happens to give us a uni-modal distribution.

Monte Carlo Localization is discrete, so it gives us multi-modal distribution.

Gaussian Function: is a continuous function over a space of locations, and the area underneath sums up to 1. 1D Gaussains are characterized by mean and variance. It characterizes by the exponential drop on both sides (symmetrical) and have a single peak (called uni-modal). A bimodal function has two peaks, so it is not a guassain.

If we set z = mue, we get the maximum which is the peak of the gaussian.

The task in Kalman filters is to obtain mue and sigma squared that are the best estimate of the location of the object we are trying to find. Sigma squared value (variance) is a measure of uncertainty. The larger variance, the more uncertain we are about the actual state.

Please check Lesson 1 Localization in AI for Robotics course to better understand the following parts

🤖AI for Robotics

Lets start with getting the Measurement update!

The more certain we are (less variance), the more we pull the mean in the direction of the certain answer.

More measurements means greater certainty. 🙂

The new mean is the weighted sum of the old means using the old variances and then normalized by the sum of the weighting factors. The new variance term is unaffected by the means of the previous gaussians.

Notice that the new mean is between the previous two means and the new variance is LESS than either of the previous variances.

Since the Gaussian's have the same width (which means same certainty), than their product will be a Gaussian with a mean that is right in the middle.

🛠

Remember that: The new variance is obtained independent of the means, so it is always more certain. Multiple measurements ALWAYS gives us a more certain (and therefore taller and narrower) belief.

Step 2: Motion Update (Predictions)

🗣

Measurement meant updating our belief (and re-normalizing our distribution). Motion meant keeping track of where all of our probability "went" when we moved (which meant using the law of Total Probability).

🛠

The sensor (for instance camera) itself only sees positions, it never sees the actual velocity. The velocity is inferred from seeing multiple positions. One of the most amazing things of Kalman filters in tracking applications that is able to figure out velocity of the object (without really measuring it) and make predictions from there about future locations that incorporate velocity. Performing a measurement meant updating our belief by a multiplicative factor, while moving involved performing a convolution.

When the Gaussian is tilted, this means that the uncertainty of x and y is correlated

The covariance defines the spread of the Gaussian as indicated by the contour lines. You may have small uncertainty in one dimension and large uncertainty in the other dimension (like the Gaussian on the right of the previous picture).

In higher dimensions (Multivariate Guassians), the mean is replaced by a vector and the variance is replaced be covariance matrix.

Since we only know the position from a camera image, so we are certain about x but extremely uncertain about x dot (velocity) → This gives us the blue Gaussian above.

All the possibilities on the blue Gaussian link to give the red Gaussian. This is a very interesting 2D Gaussian. It is clear that if we project the red Gaussian into the space of possible locations, you can NOT predict a thing.😂 This is because you don't know the velocity. And the same if you project on to the space of possible velocities, you can NOT predict anything. A single observation is insufficient to make that.

However, what we know from that is our location is correlated to the velocity. This tilted Gaussian tells us so much about the relation of these two things (x, x dot).

To feel how powerful this is: Lets take the 2nd observation (@ t=2), this observation tells us nothing about the velocity, only sth about the location → we draw the green Gaussian. Now, multiply the prior (red) (from prediction step) with measurement probability (green). You get the black Gaussian 😮 which has a really good estimate what the velocity is and a really good estimate where I am. If you take this Gaussian and predict one step forward, you find yourself at t=3. This is a deep insight of how Kalman filter works. You only were able to observe one variable. And from multiple observations, we were able to infer the other variable. This was possible because there is a set of physical equations which say that my location after a time step is my old location + my velocity.

A radar give a distance estimate and a noisy velocity estimate, and a Lidar gives a distance estimate. Combining the data from both sensors using a Kalman filter, we can have an accurate estimation of the environment around us. 😉