I recently learned about the Kalman filter and finally got to play around with it a little bit. Since I had a hard time figuring out how to get it to work, here’s a practical (but yet general) introduction with examples:
A Kalman filter works by matching a simulation model and measured data. For each data point, an estimation of the simulation model’s internal state is computed based on the estimate of the previous state. This works with noisy data and limited measurement signals (e.g. a model with 10 state variables but only 2 measurement signals, although there are obvious limitations here (the more and the better the sensor data, the better the results should be – there’s also some limit on observability).
So we have this interesting tool which does all these different things:
- filtering noisy data, while taking knowledge (or assumptions) on the underlying dynamics into account
- merge data from several different sensors into one signal (typical application: combine GPS and acceleration sensor data into one accurate position signal)
- offer a prediction of a system’s future state
- estimate internal parameters of a system (say a spring stiffness based on measured oscillations)
Another interesting use is that we might try two different simulation models on the same measurement and check which one does a better job at synchronizing to the measurement (I’ll do this in a very simple example below).
While I had a tough time figuring this out, the main concept of a Kalman filter is rather simple. You provide the filter with your system’s behavior (in the form of a transition matrix F) and the information on how your measurement relates to the system’s internal state (in the form of a matrix H). Now throw in some information on how noisy your measurement is (vector R) and how sure you are that your system calculates accurate results (matrix Q). Now you’re ready to calculate the following steps:
- Prediction: calculate the next state x_predicted and the covariance (read: uncertainty) P_predicted based on the previous estimate of the system’s state.
- Update: Match the current measurement value with the prediction and correct the internal state based on the results.
In Matlab / GNU Octave code, this looks like this:
The variables y, S and K are only used to simplify the equations and are also used in the Wikipedia article. In the “official formulas”, there is also a part B * u to account for external influences (say if you model a quadrocopter the amount of thrust from the rotors) that I’m not including here to simplify things.
Very simple example model
So to get started, I used a very simple signal of a sinus sweep with added noise. The simplest model we can use here would be the assumption of a constant model, which would simplify our system to:
x as a scalar variable (current state) F = 1 (x_predicted = 1 * x_previous) H = 1 (measured signal = x + noise)
Still we are left without values for R and Q, so let’s just use some guesses:
The estimate curve in red also includes x + P and x – P to get an idea of the model’s internal uncertainty. To get an idea on how this depends on the P and Q values, see the following comparison (click on it to get a larger picture):
As I already said, the Kalman filter allows us to try different models on the same measurement and see how they perform. So instead of the constant model, we might also include an integrating part in our model:
signal(k) = signal(k-1) + integrating_value Since the Kalman filter should determine the previously unknown integrating value, let's place it in the x vector: x = (signal_value; integrating_value) F = [1 1; 0 1] (signal = signal + integrating_value, integrating value remains constant) H = [1 0] ( measured_signal = H * x + noise )
This leads to results like this one (this time plotted without the covariance):
It’s interesting to see that the estimate now includes some kind of “overshoot” behavior due to the integrating part. If this es required depends on the target, but it’s really interesting to see how easily I could tune the estimation behavior with the Kalman filter.
Application example: averaging polling results
At this point, I decided to grab some real data and put my Kalman filters to use on a set of polls from the US 2016 election. As raw data, I used all national polls on clinton vs. trump from the fivethirtyeight.com 2016 election forecast (you can download them at the bottom of this page). So let’s see how both constant and integrating model perform with the polls data:
The thick line is the constant model tuned to a more conservative behavior and the thin line is the integrating model with a more aggressive behavior. Since the polls are not available on an equidistant time scale, I also had to modify my kalman filter sequence, either performing several update steps without prediction or several predictions on one update step.
Let’s also zoom into the second part of 2016, where more data is available:
It’s interesting to note how similar both models are behaving – despite the very different tuning parameters (probably due to the large amount of polls available, for the early months with less polls, the difference between both models is a lot higher). Also, I’ve added 10 days of prediction for the end state of each model to illustrate the prediction behavior of the kalman filter (the dotted line at the end).
There are also some obvious downsides here to my approach:
- Currently there is no weight applied to the polls – each counts the same, no matter how good the pollster is or how many people were asked (actually I’ve just thrown away the pollster ratings and adjusted poll values from the fivethirtyeight dataset). This could probably be modified by variating the R or Q values depending on the data quality.
- The resulting graph turned out to be a lot less smooth than I expected it to be, accounting for the model updates happening at each data point instead of a weighted approach (say, average all polls for each week). This adds visual clutter to the results – something not really intended here.
After I did all this, I also tried to google for polls kalman filter and it turned out that I’m basically just doing what most official poll tools also do – see here, here and here for examples in “official” poll models and here (I’ve recently started reading one of Gelman’s books – he’s amazing! Also, read the comments), here and here for more technical blog posts with a lot more information.
The Kalman filter turns out to be really interesting. With regards to multibody dynamics, I’d like to do some applications focused on parameter estimation and model comparisons (is there a general way to evaluate model quality based on the amount of correction the filter performs? I’m not sure yet). Overall, this fits in the general topic of combining measurements and simulation models more thightly.
The code I’ve used can be found here. Please keep in mind that this was just built to get a general idea of the concept – if you do anything serious with it, don’t blame me if it goes wrong.