This is a review I wrote for the SlateStarCodex / AstralCodexTen book review contest. For whatever reason, it never arrived in the pile of contestants, so I’m posting it here instead. This is OK for me: I mostly wrote this for myself and really did enjoy the process. Nonetheless, I’d like others to read it and get some feedback on it. A few more announcements:
The essay was originally intended for readers in some proximity to the rationalist community (which explains a few references I make along the way)
I’m not a native english speaker, so feedback on wording and writing style is highly appreciated
I made some very minor edits (fixed wordings and links in a few places) in the first 24h after posting (including this line here)
I stumbled upon the Google mobility dataset and decided to plot it against my case estimates and covid reproduction numbers. It looks as if the mobility data can serve as a very crude proxy for comparing lockdown compliance and effectiveness. Oh, and we can go to parks as much as we want.
Update Jan. 26th: Now I also added plots for NPIs (non-pharmaceutical interventions) as described in the ACAPS dataset. I picked only a few (potentially interesting ones and even this makes the plots really confusing, so I don’t know what to make of this. Also, I now generally smooth mobility data over two weeks and reproduction Rates R over 10 days.
In my covid infection models, Russia is a country where case and death data appears to match reasonably well. So I was surprised by news accounts that Russia might have 3x higher death counts than previously reported (sources: 1, 2). More generally, the question is: can we trust official death numbers or is each country counting covid-related deaths in a different way?
Our world in data has a dataset on excess mortality and while Russia is not included, I was able to produce this graph (featuring all the countries included):
So long story short, yes, the official covid deaths mostly appear to be “people that actually died because of covid-19”.
Instead, I’m interested in estimating actual infection counts: we test only a small part of each country and depending on how much we test, we miss a large share of all infections. This is important for many questions, like comparing countries, estimating cases when testing changes and so on. So I implemented two models for estimating actual infection numbers. Both have limitations, but when the results match, I’m reasonably confident that we’re seeing the right picture.
Please note that this is a statistics- and data-driven picture. I’m running a lot of numbers because it helps me become less confused, but if you’ve lost someone in events related to the pandemic or are directly affected in other ways (e.g. economically), this is probably not helpful.
I’ll first show my plots for Germany and the United States, then describe how the models work and finally provide a gallery with plots for ~20 different countries. Plots will be updated once per week (possibly less often over summer 2021 unless something very surprising happens).
This is sooo fascinating. I mainly wrote this down for myself after reading many many wikipedia pages and added many many links. Please assume that I got a few thing’s wrong and check the links for details.
100 years ago, humanity did not know if there was anything beyond our galaxy. In telescopes, you could see small muddy blobs (nebula) and various folks, including Immanuel Kant, had long speculated that those might as well be separate galaxies, but you couldn’t be sure. And there were good reasons to doubt it – see the absolutely fascinating 1920 great debate for pro and con arguments.
It appears our universe is very old, but our knowledge on it is rather young. And both are expanding.
Most of what I’m doing is managing complexity – not in the business sense but in the actual problem sense.
Engineering involes problems like: “why are these parts breaking?” (or considering the future: “Will they break and what can we do to prevent this?”). And our job as engineers is to try to break these problems down into causal effect chains and lists of options.
Then it appears to me that most engineers, developers and basically every other profession on earth are doing just the same. This made me look at other disciplines to see what’s working for whatever kind of problem.
Let’s define simulation as a prediction of real-world events (think of something you cannot or don’t want to measure or something you’d like to explore before actually building it). So why bother when there’s already reality to look at?
Here are a few reasons I can come up with:
We want something that feels like reality (special effects in a movie or a physics engine in a video game)
We want to test a physical system before actually building it (e.g. design simulations in engineering)
There’s something we’d like to know we cannot easily measure (think of the inside of reactors and high-power turbines – or complex systems like sociology or medicine where we try to infer on hidden variables to estimate treatment effects)
We’d like to understand something about reality and do so by building a model of it.
The last point above is a little bit obscure and I’d like to elaborate. Actually, this is my favorite reason for spending free time with simulation (in contrast to paid time at work which is all about 1-3): I’d like to learn something about reality!
I remember when I was a first-year student, I had a hard time figuring out what exactly torque is. My first year of mechanics was all about static calculations and I feel torque is just so much easier to explain when you’re considering a dynamic system.
So let’s try this: Consider a body floating in free space with rocket boosters attached to it.
Force: If a rocket fires, it will exhibit a force on the body. This will make the body move (accleration = force / mass ).
The yellow rocket fires, this makes the body move to the left
Torque: If two rockets fire in opposite directions, the force will cancel out. Depending on their positions relative to each other, a torque will however be created. This will make the body rotate (rotational acceleration = torque / inertia).
Orange and red rockets fire (on the already moving body). Forces cancel out, but the body starts to rotate.
I’m a huuuuge cola enthusiast, with an appetite for non-mainstream versions (most major brands just taste boring to me). So I wanted to make OpenCola for a long time and finally got around to do so.
It took me several attempts to get the recipe right, but once I did it turned out both very delicious and very easy to make. I ended up modifying the original opencola recipe for my needs and trying several versions, so here’s my summary of what did and didn’t work.
Processing is a programming language / IDE built on top of Java that’s intended for simple and visual programming, making it a great tool for WYSIWYS (what you see is what you simulate, actually that is not a real term, I just made it up).
I’m not only a simulation nerd, I’m also a visualization nerd. My interest in formatting, layout and displays has proven to be extremely helpful in my daily work, where finding the right visual is often key to analyzing and communicating large measurement and simulation datasets. Sometimes, I also find the time to participate in fun events like the recent storytelling with data visualization challenge – which also is a good excuse to write this post on plots and visualization techniques.
So here are some simple tips to get better result plots and graphs. Most of my advice is focused on visuals for simulation results, especially in the context of large datasets and use-cases where you have to plot results frequently (like multibody simulations). This often boils down to getting the workflow right – the most beautiful visuals won’t help when creating them takes more time than what’s available.
I’ve used a lot of different configurations of Arduino-related gear as datalogging utilities. So here’s a comprehensive guide on what’s possible, how to set up stuff and on what you can expect regarding accuracy, battery life and logging speed.
(However, please keep in mind that I’m neither particularly skilled with electronics nor programming and other’s know a lot more on this than I do. I’m particularly grateful that Ed Mallon provided a link to this paper he coauthored with Patricia Beddows in the comments – the work and knowledge they put into it is just amazing)
My Arduino Uno with the datalogger shield both temperature and brightness sensors connected. I used a normal smartphone charger to power it for more than 3 days and placed it in this fireproof baking tray since I felt somewhat unsure about having it running unattended.
I recently learned about the Kalman filter and finally got to play around with it a little bit. Since I had a hard time figuring out how to get it to work, here’s a practical (but yet general) introduction with examples:
A Kalman filter works by matching a simulation model and measured data. For each data point, an estimation of the simulation model’s internal state is computed based on the estimate of the previous state. This works with noisy data and limited measurement signals (e.g. a model with 10 state variables but only 2 measurement signals, although there are obvious limitations here (the more and the better the sensor data, the better the results should be – there’s also some limit on observability).
So we have this interesting tool which does all these different things:
filtering noisy data, while taking knowledge (or assumptions) on the underlying dynamics into account
merge data from several different sensors into one signal (typical application: combine GPS and acceleration sensor data into one accurate position signal)
offer a prediction of a system’s future state
estimate internal parameters of a system (say a spring stiffness based on measured oscillations)
Another interesting use is that we might try two different simulation models on the same measurement and check which one does a better job at synchronizing to the measurement (I’ll do this in a very simple example below).
I got a small display compatible with my Due for christmas. And since I really wanted to see some arduino-in-the-loop simulations, I decided to use it for exactly this: real-time multibody simulations on the Ardunio and the results displayed on the tft.
It’s been more than a year since I published my post on numerical integration on an Arduino. Since then, the post has been quite popular, recieving a steady stream of visitors (mostly via Google). When I originally wrote it, I only had an Arduino Uno at hand – since then I’ve added a couple of Nanos and lately an Arduino Due to my inventory and decided it would be interesting to do a couple of speed tests to see how they perform. The latest addition to my growing circus of microcontroller boards is a Teensy 3.5 board. (Update May 2019: Added a ESP32 dev board)
As I pointed out in the original post, numerical integration relies heavily on floating-point math – which is something the Arduino’s 8-bit processor is not particularly good at. The Due features a 32-bit processor, a clock frequency of 84 instead of 16 MHz and the possibility to use double (64 bit) instead of float (32 bit) as a data type – so I was curious to see how it would compare to the Arduino Uno. The Nano is supposed to have more or less the same characteristics as an Uno, but is a lot smaller and cheaper – see below for details.
Now added to the comparison, the Teensy 3.5 includes a 32-bit processor with 120 MHz clock speed and a FPU for speedier floating-point math.