I’m not only a simulation nerd, I’m also a visualization nerd. My interest in formatting, layout and displays has proven to be extremely helpful in my daily work, where finding the right visual is often key to analyzing and communicating large measurement and simulation datasets. Sometimes, I also find the time to participate in fun events like the recent storytelling with data visualization challenge – which also is a good excuse to write this post on plots and visualization techniques.
So here are some simple tips to get better result plots and graphs. Most of my advice is focused on visuals for simulation results, especially in the context of large datasets and use-cases where you have to plot results frequently (like multibody simulations). This often boils down to getting the workflow right – the most beautiful visuals won’t help when creating them takes more time than what’s available.
Tip #1: Automate your plotting workflow
Multibody models are supposed to simulate reasonably fast, and working with them often involves many simulation runs – maybe because you’re tuning your model to fit a measurement or because you’re simulating a full variation matrix with many variants. Comparable situations apply to data analysis or processing tasks, where you often have many sets of comparable data (or want to rerun an analysis several times). There is a very simple trick to make this tremendously easier:
Whenever you produce results data (e.g. your simulation model is run, or you process measurement data), make your toolchain automatically save a plot to an image file.
- You process 20 new measurement sets? Make sure your analysis tool automatically saves 20 result plots to your project directory.
- You simulate a 50×20 variants parameter matrix? Make sure 1000 small plots are created along the raw simulation data.
- You spend 1 hour tuning your model to a measurement? There should be a trail of plot files in your folder, showing your progress.
Achieving this is actually very simple: Tools that feature interactive plots and a scripting language also provide means for automatically exporting plots to png or jpg files. Matlab, GNU Octave, R and Excel (via VBA) all have functions for this – you just have to add a small plotting script. That’s it.
This will allow you to cycle through your datasets just like an image deck – you can easily compare variants (place two images next two each other or overlay them in a graphics tool) and finding errors in your data or model becomes much easier. Do not worry about having many plot files: If your smartphone can hold thousands of photos from your last vacation, then your result folder can do the same (and you can cycle through the plots much faster since there is less variation)!
Tip #2: Automatically include documentation info in your plots
When you follow tip #1, make sure you automatically include documentation in your plot. This is one of the most important rules I can communicate:
Documentation only happens when it doesn’t require any work at all. If metadata information is not added automatically, there will one day be a situation where you’re screwed because it’s missing.
- Use the plot title or subtitle to show the data set source or your simulation parameters.
- Add axis labels and a legend automatically. It doesn’t have to be perfect, but the image itself must be self-explaining (e.g. you can mail it to someone without further explanation and there is enough context provided to interpret the image file)
Combining tips 1 and 2 will lead to an automatic documentation of your work:
- You need to document your model tuning? Just collect all plot files from your folder.
- You think there was a variant 20 iterations ago that was better than what you have now? Just check the plots, and if this was the case, all information to go back to it should be there in the title!
- You find a printout of results from three weeks ago, on a presentation you prepared in a hurry? It has axis labels, a legend and a data source.
- Your analysis covers many thousands of variants, and there may be errors in the data or your algorithm? Check a random selection of your result plots to see how well you are doing.
Tip #3: Don’t worry about rare plot features. Add them manually afterwards.
Here’s the result plot I prepared for the storytelling with data visualization challenge:
I spent some time trying different captions, arrangements and layouts and finally decided to use a version that highlights five of the datapoints. So did I program my visualization tool to color them separately, and add the individual legend, captions etc.? This probably would have been possible… but very time-consuming! Instead, I imported my plots in a graphics tool (LibreOffice Draw in this case) and did the highlighting, annotating and layout tuning there. You should do the same:
If you want to add annotations and comments to your data or highlight certain parts, you can always do so afterwards.
- Use the annotation tool in your PDF viewer to easily add additional labels, comments and markers, or any graphics tool of your choice (including your presentation software)
- If your original labels are not sufficient (say, you want prettier axis labels or a complex legend), you can always crop your plot and add nicer ones. If you include many variants on one page, you probably don’t need the legend on each of them – but it’s way easier to remove it than to worry about what exactly it was originally (see tip #2).
- A great tool for annotations is… handwriting! Just print your plot on a sheet of paper and bring it to a meeting, then discuss it together and add comments and interpretations to it manually.
Seriously, you have to look at your results and think about their meaning anyway. While you’re at it, why not add your thoughts as annotations?
Tip #4: Beautiful plots are easy – remove everything not needed and use contrast depending on importance.
If you follow tips 1-3, you save so much time in your daily work. Use a little bit of what you’ve gained on making your plots beatiful – it’s very easy:
Remove all elements from your plot that are not serving a particular purpose (Edward Tufte refers to this approach as maximizing the data-ink ratio).
- In many cases, instead of removing elements alltogether, you can also reduce their contrast: In the example above, I actually cared about the blue line (high contrast), but the original data is added in the background in a low contrast fashion.
- The same advice also applies to other secondary elements like gridlines – if you want to keep them, make them barely noticeable (I’ve seen so many horrible plots with thick black gridlines as the most striking element).
Again, this is very easy to implement… just increase your plot line thickness for the main data and decrease it for secondary elements… and you’re ready to go!
One restriction: Don’t try to cram everything into one graph! In the example above , I could have tried to combine both graphs into one by adding a second axis. In almost all cases I know this leads to horribly crammed displays. If your data has two different dimensions (= signals with different units you care about), show them as two separate subplots (see here for another example of this principle).
Tip #5: Use transperancy to add density information to scatterplots
Scatterplots come in handy in many cases, especially where noisy relationships between two variables are of interest. However, if data points overlap or match, most of the density information is lost… unless you activate transparency:
Plot of all hurricanes that hit the US between 1850 and 2016 (from the example above). Overlapping points appear darker due to transparency settings.
This is sooo easy, and yet missing from many dense scatterplots. This plot was created in R, but even the newer versions of MS Excel include transparency as an option for plot points.
Tip #6: Find your own style
There may be a lot of things still left to add, but this post is already long enough and in any way, there is often not the one correct solution for visualization techniques. It’s easy to point out bad visualization examples but almost impossible to agree on the best visualization for any task.
- Feel free to experiment – try different things and see what works
- Invite others to give feedback on your visualizations and have a look at what they are doing themselves
- Don’t get overly obsessed with details. Plots are a tool for communicating data… often a very beautiful tool, but still a tool.
If you think something is missing on this list or disagree, please write a comment!