You’ll be surprised how important this is. Remember how you learned to use Excel in junior high (or maybe earlier)? It was probably for a science project. Well, don’t worry, because as a researcher, you’ll never have to make another Excel graph in your life!
…because we use things like Origin and Igor instead.
…which are complex enough to have their own programming languages.
Why? Well, once you pull super-cool results from your research (which I’m starting to now, by the way, and it’s awesome), you’re presented with a problem: people don’t have much time, and you have a lot of information. Often tens of thousands of data points, from different tests and techniques over months of narrowly-focused work, with layers of data analysis in between, that you somehow have to compile in a way that these busy, tired people will understand.
And they don’t want to deal with lazily-made graphs.
If you think I’m making a mountain out of a mole hill, think again – it’s this kind of stuff that makes papers take years to publish. But that’s why it’s good to get a handle on it now, when you’re just starting out your career!
It’s easy to get mired in the details of each individual graphing software. You can probably use any, as long as it offers enough flexibility, so instead I’ll keep my advice general. The predominant choice here is Igor, which is the source of the plots in the “good” column. The “bad” is a combination of MATLAB and Excel.
This wisdom comes from the EUREKA program, and my research mentor.
Disclaimer: none of the “good” column graphs have titles. This is only because I like to interchange titles based on what I use the graphs for, without having to edit the graph image over and over. Always, always include a title.
You’ll probably see a lot of this. This is fresh out of MATLAB, and yes, something I actually sent my mentor at some point. It looks great at the time, since you’re buried in your process and don’t take a step back to look at it! But in reality,
- The lines are too thin
- The labels are tiny
- Most of the labels are unnecessary, at least for a presentation
- Two graphs are overplotted for no particular reason
- It’s not cropped, and the axis range is poorly selected to include extraneous regions (including regions in which my filtering technique had some unprofessional hiccups!)
- Image quality is low
- Almost everything is whitespace or noise (extra “ink on the page” that doesn’t contribute to meaning) – which is a guaranteed way to make your graph confusing
But lastly, and most importantly, the purpose of the graph is not clear. Unless you need it to portray something, why would you show it?
|Better (but not perfect)
This is the same analysis, but a version of it which could conceivably be used for professional work. It’s not quite what you would use for a paper (which is more formal), but it’s pretty decent for a presentation. And certainly it’s leagues ahead of the other one. The reasons why turn out to be good general tips:
- Thicker lines
- Bigger labels
- Only the important and necessary things are labelled
- Nicer colors* (this happens to coincide with color coding elsewhere in the presentation, which is another good strategy)
- Overplotting can be fine, but if data sets cross each other, it can get confusing
- The most important point on the graph is clearly identified
- High image quality – never screenshot; any good graphics program will have ways to export high-resolution figures
* Never use red and green to distinguish between different data sets on the same graph. Red-green colorblindness is surprisingly common!
** Only important in a presentation – complicated graphs are often the only way to go, in papers.
Ah, the other ancient enemy of clear presentation: the default graphs from Excel. This is something I had for personal use, to organize the outputs of my data analysis. Don’t assume that you can just put it in your slides like this!
- Even if you’re going to explain what it means, there should minimally be some clues which are written
- Both in the graph area and in the legend, there is duplicate data! That becomes noise to people who are trying to read your graph – or, in this case, it could make people think “Wow, that’s a really good fit!” when in fact the red points are what’s supposed to correspond to the trend line
- Don’t use “e” or “E” for scientific notation. Write out *10^x, if you need to. Whatever takes up the fewest characters is usually the best choice, so in this case, it would be better to just use decimals!
- There are some extra pieces which are legacies from analysis work, some points which are unverified, etc. – you will be expected to explain every single item on your graphs, and that can take time away from the good science!
- More of the same stylistic issues as before
|Better (but not perfect)
So again, this is meant for a slide presentation. Readability and simplicity are key, as before. Here are some more things to point out:
- Color choice is key. Notice that the red here is neither a primary color nor a default color – going slightly towards pastel from the brightest colors available to you often works well. (It comes across more clearly if you have a graph with many different colors, and makes you look like quite the professional.)
- Credit is given to prior work – that fit line was obtained by a former group member, and previously published. If you don’t attribute work to those who did it, the assumption will be that you did – which can constitute plagiarism, with quite serious ramifications if it makes its way into a publication. Be overcautious!
- Be creative with your axis labels. That doesn’t mean making them polka-dotted. That means choosing the format that works best for the graph you’re working with. You’ll notice that the x-axis variable is dimensionless. But I still had to convey what it meant! So rather than putting the units in parentheses, I included the chemical formula. I stopped short of explicitly pointing out the x in the formula because it’s expected that people will digest your graphs to some degree, and at some point more text just becomes noise.
I’ll leave you with a thinking point: How to deal with data that isn’t what you expected. Sometimes there are just deadlines, and you don’t have time to conduct the tests you would have to in order to correct your mistakes. What can you do? Be honest. Still choose the most effective way to present the data, even if it’s presenting your mistakes, or things you’re still seeking to understand. Most of your work is complicated. People won’t blame you for setbacks. But, as you’ll see time and time again, they’ll nail you for hiding them.
Can you see the setback?