Scaling the brain: Is it dishonest to truncate your y-axis?

So, the other day I responded to a tweet by Felix Schönbrodt. He called out a tweet by GESIS – Leibniz-Institut für Sozialwissenschaften that showed data on life satisfaction in Germany from 2010 to 2016 without a y-axis (below left). He added a picture of a tweet (below right) that suggests it is dishonest to truncate your y-axis and I completely disagree with the message of this tweet!  Importantly, I do agree with Felix that the GESIS graph was a prime example of how not to visualize data. Still, I want to use this post to point to some considerations, when choosing the range of your y-axis.

Left: graph posted by GESIS – Leibniz-Institut für Sozialwissenschaften on twitter. Right: picture of tweet posted by Felix Schönbrodt on twitter.

 

First of all let’s make the case for including “0” in bar graphs (a more detailed account can be found here) and then move on to situations where it is dishonest not to exclude it. I believe Felix’ main point is that in a bar graph the height of the graph represents something meaningful about the data. One case, where this is in-arguably true, is for height, e.g., below, I have graphed body heights of my family. In the left panel, I have included “0” and you can roughly judge that I am twice as tall as my daughter Sophie. If the y-axis is truncated, as in the middle right panel, this same interpretation would make me ten times taller than Sophie and in fact, in that graph, Liah is infinitely smaller than me. Nonsense.

Left: heights of the Feld family with y-axis starting at 0. Right: heights of the Feld family with y-axis starting at 1.

Of course, this was an over-exaggeration to make a point. However, since people have the expectation that the height of a bar graph is relevant to its interpretation, it will distort how people interpret more relevant and realistic data, if this is not true. And this expectation is only true, if the data have a meaningful “0”, which is directly connected to the scale of measure. In statistics a scale that has a true “0” is called an absolute scale and it shares the feature of our above left bar graph that ratios can be interpreted, i.e., we can say the value 4 is twice as large as 2 (just as we said I am twice as large as Sophie).

To contrast this, let’s move to some other data, the forecast for temperature in degree Celsius in St. Albans/UK on 7th April (see below top left and know that I am not very happy about this forecast). 0°C may seem like a true “0”, since it is the temperature at which water freezes. Thus, when comparing 7h and 11h (in red) you might be tempted to say that the temperature has doubled from 7°C to 14°C in four hours. However, a quick look at the same data transformed to degree Fahrenheit (below lower left) reveals this to be untrue (57.2°F at 11h is not twice as high as 44.6°F at 7h). The reason for this is that these scales are interval scales, which means that only the ratio of their differences can be compared. For example, the difference in temperature between 13h and 7h is eight times the difference in temperature between 13h and 11h, regardless of measuring in °C or °F (below right).

Temperature forecast for St. Albans on 7th April 2018. Top left:  in degrees Celsius. Bottom left: in degrees Fahrenheit. Top right: difference in temperature in degree Celsius between 13h and 11h as well as 13h and 7h. Bottom right: difference in temperature in degree Fahrenheit between 13h and 11h as well as 13h and 7h.

We can get around the shortcoming of interval scales in this case, as temperature can be expressed in degree Kelvin, which has a true “0” (nothing can be colder than absolute zero). However, if you look at the graph for °K (below) you see that this is not very helpful. While we could now validly interpret the ratio of temperatures, including absolute zero has scaled the graph so that the small differences of interest to me are no longer discernible. This is probably why we still hang on to our outdated use of °C or °F, as they nicely scale with the range relevant to our daily lives.

Temperature forecast for St. Albans on 7th April 2018 in degrees Kelvin.

Note, if a scale does not have a true “0” or if it would make the interesting differences indiscernible, we should not use bar graphs, to convey that ratios of values cannot be interpreted. An alternative simple way to visualize data that does not imply this, is the line graph, which works nicely for time series data like our temperatures.  However, excluding “0” – as it is not informative – introduces bias, since we need to decide which arbitrary range the y-axis should cover. This can make a huge visual difference! For our temperature data a simple approach could be for the y-axis to start at the minimum and end at the maximum values measured for that day (below top) or you could use the minimum and maximum temperatures measured on earth in general (below middle). However, you might agree that the first exaggerates the differences across the day and the other marginalizes them, so like me you might prefer to use the average minimum and maximum temperatures in St. Albans measured over the year (below bottom). Importantly, your choice will depend on what you want to convey and the expectations of your audience. In other words you should use a meaningful range.

Temperature forecast for St. Albans on 7th April 2018 in degrees Celsius. Top: y-axis range 5° to 16°. Middle: y-axis range -80°to 60°. Bottom: -axis range 1°to 22°.

The definition of what is meaningful sometimes seems a bit arbitrary and, if you have read How to Lie with Statistics by Darrel Huff, you may think the goal of any data visualization is to deceive instead of conveying meaning. And for sure, as we have seen, choosing the range of your y-axis you can make very small differences look huge or very large differences look irrelevant. However, it is dangerous to suggest that there are no guidelines how to visualize data and anything is up to interpretation. In the next paragraph, I will try to make the point of using the variance of the data as a meaningful tool to fix your y-axis range.

Below, I have simulated some IQ-values (mean = 100, SD = 15) in three different groups (let’s say orange-, purple- and red eyed-people) and made three graphs with different y-axis ranges. On the left, it looks like having orange eyes is related to a much lower IQ than having red eyes, whereas in the middle the differences looks negligible and this is – as we know – only because I have chosen the y-axis to make this impression. Luckily, in science, a relatively unbiased way to choose the scale of your y-axis, is to include the variance of your variable. On the right, I have added the standard deviation of each group as error bars and changed the range of the y-axis to roughly the mean plus minus twice the standard deviation. Now differences no longer need to be interpreted in relation to the y-axis range, but can be related to the variance of the variable. The gold standard, of course, is to define the range of your y-axis with respect to the variance expected from the literature, as this is not biased by the level of noise in your measurements. Importantly, if the error bars in a graph look very large or very small, this may indicate that somebody is trying to deceive and make you believe differences are smaller or larger than they really are. If it is your graph, you may be deceiving yourself.

Simulated IQ-data in three groups (mean = 100, SD = 15). Left: y-axis range 96 to 108. Middle: y-axis range 0 to 200. Bottom: y-axis range 70 to 130.

In conclusion, I believe, it is not dishonest, if your y-axis does not include “0”. However, it is very important that the range of your y-axis covers a meaningful interval, which is often defined by the sample variance of your variable or by a priori consideration of typical variance in that variable. Coming back to the GESIS graph, it is clear that its y-axis does not cover a meaningful range, as it overemphasized changes of only 0.4 on a 10 point Likert scale and does not provide error bars.

Posted in Data visualization, Methods, Statistics | Leave a comment

Deceived brain – Can twitter followers differentiate real and false memories

Currently, I am curating the German version of the Real Scientist twitter account and this is a lot of fun. At Real Scientist real scientists get to tweet about their work and benefit from the following of the account, which is usually larger than their own. During the week I have used twitter’s survey function to find out more about the follower’s sleep habits, me being a sleep and memory researcher and all. This was helpful to break down some of the more complicated information I wanted to relay. Here is how I tried to use twitter to also collect some memory data. Here is a link to the associated twitter thread.

After one of the sleep surveys, I got into a little bit of a discussion with one of the followers about how to analyse these data and of course the restrictions of twitter surveys don’t really allow a lot of flexibility. This prompted me to use an outside survey service for my little experiment. Essentially I wanted to use the Deese–Roediger–McDermott paradigm (Deese 1959, Roediger & McDermott 1995) to demonstrate to the followers how easy it is to induce false memories.

List of German words that can be grouped by the category words “süß”, “Mann”, “Spinne” and “schwarz”. Note that the category words were not used in the lists and participants were not told that there were categories.

To this end I showed a list of German (this was German Real Scientist after all) words that pertained to four categories and were randomly mixed. I asked the twitter followers to memorize the words and told them I would delete the tweet in the evening. The next day, I provided a survey using a third party service, which contained old words (from the list), new words and lure words. It asked participants for each word, if it was part of the originally learned list. New words were not related to the learned list in any meaningful way, but lure words were associated to the four categories of the original list. In the end, 22 followers (of 3000) actually filled in the survey.

Luckily for me, the data showed exactly what could be expected from the literature (I used simple two-sided paired t-tests for the p-values). 1. Trivially, hits (correctly identifying an old word as being part of the learned list) were more frequent than false alarms for new words (incorrectly identifying a new word as being part of the learned list). 2. Importantly, false alarms for lures (incorrectly identifying a lure as being part of the learned list) were more frequent than false alarms for new words. 3. Reassuringly, hits were still more frequent than false alarms for lures. These data nicely demonstrated that our brain’s memory system does not work like a hard drive and retrieval is a reconstructive process prone to error.

Data from the experiment. A) Relative frequency (mean and standard deviation) of hits and false alarms for new words and lures. B) Absolute frequency for each item being identified as an item from the learning list by the 22 participants.

All in all, I must say that I am quite happy that this little demonstration worked out and it nicely showed that you can collect more complex data via twitter. However, I was a bit shocked that I only got 22 respondents to the survey. When running a survey internally on twitter, I received about 200-300 respondents using this account. The third party service was optimized for mobile devices, so I don’t think that was an issue. It would have been so cool to run such an experiment on a couple of hundreds of people just by posting it on twitter. However, it seems leaving twitter is a much higher hurdle than I had anticipated.

Posted in Experiment, Memory | Leave a comment

Continuity of self: Was the world put into place five minutes ago?

In my first ever blogpost I speculated whether uploading your brain would result in potentially eternal life. And I concluded it was possible. However, I also concluded that what we experience as our self is not as continuous as we believe. I kind of glossed over the reasons I believe that to be true. Since elaborating this allows me to talk about two of my favourite thought experiments, I decided to write this short add-on.

The malicious demon thought experiment by Descartes supposes there is a being that can produce a complete illusion of the external world:

I shall think that the sky, the air, the earth, colours, shapes, sounds and all external things are merely the delusions of dreams which he has devised to ensnare my judgement. I shall consider myself as not having hands or eyes, or flesh, or blood or senses, but as falsely believing that I have all these things.

This idea of course has been used many times. It is the plot of the movie Matrix and is also used for the simulated reality or the brain in a vat arguments. It is perfect to demonstrate that we are unable to tell apart reality and illusion, if the illusion is good enough.

vision by Kaitlin M is licensed under CC BY 2.0

The “five minute hypothesis” by Bertrand Russel works in a similar way, but goes even further. It assumes that everything in the universe, including human memory, sprung into existence five minutes ago. In this scenario an event that you remember does not necessarily have to have happened, as long as the molecular structures of the memory were put in place also five minutes ago and, thereby, create the illusion that it happened. It is impossible to prove that this hypothesis is wrong. Of course, it is also not possible to prove that it is correct and entertaining it as a real possibility is mostly meaningless. Rather, you may use it to scrutinize your feeling of continuity of self. To this end, if neuroscience were able to prove that a continuity of the self exists outside of plastic brain changes (i.e., the molecular structures of the memory), it would disprove this hypothesis.

So far I am unaware of such data and I believe it is highly unlikely they will ever exist.

Posted in Neurophilosophy | Leave a comment

Theseus Brain: Can you prolong your life by uploading your brain to a computer?

The idea of living forever in a digitalised form has become popular and recently even featured in the Dr. Who Christmas Special 2017. [SPOILER] An advanced civilisation kidnaps individuals at the very last second of their lives, extracts all their memories and returns them for their deaths. It is revealed that this enables the individual to come back to life using an avatar resembling their body shortly before death. [/SPOILER]

Most people will intuitively answer the title question with a strict ‘no’. Thinking about it for a while the answer may change towards ‘yes, but’. Assume a process will someday be engineered, by which the exact physiological properties of your brain can be emulated within a software running on a powerful computer. This emulation would by definition include all your memories and adapt to experiences in the exact same as your brain would have. Most people will agree that we are far from discovering this technology and in fact the laws of quantum mechanics may even prevent it. For sake of argument in this post, I will assume that it will someday be possible. Will that mean you can live forever?

In a way this idea is not very new, as it entails the classical problem of identity. The most ancient thought experiment in this domain is the Ship of Theseus or the Theseus paradox. The Grandfather’s Axe is a modern version of this conundrum: imagine an axe is passed down in the family over generations and over time both head and handle are replaced. Is it still the same axe? (Of course the axe would claim to be the same axe, if you asked it, but that’s whole other story.)

In a similar way, you can ask what would be the consequence of replacing one single biological neuron in a human’s brain with one single artificial neuron that mimics all of the original neuron’s connections and functions. Would this make the respective human a different individual? Over time, you could exchange one neuron, then another, and another until the whole brain is made up of artificial neurons. At what point is the person no longer the original? Most people would agree that this gradual change, if it takes place over a period of say years, would mean the person stays the same individual, whereas, making an artificial copy of the original brain (which is in all important ways equivalent to uploading it to a machine) would mean the new individual is not the same as the original. (This is essentially a rephrasing of the teleportation paradox.)

This paradoxical differentiation is due to a strong subjective feeling of continuity that is violated by the idea of uploading or copying people’s brains. For example you feel this continuity, when you wake up after a night of sleep and are convinced that you are the same person as when you went to bed (I stole this great example from Guilio Tononi’s Phi). As far as we know, this feeling is merely a product of sampling the memories of being yourself and realizing you have not changed. Put more drastically, the only thing that keeps your experience of self continuous, is the constant plastic changes that are occurring in the brain and producing congruent memories that can be sampled. Unless one argues that there exists something like a non-material soul, all these memories can potentially be part of a brain-upload.

In conclusion, there is no difference between the experience of uploading your brain to a computer and experiencing the flow of time in your dedicated biological body. Not because you remain yourself when you are uploaded, but because there is no continuous self. In this sense continuity of self is an illusion and an upload of all your memory would in essence enable eternal life. However, this absence of actual continuity of existence also means that your life only lasts a moment. In the each new moment a fleeting individual is born that inherits all you memories and only lasts a moment before it again is replaced.

If this sounds a bit depressing, you can console yourself with this: When you are out partying and have that one drink too many, it’s not you who suffers that hang-over but future-you. And who cares about her?

Posted in Neurophilosophy | 2 Comments

Is there anybody out there?

So finally I have gotten around to activating this site. It has been sitting around now for about two months and as the dust settles I just wanted to send out a quick hello in case anybody is listening. This page is intended for my personal musings and I hope I can regularly post things about neuroscience, philosophy and current developments. Of course, this page is merely a sly trick to advance my career, so don’t expect any actual original content. That’s it for now, watch out for my first actual post though…

Posted in Blog news | Leave a comment