How large of a difference is there between mouse and human inflammatory responses?

Attention conservation notice: Post about a 2014 PNAS paper discussing a 2013 PNAS paper (cited 449 times already!) for a journal club I was required to go to, which you’ve probably already heard of if you’re in the field, and probably don’t care about if you’re not. Also, I am not a geneticist.

The crux of both these papers is the following: let’s compare the genome-wide gene expression of humans in situations in which their inflammatory systems are likely to be going haywire (trauma, burns, sepsis with endotoxins) to the genome-wide gene expression of mice that are also in stressful situations in which their gene expression is likely to be pretty haywire-ish as well.

Here’s the problem: they come to exactly the opposite conclusions: the 2013 paper saying that the correspondence between the human and mice inflammatory signatures isn’t very good, and the 2014 paper saying that it is pretty good. And they use similar data sets.

So why the conflicting results?

1) One of the major differences is that the 2013 paper compared human genes that had been thresholded for significance in the condition of interest to all mice genes, whereas the 2014 paper thresholded both gene sets for significance prior to calculating the correlation. But why threshold for significance and then do correlations at all? It might be nice to threshold for some other characteristic of the genes, such as high variance, that is at least somewhat orthogonal to the actual correlation.

2) The 2013 paper uses Pearson correlations, whereas the 2014 paper uses Spearman rank correlations, making quite a brouhaha about how this is necessary because the data is not normally distributed via a KS test, and the Spearman measure is better in the case of non-normality. If this is so important, why not Kendall’s tau? But I am not convinced that it is: even in the most extreme cases on Wikipedia, the differences between Pearson and Spearman is only ~ 0.15 – 0.2, whereas the two studies found differences of around ~ 0.4 – 0.6. I bet #1 is more key.

Bottom line: The cynical take is that comparing mice to human gene expression patterns poses a large number of analysis quandaries, offering many free parameters for the researchers to draw conclusions of their preference. The more idealistic take is that authors and reviewers must be very careful in ensuring standard, robust methods in the analyses. Overall, I come down in favor of the 2014 paper, because they threshold mouse and human genes in the same way, which is more like comparing apples to apples.