I really question Denise's maths...

I read part of Denise's on the China Study, and I quickly came to her quick note on stats:

A quick note on stats

Not a math whiz? No problem. The stuff here should be pretty straightforward, but in case you’re rusty or simply allergic to numbers, here’s a refresher on some of the statistics terminology I’ll
be using.

A positive correlation means two variables increase together and decrease together. For example, the more food Garfield scarfs down, the more Garfield weighs;
the less food Garfield eats, the less Garfield weighs.

A negative correlation means one variable increases while the other decreases, and vice versa. For example, the more Garfield exercises, the less Garfield weighs; the
less Garfield exercises, the more Garfield weighs.


Well... From this, I can see that Denise maybe be a math whiz, but she doesn't seem to know how to use stats, and what stats techniques apply for what.
After reading more, I saw her using excel to draw her data plots. I'm pretty sure she used it as well to calculate her "correlations".
So where did she go wrong?
First, my understanding is that in epidemiology you don't look at variables range when you look for associations between a factor and an effect. You draw a 2x2 table with your factor in the rows, and the effect in the columns.  First row below the exposure threshold you define, second row above threshold. First column, no effects, second column with effect. Then you can fill in the numbers of incidence in the four cases defined, and calculate the risks and so on.
To what I see, Denise didn't take this approach, but went straight for the CORREL function in excel, plotting data in XY plots, and drawing lines basically between the data points. The CORREL function is a linear regression algorithm between all the points in your plot, to match a line between them, and the correlation factor is a measure of the scattering of the data points around that line.
First, this is not the stats techniques used in the field from all I could gather in litterature.
Second, let's take a quick example of how this CORREL function technique does not work for this purpose. Let's take a simple exponential function but with two different exponential rise, one almost linear (purple), the other really steep at one point (red).

You will agree that in both case, the Y increases when X increases (the more Garfield exercises, the more Garfield weights as she said). But the result of the CORREL function shows clearly a strong correlation for the purple case (0.991669, or 99%), and no correlation for the red case (0.534017, or 53%)! But this conclusion is driven if you use Denise's definition for correlations between factors and incidence.
On this simple example, you can see that the technique she used doesn't work. It doesn't work with the simplest data set your eyes can analyze easily, so imagine what it can do on more complicated data sets?
In conclusion, I think there is not much to talk about what she can write. All she writes is based on a wrong data analysis, so none of her conclusions can be valid. That will save me a lot of time reading it actually.
I bet if her work had been peer reviewed, it would have never made it out of her kitchen.
So, Denise, if you ever come across this, don't take it personally, but before jumping on your soap box and spreading wrong truth out, just open a couple books, and learn about what you want to do first.
Then, there are other aspects of her analyzes I'm not confident with, like using the mortality as a measure of disease incidence. I'm no epidemiologist, but I think for chronic disease or cancer that would not be the best indicator, since you can live quite some time with one, and you can die from other cause. But that may be what was in the original study, I don't know, I haven't seen it.
So I'll just stay on the math side, because she is so wrong in my opinion, that none of what she can write has any value.

You need to be a member of The Frugivore Diet to add comments!

Join The Frugivore Diet

Email me when people reply –

Replies

  • well this is fantastic Stephane! Great to have you here assisting :)
    The flaws are really showing up now.
  • this is a very clear indication of the nature of the problem, stephane!

    while i think it is decent of denise to provide 'intro tutorials' for the benefit of others, it seems she may have followed them too 'literally' herself. i think you are the 3rd phd (i know you are too modest to state such things) who has criticized her stats understanding.

    it would be a good idea for us to take the accumulated consensus and follow through on this idea:
    http://www.30bananasaday.com/xn/detail/2684079:Comment:629872
    (this should not be done as an attack on denise, but as an emphasis that there are serious errors of application and interpretation - which is what campbell has been saying back in 2006)

    in friendship,
    prad
    understanding.it
    This domain may be for sale!
  • thanks for this insightful post stephane. yes many thoughtful and intelligent people like yourself are questioning Denise's stats skills, the more we look into it the less likely it seems like she knew what she was doing.


    Then, there are other aspects of her analyzes I'm not confident with, like using the mortality as a measure of disease incidence. I'm no epidemiologist, but I think for chronic disease or cancer that would not be the best indicator, since you can live quite some time with one, and you can die from other cause.

    This is an excellent point, basically all she's looking at are the uncorrected correlations between various factors including diet and disease and the mortality rates for various diseases. not being dead does not mean the absence of disease as rightly you pointed out. this must be factored into our formal rebuttal.
    • I think using mortality rates is really misleading. If I remember correctly, for breast cancer, they consider a woman survives cancer if she doesn't die within 5 years of surgery. So all women who manage to live 5 years after their surgery and breast removal for cancer would not appear in her stats! That doesn't sound right.
    • there's still the chance that a woman who survived >5 years after diagnosis might be coded as having died of breast cancer on her death certificate and therefore contribute to breast cancer mortality... but, at least for breast cancer, a big issue is that the women you capture in your mortality rates will have had much more aggressive disease with possibly a different natural history and etiology (much related to the very point you made above).

      and i completely agree - incidence and mortality are not necessarily related. and there are many other factors that play a role in mortality that are irrelevant to incidence (like treatment).

      btw, did you use R for that graph?! i love R... :-)
This reply was deleted.