bananas stats course

UPDATE: bananalysis is here!

coming soon to your own banana neighborhood!
here's your chance to learn the stats you slept through in school!!

there will be a course in statistical analysis offered right here with the expert guidance of veganmama (and anyone else who wants to pitch in) specifically geared to the interpretation of the china study data.

why should you do this course?

because:

1. you played hooky
2. you are keen on understanding how studies are designed
3. you are wanting to know how to interpret data - how these numbers all come together to make sense
4. you want to understand the inner workings of the china study better
5. you want to convincingly refute the nay-sayers
6. you have always wanted to have some quantitative expertise, but thought you just couldn't do it

here you will be able to do all these things, because not only will things be explained, you will be able to ask questions and have people personally assist you. there are several mathematically skilled individuals in this group who will be happy to share their expertise.

the mechanisms for the course are presently being formulated - you are welcome to join the process.
details should be available by next week - so stay tuned!

in friendship,
prad

=======
here are some temp links to contemplate while we put things together.

R software
http://www.r-project.org/
getting started with R
simpleR pdf minicourse using R: cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf

mcdonald intro course
http://udel.edu/~mcdonald/statintro.html

3 online courses
http://onlinestatbook.com/
http://davidmlane.com/hyperstat/
http://oli.web.cmu.edu/openlearning/forstudents/freecourses/statistics

Replies

pradtf July 29, 2010 at 3:34pm
this post is just experimental. we are trying to explore the best way to represent code in a post and may be removed.

in friendship,
prad

======

bold
#identify the url
dataurl <- "http://www.ctsu.ox.ac.uk/~china/monograph/CH83PRU.CSV"

#get the data from the url
chinadata <- read.csv (dataurl, na.strings = ".", strip.white = TRUE)

#then pull out the xiang 3 and sex T
chinadata1 <- subset ( chinadata, Xiang == 3 & Sex == "T")

italic
#identify the url
dataurl <- "http://www.ctsu.ox.ac.uk/~china/monograph/CH83PRU.CSV"

#get the data from the url
chinadata <- read.csv (dataurl, na.strings = ".", strip.white = TRUE)

#then pull out the xiang 3 and sex T
chinadata1 <- subset ( chinadata, Xiang == 3 & Sex == "T")

using the pre tag
```
#identify the urldataurl <- "http://www.ctsu.ox.ac.uk/~china/monograph/CH83PRU.CSV"



#get the data from the url

chinadata <- read.csv (dataurl, na.strings = ".", strip.white = TRUE)



#then pull out the xiang 3 and sex T

chinadata1 <- subset ( chinadata, Xiang == 3 & Sex == "T")
```
- veganmama18 > pradtf July 29, 2010 at 3:39pm
  
  i like the last one (using the pre tag). how did you do that?
- pradtf > veganmama18 July 29, 2010 at 4:45pm
  
  you can enclose things between <pre> and </pre>
  
  i like it too - but there seem to be serious issues with it on ning
  for instance, it is putting extra lines when we don't want them.
  also, if your line is too long, it will run off the edge.
  
  i'm inclined to think that bold or italic will likely be best.
  
  you can play with it here with some code you have and see what the output is like - we haven't done more than in the last post because we went for dindin, but based on what we did, i have a feeling it's not going to work well here.
  
  in friendship,
  prad
pradtf July 23, 2010 at 5:58pm

banalysis is here!
where?
right here:
http://www.30bananasaday.com/group/debunkingthechinastudycritics/fo...

in friendship,
prad
pradtf July 23, 2010 at 9:52am

ok we are just about ready to start!

below are some of the details of the course which will be conducted in its own separate thread with an explanation of both content and process. we will maintain a set pace for it, though people are free to go at whatever speed they wish to.

we will be using the very nicely done carnegie-mellon university free online course as the trunk with various branches growing out of it directly relating to the china study data.

if you want to get a headstart go here:
http://oli.web.cmu.edu/openlearning/forstudents/freecourses/statistics

the software of choice is R which you can get from here:
http://www.r-project.org/
(though if you are on a linux or bsd system, it is likely your distribution has its own install process for it.)
we prefer R because it is open source, gnu software and exceptionally good - and was strongly recommended by rayna!
however, if some of you want to use excel, you can, since the course offers that opportunity too.

if you need help with installation, just ask in this thread.

the course thread location will be announced in another post.

in friendship,
prad

=======
the cm course is done very clearly with plenty of illustrative examples
and interactive practice exercises. the presentation is more practical
than mathematical in orientation, so instead of, say, figuring out
derivation or proofs, you get to actually work with the data right away
- and understand what you are producing.

datasets for exercises can be downloaded and used in a free analysis
program such as R.

there are even explanatory videos!

there are 4 main units after a good introduction to the course:

unit 2: exploratory data analysis
techniques for sumarizing and making sense of the data

module 1 examining distributions
- categorical and quantitative variables

module 2 examining relationships
- 4 types of relations (three used in course cases I, II, III)
- causation with nice job on confounders

unit 3: producing data
deals with sampling methods and different study designs

module 3 sampling
- sampling plans random and non-random

module 4 designing studies
- observational, exeriments, surveys
- role of causation

unit 4: probability
- preparatory groundwork for drawing inferences

module 5: introduction
- basic concepts

module 6: finding probabilities
- frequencies, outcomes, various rules

module 7: conditional probabilities
- conditional and independence, multiplication rule, trees

module 8: random variables
- discrete, continuous

module 9: sampling distributions
- parameters, behaviors

unit 5: inference
- using sample data to draw conclusions

module 10: introduction
- forms of statistical inference

module 11: inference for one variable
- estimations, hypothesis testing

module 12: inference for relationships
- cases I, II, III
- sparrowrose > pradtf July 23, 2010 at 1:39pm
  
  "which will be conducted in its own separate thread"
  
  Will you link to that thread from this one to be sure that no one accidentally misses it?
  
  Thank you!
- pradtf > sparrowrose July 23, 2010 at 1:53pm
  
  we certainly will sparrowrose!
  it'll likely be up a bit later tonight.
  
  in friendship,
  prad
  
  Tonight.in
Frugivore Freelee July 13, 2010 at 3:35pm

Wow this is very exciting Prad! Thanks to you and vegamama for organising this :)

Debunking The China Study Critics

Replies