Course description:
|
Statistics, especially biostatistics, aims to describe nature by reducing nature to a set of variables, model these variables to quantify their relationship,
and choose the best model among possible alternatives to facilitate decision-making. There are many different ways to formulate models, to characterize the parameters
of the models, and to evaluate alternative models based on different criteria. These different ways reflect different statistical thinking which has resulted in the development of different
statistical protocols and significance tests. This course examines statistical thinking that leads to the least-squares, maximum likelihood, and Bayesian inference
methods that are the foundation of modelling, model selection, and significance tests. I illustrate statistical thinking with various statistical methods and their
applications. One particular feature of the course is that
- you will be guided to do all hand calculations (with the help of
a spreadsheet) from simple linear regression to more complicated methods including generalized linear models and multivariate statistics such as PCA, canonical
correlation, etc.)
- you will write R scripts to implment the statistical methods you have learned
- you will then compare your implementation against existing R packages
- you are encouraged to develop new methods not available in R or not implemented well in R
Through these protocols students will gain a sound understanding of statistical rationales in the framework of likelihood methods and Bayesian inference,
and develop the potential of contributing to biostatistics. The course is intended for highly motivated students and designed to be highly interactive.
|