Online – 20 hours; 16-20 December 2024
The course’s first session is a reminder of linear regression and analysis of variance, the two statistical techniques that constitute the basis of the most important frames of statistical models. These statistical models (GLM, GLZ and GAM) will be the content of the next four sessions.
GLM (General Linear Model) will be studied in the second and third sessions. Due to its robustness and relative simplicity, this is the most widely used statistical model frame. For this reason, in the course we spend more time with GLM than with the others modelling frameworks (GLZ, GAM). Through worked examples will be explained the following topics: I) GLM assumptions, how to test them and how to solve problems when certain of these assumptions are not met (data transformation for non-normality, alternative fitting by generalized least squares for variance heteroscedasticity and residual autocorrelation); II) contrast of hypothesis for the main effects and post-hoc tests for the simple effects; III) visualization of model predictions.
In the GLM will be explained with particular detail the inclusion of random factors (simple mixed effects models, hierarchical designs, and random slopes designs).
GLZ (Generalized Linear Models) are an extension of GLM for error distributions that could be different than Gaussian. Three worked examples will be shown for different distributions (binomial Poisson and negative binomial, for over-dispersed data), and a simple example including random factors.
GAM is a statistical model frame that could incorporate any features of GLM and GLZ. But, in addition, and considering additional restrictions, they could incorporate smoothed functions (non-linear but non-parametric) to describe the relationship between the response variable and one or several predictor variables. This statistical model frame has been gaining popularity in different fields of science due to the increase in computer power over the last 20 years.
At least a beginner’s level in R language is recommended for this course, and also some familiarity with basic statistical techniques (regression, contrast of hypothesis, ANOVA).
The course is open to any level from undergraduate students to senior researchers. It is considered as the “second part” of another course that is regularly taught in CIIMAR entitled Introductory Statistics for Biological Sciences.
The course contents are short theoretical introductions to each topic followed by worked examples in R language, throughout which all features of the specific topic are shown. At the end of each section, the students will be provided data to perform exercises that will be revised and corrected during the lessons.
The theoretical explanations, as well as the worked examples are fully developed in a pdf, with R scripts with the examples as additional support. Access to all this material will be provided days before the course through a link to the Open Science Framework platform. Solutions to the exercises will also be delivered to the students by the end of the course.
Overview of the main statistical model frames to understand the differences between them and consequently, their specific contexts of applicability.
Check the course’s programme here.
Price: 250 € (200 € for CIIMAR/U.Porto members/CCMAR)
Registration: after announcement, up to fill 25 available positions.
Steps to register:
Aldo Barreiro Felpeto is a researcher at the Interdisciplinary Centre of Marine and Environmental Research (CIIMAR) associated to the University of Porto (Porto, Portugal). His research career has focused in plankton ecology. He defended his Ph.D. dissertation in 2007 in the Department of Ecology at the University of Vigo (Vigo, Spain) about interactions between zooplankton and toxic phytoplankton species from the Spanish NW Atlantic coast, southern Baltic sea and southern Tirreno coast. In 2008-2010, he performed a post-doctorate in the Department of Ecology and Evolutionary Biology at Cornell University (Ithaca, New York, USA). Since 2011 he is a researcher at CIIMAR.
He developed a strong background in statistics and dynamic modelling with R software, attending 10 courses in the period 2006-2018 and since 2013, organizing 14 editions of courses about different aspects of statistics and programming with R, mostly in CIIMAR, but also in the University of Vigo (Spain) and the University of Magallanes (Chile). He co-authored two books about statistics and programming: Tratamiento de Datos (Ed. Díaz de Santos, Madrid, 2006) and Tratamiento de Datos con R, SPSS y ESTATISTICA (Ed. Díaz de Santos, Madrid, 2010).
Due to his expertise in statistics and programming, he has developed collaborations in different fields of ecology, but also environmental sciences and molecular biology. He has published 60 articles, accounting for an h index of 27 and an i10 index of 48.
Website by: Glitz Design