Course description

Block 1 | Block 2 | Block 3 | Block 4 | Block 5 | Block 6 | Stata 1 | Stata 2

Block 1

Principles of Biostatistics - M. Pagano (Harvard T.H. Chan School of Public Health)

Introduces the fundamental principles of statistics applied to biomedicine. The topics to be covered include: descriptive statistics, measures of central tendency, probability, diagnostic testing, population and sample, comparison of proportions. At the end of the course, students will be able to understand the descriptive statistical methodologies which are used in clinical and epidemiological studies and to utilize the estimates obtained from suitably selected samples, in order to draw statistical inferences.

Linear Regression for Medical Research - R. Bellocco (University of Milano-Bicocca and Karolinska Institutet)

The course introduces students to the practice and application of regression modeling. Through the use of Stata¨, students will learn how to fit a regression, estimate, and test regression coefficients. Particular emphasis will be placed on the interpretation of the regression coefficients of continuous and categorical predictors. Analysis of variance models and their correspondence with regression models will also be covered together with procedures and issues in model selection, including confounding and interaction. Model building, goodness of fit, residual analysis, and appropriate regression diagnostics will be discussed.

Causal Inference in Epidemiology - Stijn Vansteelandt (Ghent University)

Within fields spanning drug testing, epidemiology and social sciences, researchers are often faced with the challenge of assessing the effect of an exposure on an outcome. Standard statistical methods are commonly used for this purpose, but often not targeted towards the causal question of interest, or even misleading. In this course we will introduce modern causal inference theory to infer causal effects from data. The course will in particular introduce popular tools such as causal diagrams, standardisation, propensity score methods, instrumental variables methods and techniques for estimating the effects of exposures that vary over time. All theoretical concepts will be set into the context of real life research problems, taken from medicine and epidemiology. Lab sessions in Stata throughout the course will ensure that participants actively use the just taught concepts.

Block 2

Principles of Epidemiology - E. Mostofsky (Harvard T.H. Chan School of Public Health)

This course provides an introduction to the skills needed by public health professionals and clinicians to critically interpret the epidemiologic literature. It will provide participants with the basic principles and practical experience needed to develop these skills. This will be accomplished by covering the basic principles and methods of the design, conduct and interpretation of epidemiologic studies, including descriptive studies, observational analytic studies (case-control and cohort), and randomized clinical trials. In addition, the course will address the calculation and interpretation of measures of disease frequency and association; the assessment of association versus causation in the interpretation of study results; and an introduction to issues related to the evaluation of chance, bias, confounding, and effect modification. Lectures will be complemented by seminars devoted to case studies, exercises, or critiques of relevant examples of epidemiologic studies.

Logistic Regression for Medical Research - D. Wypij (Harvard T.H. Chan School of Public Health)

This course introduces students to the practice and application of logistic regression modeling for binary outcomes. Students will fit, evaluate, and interpret binary data models arising from epidemiological studies, clinical trials, or other application areas. Topics include assessment of confounding and effect modification, use of indicator variables, model building methods, goodness-of-fit assessment, presentation of logistic regression models for reports and publications, and an introduction to conditional and ordinal logistic regression. Data sets from the medical and public health literature will be used as case studies to be analyzed using the Stata® statistical package.

Mediation Analysis - A. Bellavia (Harvard T.H. Chan School of Public Health)

The course will introduce traditional and new methods for mediation analysis. These methods are commonly used to assess social and biological pathways by which causal effects operate. Fundamentals of mediation analysis will be presented for dichotomous, continuous, and time-to-event outcomes, and discussion will be given as to when the standard approaches to mediation analysis are or are not valid. The relationship between traditional methods for mediation in the biomedical and the social sciences and new methods in causal inference will be discussed. The course will also introduce some of the recent developments in the field, including extensions to incorporate multiple mediators and interactions. Stata macros and commands to implement these techniques will be presented, and several applications from epidemiology and the social sciences will be illustrated and discussed. Basic knowledge of linear and logistic regression is recommended.

Block 3

Statistical methods for population-based cancer survival analysis - P. Dickman (Karolinska Institutet) & P. Lambert (University of Leicester and Karolinska Institutet)

The course will address the principles, methods, and application of statistical methods to studying the survival of cancer patients using data collected by population-based cancer registries. We cover central concepts, such as how to estimate and model relative survival, as well as recent methodological developments including cure models, flexible parametric models, loss in expectation of life, and estimation in the presence of competing risks. Comparison of alternative methodological approaches (e.g., to estimating and modeling relative/net survival) will be a focus of the course and participants will get the opportunity to apply and contrast a range of methods to real data. A large amount of time will be devoted to exercise sessions where Drs Lambert and Dickman along with 3 other experienced faculty members will be available to work with participants individually or in small groups. The exercise sessions will also provide an opportunity for participants to discuss their own research projects with the faculty (and with each other). We encourage potential participants to read the detailed course description at

Block 4

Research Methods in Health: Biostatistics - M. Bonetti (Bocconi University)

This course is designed to provide the student with an understanding of the foundations of biostatistics and of the various statistical techniques that have been developed to answer research questions in the health sciences. Students will be introduced to methods for the comparison of outcome between two groups (t-test and non parametric tests), as well as the extension to the comparison of outcome across several groups (ANOVA); methods for the study of association between two continuous variables (correlation and linear regression); the analysis of contingency tables; the study of survival (time-to-event) data. The afternoon sessions are devoted to discussion and learning to use Stata® to implement materials covered in the morning lectures.

Longitudinal Data Analysis - G. Fitzmaurice (Harvard T.H. Chan School of Public Health)

This course focuses on methods for analyzing longitudinal and repeated measures data. The defining feature of longitudinal studies is that measurements of the same individuals are taken repeatedly through time, thereby allowing the direct study of change over time. This type of study design encompasses epidemiological follow-up studies as well as clinical trials. The course covers many well-established methods for the analysis of longitudinal data when the response variable is continuous. Methods for discrete response variables (e.g., repeated binary responses and counts) are introduced, but not emphasized. An introductory course in biostatistics and a good background in linear regression analysis are prerequisites for this course.

Block 5

Research Methods in Health: Epidemiology - M. Mittleman (Harvard T.H. Chan School of Public Health)

This course will explore in greater depth the fundamental epidemiologic concepts introduced in Principles of Epidemiology (Week 1). The course will be taught with an emphasis on causal inference in epidemiologic research. Topics will mainly focus on chronic disease epidemiology, with a special emphasis on practical study design. Epidemiologic examples from major chronic diseases/conditions (e.g. heart disease and cancer) will be discussed. Students will revisit the issues of confounding, selection bias, effect modification, and generalizability in the context of these topics. Lectures will be augmented by workshops to illustrate practical examples in the epidemiologic literature. The material covered in Principles of Epidemiology will be assumed of the students entering this course.

Survival Analysis - N. Orsini (Karolinska Institutet)

The course introduces statistical methods for survival analysis, that is, the analysis of studies where the outcome is a time-to-event. Measures covered are survival probabilities, survival percentiles, and rates. The methods include Kaplan-Meier survival curve, Poisson regression, Cox regression, Laplace regression. Modelling strategies include flexible modelling of quantitative predictors and interaction analysis. The concepts and methods are illustrated through real-life examples taken from medical, epidemiological, and public-health research. The emphasis is placed on interpretation and practical relevance. Guided, hands-on computer activities enable the participants to utilize the presented statistical methods.

Joint Modelling of Longitudinal and Survival Data - M. Crowther (University of Leicester)

The joint modelling of longitudinal and survival data has been an area of growing interest in recent years, with the benefits of the approach becoming recognised in ever widening fields of study. The models can provide both an effective way of conducting an analysis of a survival endpoint (e.g. time to death), influenced by a time-varying covariate measured with error, or alternatively correct for non-random dropout in the analysis of a longitudinal outcome (e.g. a biomarker such as blood pressure). This week-long course will provide an introduction to joint modelling through real applications to both clinical trial data and electronic health records, using examples in cancer, liver cirrhosis and cardiovascular disease. We will study the methodological framework, underlying assumptions, estimation, model building and predictions. We will also consider current developments in the field, looking at some of the many extensions of the standard framework, such as the ability to model multiple biomarkers and competing risks. The course will consist of lectures, classroom exercises, and computing exercises making use of the stjm and merlin packages in Stata, written by the course lecturer.

An Introduction to Sample Surveys for Health - M. Pagano (Harvard T.H. Chan School of Public Health)

Information obtained from surveys provide the basis for measuring, monitoring and advancing the public health. It is thus critical that we utilize high quality surveys. Over the past decade, the use of online surveys and mobile data collection has skyrocketed. We can now conduct research for a fraction of the cost and time it used to take, but the principles of a good survey remain the same. We will spend the week on learning about what makes a good survey and how to analyze the results properly.

Block 6

Public health emergency preparedness and response - E. Savoia (Harvard T.H. Chan School of Public Health)

This course provides an introduction to emergency preparedness and response to health threats including natural disasters, infectious diseases, acts of terrorism, biological, chemical, nuclear, and radiological events. The course presents the role of various sectors within the emergency preparedness response system including public health, healthcare providers, emergency management and civil protection personnel. Risk sciences, public health practice and public health surveillance as it relates to natural and manmade disasters will be discussed. Emergency risk communication guidelines will be presented as well as examples of community engagement approaches and crisis management. Current and past disasters in Europe, Africa and North America will be examined in the form of case studies. Participants will apply the course content to a simulation of a disaster and learn how to develop an after action report. Through lectures, discussion, and case studies, participants in this course will develop a broad theoretical and practical understanding of the field of public health emergency preparedness. Participants will be exposed to various ways to think about, measure, assess and compare crisis, as well as ways to prepare public health systems for major emergencies. We encourage potential participants to read the detailed course description at

Stata 1

Basics of Stata® - B. Pongiglione (Institute of Education, University College London)

This course is designed to introduce students to the basics of Stata. It will focus on the minimum set of commands everyone should know to organize their own work. Specific topics include data-management, data-reporting, graphics and basic use of do-files. By the end of this one-day course, the student should be capable of using Stata independently.

Meta-analysis using Stata® - L. Ciccolallo (European Food Safety Authority)

The aim of this course is to provide an overview of methods to perform meta-analysis. We will cover the following topics: data preparation and imputation, fixed-effect and random-effects models, forest plots, heterogeneity across studies, publications bias, sensitivity analysis, meta-regression models and dose-response meta-analysis.

Analysis of prospective studies with Stata® - F. Ghilotti (Karolinska Institutet)

This course is designed to introduce student to the analysis of cohort studies, managing person-times, estimating counts and incidence rate ratios of both fixed and time-varying exposures and fitting count regression models. By the end of the course, the student will be familiar these epidemiogical techniques using Stata.

Stata 2

Basics of Stata® - F. Gallo (Local Health Authority of Cuneo, Epidemiology Division)

This course is designed to introduce students to the basics of Stata. It will focus on the minimum set of commands everyone should know to organize their own work. Specific topics include data-management, data-reporting, graphics and basic use of do-files. By the end of this one-day course, the student should be capable of using Stata independently.

Data Visualization with Stata® - G. Capelli (University of Cassino and Southern Lazio)

The course introduces students to the logic and the strategies for visualizing data in Stata. Among the topics, the course will explore the issues in the choice of the most appropriate graphic (distributional, compositional or correlational) for different data and aims, and tips and tricks to prepare data for different graphical schemes. In particular, the power and flexibility of multiple "layers" in twoway Stata panels will be exploited. By the end of this one-day course, students will be able to produce Stata Graphs, and export them to JPG, TIFF or PDF formats for further applications.

Epi tables using Stata® - A. Discacciati (Karolinska Institutet)

This course is designed to introduce students to basic Stata commands useful in epidemiological research: descriptive statistics to estimate the incidence of a binary response and to characterize the demographic information supplied by study participants; statistical tests to identify univariate predictors associated with the binary response; graph the incidence of a binary response as a function of a predictor; and table of standardized means and proportions.

Multiple Imputation using Stata® - N. Orsini (Karolinska Institutet)

The course provides a practical overview of methods to estimate missing data. The course will introduce the basics of multiple imputation, in particular imputation by chained equations. By the end of this one day course, participants should be capable to analyse data by multiple imputation in Stata. Students should have a background in linear regression methods prior to taking this course.