An 8-week course

2020’s outbreak of COVID-19 has affected many things, and one of them is the development of HSCI 3117 – no major overhaul of the course will occur, due to funding limitations, before the course is offered in Spring 2021. However, COVID-19 has re-invigorated my commitment to the importance of a statistically literate citizenry. Therefore, I hope to push forward with a bare-bones update of a new statistical literacy focused course.

One other bombshell was dropped – the course will now be an 8-wk course. I understand why the university may prefer the 8wk format, and I understand why many students do as well, but I’m not entirely convinced the format is in the best interests of students’ learning, because asking students to spend 14 hours a week never seems to quite pay off twice as much as only asking them to spend 7 hrs a week – there are diminishing returns.

After reviewing the original 15wk format, I can’t bring myself to cut any material, even though I know I am going to be asking a lot from my students – this is a literacy course, and as such, is reading intensive. The new weeks 2 and 3 require reading approximately 100 pages, which may startle statistics students but is typical for social science students – I don’t know how health science students will react. My goal is for students to acquire a general-level understanding of principles and terms, and I will, via discussion activities, help navigate nuances of the key concepts. To help them develop an understanding of why these statistical concepts are important, I plan to include readings from Stephen Stigler’s Seven Pillars of Statistical Wisdom in addition to Harvey Motulsky’s Intuitive Biostatistics as the main textbooks.

The plan is to rely heavily on the readings for each week, readiness quizzes to ensure basic comprehension, and discussion activities when I, as an active participant, will be able to refine my students’ understanding.

hsci-3117-spring-2021-draft-syllabus Download

hsci-3117-s2021-proposed-schedule-8wk-syllabus Download

assigned-readings Download

Textbook Choice

When I last contemplated the curriculum I wished to adopt, there was one text book I didn’t consider, because it was being used for a Master’s level course in the department – Harvey Motulsky’s Intuitive Biostatistics (IB). However, that course since switched textbooks, leaving me free to consider IB for 3117.

I requested an exam copy of the book to assess its suitability for 3117 – The chapters are short, introduce important vocabulary with examples, avoids mathematical formulas, and most importantly, covers nearly all of the content I was hoping to cover in 3117.

Of 46 chapters in the book, 40 appeared to fit my original conception for 3117. Furthermore, the only things I intend to supplement are resources focusing on measurement error, randomization in experiments, simulation based tests, and perhaps poisson regression.

One added advantage is that the IB curriculum has been adopted for biostat literacy courses in three universities by expert statistics educators. Therefore, choosing IB for 3117 carries the advantage of a small yet excellent set of resources in terms of a community of instructors to seek advice from.

IB-notes Download

hsci-3117-s2021-proposed-syllabus-ib Download

Weekly Learning Objectives

Having settled on content and course learning objectives, I began to plan out weekly topics and learning objectives. I searched several textbooks for ideas on how to sequence topics, with an ulterior motive of perhaps selecting one for students.

However, none of the textbooks aligned with my combination of context and approach. I then recalled research in statistics education stating that the order of topics probably doesn’t matter all that much.

Although I wanted to make evidence-based decisions in course design, I decided to simply rely on my own experience and intuition to sequence topics. In hindsight, I might have reached out to other biostatistics instructors I know for their courses’ weekly learning objectives.

Once I finished sequencing topics, I used Bloom’s Taxonomy to help me create weekly learning objectives. Because of the literacy focus, I used the verbs Identify, Explain, Describe, and Evaluate repeatedly.

As a final check of alignment, I mapped each of my weekly objectives to the four course objectives. There was relative balance, with all course objectives mapped to between 14-17 of the 47 total weekly learning objectives.

HSCI 3117 Proposed Syllabus Download

Learning objectives: Literacy or computing?

I recently met two colleagues to solicit their advice on course creation, and the first thing they asked me was ‘are your students expected to run the models themselves?’

My first gut answer was ‘no’ – I want to focus on developing students’ statistical reasoning and literacy. What did the original course creators have in mind? The course description is:

Biostatistics for health science professionals. Concepts and methods, including confidence intervals, ANOVA, multiple and logistic regression, and non-parametric analyses. Scientific literature is used to provide a comprehensive context in which analytical evidence is employed to support practices in the health sciences.

The last sentence seems to imply a focus on literacy, but does ‘methods’ imply computation? The course objectives are that students in the course will learn to:

Apply biostatistical concepts, including: probability, distribution, confidence intervals, inference, hypothesis testing, P-value and confidence interval.
Apply numerical, tabular, and graphical descriptive techniques to health sciences data.
Conduct appropriate statistical procedures to test null hypotheses.
Appraise statistical results in health science research articles and reports.

Again, ‘appraise’ strikes me as a desire for literacy, but ‘conduct’ seems a clear indicator that computing is also an expectation. However, the weekly learning objectives from units focusing on ANOVA, multiple linear regression, and logistic regression paint a different story:

Interpret estimates from a one-way and two-way analysis of variance.
Appraise procedures used to address the problem of multiple comparisons.
Explain prediction models that are grounded in the population regression line.
Interpret regression coefficients from a multiple regression model.
Calculate and interpret chi-square test statistics.
Interpret results from unadjusted logistic regression models.
Identify the reasons for conducting a non-parametric test.
Interpret estimates from non-parametric tests including the Wilcoxon Signed-Rank test, the Wilcoxon Rank Sum test and Kruskal-Wallis test.

‘Interpret’, ‘Appraise’, ‘Identify’, and ‘Explain’ all seem like literacy skills to me, and the only computation that is being asked is a chi-squared test statistic, as well as (in other weeks) odds ratios, risk ratios, relative risks, and other descriptive statistics.

Perhaps this is the right place to draw the line between computation and literacy – I will ask my students to ‘compute’ descriptive statistics, but focus on literacy with regards to inference, modelling, and hypothesis testing.

I am curious though if it is possible to focus on both literacy and computation in the same course. I imagine students becoming stressed and annoyed with debugging with whichever software we might use. Whichever software I do choose ought to be able to do all the calculations I expect students to do in 2117 and 3117, and should be used in both courses.

A two-course sequence

I’ve increasingly come to think of HSCI 2117 in terms of the statistical reasoning I wish to develop in my students and the specific tools I wish to equip my students with.

I think of the reasoning goals as focusing successively on reasoning about data, reasoning about variability, reasoning about covariation, and finally reasoning about statistical inference.

I’ve think of the content of HSCI 2117 in terms of descriptive and inferential tools for different combinations of types of variables in the univariate and bivariate case. For example:

frequency tables, bar charts, and binomial and multinomial tests for a single categorical variable
summary tables, histograms, and T tests for a single quantitative variable
joint and marginal frequency tables, stacked bar charts and line charts, and chi-squared tests for two categorical variables
marginal summary tables, side-by-side boxplots, and ANOVA with multiple comparisons for one categorical variable and one quantitative variable
summary tables, scatterplots, and T tests for correlation or ANOVA for a model for two quantitative variables

What is the best way to extend this foundation in HSCI 3117, a ‘second-course’ in statistics? Most of the research I’m familiar with in the statistics education literature focuses on the first course in statistics, and it seems we haven’t gotten around to second courses yet because the first course is still an open problem and we’re not satisfied with our solutions yet.

Using the framework of reasoning and content, I think reasoning about models and modelling and reasoning about sampling variability are the two most obvious pieces to develop in addition to a re-emphasis on reasoning about covariation and reasoning about statistical inference.

In terms of content, we would then cover combinations of categorical and quantitative variables as independent and dependent variables in the tri-variate case. This would include multiple linear regression (including multicollinearity and interaction terms), 2-way ANOVA, ANCOVA (and regression with indicator variables), logistic regression (including adjusted odds ratios), and Cochran-Mantel-Haenszel procedures.

In a previous post, I discuss common methods in public health literature. Therefore, in addition to the methods above, I feel I should include Cox Proportional Hazards Regression, non-parametric methods, and survival analysis.

There is a paucity of research on the second course in statistics. I do not know what is typical. However, this approach essentially means that HSCI 2117 will hopefully build a foundation of literacy and reasoning while HSCI 3117 will build students’ statistical skills.

References and further reading:

Garfield, J., & Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. Springer Science & Business Media.

Rao, V.N.V. (2019, March 31). Deciding course content [Blog post]. Retrieved from https://statisticaljourneys.home.blog/2019/03/31/deciding-course-content/

Deciding course content

Although I view 2117 as a statistics class, its title, Introduction to Statistics for the Health Sciences, reminds me that it is statistics situated within a specific context. 3117’s title, Principles of Biostatistics, makes this even clearer.

As such, I want the specific statistical skills students walk out of the course with to be well-aligned with the tasks they might be asked to do in their careers. How should I decide on content for the course and how do I anticipate what students will need?

I have professional experience as a statistician in the health sciences, and largely drew upon that experience in the past to decide on relevant content for this course. However, I wanted a stronger evidence-base for my decisions. Luckily, a recent study by Hayat, et. al (2017) sampled published papers in public health journals. I could use this to decide the typical basket of tools my students might need to be familiar with.

Hayat, et. al. (2017) found the following epidemiological terms common:

Prevalance
Relative Risk
Odds Ratio
Incidence
Mortality
Hazard Ratio

Hayat, et. al. (2017) found the following statistics tests common (including p-values and confidence intervals):

T-test
Chi-squared Test / Exact Test (presumably for contingency tables)
Correlation tests
Non-parametric tests

Hayat, et. al. (2017) found the following statistics models common:

ANOVA
Linear Regression
Logistic Regression
Poisson Regression
Cox Proportional Hazards Regression
Generalized Linear Mixed Models

These results gave me a frame from which I could choose content for 2117 and 3117. I will introduce prevalence, relative risk, odds ratios, and incidence in 2117, and introduce mortality and hazard ratios in 3117. I will introduce the p-values, confidence intervals, T-test, Chi-squared test, and correlation tests in 2117, and introduce non-parametric tests in 3117. I will introduce ANOVA and linear regression in 2117, and generalized linear models in 3117.

While I certainly don’t want to automatically perpetuate the status quo, if I wish my students to be statistically literate, especially in the field of the health sciences, then they must be familiar with these methods. However, I struggle between balancing this traditional base and exposing my students to more modern methods. It seems unlikely that many will take more statistics courses in their lives. Is it my responsibility to seize this opportunity to expose them now, at the risk of leaving them without common skills prevalent in their field?

Nevertheless, with these content learning objectives in place, I could now move on to the specific statistical reasoning learning objectives and content sequencing.

References and further reading:

Hayat, M. J., Powell, A., Johnson, T., & Cadwell, B. L. (2017). Statistical methods used in the public health literature and implications for training of public health professionals. PloS one, 12(6), e0179032.

A second journey begins

Welcome to my blog! I hope to use this space to document my thoughts, deliberations, and decisions as I update a statistics course I teach. This page will focus on my experiences teaching and designing HSCI 3117 Principles of Biostatistics for the Health Sciences.

The course: HSCI3117 is an online second course in statistics offered in the Department of Clinical Research and Leadership at George Washington University’s School of Medicine and Health Sciences. It was first offered in 2016 and is scheduled to run every fall term. HSCI 2117 is a pre-requisite for students taking HSCI 3117.

My experience: I began teaching the course in the fall of 2017, the second term in which it was offered. In 2018 I was hired as the course director. In 2018, we began to consider revamping the course, in order to better align it with HSCI 2117 and to create a 2-course sequence as opposed to parallel courses at different levels. This, bringing us to the present day, has led to a near total re-conceptualization, that I am now beginning to design and implement.