# Learning objectives: Literacy or computing?

I recently met two colleagues to solicit their advice on course creation, and the first thing they asked me was ‘are your students expected to run the models themselves?’

My first gut answer was ‘no’ – I want to focus on developing students’ statistical reasoning and literacy. What did the original course creators have in mind? The course description is:

Biostatistics for health science professionals. Concepts and methods, including confidence intervals, ANOVA, multiple and logistic regression, and non-parametric analyses. Scientific literature is used to provide a comprehensive context in which analytical evidence is employed to support practices in the health sciences.

The last sentence seems to imply a focus on literacy, but does ‘methods’ imply computation? The course objectives are that students in the course will learn to:

• Apply biostatistical concepts, including: probability, distribution, confidence intervals, inference, hypothesis testing, P-value and confidence interval.
• Apply numerical, tabular, and graphical descriptive techniques to health sciences data.
• Conduct appropriate statistical procedures to test null hypotheses.
• Appraise statistical results in health science research articles and reports.

Again, ‘appraise’ strikes me as a desire for literacy, but ‘conduct’ seems a clear indicator that computing is also an expectation. However, the weekly learning objectives from units focusing on ANOVA, multiple linear regression, and logistic regression paint a different story:

• Interpret estimates from a one-way and two-way analysis of variance.
• Appraise procedures used to address the problem of multiple comparisons.
• Explain prediction models that are grounded in the population regression line.
• Interpret regression coefficients from a multiple regression model.
• Calculate and interpret chi-square test statistics.
• Interpret results from unadjusted logistic regression models.
• Identify the reasons for conducting a non-parametric test.
• Interpret estimates from non-parametric tests including the Wilcoxon Signed-Rank test, the Wilcoxon Rank Sum test and Kruskal-Wallis test.

‘Interpret’, ‘Appraise’, ‘Identify’, and ‘Explain’ all seem like literacy skills to me, and the only computation that is being asked is a chi-squared test statistic, as well as (in other weeks) odds ratios, risk ratios, relative risks, and other descriptive statistics.

Perhaps this is the right place to draw the line between computation and literacy – I will ask my students to ‘compute’ descriptive statistics, but focus on literacy with regards to inference, modelling, and hypothesis testing.

I am curious though if it is possible to focus on both literacy and computation in the same course. I imagine students becoming stressed and annoyed with debugging with whichever software we might use. Whichever software I do choose ought to be able to do all the calculations I expect students to do in 2117 and 3117, and should be used in both courses.

# A two-course sequence

I’ve increasingly come to think of HSCI 2117 in terms of the statistical reasoning I wish to develop in my students and the specific tools I wish to equip my students with.

I think of the reasoning goals as focusing successively on reasoning about data, reasoning about variability, reasoning about covariation, and finally reasoning about statistical inference.

I’ve think of the content of HSCI 2117 in terms of descriptive and inferential tools for different combinations of types of variables in the univariate and bivariate case. For example:

• frequency tables, bar charts, and binomial and multinomial tests for a single categorical variable
• summary tables, histograms, and T tests for a single quantitative variable
• joint and marginal frequency tables, stacked bar charts and line charts, and chi-squared tests for two categorical variables
• marginal summary tables, side-by-side boxplots, and ANOVA with multiple comparisons for one categorical variable and one quantitative variable
• summary tables, scatterplots, and T tests for correlation or ANOVA for a model for two quantitative variables

What is the best way to extend this foundation in HSCI 3117, a ‘second-course’ in statistics? Most of the research I’m familiar with in the statistics education literature focuses on the first course in statistics, and it seems we haven’t gotten around to second courses yet because the first course is still an open problem and we’re not satisfied with our solutions yet.

Using the framework of reasoning and content, I think reasoning about models and modelling and reasoning about sampling variability are the two most obvious pieces to develop in addition to a re-emphasis on reasoning about covariation and reasoning about statistical inference.

In terms of content, we would then cover combinations of categorical and quantitative variables as independent and dependent variables in the tri-variate case. This would include multiple linear regression (including multicollinearity and interaction terms), 2-way ANOVA, ANCOVA (and regression with indicator variables), logistic regression (including adjusted odds ratios), and Cochran-Mantel-Haenszel procedures.

In a previous post, I discuss common methods in public health literature. Therefore, in addition to the methods above, I feel I should include Cox Proportional Hazards Regression, non-parametric methods, and survival analysis.

There is a paucity of research on the second course in statistics. I do not know what is typical. However, this approach essentially means that HSCI 2117 will hopefully build a foundation of literacy and reasoning while HSCI 3117 will build students’ statistical skills.

Garfield, J., & Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. Springer Science & Business Media.

Rao, V.N.V. (2019, March 31). Deciding course content [Blog post]. Retrieved from https://statisticaljourneys.home.blog/2019/03/31/deciding-course-content/

# Deciding course content

Although I view 2117 as a statistics class, its title, Introduction to Statistics for the Health Sciences, reminds me that it is statistics situated within a specific context. 3117’s title, Principles of Biostatistics, makes this even clearer.

As such, I want the specific statistical skills students walk out of the course with to be well-aligned with the tasks they might be asked to do in their careers. How should I decide on content for the course and how do I anticipate what students will need?

I have professional experience as a statistician in the health sciences, and largely drew upon that experience in the past to decide on relevant content for this course. However, I wanted a stronger evidence-base for my decisions. Luckily, a recent study by Hayat, et. al (2017) sampled published papers in public health journals. I could use this to decide the typical basket of tools my students might need to be familiar with.

Hayat, et. al. (2017) found the following epidemiological terms common:

• Prevalance
• Relative Risk
• Odds Ratio
• Incidence
• Mortality
• Hazard Ratio

Hayat, et. al. (2017) found the following statistics tests common (including p-values and confidence intervals):

• T-test
• Chi-squared Test / Exact Test (presumably for contingency tables)
• Correlation tests
• Non-parametric tests

Hayat, et. al. (2017) found the following statistics models common:

• ANOVA
• Linear Regression
• Logistic Regression
• Poisson Regression
• Cox Proportional Hazards Regression
• Generalized Linear Mixed Models

These results gave me a frame from which I could choose content for 2117 and 3117. I will introduce prevalence, relative risk, odds ratios, and incidence in 2117, and introduce mortality and hazard ratios in 3117. I will introduce the p-values, confidence intervals, T-test, Chi-squared test, and correlation tests in 2117, and introduce non-parametric tests in 3117. I will introduce ANOVA and linear regression in 2117, and generalized linear models in 3117.

While I certainly don’t want to automatically perpetuate the status quo, if I wish my students to be statistically literate, especially in the field of the health sciences, then they must be familiar with these methods. However, I struggle between balancing this traditional base and exposing my students to more modern methods. It seems unlikely that many will take more statistics courses in their lives. Is it my responsibility to seize this opportunity to expose them now, at the risk of leaving them without common skills prevalent in their field?

Nevertheless, with these content learning objectives in place, I could now move on to the specific statistical reasoning learning objectives and content sequencing.