My favourite number

On this blog’s homepage I state “I’ve been in love with numbers for as long as I can remember.” Even when I was a toddler I never wanted to practice reciting the alphabet – I preferred reciting numbers.

Yet, one number has always stood above the rest. It is my favourite number – 7.

Origins

I think I decided that 7 ought to be my favourite number when I was relatively young. I was 7 years old when my sister was born (technically, 6yrs 10mos 23days 17hrs and 39mins, and yes I did do that calculation the day she was born itself). 7 was the first jersey number I had for soccer. 7 is the sum of the digits of my birth date.

However, I remember recognizing quite early on, no later than the age of 8, that the reciprocal of 7 was the most interesting reciprocal of all of the numbers less than 12.

1/2 = 0.5
1/3 = 0.333333…
1/4 = 0.25
1/5 = 0.2
1/6 = 0.166666…
1/7 = 0.142857142857….
1/8 = 0.125
1/9 = 0.1111111…
1/10 = 0.1
1/11 = 0.090909…
1/12 = 0.083333…

The reciprocals of 2, 4, 5, 8, and 10 all have finite decimal expansions. The reciprocals of 3, 6, 9, and 12 all end with a single digit repeating ad infinitum while the reciprocal of 11 ends with a repeating two-digit sequence. Yet 1/7 was in a class of its own. I didn’t know why 7 should have such a unique decimal expansion at the time, but I was captivated by it.

A growing fancy

A few years later I realized that the pattern went deeper. Comparing 1/3 (0.333…) to 2/3 (0.666…), the decimal expansions have the same form, in that it is a single digit that repeats, but the digit that repeats is different. This holds for the fractions of 6, 9, 11, and 12 as well. However, the fractions of 7 do something entirely different:

1/7 = 0.142857142857…
2/7 = 0.2857142857…
3/7 = 0.42857142857…
4/7 = 0.57142857…
5/7 = 0.7142857…
6/7 = 0.857142857…

I’ve aligned the decimal expansions to help identify the pattern. Each of the fractions has the same six numbers in the same order, just with a different starting point! 7 was simply outlapping the other numbers in terms of mystique. Why were the same numbers repeating for each fraction, and why were they in the same order?

I began my search for other numbers that had this pattern, but, only using pen and paper, or calculators that only displayed 8 to 10 digits, proved limiting in my search.

Getting serious with number theory

I held on to this intrigue with the number 7 into college, and approached the professor of my number theory class with the question ‘why does 1/7 have such unique patterns?’. He then showed me a whole new dimension to the number 7. It was as if I was peering through Lewis Carroll’s looking glass into a hitherto unknown world of exotic beauty.

He explained that the pattern occurred because we operate in the numeric base of 10, and 10 is a primitive root modulus 7. That means that a series of 6 9’s in a row is divisible by 7, (important since 6 = 7-1) i.e. 999999/7 is an integer, and no other smaller series of 9’s divided by 7 is so (9/7 is not an integer, 99/7 is not an integer, etc.). Any prime number that has this property is called a full repetend prime, and all full repetend primes exhibit the same properties that I described above for the number 7. The first five full repetend primes are 7, 17 (meaning that a series of 16 9’s in a row is the smallest series of 9’s that is divisible by 17), 19, 23, and 29.

He then showed me the property of 9’s, also called Midy’s Theorem. If we recall the repetend of 1/7, i.e. the part that repeats, or 142857:

1+4+2+8+5+7 will be divisible by 9 (it equals 3*9)
14+28+57 will be divisible by 99 (it equals 99)
142+857 will equal 999

Similarly, for the repetend of 1/17, 0588235294117647:

0+5+8+8+2+3+5+2+9+4+1+1+7+6+4+7 will be divisible by 9 (it equals 8*9)
05+88+23+52+94+11+76+47 will be divisible by 99 (it equals 4*99)
0588+2352+9411+7647 will be divisible by 9999 (it equals 2*9999)
05882352+94117647 will equal 99999999

All full repetend primes have this property. We went on to discuss many other things, including discrete logarithms and other properties of cyclic numbers and prime reciprocals, and I went on to discover and play with subclasses of the full repetend primes, but I never lost any love or interest for my favourite number, 7.

Learning objectives: Literacy or computing?

I recently met two colleagues to solicit their advice on course creation, and the first thing they asked me was ‘are your students expected to run the models themselves?’

My first gut answer was ‘no’ – I want to focus on developing students’ statistical reasoning and literacy. What did the original course creators have in mind? The course description is:

Biostatistics for health science professionals. Concepts and methods, including confidence intervals, ANOVA, multiple and logistic regression, and non-parametric analyses. Scientific literature is used to provide a comprehensive context in which analytical evidence is employed to support practices in the health sciences.

The last sentence seems to imply a focus on literacy, but does ‘methods’ imply computation? The course objectives are that students in the course will learn to:

Apply biostatistical concepts, including: probability, distribution, confidence intervals, inference, hypothesis testing, P-value and confidence interval.
Apply numerical, tabular, and graphical descriptive techniques to health sciences data.
Conduct appropriate statistical procedures to test null hypotheses.
Appraise statistical results in health science research articles and reports.

Again, ‘appraise’ strikes me as a desire for literacy, but ‘conduct’ seems a clear indicator that computing is also an expectation. However, the weekly learning objectives from units focusing on ANOVA, multiple linear regression, and logistic regression paint a different story:

Interpret estimates from a one-way and two-way analysis of variance.
Appraise procedures used to address the problem of multiple comparisons.
Explain prediction models that are grounded in the population regression line.
Interpret regression coefficients from a multiple regression model.
Calculate and interpret chi-square test statistics.
Interpret results from unadjusted logistic regression models.
Identify the reasons for conducting a non-parametric test.
Interpret estimates from non-parametric tests including the Wilcoxon Signed-Rank test, the Wilcoxon Rank Sum test and Kruskal-Wallis test.

‘Interpret’, ‘Appraise’, ‘Identify’, and ‘Explain’ all seem like literacy skills to me, and the only computation that is being asked is a chi-squared test statistic, as well as (in other weeks) odds ratios, risk ratios, relative risks, and other descriptive statistics.

Perhaps this is the right place to draw the line between computation and literacy – I will ask my students to ‘compute’ descriptive statistics, but focus on literacy with regards to inference, modelling, and hypothesis testing.

I am curious though if it is possible to focus on both literacy and computation in the same course. I imagine students becoming stressed and annoyed with debugging with whichever software we might use. Whichever software I do choose ought to be able to do all the calculations I expect students to do in 2117 and 3117, and should be used in both courses.

Teaching sampling variability

To my mind, the most important foundational concept to statistical inference is an appreciation of sampling variability. Chance, del Mas, & Garfield (2004) lay out a vision of what students need to understand and what they should be able to do with that understanding. However, I wouldn’t reach as far as they did. I believe that the core understanding rests on only a small subset of their list.

The main understanding students should have, in my opinion, is that given a population parameter, some values of a sample statistic are more or less likely to than others to be the result of a sample from that population. This should manifest in a student’s ability to make statements about how far a sample statistics is likely to vary from a population parameter, and vice versa.

Developing such an understanding in students is no trivial matter. There seems to be consensus in the statistics education research community that the use of simulations can help develop students’ understanding of sampling variability (Garfield, et. al., 2008).

I particularly like an activity designed by Scheaffer, et. al. (1996) called What is a confidence interval anyway?. The instructor resources presents a scatterplot relating population proportions to their likely sample proportions (Figure 8, page 274).

Printed below is an adaptation of this scatterplot demonstrating how a student might use it to determine that the likely values of a population proportion are between approximately 65% and 75% after determining that their sample proportion is 0.70 from a sample of size 100.

I particularly like this tool as I believe it helps to frame the idea of inference quite nicely. We never know what the true population parameters are. However, the theory of sampling distributions tells us something about how sample statistics behave in relation to those parameters.

Each of the vertical bars represent the likely sample proportions we might get when we sample from a population with the given population proportion. When we take only one sample sample, we can never know for sure the exact value of the population parameter, but certain options become to look increasingly unlikely. Use of this scatterplot may guide students into a more multiplicative conception of a sample (Saldanha & Thompson, 2002).

I believe such an activity can help improve students’ ability to make statements about how far a sample statistics are likely to vary from a population parameter, and vice versa. However, by only focusing on this one learning objecting, as opposed to full list of recommendations by Chance, et. al. (2004), would I be doing a disservice to our students in their future work and studies in statistics, or will this indeed provide a sufficient foundation for them to become statistically literate?

References and further reading:

Chance, B., del Mas, R., & Garfield, J. (2004). Reasoning about sampling distribitions. In The challenge of developing statistical literacy, reasoning and thinking (pp. 295-323). Springer, Dordrecht.

Garfield, J. B., Ben-Zvi, D., Chance, B., Medina, E., Roseth, C., & Zieffler, A. (2008). Learning to Reason About Statistical Inference. In Developing Students’ Statistical Reasoning (pp. 261-288). Springer, Dordrecht.

Saldanha, L., & Thompson, P. (2002). Conceptions of sample and their relationship to statistical inference. Educational studies in mathematics, 51(3), 257-270.

Scheaffer, R.L., Watkins, A., Gnanadesikan, M., Witmer, J.A. (1996). What Is a Confidence Interval Anyway? In Activity-Based Statistics: Instructor Resources (pp. 274-278). Springer, New York, NY.

A two-course sequence

I’ve increasingly come to think of HSCI 2117 in terms of the statistical reasoning I wish to develop in my students and the specific tools I wish to equip my students with.

I think of the reasoning goals as focusing successively on reasoning about data, reasoning about variability, reasoning about covariation, and finally reasoning about statistical inference.

I’ve think of the content of HSCI 2117 in terms of descriptive and inferential tools for different combinations of types of variables in the univariate and bivariate case. For example:

frequency tables, bar charts, and binomial and multinomial tests for a single categorical variable
summary tables, histograms, and T tests for a single quantitative variable
joint and marginal frequency tables, stacked bar charts and line charts, and chi-squared tests for two categorical variables
marginal summary tables, side-by-side boxplots, and ANOVA with multiple comparisons for one categorical variable and one quantitative variable
summary tables, scatterplots, and T tests for correlation or ANOVA for a model for two quantitative variables

What is the best way to extend this foundation in HSCI 3117, a ‘second-course’ in statistics? Most of the research I’m familiar with in the statistics education literature focuses on the first course in statistics, and it seems we haven’t gotten around to second courses yet because the first course is still an open problem and we’re not satisfied with our solutions yet.

Using the framework of reasoning and content, I think reasoning about models and modelling and reasoning about sampling variability are the two most obvious pieces to develop in addition to a re-emphasis on reasoning about covariation and reasoning about statistical inference.

In terms of content, we would then cover combinations of categorical and quantitative variables as independent and dependent variables in the tri-variate case. This would include multiple linear regression (including multicollinearity and interaction terms), 2-way ANOVA, ANCOVA (and regression with indicator variables), logistic regression (including adjusted odds ratios), and Cochran-Mantel-Haenszel procedures.

In a previous post, I discuss common methods in public health literature. Therefore, in addition to the methods above, I feel I should include Cox Proportional Hazards Regression, non-parametric methods, and survival analysis.

There is a paucity of research on the second course in statistics. I do not know what is typical. However, this approach essentially means that HSCI 2117 will hopefully build a foundation of literacy and reasoning while HSCI 3117 will build students’ statistical skills.

References and further reading:

Garfield, J., & Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. Springer Science & Business Media.

Rao, V.N.V. (2019, March 31). Deciding course content [Blog post]. Retrieved from https://statisticaljourneys.home.blog/2019/03/31/deciding-course-content/

Teaching data collection

I recently attended a presentation on the Island and its use in a statistics class. The Island was developed by Dr Michael Bulmer, and is a simulated environment where students can practice data collection and study design.

We want our students to be able to use the statistical tools we teach in order to make data informed decisions, which means they will need to collect data first. Unfortunately, we typically hand students’ datasets in our classrooms. An appreciation of the data creation process can help to improve students’ reasoning when applying statistical analyses to the data (McClain & Cobb, 2001).

In order to develop this appreciation for data collection, I have begun brainstorming activities for students using the Island, such as the following activity:

Select a sample of 10 different people. Select one person randomly from 10 different cities. You can select from any 10 cities you wish. To select your sample, choose an island, then choose a city, then randomly choose a house, and randomly choose a person in that house.
Record which island they live on, which city they live in, their house number, their name, their age, their gender, their systolic blood pressure, and their cholesterol level.
Summarize the distribution of the following traits of people in your sample: age, gender, systolic blood pressure, cholesterol level
Share your dataset and your summaries with your classmates.

In having students share their results, we also create a window of opportunity to discuss sampling variability, one of the cornerstone topics of any course in statistics. I am looking forward to experimenting with this resource and evaluating its efficacy in developing students’ appreciation of a dataset.

References:

McClain, K., & Cobb, P. (2001). Supporting students’ ability to reason about data. Educational Studies in Mathematics, 45(1-3), 103-129.

Visualizing Data

I’m often asked by colleagues and students, “what makes a good data visualization?”. I believe data visualizations (and any data analysis) are story telling tools. As such, I have two criteria when I create or evaluate them:

(1) The visualization should only have one main pattern it tries to convey. There is a tendency to create very complex and multi-faceted visualizations. However, stories are told one plot point at a time. Similarly, visualizations should each focus on one pattern at a time within a larger narrative.
(2) Ignoring all labels and text, the pattern should be immediately noticeable. The value of the visualization is in serving as an aide to highlight an otherwise obscure pattern. Thus, a good aide should make the pattern painstakingly obvious and universally evident.

This may seem limiting, but I believe that even complex patterns can be elicited from good data visualizations within a few seconds. TED talks by Hans Rosling and David McCandless excellently demonstrate this power.

One of my favourite examples, and one of the most famous early examples of a data visualization, is by Florence Nightingale (read more here). Her visualization, printed below, helped showcase the need for hospital sanitation.

Let’s examine Florence Nightingale’s graph under my two criteria:

Florence Nightingale’s Data Visualization

Ignoring all text, I immediately see a lot of blue. That blue area represents the number of deaths from preventable or mitigable diseases. This indeed is the pattern Florence Nightingale wished to highlight – a lot of soldiers were dying unnecessarily. Thus, this is an effective visualization of the data and helps tell a story.

Although we could have also examined a data table to arrive at the same conclusion, it is much less dramatic and perhaps a harder pattern to see. This is the power of a good data visualization – it presents patterns in numbers with colours and shapes, broadening prospective audiences’ ability to consume the information at hand.

We owe a great debt to innovators like Florence Nightingale and can strive to exemplify her creativity when we present the stories contained in our data sets.

I like to imagine the conversation Florence Nightinghale had after she created this visualization went something like this:

General: I don’t see why we need to waste time and money cleaning the hospital
Florence: Do you see all that blue?
Army General: Yea…
Florence: That’s how many soldiers we could have saved if we had clean hospitals
General: We’ve got to do something about that blue! How much time and money do you need to clean the hospital?

Choosing a curriculum – Part 1

With the green light to make as dramatic a proposal as I desired, I set to considering a new syllabus accompanied by general learning objectives in February 2019. My guiding principles were based on what I had recently learnt about statistics education research, my experiences teaching HSCI 2117, and my experience working as a statistician in the health sciences. I knew I wanted to make decisions supported by academic research, but before I went hunting for recommendations in the research literature, I sat down to plot out where my current visions lay.

New statistical reasoning course objectives

I wanted my students to walk out of my course with knowledge and skills that would allow them to use data to inform decisions. This meant I needed them to be comfortable asking a question, collecting relevant data, summarizing and describing data, and using data to make inferences. Specific modules could focus on:

measurement – focusing on validity and data collection
types of variables (categorical, ordinal, discrete quantities, continuous ratio scales) – the backbone to future instruction, as each of the tools I would introduce can be mapped to a subset of these four types in univariate and bivariate cases
summary statistics and data visualizations
modelling – focusing on ‘naming’ identified patterns from summaries and descriptions
inference – focusing on sampling variability, likelihood as a measure, and interval estimation
hypothesis testing – introducing both null hypothesis testing and also comparing two hypotheses or models in significance testing

While this helps me to decide on content, the specific reasoning skills that I could stress would be reasoning about data, reasoning about variability, and reasoning about inference. While we do touch on modelling, I feel that models and modelling require too much attention to include in addition to setting a solid foundation for inference in general.

Out with the old and in with the new

Furthermore, the manner in which I wanted to teach these materials was vastly different from the old standard curriculum. It is 2019 – I do not need my students to know how to use a normal probability table. No one does that any more. As such, I decided on the following changes in content:

Exclude by-hand calculations – Although one could argue that working out equations is the only way to truly understand them, I wanted this course to be a statistical literacy course. What real value is there in having a student being able to calculate the standard deviation of a dataset by hand? I believe the answer is ‘none’.
Exclude probability theory and probability distributions – I knew this was a recommendation of GAISE 2016, and it was something that students often struggled with. In the consensus curriculum, I think its introduced in order to facilitate calculations of test statistics by hand, a need now obviated.
Exclude the critical value approach and tests statistics – although understanding sampling distributions is arguably an essential aspect of statistical literacy, I do not believe it is worth the trouble in an undergraduate introductory course. It is more important to develop students’ understanding of sampling variability, which can be done without specifically addressing sampling distributions at all. Since in practicing statisticians rely on p-values and confidence intervals, I decided to go with a combination of the p-value and the confidence interval approach, further diminishing value in introducing critical values and test statistics.
Exclude the one-sample Z-test for a proportion and the two-sample Z-test for a difference in proportions – these tests are approximations of exact tests and were prevalent in an age before computers. If we are adopting a p-value and confidence interval approach, then the choice of the test we instruct is irrelevant. Why not just teach the exact tests?

Preparing for the hunt for a new text book

Satisfied with my proposed changes, I thought about how I would operationalize instruction for each of my major learning objectives, began to search for a new text book that would align with these values.

References and further reading:

GAISE (2016). Guidelines for assessment and instruction in statistics education. College report. Alexandria, VA: American Statistical Association

Garfield, J., & Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. Springer Science & Business Media.

Deciding course content

Although I view 2117 as a statistics class, its title, Introduction to Statistics for the Health Sciences, reminds me that it is statistics situated within a specific context. 3117’s title, Principles of Biostatistics, makes this even clearer.

As such, I want the specific statistical skills students walk out of the course with to be well-aligned with the tasks they might be asked to do in their careers. How should I decide on content for the course and how do I anticipate what students will need?

I have professional experience as a statistician in the health sciences, and largely drew upon that experience in the past to decide on relevant content for this course. However, I wanted a stronger evidence-base for my decisions. Luckily, a recent study by Hayat, et. al (2017) sampled published papers in public health journals. I could use this to decide the typical basket of tools my students might need to be familiar with.

Hayat, et. al. (2017) found the following epidemiological terms common:

Prevalance
Relative Risk
Odds Ratio
Incidence
Mortality
Hazard Ratio

Hayat, et. al. (2017) found the following statistics tests common (including p-values and confidence intervals):

T-test
Chi-squared Test / Exact Test (presumably for contingency tables)
Correlation tests
Non-parametric tests

Hayat, et. al. (2017) found the following statistics models common:

ANOVA
Linear Regression
Logistic Regression
Poisson Regression
Cox Proportional Hazards Regression
Generalized Linear Mixed Models

These results gave me a frame from which I could choose content for 2117 and 3117. I will introduce prevalence, relative risk, odds ratios, and incidence in 2117, and introduce mortality and hazard ratios in 3117. I will introduce the p-values, confidence intervals, T-test, Chi-squared test, and correlation tests in 2117, and introduce non-parametric tests in 3117. I will introduce ANOVA and linear regression in 2117, and generalized linear models in 3117.

While I certainly don’t want to automatically perpetuate the status quo, if I wish my students to be statistically literate, especially in the field of the health sciences, then they must be familiar with these methods. However, I struggle between balancing this traditional base and exposing my students to more modern methods. It seems unlikely that many will take more statistics courses in their lives. Is it my responsibility to seize this opportunity to expose them now, at the risk of leaving them without common skills prevalent in their field?

Nevertheless, with these content learning objectives in place, I could now move on to the specific statistical reasoning learning objectives and content sequencing.

References and further reading:

Hayat, M. J., Powell, A., Johnson, T., & Cadwell, B. L. (2017). Statistical methods used in the public health literature and implications for training of public health professionals. PloS one, 12(6), e0179032.

The initial call for change

In 2018, the research core curriculum director (my boss at GWU) and I decided we wanted to attempt a course re-design. She had for some time wanted to develop a course that would scaffold students through material to better guide and assess student learning and development through each module.

We conceived splitting each module (one per week) into three phases:

1) introduction – reinforcing previously learnt material, and ensuring students were comfortable with the base ideas that would be developed in the module.
2) knowledge development – guided learning through a series of lectures and activities.
3) reinforcement of the main ideas and review

Each of the phases would be made up of a different balance of lessons, collaborative activities, assigned readings, and assessments (mini-quizzes). This initial conception kept the status quo curriculum in place, and simply was a change to the learning environment.

However, neither of us were perfectly happy with the current curriculum. I had been making small changes to the course with increasing frequency – rewriting assignments, rerecording lessons, and rewriting test questions – as I gained experience and became more familiar with statistics education research literature. This resulted in the fall 2018 version of the course being almost unrecognizable compared to the fall 2016 version.

Fueled by the creative opportunity provided by our re-design initiative, as a thought exercise we decided imagine what we wanted to teach with as close to a carte blanche as feasible.

The p-value controversy

I was first introduced to the p-value controversy by an epidemiologist in 2014. The controversy is about the use and practice of null hypothesis significance testing (Tramifow, 2014; Wasserstein, 2016; Wellek, 2017). It is essentially due to the prevalence of the practice of mindless hypothesis testing procedures, also known as the null ritual (Gigerenzer, 2004).

Why does the controversy exist? Ultimately, it’s because statistical reasoning and inference, as well as the hypothesis testing procedure, is difficult to understand (del Mas, 2004). So widespread is confusion that the null ritual has even been labelled as tyrannical (England, 1991; Stang, Pool, & Kuss, 2010).

There’s so much confusion that even the controversy itself can be misunderstood as a indictment of p-values and the hypothesis testing procedure, whereas it is simply a recommendation that one should be thoughtful and not use statistical tools blindly (Wasserstein, Schirm, & Lazar, 2019).

Surely, one part of the problem, and one part of any solution, is the statistics classroom. If people fail to understand statistics, is it not the responsibility of statisticians, as stewards of the field, to help remedy the situation?

I think understanding the evolution and origins of the p-value in hypothesis testing can go a long way in helping. Most people would find it surprising to hear that the three people credited with its development would likely balk at its current practice (Gigerenzer, 2004).

There are many articles that discuss the origins of the procedure, but, one of my favourites is by Biau, Jolles, & Porcher (citation listed below). I strongly recommend everyone to read that paper (and require it of my students) – it is short and very accessible – and at the very least, should prove illuminating.

References and Further Reading:

Biau, D. J., Jolles, B. M., & Porcher, R. (2010). P value and the theory of hypothesis testing: an explanation for new researchers. Clinical Orthopaedics and Related Research®, 468(3), 885-892.

del Mas, R. C. (2004). A comparison of mathematical and statistical reasoning. In The challenge of developing statistical literacy, reasoning and thinking (pp. 79-95). Springer, Dordrecht.

England, C. (1991). On the tyranny of hypothesis testing in the social sciences. Contemporary psychology, 36(2), 102-105.

Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33(5), 587-606.

Trafimow, D. (2014). Editorial. Basic and Applied Social Psychology, 36(1), 1-2.

Stang, A., Poole, C., & Kuss, O. (2010). The ongoing tyranny of statistical significance testing in biomedical research. European journal of epidemiology, 25(4), 225-230.

Wasserstein, R. L. (2016). ASA statement on statistical significance and P-values.

Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a World Beyond “p< 0.05”. The American Statistician, 73 (S1), 1-19.

Wellek, S. (2017). Author response to the contributors to the discussion on “A critical evaluation of the current ‘p‐value controversy’”. Biometrical Journal, 59(5), 897-900.