SWBAT understand and use the Chi-square Goodness of Fit and the Chi-square Contingency Table to evaluate categorical data.

What do you do when your data isn't quantitative? Chi-square to the rescue!

15 minutes

I begin this class by asking my students for some help with a problem I have. I tell them that my family was playing Risk and my daughter rolled a six eight times in a row with the same die. I've been challenged to figure out whether or not that is a reasonable occurrence or if the die is somehow rigged. Setting this lesson up with this question helps ensure that my students are engaged. They either get hooked by helping me solve the problem or by proving the die is rigged or both. Whatever the reason, adding this touch of personal about my family playing a game seems to resonate with my students*. * Some of my students will try to move this discussion toward what games I play or who won, but I just redirect them back to my problem. I ask them to pair-share some ideas, then ask for suggestions. **(MP1)** Usually someone suggests rolling the same die several more times to see what happens, at which point I explain that the die is at home, then follow with a question of how that would help. I might ask "How many sixes in a row is too many to be reasonable?", followed by "How do we know what number is acceptable?" Another common suggestion is to somehow figure out the mean and standard deviation or do a t-test (two recent lessons). To those suggestions I might reply with "What data are we calculating the mean of?" or "What data sets are we comparing for the t-test?. All these questions get my students thinking in terms of how to determine what's reasonable. I tell them that today we will be learning about ways to evaluate categorical data. I remind them that they are welcome to take notes if they choose *(I have an assortment of graphic organizers in my strategies folder)* and say that some of what we'll be doing is not in their textbook.

The first test is used for problems like the one I've posed about the die and I walk my students through that application, beginning with setting a significance level of 0.05 apriori. *(This is from a previous lesson) *** **You can see an example of this in my educreations video "Goodness of Fit" in my resources. **(MP2)** The terms I clarify with my students as we go through the lesson are degrees of freedom, expected value, and observed value. *I show them how to solve these problems using a table first because I want them to see where the comparison is occurring, then later in the lesson I demonstrate using the graphing calculator.*

When we've completed this analysis as a class, I tell my students that the second kind of Chi-square test is used to compare to groups. I give the example of the number of boys compared to the number of girls who have had discipline referrals at our school in the past month. Since there are different numbers of boys and girls and these are count values, my students can understand why we need a new kind of comparison. *I choose this data because it engages my students to figure out if boys or girls get in trouble more often. *I walk them through this lesson (see my educreations video "Chi-square") and ask if there are any questions. **(MP2)** After answering questions, I tell my students that they will now have an opportunity to work with Chi-square independently.

35 minutes

**Team work ***15 minutes: You will need copies of the Chi-square handout and Chi-square table for this part of the lesson.* I begin by asking my students to suggest categorical data we can collect to work with, writing their suggestions on the board. **(MP2) **(If you need some ideas you might consider the following: by gender-number of shoes, number of hats, books read in past 2 months for pleasure, elective classes taken or by grade level - parent occupation, gpa, co-curricular activities, books read over the summer, for Goodness of Fit-the number of boys vs girls in certain classes like physics or music, male vs female teachers, number of times a coin turns up heads or tails, chance of the World Series going to 7 games...) I generally narrow it down to about half a dozen, including at least two Goodness of Fit type data. I tell my students they will be working in teams to complete today's challlenge, and give them the handout "Chi-square". When they've had a chance to review the assignment, I ask if there are any questions, then tell them to post any data they need to on the front board ASAP. **(MP1,MP4) **As they work I walk around offering encouragement and assistance as necessary.

**Student Presentations** *20 minutes:* *There is a video narrativen in my resources for this section that explains why I have my students do a class presentation and critique at this point in the lesson.* As my teams are finishing their calculations, I remind them to be ready to share their work with the class. When everyone is done, I randomly select teams to present their results for one of their three tests, following the guidelines in their handout. I encourage my students to critique each presentation, reminding them of our rules about appropriate comments. **(MP3)** I allow approximately 3 minutes per team - 6 teams - with a little extra time for transitions and clarification, if needed.

5 minutes

To close this lesson I have my students each do a compare-contrast of the Chi-square Goodness of Fit test and the Chi-square Contingency table test, including when to use which and WHY. This may sound simple after all the time we've just spent on it, but I've found that my students are perfectly happy to plug numbers into boxes or their calculator and never really stop to consider why they're doing it. I have several graphic organizers *(see my strategies folder)* that my students can choose from or they can create their own system for comparing. I'm not looking for everyone having the same answers, I'm looking to see if my students truly understand the similarities and differences between the Chi-square tests. I'm hoping they will say something about comparing two sets of data or comparing one set of data to a known or expected value, and that they'll comment on the fact that both tests work with qualitative data. **(MP2)**