New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Update testing practice #6

Open

d-alex-hughes wants to merge 2 commits into master from update-testing-practice

Contributor

d-alex-hughes commented Feb 24, 2022

This PR creates an activity that gives students a chance to run four different hypothesis tests against some real, interesting data.

If these tests look good, then

Save this file into a new file called *_answers.Rmd
Remove the answers from this document, and then save.
Update the README link to create a nbgitpuller link that will bring this data into the ischool.datahub.berkeley.edu environment.

d-alex-hughes added 2 commits

February 23, 2022 11:47


          ignore mac files

489813a


          adds testing exercise

ac4ac44

d-alex-hughes requested a review from paul-laskowski

February 24, 2022 21:38

paul-laskowski approved these changes

View reviewed changes

Contributor

paul-laskowski left a comment

Some really great practice here! My only big comment is that this is a LOT of work. We could call it a homework, but to help the most students I think it's worth thinking of how to make it go a lot faster.

unit_07/hypothesis_test_practice_activity.Rmd

               ---
+              # Hypothesis Test Practice Activity
+              In this short activity, you're going to write, and execute a short series of hypothesis tests using the `R` estimating framework.

Contributor

paul-laskowski Feb 24, 2022

not sure what the R estimating framework is. just use R? I don't think short is accurate at the moment :D

Suggested change

      
            In this short activity, you're going to write, and execute a short series of hypothesis tests using the `R` estimating framework. 
          
            In this activity, you're going to execute a series of hypothesis tests using R.

unit_07/hypothesis_test_practice_activity.Rmd

+              - `salary_potential`
+              - `diversity_school`
+              We are going to ask as series of questions that can be answered with the constrained set of tests that we have available to use from the course.

Contributor

paul-laskowski Feb 24, 2022

just trying to clarify a bit

Suggested change

      
            We are going to ask as series of questions that can be answered with the constrained set of tests that we have available to use from the course. 
          
            We are going to ask a series of questions that can be answered with one of the hypothesis tests presented in the course.

unit_07/hypothesis_test_practice_activity.Rmd


		For each of the questions, the data is available, but you might have to join a table or two, recode a variable or two, or otherwise do a little bit of data work to get the data ready to run the test.

		For each test that you conduct, please (a) evaluate the assumptions of the test to see if the data satisfy these assumptions; (b) state the null hypothesis that is being evaluated; (c) state the criteria that would lead you to reject the null hypothesis; (d) conduct and interpret the test; and (e) tell us whether the difference you observe between the groups is an important one.

Contributor

paul-laskowski Feb 24, 2022

"state the criteria that would lead you to reject the null hypothesis" is it valid for a student to write "p<.05" for every test? perhaps that's a reason to remove this component and test it more directly another way.

unit_07/hypothesis_test_practice_activity.Rmd

		```

		Now then, using data that is available to you, please test whether public school tuition has changed from 1985 to the present. In order to do you, you will have to select the appropriate rows and columns of data, and conduct the appropriate test for this data.

Contributor

paul-laskowski Feb 24, 2022

The big issue I see is that you don't know the 1985 tuition level - you only have an estimate, but your one-sample test below treats this as a true value.

To fix this, you could instead create a single variable to be the difference in tuition from 1985 to the present, then test if it's zero. Students might not initially recognize this as a place to use a one sample test, but it coudl be good learning for them.

unit_07/hypothesis_test_practice_activity.Rmd

+. What is the appropriate test?
+. What are the assumptions for this test?
+. Are these assumptions for the test satisfied in this case?

Contributor

paul-laskowski Feb 24, 2022

I worry that this phrasing makes it sound like we're really interested in a final yes or no answer. I prefer the way we did it on the lab, something like "evaluate each assumption, based on your background knowledge, data visualizations, and numerical summaries."

unit_07/hypothesis_test_practice_activity.Rmd

		```

		1. Are the tuition costs different?

Contributor

paul-laskowski Feb 24, 2022

Is this a gotcha question? if students answer yes or no, I would call that wrong - you just have evidence, you haven't proved the hypothesis. perhaps something like the following?

Suggested change

      
            1. Are the tuition costs different? 
          
            1. Have you found evidence that the tuition costs are different?

unit_07/hypothesis_test_practice_activity.Rmd

               ```
+. Are the tuition costs different?
+. Is this a big difference or a little difference? What makes you think this?

Contributor

paul-laskowski Feb 24, 2022

importance feels more clear to me.

Suggested change

      
            2. Is this a big difference or a little difference? What makes you think this? 
          
            2. Is this an important difference or an unimportant difference? What makes you think this?

unit_07/hypothesis_test_practice_activity.Rmd

+                geom_point(alpha = 0.3) +
+                facet_grid(cols = vars(type))
+              ```
+              Given what you have seen, how would you recommend proceeding with your test? Proceed in the way that you think is most reasonable. State the assumptions for your test, evaluate whether they are satisfied, and conduct the test, describing what you have learned about the statistics, and also what the practical meaning of these statistics are.

Contributor

paul-laskowski Feb 24, 2022

the learning is about the effect or the model parameters or the difference in tuition, not the statistics, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet