knowt logo

A simple random sample is a sample chosen in a way that every set of n units in the population has an equal chance of being in the sample that is actually selected.  Giving each individual an equal chance to be chosen is a necessary but not sufficient condition for a simple random sample.

There is a difference between voluntary response bias and non-response bias.  In non-response bias, the researcher picks the sample and some respondents refuse to answer or provide only partial answers.  In voluntary response bias, the researcher does not pick the sample.  Rather, the researcher issues an invitation for people to take the survey, and the subjects decide for themselves whether to be in the sample.

Simple random sampling does not reduce bias from poorly worded questions, under coverage, non-response, or other non-sampling errors.

In an experiment, the reason for the random assignment of treatments is to create two groups that are as similar as possible, so that comparisons can be made between the two groups.

When asked to describe how you would randomize, students need to provide details.  When describing the “hat method,” be sure to say that the slips of paper are identical and that you shake the hat to randomize.

The results of an experiment or sample are called statistically significant if the observed effect is unlikely to be the result of random assignment alone.  Of course, student answers must include context.

Here are some wrong interpretations of statistical significance:

  • The subjects that were given treatment A responded differently than the subjects who were given treatment B.  Dr. Johnson’s comment:  A difference in response is not enough.

  • The subjects that were given treatment A responded differently than the subjects who were given treatment B, and the difference was not due to chance. Dr. Johnson’s comment:  Statistical significance does not require certainty that the observed effect was not due to chance; it just requires that the observed effect was unlikely to be the result of chance.   Typically, if the probability of getting the observed effect by chance is less than 5%, we say it is statistically significant.

The purpose of giving one group a placebo is to provide a basis of comparison to measure the real effect of the treatment.

A confounding variable is a variable that is both (1) related to an independent variable and (2) might influence the response variable and may create a false perception of an association between the two.  To argue that a variable is a confounding variable, students need to argue that the variable satisfies both (1) and (2).

Here are the four principles of a good experimental design:

  1. Random assignment:  A good experiment randomly assigns treatments to subjects.

  2. Control:  A good experiment controls for outside factors (confounding variables).

  3. Compare:  A good experiment compares the results (the response variable) for two or more treatments.

  4. Replication:  A good experiment uses enough experimental units in each treatment group to reduce chance variation.  Also, replication occurs when the experiment is repeated with other subjects in different settings.

We can make an inference about cause-and-effect if we conducted a well-designed experiment with random assignment of treatments and obtained statistically significant results.

On FRQs, many students are too wordy; they wrote too much.  Get to the point.   Don’t be vague.  Avoid using the word “it.” Avoid ambiguous phrases like “skew the results.”

Question:  Suppose there are 2800 students at CHS.  Describe how you could select a simple random sample of 50 students using a random number generator.

Answer #1:  Give everyone a number, generate 50 random numbers.

Dr. Johnson’s comment:  I am looking for precise number assignment and clear instructions on randomization.  Simply saying “use a random number generator” or “flip a coin” is not sufficient for credit as a description of the randomization process.

Answer #2:  Every student can be associated with a number, and then the generator can decide the students based on the number it randomly picks.

Dr. Johnson’s comment:  The calculator doesn’t pick the students.  The researcher does.

Best answer:  Give each student a distinct integer label from 1 to 2800.  Use a random number generator to generate 50 unique integers (repeats are ignored) between 1 and 2800.  The 50 students whose numbers are selected are surveyed.

Dr. Johnson’s comment:  In the student’s answer, the procedure for randomization must be fully described so that two knowledgeable statistics users would, following the student’s instruction, use the same method to randomize.

Stratifying vs. blocking.  Use the term stratification in sampling and blocking in experiments.  Use the correct terminology in each setting. If you suggest a design that involves stratified samples or blocked experiments, you should explain the reason why.  There is no need to stratify by gender if we don't think males and females might respond differently.  In a survey at school asking "should seniors have special parking privileges?", there may not be gender-based differences, but I bet it would be important to stratify by grade level.  Similarly, there's no reason to block in an experiment unless we think subjects' responses may be related to some characteristic we can't otherwise control.

Question:  Rather than using a completely randomized design, one group of CHS administrators proposes stratifying by political affiliation (democrat or republican).  Under what circumstances would stratifying by political affiliation be appropriate?

Answer #1:  They need to create a homogenous group, so there are not drastic differences.

Dr. Johnson’s note:  This answer lacks context.  You need to reference back to the facts of the problem.

Answer #2:  Stratifying by political affiliation will create groups where the people in them will most likely have similar views to each other.

Dr. Johnson’s note:  Stratified random sampling works best when the individuals within each stratum are similar with respect to what is being measured and when there are large differences between strata.  Both conditions should be met.  Also, when discussing stratified random sampling, I prefer that you use the word “strata,” not “groups”.

My answer:  The goal of stratification is to create homogeneous groups.  Stratification by political affiliation would work best if the individuals within each stratum (political party) are similar with respect to their support for voting by mail and there are large differences between the political parties, i.e., there is homogeneity within the strata and difference between the strata.

MA

A simple random sample is a sample chosen in a way that every set of n units in the population has an equal chance of being in the sample that is actually selected.  Giving each individual an equal chance to be chosen is a necessary but not sufficient condition for a simple random sample.

There is a difference between voluntary response bias and non-response bias.  In non-response bias, the researcher picks the sample and some respondents refuse to answer or provide only partial answers.  In voluntary response bias, the researcher does not pick the sample.  Rather, the researcher issues an invitation for people to take the survey, and the subjects decide for themselves whether to be in the sample.

Simple random sampling does not reduce bias from poorly worded questions, under coverage, non-response, or other non-sampling errors.

In an experiment, the reason for the random assignment of treatments is to create two groups that are as similar as possible, so that comparisons can be made between the two groups.

When asked to describe how you would randomize, students need to provide details.  When describing the “hat method,” be sure to say that the slips of paper are identical and that you shake the hat to randomize.

The results of an experiment or sample are called statistically significant if the observed effect is unlikely to be the result of random assignment alone.  Of course, student answers must include context.

Here are some wrong interpretations of statistical significance:

  • The subjects that were given treatment A responded differently than the subjects who were given treatment B.  Dr. Johnson’s comment:  A difference in response is not enough.

  • The subjects that were given treatment A responded differently than the subjects who were given treatment B, and the difference was not due to chance. Dr. Johnson’s comment:  Statistical significance does not require certainty that the observed effect was not due to chance; it just requires that the observed effect was unlikely to be the result of chance.   Typically, if the probability of getting the observed effect by chance is less than 5%, we say it is statistically significant.

The purpose of giving one group a placebo is to provide a basis of comparison to measure the real effect of the treatment.

A confounding variable is a variable that is both (1) related to an independent variable and (2) might influence the response variable and may create a false perception of an association between the two.  To argue that a variable is a confounding variable, students need to argue that the variable satisfies both (1) and (2).

Here are the four principles of a good experimental design:

  1. Random assignment:  A good experiment randomly assigns treatments to subjects.

  2. Control:  A good experiment controls for outside factors (confounding variables).

  3. Compare:  A good experiment compares the results (the response variable) for two or more treatments.

  4. Replication:  A good experiment uses enough experimental units in each treatment group to reduce chance variation.  Also, replication occurs when the experiment is repeated with other subjects in different settings.

We can make an inference about cause-and-effect if we conducted a well-designed experiment with random assignment of treatments and obtained statistically significant results.

On FRQs, many students are too wordy; they wrote too much.  Get to the point.   Don’t be vague.  Avoid using the word “it.” Avoid ambiguous phrases like “skew the results.”

Question:  Suppose there are 2800 students at CHS.  Describe how you could select a simple random sample of 50 students using a random number generator.

Answer #1:  Give everyone a number, generate 50 random numbers.

Dr. Johnson’s comment:  I am looking for precise number assignment and clear instructions on randomization.  Simply saying “use a random number generator” or “flip a coin” is not sufficient for credit as a description of the randomization process.

Answer #2:  Every student can be associated with a number, and then the generator can decide the students based on the number it randomly picks.

Dr. Johnson’s comment:  The calculator doesn’t pick the students.  The researcher does.

Best answer:  Give each student a distinct integer label from 1 to 2800.  Use a random number generator to generate 50 unique integers (repeats are ignored) between 1 and 2800.  The 50 students whose numbers are selected are surveyed.

Dr. Johnson’s comment:  In the student’s answer, the procedure for randomization must be fully described so that two knowledgeable statistics users would, following the student’s instruction, use the same method to randomize.

Stratifying vs. blocking.  Use the term stratification in sampling and blocking in experiments.  Use the correct terminology in each setting. If you suggest a design that involves stratified samples or blocked experiments, you should explain the reason why.  There is no need to stratify by gender if we don't think males and females might respond differently.  In a survey at school asking "should seniors have special parking privileges?", there may not be gender-based differences, but I bet it would be important to stratify by grade level.  Similarly, there's no reason to block in an experiment unless we think subjects' responses may be related to some characteristic we can't otherwise control.

Question:  Rather than using a completely randomized design, one group of CHS administrators proposes stratifying by political affiliation (democrat or republican).  Under what circumstances would stratifying by political affiliation be appropriate?

Answer #1:  They need to create a homogenous group, so there are not drastic differences.

Dr. Johnson’s note:  This answer lacks context.  You need to reference back to the facts of the problem.

Answer #2:  Stratifying by political affiliation will create groups where the people in them will most likely have similar views to each other.

Dr. Johnson’s note:  Stratified random sampling works best when the individuals within each stratum are similar with respect to what is being measured and when there are large differences between strata.  Both conditions should be met.  Also, when discussing stratified random sampling, I prefer that you use the word “strata,” not “groups”.

My answer:  The goal of stratification is to create homogeneous groups.  Stratification by political affiliation would work best if the individuals within each stratum (political party) are similar with respect to their support for voting by mail and there are large differences between the political parties, i.e., there is homogeneity within the strata and difference between the strata.