results of a study consistently over or underestimate the value of a parameter we want to know
convenience sampling
taking a sample consisting of individuals that are easy for the experimenter to reach
Why is convenience sampling biased?
The individuals that are easiest to reach will likely differ from the population at large in some systemic fashion
Voluntary response sample
consists of individuals who choose to respond to a survey
Why is a voluntary response sample biased?
The people who choose to respond most likely have stronger opinions than the average individual, which cannot be generalized to the population at large
Random sampling
requires a chance process
Simple random sample
a sample of size n that is chosen so that every possible group of n individuals in the population have an equal chance of being selected for the sample
sampling frame
a list of individuals from whom the sample is drawn
downsides of simple random sampling (srs)
1. sampling frames are difficult to construct accurately (especially for large populations) 2. selection of subjects takes a long time
stratified random sampling
population is divided into groups of similar individuals (strata) and a simple random sample will then be taken in each stratum and the combined results will be the actual sample (SIMILAR WITHIN, DIFFERENT BETWEEN)
Cluster sample
population is divided into groups of individuals located near each other (clusters) (DIFFERENT WITHIN, SIMILAR BETWEEN) each cluster should be a small-scale population
larger samples....
are more precise and decrease variation
selection bias (undercoverage)
some members of the population cannot be chosen in a sample (ex: survey mailed to homes dont cover homeless ppl)
nonresponse bias
individual who is chosen for the saample cannot be contacted or refuses to participate
wording of question bias
questions that are leading, loaded, or poorly written can lead to markedly different result
response bias
gender, age, race, ethnicity, or the behavior of the interviewer affects the responses in some systematic way
response variable
a measured outcome of a study
explanatory variable
something that can explain the response variable
observational study
individuals are observed and variables of interest are measured; no treatment is imposed (NO CAUSE AND EFFECT CAN BE DETERMINED)
experiment
deliberately imposes treatment on the subject (CAN BE USED TO DETERMINE CAUSE AND EFFECT)
controlled experiment
some subjects are given a treatment and others given a placebo (used a comparison group)
confounding
2 or more variable are associated in such a way that their effects on the response cannot be distinguished from each other
experimental units
smallest collection of individuals to which treatments are applied
4 principles of good experimental design
comparison, random assignment, control, replication
principle of comparison
a good experimental design will be used to compare 2 or more treatments
principle of random assignment
units are to be assigned to treatments by some chance process (purpose of random assignment is to CREATE APPROX EQUAL GROUPS FOR COMPARISON)
principle of control
keeping outside variables that might affect the response constant (as much as possible). This REDUCES VARIABILITY in the response variable
principle of replication
to use enough experimental units in each group to distinguish the results (from the treatment) from chance differences between the groups
statistically significant
the effect is large enough to rarely occur by chance alone
practical importance
Something can be statistically significant but not practically important (Ex: avg test score goes from 90 to 92% due to a test prep site can be statistically significant but not practically important)
block
a group of experimental units that are known before the experiment begins to be similar in some way that is thought to affect the response (purpose: TO REDUCE VARIABILITY THAT MIGHT ARISE FROM RANDOM ASSIGNMENT)
matched pairs design
pairs of similar experimental units are matced up and one of the units in the pair is randomly assigned the treatment (and other the control) Alternatively, both treatment and control can be assigned to the same unit with the order of reception randomized
random selection
if so, you can make an inference about the population