population
entire group of individuals
convenience sample
easy to reach (bias)
sample
a subset of individuals in population from which we select data
simple random sample
SRS= ever individual has equal chance of being selected// label individuals, randomize , select
voluntary response
people choose (bias due to strong opinions) themselves
stratified random sample
classified into groups of similar individuals
cluster sampling
classified into groups that are located near each other (elected all individuals in the cluster)
systematic random sample
choose random start , then use equal interval
under coverage
some member cannot be chosen in a sample
non response
individual chosen can't be contacted or refuses to participate
response bias
pattern of inaccurate results
strata
group of similar individuals(sampled from all groups)
cluster
groups with different individuals
wording of question
most influential to given sample surveyed
observational study
observes individuals and measures variables of interest but does not attempt to influence the responses.
experiment
deliberately imposes some treatment on individuals to measure their responses.
Confounding
occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other
treatment
A specific condition applied to the individuals in an experiment
experimental units
the smallest collection of individuals to which treatments are applied
subjects
When the units are human beings
Fix cofounding
to perform a comparative experiment in which some units receive one treatment and similar units receive another
Basic principles for designing experiment
Comparison, Random assignment, control, replication
Comparison
Use a design that compares two or more treatments
Random assignment
Use chance to assign experimental units to treatments. Doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among the treatment groups.
Control
Keep other variables that might affect the response the same for all groups
Replication
Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between the groups
completely randomized design
the treatments are assigned to all the experimental units completely by chance
control group
receives an inactive treatment or an existing baseline treatmen
placebo effect
The response to a dummy treatment
double-blind experiment
neither the subjects nor those who interact with them and measure the response variable know which treatment a subject received
block
group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments
randomized block design
the random assignment of experimental units to treatments is carried out separately within each block (like stratification)
matched pairs design
for comparing two treatments, randomized blocked experiment in which each block consists of a matching pair of similar experimental units.
Sometimes, a “pair” in a matched-pairs design consists of a single unit that receives both treatments. Since the order of the treatments can influence the response, chance is used to determine with treatment is applied first for each unit.
statistically significant
An observed effect so large that it would rarely occur by chance = does imply causation.
cant happen by chance
less than 5% (p value of <0.05)
sampling variability
each sample has different results but they are from the same population
margin of error
how far off the targeted
how to get more accurate? and why?
bigger sample, because narrower curve and less variability
Inferences in samples survey
if randomly assigned to group? inference of cause and effect
if randomly selected from population? inference about population
causation when no experiment
association strong
consistent association
larger values of explanation associated with stronger responses
alleged cause preceded the effect in time
alleged cause is plausible
confidence interval
mean +_ margin of error (range)