Exam 2 - STATS 3110

Studied by 3 people

0.0(0)

get a hint

hint

Probability Distribution

1 / 66

There's no tags or description

Looks like no one added any tags here yet for you.

Studying Progress

New cards67

Still learning0

Almost done0

Mastered0

67 Terms

New cards

Probability Distribution

Gives all possible events (or values) a certain variable (typically denoted by X) can product along with the probabilities for these vents/values. The variable can be quantitaitive or categorical. All probabilities must be between 0 and 1 (inclusive). All probabilities must sum to 1.

New cards

Discrete Random Variables

Have explicit corresponding probabilities. Also applies to categorical variables.

New cards

Continuous Random Variables

Can take on infinetly many values. We look at intervals instead of single values. There is no difference between ≤ and <.

New cards

Normal Distribution

<p>Provides a good mathematical model for many biological variables which are symmetric, such as blood pressure, bone mineral density, height of plants, weight of animals, etc. µ=0 and σ=1.</p>

Provides a good mathematical model for many biological variables which are symmetric, such as blood pressure, bone mineral density, height of plants, weight of animals, etc. µ=0 and σ=1.

New cards

Mean (µ)

Describes the central peak around which the values are clustred (in a normal distribution)

New cards

Standard Deviation (σ)

Describes the spread of values around the mean (µ)

New cards

68-95-99.7 Rule

Data can be described by Normal distributions following this.
In the normal distribution with mean (µ) and standard deviation (σ):

Approximately 68% of the observations fall within σ of the µ
Approximately 95% of the observations fall within 2σ of µ
Approximately 99.7% of the observations fall within 3σ of µ

New cards

Probability

Can be represented by the “area under the curve” and only intervals will have non-zero probability

New cards

Z-Score

Tells us how many standard deviations the observation falls from the mean in an observation. Standardizes any distribution so that it can be directly compared with others. Shows a value’s relative position within a distribution. Can also be used to find the probability of a randomly selected value falling above or below a value or between two values in the distribution.

New cards

Positive Z-Score

Indicates the observation is above the mean

New cards

Negative Z-Score

Indicates the observation is below the mean

New cards

Parameter

Is a numerical value summarizing the population data (mean, median, mode, range, maximum). It is typically unknown.

New cards

Statistic

Is a numerical value summarizing the sample data (sample, mean, median, mode, range, maximum, etc.). If we have data we know and can calculate this. This can serve as estimates for population parameters.

New cards

Symbol for Sample Mean

x̄

New cards

Symbol for Population Mean

New cards

Symbol for Sample Proportion

p̂

New cards

Symbol for Population Proportion

New cards

Symbol for Sample Standard Deviation

New cards

Symbol for Population Standard Deviation

New cards

Symbol for Sample Variance

s²

New cards

Symbol for Population Variance

σ²

New cards

Symbol for Sample Size

New cards

Symbol for Population Size

New cards

Sampling Distribution

Is the collection of all possible sample statistics we could attain from samples of the same size from a population

New cards

Center of Sampling Distribution

Will ALWAYS be the value of the population’s parameter, regardless of sample size

New cards

Randomization and Independence Assumption

The sampled values must have been randomly selected or resulted from an experiment with random assignment, and they should be independent of each other

New cards

Sufficient Sample Size Assumption

The sample size, n, must be large enough that the sampling distribution is not truncated at either end. This is generally satisfied by the 15 successes and failures condition

New cards

Success/Failure Condition

The sample size must be big enough so that both the number of successes, np, and the number of failures, nq, are at least 15

New cards

Sampling Distribution of a Sample Proportion

New cards

Standard Error

Is the standard deviation of the sampling distribution

New cards

Predicting SE using σ (the population standard deviation), then the sampling distribution for x-bar is approximately normal as long as…

If the population has a normal distribution
If the data (from our sample) are approximately bell-shaped, we can infer that the population from which it is drawn is also approximately bell-shaped
For large n (n≥30), the sampling distribution of x-bar is approximately a normal distribution regardless of the distribution of the population. This is called the Central Limit Theorem (CLT)

New cards

Sample Average (x-bar)

Is a statistic with a sampling distribution of predictable shape, center, and variation

New cards

Summary of the sampling distributions

New cards

Confidence Interval

Interval estimate of a population parameter (based on the sampling distribution)

New cards

Test of Significance

Assessing the evidence for or against a claim about the population parameter (based on the sampling distribution)

New cards

“Plus Four” Adjustment

Add two observations to the number of successes and four observations to the total sample size (two successes and two failures); For large sample sizes, it makes very little difference, but this can improve the accuracy for smaller sample sizes. Use throughout the whole formula, not just SE

New cards

Changing the confidence level to a higher percentage

The interval would be wider

New cards

Pros of choosing a higher confidence level

New cards

Cons of choosing a higher confidence level

New cards

How would a smaller sample affect the width of the interval?

It would make the interval more narrow

New cards

How would a larger sample affect the width of the interval?

It would widen the interval

New cards

Hypothesis Testing

Another type of statistical inference; to determine if changes in the observed data are bigger than we would expect; provides a formal structure of testing a specific value of the population parameter, without measuring the whole population (based only on the sample results)

New cards

Hypothesis Testing Steps

State the hypothesis
Check the conditions and calculate the test statistic
Find the p-value
State the conclusion based on the context

New cards

Null Hypothesis (H₀)

Is a statement of “no effect” or “no difference”; a hypothesis test is designed to test the strength of the evidence against this (refers to the true population parameter) (not the p-hat)

p=hypothesized value
Never “accept” the null hypothesis; rather we speak in terms of rejecting or failing to reject

New cards

Alternative Hypothesis (Hₐ)

The more specific claim about the population that we are trying to find evidence for (refers to the true population parameter) (not the sample p-hat); the direction is determined by the comparison and will be informed by the context of the problem

p[<,>,≠] hypothesized value

New cards

Hypothesized Value

Is the value being tested, often the value which used to be previously true

New cards

One-Sided

The alternative hypothesis is ________ if it states that a parameter is larger than or that it is smaller than the null hypothesis value

New cards

Two-Sided

It is _______ if it states that the parameter is different from the null value (it could be either smaller or larger)

New cards

Test Statistic

Measures how much evidence we have in favor of Hₐ; measures how many standard errors the current sample proportion p-hat is from the hypothesized value; it is denoted by z since it follows the standard N(0,1) distribution

New cards

P-Value

<p>Quantifies exactly how unlikely a particular sample proportion is; it is defined as the probability of getting a sample proportion greater or less than the current p-hat (in the direction given by H<span style="font-family: Roboto, arial, sans-serif">ₐ) under the assumption that H₀ is true; graphically it is the tail area of the normal distribution beyond the test statistic</span></p>

Quantifies exactly how unlikely a particular sample proportion is; it is defined as the probability of getting a sample proportion greater or less than the current p-hat (in the direction given by Hₐ) under the assumption that H₀ is true; graphically it is the tail area of the normal distribution beyond the test statistic

New cards

Small p-value

The observation would be really unlikely if the null hypothesis were true, so we should conclude the null hypothesis is false and go with the reality presented in the alternative hypothesis

New cards

Large p-value

The observation would happen fairly often just by chance when the null hypothesis is correct, so we have no real evidence that it is wrong

New cards

Significance Level (α)

The formal cutoff between a “small” and “large” p-value

New cards

If P-value<α

Reject H₀; there is sufficient evidence to reject the null hypothesis at this significance level; there is sufficient evidence, at α, Hₐ is true

New cards

If P-value ≥ α

Fail to reject H₀; there is insufficient evidence to reject the null hypothesis at this signficance level; there is insufficient evidence, at α, Hₐ is true.

New cards

Interpretation of P-value

The chance of observing ______ (more or less) ________ in a random sample of ______ (n), if the true proportion of ________ is _______%.

New cards

Alternative hypothesis and larger sample size

Always provides greater evidence for it

New cards

Large Test Statistic

Means that in the Normal curve, the tail area beyond it becomes smaller

New cards

Larger sample size and P-value

Often, we have a higher value that is slightly above 0.05 (implying insufficient evidence of a change), taking a larger sample size might bring the P-value below 0.05 (implying sufficient evidence of a change)