Scientifically backed study method

Review terms and definitions

Study with MC, T/F, and other questions

Take a test on your terms and definitions

138 Terms

😃 Not studied yet (138)

graph

a set of edges connected by vertices

vertex/vertices

the points that connect each edge together

edges

the "links" between vertices

On a city map, ____ are vertices and ____ are edges

islands; bridges

valence/degee

the number of edges attached to a vertex

path

a sequence of vertices
such that there is an edge
between each vertex and the
next one.

circuit

a path that starts and ends at the same vertex

eulerian circuit

a circuit that crosses each edge exactly once

A graph has a Eulerian circuit if and only if .....

the degree of all its vertices
is even.

eulerian path

a path that crosses each edge exactly once.

A graph has a Eulerian path if and only if...

• either the degree of all its vertices is even (then it is even better: it has a Eulerian circuit);
• or the degree of all its vertices except 2 of them is even (then it has a Eulerian path, but no Eulerian circuit)

hamiltonian circuit

a circuit that visits each vertex of the graph exactly once

hamiltonian path

a path that visits each vertex of the graph exactly once.

complete graph

a graph where there is exactly one edge between different pairs of vertices

weighted graph

a graph where there is a number attached to each edge

The weight of a path is the ....

sum of the weights on all edges encountered on this path (counted once for every passage)

traveling salesman problem

consists in finding a path / circuit that visits all
vertices of a weighted graph (a Hamiltonian path / circuit); and that has the smallest
weight.

brute force solution (for TSP)

1. Look at all possible Hamiltonian paths.
2. Check the weight of each of these paths.
3. Choose the one with the smallest weight.

n factorial

In a complete graph with n vertices, there are n! Hamiltonian paths
Even if n is small, n! is humongous

nearest-neighbor algorithm

1. Start from any vertex.
2. Jump to the closest vertex (that is, the edge to this vertex has the smallest weight).
3. Then go to the closest vertex which has not been visited yet.
4. Continue until you have seen all the vertices.
5. If you want a circuit, eventually return to the starting vertex.

The nearest-neighbor algorithm provides a path with the following properties:

• It is usually “short”.
• Its length can depend on the vertex that we start from.
• Its length can depend of the choices that we make (if there are edges with the same length).
• It usually does NOT provide a solution to the TSP (meaning, it is not always the shortest path).

sorted-edge algorithm

1. List the weight of all edges in increasing order.
2. Choose the first edge (the one with the smallest weight).
3. As long as not all vertices are included, add the next edge in the list (except if it makes three chosen edges meet at a vertex or it closes a circuit that does not include all the vertices)
4. When a Hamiltonian circuit has been created, we are done

The sorted-edge algorithm provides an approximate solution to the TSP. However...

-it is usually not the shortest path;
-its length can depend of the choices that we make (if there are edges with the same length);
- it only works to find a circuit

tree

a graph which contains no circuits

spanning tree

a tree that has only the edges of the original graph (but not always all the edges) and has all the vertices of the original graph

applications for a tree problem are:

Electrical grid: connect every home to the electrical grid at the
minimum cost.
Broadcasting: transmit one message to all recipients in the fastest way.
Circuit board design: connect all parts at minimum cost.

How to obtain a spanning tree

• Start from the graph.
• Remove edges one by one until no circuit remains.

The weight of a spanning tree is....

the sum of the weights of all its edges

minimum spanning tree

a spanning tree whose weight is minimum amongst all the possible spanning trees for a graph

kruskal's algorithm (MST)

1. List all the edges in increasing order of their weights.
2. At each step, add the next edge of the list if it does not create a circuit.
3. Stop when we get a spanning tree.

Prim's algorithm (MST)

1. Choose any vertex.
2. Look at all edges attached to this vertex, and choose the one with the minimum weight.
3. At each step, look at all the vertices that we saw, and all edges connected to these vertices, and choose the one with minimum weight, except if it creates a circuit.
4. Stop when we get a spanning tree

Prim’s algorithm is conceptually more _______ than Kruskal’s, but Kruskal’s needs to compare the weight of all the edges at the _______.

complicated; same time

coloring problem

consists in giving a color to each vertex,
in such a way that two vertices that are connected by an edge have different colors.

chromatic number

the minimum number of colors
needed to solve the vertexcoloring problem

how do you find the chromatic number?

1. Start from a vertex and give it a color.
2. Give a different color to a neighboring vertex.
3. Continue by going from neighboring vertex to neighboring vertex.
4. Try to use as few colors as possible: reuse colors as much as possible

planar graph

a graph that can be drawn without any edges intersecting

According to the four-color theorem, the chromatic number of a planar graph is at most ___

4

statistics

the art and science of
• collecting data;
• summarizing and analyzing data;
• presenting data;
• interpreting data

population

all the individuals for a problem

sample

all the individuals that we will actually gather data from

variables

the information that we gather from the experiment (ex: intended vote, grades, size, opinion, …)

discrete variable

an integer/whole number (1, 2, 3)
ex: number of pets, number of classes taken

continuous variable

can be any number, not only whole; like 2.5, 3.9, etc
ex: weight, height, price, size

how to find the frequency distribution of a variable:

1. Choose some non-overlapping intervals of values (for instance 0, 1 –2, 3 – 5, 6 – 10, more than 10).
2. Count how many times the variables takes such values.

how to find the relative frequency distribution of a variable:

1. Choose some non-overlapping intervals of values (for instance 0, 1 –2, 3 – 5, 6 – 10, more than 10).
2. Count the percentage of times the variables take such values.

histogram

a graphical representation of the distribution of a
variable using bars of different heights.

outlier

an individual or data that does not fit the overall pattern

symmetric distribution

A distribution is symmetric if we can draw a vertical line on the histogram, and both sides are approximate mirror images of each other

skewed right distribution

A distribution is skewed right if the longer tail of the histogram is on the right side

skewed-left distribution

A distribution is skewed left if the longer tail of the histogram is on the left side

mean

the average of all the values

median

a list of values is a number such that half the values are higher than this number and half the values are lower than this number.

mode

the most frequently occurring value in a set of data

standard deviation

the average amount a value deviates from the mean; the square root of the variance

variance

The square of the standard deviation

how to find the variance and standard deviation

1. Compute the mean
2. Compute the deviations (x1-mean), (x2-mean),...
3. Compute the squared deviations (x1-mean)^2, (x2-mean)^2,....
4. Sum all these quantities.
5. Divide the result by n− 1 to get the variance.
6. Take the square root of the result to get the standard deviation

normal distribution

When the shape of a distribution is more or less
"regular".
• It looks like a bell curve.
• The curve is an idealized version of the distribution

A normal distribution is characterized by two parameters:

1. The mean is where the curve reaches its maximum.
2. The standard deviation is
the width of the curve, between the two points where
it changes curvature (inflection points).

The 68-95-99.7 rule

For a normal distribution with a mean µ and a standard deviation σ:
68% of the values are between µ-σand µ+σ
95% of the values are between µ– 2σ and µ+ 2σ
99.7% of the values are between µ– 3σ and µ+ 3σ

For a normal distribution, less than ___% of the values are more than 2σ from the mean and less than ____% of the values are more than 3σ from the mean.

5%; 0.3%

distributions

we study one variable, and wish to
analyze it: average value, spread of the values, etc.

relationships

we study two variables and wish to know whether they are related

response variable

measures the outcome of a study; is typically
directly observed.

explanatory variable

a variable that explains the changes of a response variable; is typically hidden / not obvious.

correlation coefficient

a numerical value that measures the strength and direction of the relationship between two variables (usually denoted by r)

correlation

a relationship between two or more things

If r> 0, then it is a ________ association, if r< 0, then it is a ___________ association

positive; negative

The closer r is to 1 or -1, the more _______ associated/ correlated the variables are.

strongly

Correlation only makes sense for _______ values and ________ relationships

numerical; straight-line

If two variables are highly correlated, then....

• when one goes up, the other one goes up (r close to 1)
• or when one goes up, the other one goes down (r close to -1)

correlation does not imply _______

causation

the equation of a line is

y=mx+b

m in a regression line equation is the _____

slope (rise over run; (y2-y1/x2-x1)

b is the ______ or a regression line equation

y-intercept

slope

rise over run
(y2-y1/x2-x1)

(least-square)regression line

the line that fits the closest to all the points

how to find the least-square regression line

1. Take a straight line.
2. Compute the vertical distances
from each point to the line.
3. Square these distances.
4. Sum all these quantities.
5. The line that makes this sum as
small as possible is the least-square regression line

The regression line allows to make predictions only if the correlation is ______ (r close to -1 or +1) and only works for ______ correlations

high; linear

proportion

the number of times something happens out of the total outcomes

occam's razor

the principle that entities should not be multiplied needlessly; the simplest of two competing theories is to be preferred

statistical inference

the method of drawing conclusion about an
entire population based on data from a sample

parameter

a fixed quantity that describes some characteristic of
the population (for instance average height, proportion of women, result of a vote, opinion about hippos, etc.)

statistic

a quantity that describes some characteristic of a
sample

central limit theorem

the larger the population, the closer the results will be to a normal distribution

confidence interval

the amount of confidence we have that a certain range of values will fall within 95% of the possible outcomes for that data

how to find the 95% confidence interval

1. find the approximate standard deviation (square root of p(1-0)/n)
2. take the percentage/probability you have and subract and add that by 2 times the approximate standard deviation
ex: if your percentage is 56% and your approximate standard deviation is 5%, then you would calculate for 56%-(2*5%) and 56%+(2*5%)

Each poll comes with a _______, which is given by the standard deviation

Each poll comes with a margin of error

Sample Space (S)

the set of all possible outcomes of a
random phenomenon

probability of an outcome (an element of the sample space)

the proportion of times the outcome occurs in an infinitely long series of
repetitions

Probabilities can be expressed as....

decimals, percentages, or fractions

The probability of an outcome is between __ and ___ (inclusive)

Probability of a coin flip

For a coin flip: the probability of H or T is 1/2 = 0.5 = 50%

Probability of picking a card

the probability of each card (1S, 1H, …) is 1/52

event (E)

any set of outcomes of a random phenomenon; an event is the occurrence of something.

probability of events

For an event (E) we denote by P(E) the probability of E, that is, the proportion of times that the event E occurs when we repeat the same experiment infinitely many times

The goal of probability is to compute the _________: the
proportion of time this event will occur when we repeat the _______ experiment infinitely many times

probability of events; same

property

The probability of an event is the sum of the individual probabilities of each outcome that it contains

The sum of the probabilities of all outcomes must be ____

1

The complement of an
event A

-the event that happens when A does
not occur; denoted by A^c
-It contains all the outcomes of
the sample space that; does not
contain

The probability of the complement of an event is 1 minus the probability of the ______

event

disjoint events

two events that have no outcome in common; they events cannot occur simultaneously

Intersection

the event that both A and B occur
ex: “Get heads at least once and tails at least twice”

Union

For two events A and B, the union is that the event that either A or B (or both) occur

If two events are disjoint, the probability of their union is the ____ of their probabilities

sum

The probability of the _______ of two events is the sum of their probabilities minus the probability of their intersection

union

independent events

the occurrence of one has no influence on the probability of occurrence of the other; knowing that one event occurs does not change the probability that the other event occurs.

conditional probability

the probability that A occurs given that B occurred; can also be switched to the event B occurring if event A occurred, etc

false positive

nothing is present, but the test is positive

false negative

something is present, but the test is negative

two-player total-conflict game

-There are two players.
• When one player loses, the other one wins.
• More generally, what is better for one player is worse for the other
player.
• Cooperation is never beneficial

Examples of a non total-conflict

• Cooperative games
• Business decisions
• Arms race
• Cold war
• Fighting a pandemic

Example of total-conflict

• Chess
• Poker
• Gambling
• Sports

A two-player game can be represented by a _________, that is, a table of numbers representing the pay-off in each case, from the point of view of one of the players

pay-off matrix

Some possible meanings of a pay-off

• Amount of money.
• Quantity of something.
• Worth.
• Utility (happiness, satisfaction)

maximin strategy

1. finding the minimum along each row
2. finding the maximum of all these values.This value is called the maximin (the maximum of the minimum).
-Choosing the corresponding row is playing the maximin strategy

minimax strategy

1. finding the maximum along each column;
2. finding the minimum of all these values.This value is called the minimax (the minimum of the maximum).
- Choosing the corresponding column is playing the minimax strategy.

The minimax and maximin strategies are both ___________ strategies

conservative

The maximin strategy allows Player 1 to obtain a guaranteed minimal gain, meaning that....

Player 1 is sure to obtain the maximin, and could obtain
more

The minimax strategy allows Player 2 to sustain a guaranteed maximum loss, meaning that....

Player 2 is sure to pay at most the minimax, and could
pay less

saddle point

a position in the pay-off matrix that is both a row minimum, and a column maximum

Steps to find a saddle point

1. Circle the minimum on each row.
2. Box the maximum in each column.
3. The value that are both circled and boxed are the saddle points

All saddle points have the same ______

value

If there is a saddle point, then the maximin, the minimax, and the saddle points all have the _________

same value

If there is a saddle point, then the value of the saddle point is called the ___________

value of the game

Interpretations of total-conflict games

-If Player 1 plays their maximin strategy, Player 1 is guaranteed to
obtain at least the maximin regardless of what Player 2 does.
- If Player 2 plays their minimax strategy, Player 1 is guaranteed to
obtain at most the minimax regardless of what Player 1 does.
-If there is a saddle point, then maximin = minimax, so none of them can do better. So the maximin strategy is an optimal strategy for Player 1, and the minimax strategy is an optimal strategy for Player 2.

If two perfectly rational players
play a game with a saddle point,
then in the long run, it will
stabilize, meaning that....

• Player 1 will always play their maximin strategy
• Player 2 will always play their minimax strategy
-Player 1 will receive the value of the game.

If there is no saddle point, then the maximin is strictly ______ than the minimax

smaller; maximin < minimax

If Player 1 plays their maximin strategy and Player 2 plays their minimax strategy, then Player _____ gets a pay-off between the maximin and the minimax

one; maximin≤ pay-off≤ minimax

pure strategy

If a player uses a fixed (not random) strategy (for instance maximin or minimax)

mixed strategy

If a player uses a (random) mix of strategies, (for instance kick left or right with probability 1/2)

An optimal strategy for Player 1 is....

a (pure or mixed) strategy that guarantees the highest average pay-off for Player 1, regardless of what Player 2 does

An optimal strategy for Player 2 is.....

a (pure or mixed) strategy that guarantees the lowest average pay-off Player 1, regardless of what Player 1 does

When both players play their _______ strategies, then Player 1 will get an expected pay-off of exactly Ek=Eg. This is called the value of the
game.

optimal

partial-conflict game

A game that is not a total-conflict game; players can benefit from cooperation

Examples of partial-conflict games

• Business decisions.
• Choice of a restaurant.
• Social norms.
• Laws

pay-off matrix for a partial-conflict game

In a partial-conflict game, gain/loss for one player can also mean gain/loss for the other player, meaning that he pay-off matrix needs to include the gain for each player in
each of the situations, such as (Gain for P1, Gain for P2)l

Nash equilibrium

a strategy such that no player can benefit by changing their strategy, while the other player’s strategy remains unchanged

How to check that a position in the matrix is a Nash equilibrium

• Is any other strategy better for Player 1? In other words, is any of the first numbers in the same column larger?
• Is any other strategy better for Player 2? In other words, is any of the second numbers in the same row larger?
• If no to both questions: it is a Nash equilibrium