EPPS Math and Coding Camp

Probability

Instructor: Prajyna Barua

9.1.2 Classical Probability

Classical probability is the theoretical analysis of events in the absence of data. The three key concepts are outcome, event, and sample space.

Outcomes are anything that might happen in the world.

Events are composed of one or more outcomes.

Events can be divided into two groups with respect to their cause, those that will happen with some probability given certain conditions and those that will happen (or not happen) with certainty given certain conditions.

A sample space is the set of all possible outcomes: it is a list of each event we might observe.

If we define the event of interest as the sample space, \(S\), then the probability of that event is the sum of the probabilities of each outcome, which is 1: \(\Pr(S) = \Pr(o_1) + \Pr(o_2) + \Pr(o_3) + \ldots + \Pr(o_n) = 1.0.\)

With outcome, event, and sample space defined, we can define the classical probability of an event: \[\Pr(e) = \frac{\text{No. of outcomes in event } e}{\text{No. of outcomes in the sample space}}.\]

9.1.3 Independence, Mutual Exclusivity and Collective Exhaustivity

Two events are independent if the probability that one occurs does not change as a consequence of the other event’s occurring.

Two events are mutually exclusive when one cannot occur if the other has occurred.

Collective exhaustivity refers to ensuring that a set of outcomes or events cover the entire sample space, i.e. all possible outcomes are included and accounted for.

Joint and Conditional Probabilities

A joint probability is the probability of a compound event. If the simple events of a compound event are independent, then their joint probability is the product of the probabilities of each simple event.

The joint probability of independent events is the probability of both events occurring, and we calculate it as the product of the probabilities of each individual event

The joint probability of two mutually exclusive events is not the product of the simple probabilities. Instead, it is the sum of the simple probabilities: given that \(p(y_D) = .4\) and \(p(y_R) = .5\), \(p(y_D \text{ or } y_R) = 0.4 + 0.5 = 0.9\).

The probability of one event occurring is affected by whether another event occurs and is referred to as a conditional probability: \(p(y|x,z)\), which is read “the probability of \(y\) given \(x\) and \(z\).”

9.2 COMPUTING PROBABILITIES

9.2.1 Notation and Some Rules

First, consider the probability that an event \(A\) occurs. We denote this \(Pr(A)\)

All probabilities lie between zero and one, so \(Pr(A) \in [0,1]\). We say \(A\) is a deterministic event if \(Pr(A) \in \{0,1\}\) and a random, probabilistic, or stochastic event otherwise. If \(S\) is the sample space containing all events that might happen, then \(Pr(S) = 1\). If \(Pr(A) = 0\), then \(A\) cannot happen.

\(Pr(A|B)\) is the conditional probability of \(A\) on \(B\). In other words, it is the probability that \(A\) occurs given that \(B\) has already occurred. If \(A\) and \(B\) are independent events, then the fact that \(B\) has already occurred doesn’t influence the probability that \(A\) will occur. So, for independent events, \(Pr(A|B) = Pr(A)\).

The symbols for set union (\(\cup\)) and intersection (\(\cap\)) also apply to events. \(A \cup B\) is the compound event where either \(A\) or \(B\) happens, or both. Thus, we read \(A \cup B\) as “\(A\) or \(B\).” So so the compound event happens if any of the events in \(A\) or \(B\) happen.

\(A \cap B\) is the compound event where both \(A\) and \(B\) happen. Thus, we read \(A \cap B\) as “\(A\) and \(B\).”

The rule for and looks like this:

\[Pr(A \cap B) = Pr(B|A)Pr(A) = Pr(A|B)Pr(B)\]

When \(A\) and \(B\) are independent, \(Pr(A|B) = Pr(A)\) and \(Pr(B|A) = Pr(B)\), and this rule reduces to \(Pr(A \cap B) = Pr(A)Pr(B)\)

The rule for or looks like this:

\[Pr(A \cup B) = Pr(A) + Pr(B) - Pr(A \cap B)\]

When events are mutually exclusive that overlap is zero, though, so you just get \(Pr(A \cup B) = Pr(A) + Pr(B)\).

9.2.3 Bayes Rule

Let \(B\) and \(A\) be two events of interest and, \(\sim B\) (read “not \(B\)”) and \(\sim A\) represent the absence of the events.

We can write Bayes’ theorem in this simple case as follows:

\[ Pr(B|A) = \frac{Pr(A|B)Pr(B)}{Pr(A|B)Pr(B) + Pr(A|\sim B)Pr(\sim B)} \]

One can read equation as follows: the posterior probability of B given A is the product of the prior probability of B and the probability of A given B divided by the product of the prior probability of B and the probability of A given B plus the product of the prior probability of not B and the probability of A given not B.

Example:

Example: Stokes provides a table of Latin American leaders between 1982 and 1995.

She records whether, once in office, the politician adopted a security-oriented or an efficiency-oriented policy. The data reveal that 33 of 43 Latin American leaders elected between 1982 and 1995 adopted efficiency policies: \(Pr(e) = \frac{33}{43} = 0.77\). Given \(Pr(e)\), we can calculate the empirical probability that a politician adopted a security-oriented policy: \(Pr(s) = 1 - Pr(e) = 1 - 0.77 = 0.23\).

These prior beliefs suggest that the typical voter in Latin America will expect 77% of candidates to implement efficiency-oriented policies and 23% of candidates to implement security-oriented policies.

To use Bayes’ rule to update those beliefs in response to a campaign in which candidates make promises, we need to define the situation.

Consider a contest with two candidates, one of whom campaigns on an efficiency-oriented platform, the other of whom campaigns on a security-oriented platform.

If Bayes’ rule leads to the conclusion that the voter ought to revise his beliefs about the candidates’ probability of implementing the policy on which they campaign.

In other words, we want to know, for example, whether campaigning on an efficiency-oriented platform increases voters’ beliefs that the candidate will adopt an efficiency policy in office.

If we let \(\epsilon\) indicate a campaign promise of efficiency-oriented economic policy and \(e\) indicate the adoption of an efficiency policy in office, then this belief is \(Pr(e|\epsilon)\). Bayes’ rule lets us calculate this, if we know \(Pr(\epsilon|e)\), \(Pr(\epsilon)\), and \(Pr(e)\).

We already know that \(Pr(e) = 0.77\). Next we need the conditional probability that a candidate campaigned on an efficiency-oriented platform given that he adopted an efficiency policy in office: \(Pr(\epsilon|e)\).

Stokes’s table reveals that 16 of the 33 candidates who adopted efficiency policies also campaigned on an efficiency platform: \(Pr(\epsilon|e) = \frac{16}{33} = 0.48\). Finally, consider \(Pr(\epsilon)\). We expand this as above to get \(Pr(\epsilon) = Pr(\epsilon|e)Pr(e) + Pr(\epsilon|\sim e)Pr(\sim e)\).

We know the first term in the sum already, and also that \(Pr(\sim e) = 0.23\). This leaves \(Pr(\epsilon|\sim e)\): the conditional probability that a candidate campaigned on an efficiency-oriented platform given that he adopted a security-oriented (i.e., “not efficiency”) policy in office.

It turns out that none of the ten Latin American politicians who enacted security-oriented policies once in office campaigned on efficiency: \(Pr(\epsilon|\sim e) = \frac{0}{10} = 0\).

Plugging these into Bayes’ rule yields:

\[ Pr(e|\epsilon) = \frac{Pr(\epsilon|e)Pr(e)}{Pr(\epsilon|e)Pr(e) + Pr(\epsilon|\sim e)Pr(\sim e)} \]

\[ = \frac{0.48(0.77)}{0.48(0.77) + 0(0.23)} = \frac{0.37}{0.37 + 0} = 1 \]

The voter’s posterior belief is 1.0, which is a considerable increase from 0.77, the voter’s prior belief. Thus, the campaign has a substantial impact: on knowing that the candidate is promising to implement efficiency-oriented policies, the voter shifts from being confident that the candidate will do so (0.77 probability) to being certain that the candidate will do so (1.0 probability).

Next we turn to the issue of whether a candidate will implement a security-oriented platform, given that he campaigned on a security-oriented policy. Letting \(\sigma\) be a security-oriented campaign and \(s\) be a security-oriented implementation in office, this conditional probability is \(Pr(s|\sigma)\).

To compute this we’ll need to know the conditional probability that a candidate campaigned on a security-oriented policy given that he adopted a security-oriented policy in office, which is \(Pr(\sigma|s)\), as well as \(Pr(s) = 0.23\) and \(Pr(\sigma) = Pr(\sigma|s)Pr(s) + Pr(\sigma|\sim s)Pr(\sim s)\).

Stokes’s data indicate that all ten of the leaders who implemented a security-oriented policy also campaigned on it: \(Pr(\sigma|s) = \frac{10}{10} = 1.0\), where \(\sigma\) (sigma) represents a candidate who campaigns on a security-oriented policy.

The final piece of information we need is the number of leaders who campaigned on a security platform but adopted an efficiency-oriented policy. The data reveal that 12 of the 33 leaders who adopted an efficiency-oriented policy campaigned on a security platform: \(p(\sigma|\sim s) = \frac{12}{33} = 0.36\).

Plugging these into Bayes’ rule yields

\[ Pr(s|\sigma) = \frac{Pr(\sigma|s)Pr(s)}{Pr(\sigma|s)Pr(s) + Pr(\sigma|\sim s)Pr(\sim s)} \]

\[ = \frac{1.0(0.23)}{1.0(0.23) + 0.36(0.77)} = \frac{0.23}{0.23 + 0.28} = \frac{0.23}{0.51} = 0.45 \]

The voter’s posterior belief after observing the campaign is 0.45.

9.3.1 Odds and the Odds Ratio

The odds of an event is defined as the ratio of the probability of the event’s occurring and the probability that it does not occur: \(\frac{Pr(y)}{Pr(\sim y)}\). The odds ratio of two events, \(x_1\) and \(x_2\), then, is the ratio of the individual odds:

\[\frac{Pr(x_1)/Pr(\sim x_1)}{Pr(x_2)/Pr(\sim x_2)}\]

Problems

Problem 1

Characterize the following as independent, mutually exclusive, and/or collectively exhaustive:

33 year-old, middle income, Asian American, male.
Strongly disagree, neutral, agree.
Vote share, size of the economy, education level.
War, not war.
Less, same, more.

Problem 2

If \(a\) and \(b\) are independent events, are the following true or false?

\(\Pr(a \cap b) = \Pr(a)\Pr(b)\)
\(\Pr(a|b) = \Pr(a)+\Pr(a)\Pr(b)\)
\(\Pr(b|a) = \Pr(b)\)

Problem 3

If \(a,b, c\) and \(d\) are mutually exclusive and collectively exhaustive, and \(Pr(a = 0.23)\), \(Pr(b = 0.15)\), and \(Pr(c = 0.46)\), then what is the joint probability of (\(a\) or \(d\))?

Problem 4

Let \(P(A) = 0.4\) and \(P(A \cup B) = 0.7\). Find \(P(B)\), assuming both events are independent.

Any Questions?

Home