# Basics of Probability: Unions, Intersections, and Complements

On January 15, 2020 by Raul Dinwiddie

Let’s look at intersections, unions, and

complements, in a probability context. We’ll discuss the basic concepts, and then work through examples. The intersection of events A and B is the

event that both A and B occur. It is typically denoted by A intersect B,

with the intersection symbol. But you might also see A and B, or possibly just AB, representing the intersection. We are free to switch the ordering of the intersection of events. B intersect A (B and A) is the same event as A intersect B (A and B). Here is a Venn diagram representation. The rectangle represents the sample space, the circle on the left represents event A, and

the circle on the right represents event B. The intersection of A and B is this green

region, which is in both A and B. It is where both A and B occur. Here’s another visual representation of events A and B. Now let’s suppose we start pushing the circles apart. A little further, a little further, there we go. Let’s look at this scenario, where A and B do not overlap. Here, A and B do not share any common ground. They share no part of the sample space, and they cannot both occur. We might casually say that they don’t have

an intersection, but mathematicians prefer to say that their intersection is the empty set (their intersection contains no elements). In this situation, we say that A and B are

mutually exclusive. The intersection of mutually exclusive events contains no sample points, so the probability of their intersection is 0. Mutually exclusive events are sometimes called disjoint events. The union of events A and B is the event that either A or B or both occurs. In set notation, the union is denoted by this

union symbol, A union B. But you might simply see simply A or B. When we use the term A or B in probability,

we are referring to their union, and using the word “or” in the inclusive sense — A or B means A or B or both. We are also free to switch the ordering in the union of events. B union A (B or A) is the same event as A union B (A or B). Here’s a Venn diagram representation of

events A and B. The green region represents the union —

everything that is in either A or B or both. The probability of the union of two events

can be found with the addition rule. The probability of the union of A and B is

the sum of the individual probabilities, minus the probability of the intersection. Why is that? Well, let’s take a look at

the Venn diagram. If we add the probability of A to the probability of B, then we’ve added the probability of this intersection *twice*, because the intersection occurs in both A and B. The probability of the intersection should

only be included once, so we need to subtract one of those. And that end result is called the addition rule. Recall that if A and B are mutually exclusive,

then they don’t share any sample points, they don’t share any common ground, and

the probability of their intersection is 0. So if A and B are mutually exclusive events, then the addition rule simplifies to this, where the probability of the union is equal to the sum of the individual probabilities. This is sometimes referred to as the “special” addition rule. The complement of an event A is the event

that A does not occur. A little more formally, A complement is the set of all sample points in the sample space that are not in A. The complement of event A is often denoted by A bar, but you may also see A with a superscript

C, or A prime. All 3 of these are very commonly used to denote the complement, so it’s not a bad idea to get comfortable with all of them. Here’s a Venn diagram representation of

the sample space. The black circle represents event A, and the green region represents A complement. A complement is everything in the sample space that is not in A. A and A complement are mutually exclusive

events, and together they make up the entire sample space. Take note of 3 things: The union of A and A complement is the entire

sample space. And, since A and A complement are mutually exclusive, and the probability of the sample space is 1, this implies that the sum of the probabilities of A and A complement is 1. And finally, the probability of A complement

is 1 minus the probability of A. We’ll often use this complement rule in

probability problems, although sometimes it’s so natural that we won’t even consciously realize we’re using it. Let’s work through a simple example to illustrate

some of these concepts. Suppose we are about to roll an ordinary six-sided die once, and observe the number on the top face. Here is a natural way to define the sample space: the set of the 6 possible outcomes. If it’s a ordinary die, and we’re rolling it fairly, then it’s reasonable to think that these 6 sample points are equally likely. That won’t be perfectly true in practice, but it’s a reasonable approximation to reality in this type of scenario. And suppose we define the following events: E is the odd numbers, F is the values greater than 3, and G is made up of the numbers 2 and 6. Since the outcomes are equally likely, we

know that the probability of E is 3/6 or 1/2, the probability of F is also 3/6 or 1/2, and the probability of G is 2/6, or 1/3. What are the complements of events E, F, and G? E complement is made up of everything in the sample space that is not in E. So the event E complement is the even numbers: 2, 4, and 6. What is the probability of E complement? E complement is made up of 3 numbers (2, 4, and 6), and there are 6 equally likely possibilities in the sample space, so the probability of E complement is 3/6. Note that this is equal

to 1 minus the probability of event E. Event F is made up of the numbers that are

greater than 3, so F complement is made up of the numbers that are less than or equal to 3, the sample points 1, 2, and 3. The probability of F complement is thus 3/6, and this also equals 1-P(F). G is made up of the numbers 2 and 6, so G complement is made up of the other numbers in the sample space: 1, 3, 4, 5. The probability of G complement is thus 4/6, and of course, this equals 1 minus the probability of G. Now suppose we’re interested in the pairwise intersections. The intersection of E and F is the set of sample points that are in both E and F, and here that’s just the number 5, that’s the only sample point in the sample space that occurs in both E and F. Since the intersection is made up of 1 out of the 6 equally likely outcomes, the probability of the intersection is 1/6. The intersection of E and G is the set of

sample points that are in both E and G. But there are no elements of the sample space that occur in both, so the intersection between E and G is the empty set. We sometimes use this symbol to represent the empty set. E and G are mutually exclusive (they cannot

both occur on the same roll), and the probability of their intersection is 0. The F and G intersection contains just the number 6, and the probability of that intersection is 1 out of 6. Now suppose we’d like to find the union

of events E and F. The union of events E and F is the set of

sample points that are in either E or F or both. 1, 3 and 5 are in E, and 4, 5, and 6

are in F, so E union F is made up of the numbers 1, 3, 4, 5, 6. E union F is made up of 5 of the 6 equally

likely sample points, so the probability of that union is 5 out of 6. Alternatively, to find the probability of their union, we could have used the addition rule: The probability of the union of E and F is

the sum of the individual probabilities, minus the probability of the intersection. E and F each have a probability of occurring of 3 out of 6, and we found on the previous slide that their intersection has a probability of 1/6, so by the addition rule the probability of their union is 5/6. This is of course the same as what we found above using arguments based on the sample space and our knowledge of what a union is. And if, say, we were interested in the complement of E union F, that would simply be the number 2, as that is the only point in the

sample space that is not in E or F. The probability of that complement is 1 out of 6. Suppose that in a certain population of adults, 10% have diabetes, 30% high blood pressure (or, a little more formally, hypertension), and 7% have both. And suppose that a person is randomly selected from this population. Let event D represent the event that the randomly selected person has diabetes, and event H represent the event that the person has hypertension. If we are randomly selecting from this population, then the probability the randomly selected

person has diabetes is .10, since 10% of the population has diabetes. The probability they have hypertension is 0.30, and the probability they have both is 0.07. The complement of event D is the event the person does not have diabetes. By the complement rule, that probability the person

does not have diabetes is 1 minus 0.1, or 0.90. The complement of event H is the event the person does not have hypertension. By the complement rule, the probability the person does not have hypertension is 1 minus 0.30, or 0.7. Now suppose we want to know the probability the person has diabetes or hypertension (or both). That’s the probability of the

union of events D and H, (D or H), and we can find that with the addition rule. The probability of D union H is the probability of D, plus the probability of H, minus the probability

of the intersection of D and H. All 3 of those probabilities are given here, and in the end we find that the probability that the person has diabetes or hypertension is 0.33. And again that’s “or” in the inclusive sense, the probability they have diabetes, or hypertension, or both. It’s often helpful to illustrate the

events and the probabilities that relate to them in a Venn diagram. Let’s look at how we might do that here. Here are the 4 probabilities that we’ve been given or we’ve worked out so far. And here’s a Venn diagram illustrating events D and H. In a typical Venn diagram, like the one we are using here, the sizes of the various regions, such as the size of circle H, don’t have any meaning. The Venn diagram simply

illustrates the various regions. So try not to read into the sizes of the regions in a typical Venn diagram. Now let’s fill in the various regions and

their probabilities of occurring. The event the person has both diabetes and hypertension is represented by D intersect H, and that’s this green region. That probability was given to us as 0.07, so I’m going to put that value here. The event that the person has diabetes but not hypertension is represented by D intersect H complement, and that’s this green region. Since the entire event D has a probability of occurring of 0.10, the probability associated with this green region, representing diabetes but not hypertension, must be 0.03. The event that the person has hypertension

but not diabetes can be represented by D complement intersect H. Or I could flip those two around if I wanted to,

to be more consistent with the wording, as the ordering of events in an intersection

doesn’t matter. And that’s this region in green. The probability of H in its entirety is 0.30, so the probability of this green region alone must be 0.23. The event that the person has diabetes or

hypertension is the union of D and H, represented by this green region. We’ve already found the probability of the union of those events to be 0.33, using the addition rule. But we could also find it here by adding up the 3 probabilities of the 3 mutually exclusive regions that we see here. 0.03 + 0.07 + 0.23 is 0.33. How about the event that the person has neither

diabetes nor hypertension? One way of writing that is by recognizing that this event is the complement of the union. The union is that they have either one or both, so the complement of the union is that they have neither. That’s this green region. And since the probability of the entire sample space is 1, the probability associated with this event must be 0.67. Setting up a Venn diagram like this often

helps us visualize probability problems and makes them much easier to solve. The concepts of unions, intersections and complements can be extended to more than two events. For example, here’s a Venn diagram representing 3 events, A, B, and C. The union of events A, B, and C is this green region, where A, or B, or C occurs. The complement of that union is this green region, where neither A, nor B, nor C occurs. The intersection of A, B, and C is the event

that A and B and C occur, represented by this green region. And the complement of that three-way intersection is this green region, where the 3 events do not all occur. Sometimes these combinations of unions, intersections, and complements can be a little difficult to think about, so I’ll work through some more complicated problems involving these concepts in another video.

Thanks for the great content as usual

are we going to do sigma-algebras?

Hey Jeremy, thanks a lot for all your stats & proba content!! I've been watching your videos for a while, and I'm glad to see you posted new content. What I like most about your videos is that you're trying to break down the technical stuff. This is really hard to find, a lot of online educators are just trying to give intuition these days – while this is nice and entertaining, I don't find it too helpful ultimately.

Thank you so much I actually understand stat than just memorize

This is exactly what I need to cure my insomnia

So I have another stats course in my masters program and I am happy to refer back to my tutor who taught me stats and probs in my undergrad. Glad to see your new videos!

I was wondering if you can make a logically ordered playlist from 0 to the end? Sorry if you have already arranged it, just wanted to confirm.

This is the best stat video ever. It's so clear and helpful. Thank you!

I'm trying to focus on the content, but your voice is so distractingly pleasant.

can you please explain why p(a&b) = p(a|b)/p(b) and also equal to p(b|a)/p(a)

Another outstanding video! Thank you sir

it was really helpful, thanks 😁

life saver thank u very much

YOU SAVED MY LIFE

Like That. I understand all

Hey man…i have found ur videos really helpful…thanks for all the good stuff…n i have a doubt…it would b great if u cud just reply me here…

How do we decide that whether our problem is to be solved using the binomial or Poisson distribution.??…thanku

Molto interessante. Suggerisco anche versione in italiano da "corsi consulenze NPR" https://www.youtube.com/watch?v=DvtJxhhXj6o&list=PLPA2f7i_kwiJ8ti81ZwIiFStZFlKixoV1

thanks very good .good luck

why is P(E∩F) same as P(F∩G)? EF has 6 sample points and one common point which gives 1/6, but FG only has 5 sample points and one common point so why is that equal 1/6

This video was extremely helpful! However, I have a question. At 13:48, why is it a union of the compliments and not the intersection? The reason I say this is because, the probability of not having diabetes would be the green region plus the probability of someone only having hypertension. While the probability of not having hypertension includes the green region plus the probability of only having diabetes. So wouldn't the probability of not having diabetes and hypertension, be the intersection of not having both? Sorry for the long winded question, and thanks in advance!

Your videos are amazing, never stop making them! <3

Nice video!

You've explained probability better than khan academy.

my textbook structured this topic horribly… this video helped a lot

Perfectly explained

you explain it very well. thanks

excellent!

Thank you 🙏🏻

THANKYOU SOOO MUCH! I'M GOING TO REPORT THIS IN MY CLASS 💕