# 9 Conditional Probability

[latex]\newcommand{\pr}[1]{P(#1)}[/latex]

We use [latex]\pr{A|B}[/latex] to denote the **conditional probability** of [latex]A[/latex] occurring if [latex]B[/latex] has occurred. This may be conditional in time, such as the probability that the second person in a sample is female given the first was male, or might be different attributes of the same person or object, such as the probability of green eyes given that a person is female.

We have already seen how to calculate conditional proportions in Chapter 6 and the methods for conditional probabilities are the same. However, we can now be more explicit about the rules by using our probability notation. Conditional probabilities are calculated using the rule

\[ \pr{A | B} = \frac{\pr{A \mbox{ and } B}}{\pr{B}}. \]

That is, to find [latex]\pr{A|B}[/latex] we look at the proportion of time [latex]B[/latex] occurs and then how often [latex]A[/latex] occurs as well, taking the ratio to obtain the conditional probability. Note that we could have used [latex]\pr{B \mbox{ and } A}[/latex] instead of [latex]\pr{A \mbox{ and } B}[/latex] in the formula, since it also denotes the probability that [latex]A[/latex] and [latex]B[/latex] occur together. We will show how conditional probabilities can be used through two examples.

# Sensitive Questions

From the survey data, 43/60 = 71.7% of the Islanders answered ‘Yes’ to the sensitive question “Do you approve of kissing on the first date?” (Fidler & Kleinknecht, 1977). However, this group of 60 contained some who answered ‘Yes’ or ‘No’ by chance, according to the rules of the survey. We need to estimate the proportion who said ‘Yes’ because they actually did approve of kissing on the first date.

The figure above summarises the process involved in this question in a **tree diagram**. This diagram shows the different choices that can be made, giving the associated probabilities on the branches.

The question begins with a coin being tossed twice. We assume

\[ \pr{\mbox{first toss heads}} = \pr{\mbox{first toss tails}} = 0.5. \]

If the first toss was tails then the student answered ‘Yes’ if the second toss was heads, otherwise they answered ‘No’. Thus

\[ \pr{\mbox{Yes } | \mbox{ first toss tails}} = 0.5. \]

To work out what proportion of `Yes’ responses came this way we want

\[\pr{\mbox{Yes and first toss tails}}.\]

The conditional probability formula can be rearranged to give the **multiplication rule**

\[ \pr{A \mbox{ and } B} = \pr{B}\pr{A | B}, \]

so

\begin{eqnarray*}

\pr{\mbox{Yes and first toss tails}} & = & \pr{\mbox{first toss tails}} \pr{\mbox{Yes } | \mbox{ first toss tails}} \\

& = & 0.5 \times 0.5 \\

& = & 0.25. \\

\end{eqnarray*}

Now we need to find out [latex]\pr{\mbox{Yes and first toss heads}}[/latex]. We can do this if we know [latex]\pr{\mbox{Yes } | \mbox{ first toss heads}}[/latex]. If they got heads on the first toss then they were asked to answer the question truthfully. So the probability given by [latex]\pr{\mbox{Yes } | \mbox{ first toss heads}}[/latex] is actually the one we want to know, the proportion of the population who would say `Yes’ if they answered the question truthfully. We will call this unknown [latex]p[/latex], so

\begin{eqnarray*}

\pr{\mbox{Yes and first toss heads}} & = & \pr{\mbox{first toss head}} \pr{\mbox{Yes } | \mbox{ first toss head}} \\

& = & 0.5p

\end{eqnarray*}

So the total probability of ‘Yes’ is [latex]0.25 + 0.5p[/latex]. We have an estimate for this, the sample proportion 0.717, so we can equate these to estimate [latex]p[/latex]. That is,

\[ 0.25 + 0.5p = 0.717, \]

so [latex]p = 2(0.717 - 0.25) = 0.934[/latex]. Thus our estimate of the proportion who would actually approve of kissing on the first date is about 93%. Of course we would really like to know how accurate this estimate might be, since it came from a sample of only 60 Islanders. We will return to that issue in Chapter 17.

# Diagnostic Testing

A frequent need with conditional probabilities is to reverse the direction of the events, finding [latex]\pr{B|A}[/latex] when we know [latex]\pr{A|B}[/latex]. Wild and Seber (2000) give an extensive case study of using this method to analyse error rates in diagnostic testing, which we will summarise here.

A common test for HIV has been to screen for the antibodies to the virus using an enzyme-linked immunosorbent assay (ELISA). Such a test is not perfect and there is always a tradeoff between having a high **sensitivity**, the probability of correctly diagnosing a person with HIV, and a high **specificity**, the probability of correctly diagnosing a person without HIV. An experiment which compared the results of the ELISA test to a more accurate, but more expensive, test suggest that

\[ \pr{\mbox{test positive } | \mbox{ HIV}} = 0.98 \]

and

\[ \pr{\mbox{test negative } | \mbox{ no HIV}} = 0.926. \]

Thus the sensitivity is 0.98 and the specificity is 0.926. What a patient who tests positive would like to know is the probability that they have HIV. This is [latex]\pr{\mbox{HIV } | \mbox{ test positive}}[/latex], the opposite of what we have above.

Now

\[ \pr{\mbox{HIV } | \mbox{ test positive}} = \frac{\pr{\mbox{HIV and test positive}}}{\pr{\mbox{test positive}}}. \]

The numerator is

\[ \pr{\mbox{HIV and test positive}} = \pr{\mbox{HIV}}\pr{\mbox{test positive } | \mbox{ HIV}}, \]

but what is [latex]\pr{\mbox{HIV}}[/latex]? This depends a lot on the population you are testing. For example, the Australian Bureau of Statistics reports that up to the year 2000 there had been 20995 HIV cases in Australia and 6017 AIDS deaths so we will suppose there are about 15000 people living with HIV in Australia. From a population of 19.2 million at that time this gives a probability

\[ \pr{\mbox{HIV}} = \frac{15000}{19200000} = 0.00078125. \]

Thus

\[ \pr{\mbox{HIV and test positive}} = 0.00078125 \times 0.98 = 0.00076563. \]

Similarly,

\[ \pr{\mbox{no HIV and test positive}} = (1 – 0.00078125) \times (1 – 0.926) = 0.073942. \]

Since someone either has HIV or does not have HIV we can add these two to find

\[ \pr{\mbox{test positive}} = 0.00076563 + 0.073942 = 0.074708. \]

Together this gives

\[ \pr{\mbox{HIV } | \mbox{ test positive}} = \frac{0.00076563}{0.074708} = 0.0102. \]

Thus, even though the test is positive 98% of the time for someone with HIV, the chance that you have HIV if you get a positive result is only 1 in a 100. Look at the numbers closely to see why this happens.

There are important political implications of this result. Many groups, such as insurance companies and employers, are keen to introduce mass testing of customers and employees to assess their medical risk. Whether this is a good or bad thing is a separate issue, but we have seen above that large scale diagnostic testing for relatively rare conditions will lead to a great number of **false positives**, people testing positive when they do not have the condition. This can easily be obscured by a claim such as that a test is “98% accurate”.

Contrast these calculations for Australia with those for South Africa where [latex]\pr{\mbox{HIV}} = 0.261[/latex] for people aged 20-24 in 1998. If you follow the steps you will find

\[ \pr{\mbox{HIV } | \mbox{ test positive}} = \frac{0.25578}{0.31047} = 0.8239. \]

That is, with the much higher rate of infection, the chance of a false positive is now much smaller.

At a similar level, the estimated HIV infection rate for people in South Africa under 20 years old in 1998 was 21.0%, up from 12.7% in 1997. These statistics are hotly debated because they have profound implications for the country. A government must decide whether to spend limited resources on areas such as medical treatment for those with AIDS, research into possible vaccinations, education programs, or on welfare for those who lose family members. Such **official statistics** and their uses are an important part of statistical research but will not be discussed directly in this book.

# Independence

We say that two events, [latex]A[/latex] and [latex]B[/latex], are **independent** if the outcome of one tells us nothing about the probability distribution of the other. That is, [latex]\pr{A|B} = \pr{A}[/latex] and [latex]\pr{B|A} = \pr{B}[/latex]. In most cases you can decide whether two events are independent by thinking about the physical way in which they happen. An important assumption we will make later is that consecutive samples from a population are independent. This can usually be achieved if the samples are chosen at random.

If events are independent then the multiplication rule in the previous section becomes

\[ \pr{A \mbox{ and } B} = \pr{A} \pr{B}. \]

This is an idea you are probably familiar with. If you toss a coin twice the probability of getting two heads in a row is 0.25 since 50% of the time the first will be heads and 50% of those times the second will also be heads, and 50% of 50% is 25%. Similarly, if we pick two people from a population with female proportion [latex]p[/latex] then the probability of getting two females is [latex]p^2[/latex].

An important use of this independence rule will be for testing whether there is an association between two categorical variables. For example, we previously saw a two-way table of counts of Islanders from Ironbard categorised by pizza preference and sex. The marginal proportions showed that 0.49 of the Islanders were male and 0.14 of the Islanders preferred mushroom pizza. If pizza preference and sex were independent then the proportion of Islanders who are male and prefer mushroom should be

\[ 0.49 \times 0.14 = 0.0686. \]

Out of 200 Islanders we would then expect there to be [latex]0.0686 \times 200 = 13.7[/latex] mushroom-liking males. In fact we observed 18 mushroom-liking males in the data, a difference of 4.3 from what was expected. In Chapter 22 we will measure these differences and see whether they give evidence that the variables are not independent. If they were not independent then this would suggest an association between them.

Summary

- Conditional probability gives the probability of an outcome occurring conditional on another outcome having occurred.
- Conditional probability can be used to analyse the results of sensitive questions on surveys that involve randomness.
- Conditional probability shows the trade-off between sensitivity and specificity in diagnostic testing.
- Events are independent if the conditional probability of one event is unchanged by the outcome of the other.

Exercise 1

A group of 60 students were asked the sensitive question “Have you ever cheated on an exam?” using a random response technique. Before answering the question each student secretly rolled a die. If the die indicated 1, 2 then they were asked to answer the question honestly. If the die indicated 3, 4 then they were asked to answer ‘yes’ regardless of their true answer. If the die indicated 5, 6 then they were asked to answer ‘no’. Of the 60 students, a total of 33 answered ‘yes’ using this technique. Based on this sample, estimate the true proportion of students who have cheated on an exam.

Exercise 2

Think of other ways of using randomness to obtain anonymity in a similar way to the method earlier in this chapter. (A common approach is to use a deck of shuffled cards which have directions on them, such as ‘Answer truthfully’ or ‘Answer Yes’.) For each method, show how to estimate the true proportion based on the one found in your sample.

Exercise 3

For a particular population group it is estimated that the prevalence of cervical cancer is 8.6%. A new diagnostic procedure for this disease has a sensitivity of 0.927, correctly diagnosing an individual with cervical cancer 92.7% of the time. The specificity of the procedure is 0.854, so that an individual without the disease is correctly diagnosed 85.4% of the time.

Suppose the diagnostic procedure indicates an individual from this population group has the disease. What is the probability that they actually do have cervical cancer?

Exercise 4

Find additional data, such as that from the Bureau of Statistics website, so that you can repeat the calculations for the Diagnostic Testing example for different groups.

Challenge: Exercise 5

Two people, Alice and Victor, agree to play the following game: A barrel is filled with 1000 balls, 650 labelled ‘Alice’ and 350 labelled ‘Victor’. The players take it in turns to draw a ball from the barrel. If it is labelled with their name they win and the game ends. Otherwise they replace their ball, shake the barrel, and the other player draws. If Alice goes first, what is the probability that she wins the game?

Exercise 6

Testing a hypothesis at the 5% level means that there is a 0.05 probability of rejecting the null hypothesis when in fact it is true. Suppose 12 research teams around the world independently carry out trials of the same chemical compound to see whether it is effective against HIV. If the compound is actually not effective against HIV, what is the probability that at least one research team will find significant evidence of an effect at the 5% level?

Exercise 7

The first randomised response methodology was presented by Warner (1965) and was based on a different process to the example in the section on sensitive questions. For example, suppose 200 university students are asked “if you or your girlfriend accidentally got pregnant would you seriously consider the possibility of an abortion?” Before answering, each student secretly tosses a fair coin twice. If both tosses show heads then they answer the truth but for all other outcomes they answer the **opposite** of the truth. If 114 students answer ‘yes’, estimate the true proportion of students who would consider an abortion.

Exercise 8

Explain why Warner’s method in Exercise 7 requires a probability of answering truthfully, [latex]\theta[/latex], that is not equal to [latex]\frac{1}{2}[/latex].

Exercise 9

Alice is planning on using a randomised response method to ask the sensitive question “Have you ever used marijuana?” She will have each subject toss a fair coin twice before answering the question. As with the example earlier in this chapter, they will be asked to answer truthfully if the first toss shows heads. If the first toss shows tails then they are asked to answer ‘yes’ if the second toss shows heads and ‘no’ if the second toss shows tails.

Suppose [latex]p[/latex] is the true proportion of people who have used marijuana in the population that Alice is sampling. If a subject says ‘yes’ for the randomised response, what is the probability that they have used marijuana?