SKEDSOFT

Design Analysis Of Algorithm

Probabilistic analysis and further uses of indicator random variables: This advanced section further illustrates probabilistic analysis by way of four examples. The first determines the probability that in a room of k people, some pair shares the same birthday. The second example examines the random tossing of balls into bins. The third investigates "streaks" of consecutive heads in coin flipping. The final example analyzes a variant of the hiring problem in which you have to make decisions without actually interviewing all the candidates.

The birthday paradox: Our first example is the birthday paradox. How many people must there be in a room before there is a 50% chance that two of them were born on the same day of the year? The answer is surprisingly few. The paradox is that it is in fact far fewer than the number of days in a year, or even half the number of days in a year, as we shall see.

To answer this question, we index the people in the room with the integers 1, 2, ..., k, where k is the number of people in the room. We ignore the issue of leap years and assume that all years have n = 365 days. For i = 1, 2, ..., k, let bi be the day of the year on which person i's birthday falls, where 1 ≤ bin. We also assume that birthdays are uniformly distributed across the n days of the year, so that Pr {bi = r} = 1/n for i = 1, 2, ..., k and r = 1, 2, ..., n.

The probability that two given people, say i and j, have matching birthdays depends on whether the random selection of birthdays is independent. We assume from now on that birthdays are independent, so that the probability that i's birthday and j's birthday both fall on day r is

Pr {bi = r and bj = r}

=

Pr{bi = r}Pr{bj = r}

 

=

1/n2.

Thus, the probability that they both fall on the same day is

More intuitively, once bi is chosen, the probability that bj is chosen to be the same day is 1/n. Thus, the probability that i and j have the same birthday is the same as the probability that the birthday of one of them falls on a given day. Notice, however, that this coincidence depends on the assumption that the birthdays are independent.

We can analyze the probability of at least 2 out of k people having matching birthdays by looking at the complementary event. The probability that at least two of the birthdays match is 1 minus the probability that all the birthdays are different. The event that k people have distinct birthdays is

where Ai is the event that person i's birthday is different from person j's for all j < i. Since we can write Bk = AkBk-1, we obtain from the recurrence

where we take Pr{B1} = Pr{A1} = 1 as an initial condition. In other words, the probability that b1, b2, ..., bk are distinct birthdays is the probability that b1, b2, ..., bk-1 are distinct birthdays times the probability that bkbi for i = 1, 2, ..., k - 1, given that b1, b2, ..., bk-1 are distinct.

If b1, b2, ..., bk-1 are distinct, the conditional probability that bkbi for i = 1, 2, ..., k - 1 is Pr {Ak | Bk-1} = (n - k 1)/n, since out of the n days, there are n - (k - 1) that are not taken. We iteratively apply the recurrence (5.8) to obtain