## June 2010

—————————

In my June diary I posed the following brain-teaser.

Gary Foshee, a collector and designer of puzzles from Issaquah near Seattle walked to the lectern to present his talk. It consisted of the following three sentences: "I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?"

The event was the Gathering for Gardner earlier this year, a convention held every two years in Atlanta, Georgia, uniting mathematicians, magicians and puzzle enthusiasts. The audience was silent as they pondered the question.

—————————

*Solution*

With these counter-intuitive probability problems — the Monty Hall Problem is the archetype — I tend to go round in a circle, from the rational to the empirical then back round to the rational. Like this:

** • Reasoning** First I switch on my reasoning faculties and try to think my way through to a solution.
Let's see … yeah … so it follows that … and then it must be the case that … no, wait a minute, that's
wrong … or was it the previous step that was wrong? … start again: assume

*this*… wait a minute … oh, no … maybe? … uh … er … oh, sod it.

** • Experiment** Pure reason having failed me, I summon up my inner empiricist. He
tells me to write a program simulating the problem.

I do so, modeling *N* two-child families (*N* some large number), with each child assigned a sex at random and a day of the week
at random.

I asterisk those families that have a boy born on a Tuesday, either as first child or second child or both.

*Within* the asterisked families, I double-asterisk those with two boys.

This generates a list like this:

G-We B-Sa

G-Su G-We

G-Mo G-Tu

G-Sa G-Fr

G-Mo B-Fr

G-Th G-Th

B-Fr G-Fr

G-Th B-Tu *

B-Su B-Fr

G-Th B-Mo

B-Tu G-Fr *

B-Sa G-Th

B-Su B-Th

B-Sa B-We

B-We G-Fr

G-Tu B-Mo

G-Tu B-We

B-We G-Th

G-Su G-We

G-Th B-Su

B-Tu B-We * **

G-Th B-Tu *

B-Su G-Th

G-Th B-Tu *

G-Sa G-Su

B-Sa G-Mo

B-Sa B-Tu * **

B-We B-Th

B-Sa B-Mo

B-Mo G-We

B-Mo B-Mo

G-Th B-Th

G-Th B-Th

G-Tu B-We

G-Fr G-Th

G-Th B-Su

G-Th B-Fr

B-Fr B-Fr

G-Th G-Th

B-We G-Tu

G-Tu G-Mo

B-Sa B-Su

G-Tu G-Tu

B-We G-Fr

G-Su B-We

B-Th G-We

G-Tu G-Sa

G-Fr B-Su

The problem now resolves to: What proportion of the single-asterisked families are double-asterisked?

I ran ten trials with *N* = 1,000,000. Results for the proportion were:

0.483139868313116

0.479990471653168

0.483349522304536

0.483050047557159

0.480316681112026

0.480332209045080

0.481120909891315

0.480373003711155

0.480537214994952

0.481898312915545

That's an average of 0.481410824 , agreeably close to 13/27. (The first few convergents of its simple continued fraction are 1/2, 12/25, 13/27, 38/79, 51/106.) Looks like that could be the right answer. But why?

** • Back to reasoning** Now I fire up my reasoning faculties again, using that list as a visual aid.

First consider the single-asterisked families, those who have a B-Tu. Since the chance of a B is 1/2 and the chance of a Tu is 1/7, in each of the two columns (first child, second child) one in 14 entries is a B-Tu.

*However*, one in 196 (that's 14×14) families has *both* kids B-Tu. Treating those families separately, we have:

1 family in 196 has B-Tu B-Tu.

13 families in 196 (1/14 minus 1/196) have B-Tu [other], where [other] is any child not B-Tu.

13 families in 196 (same logic) have [other] B-Tu

So 27/196 of families have one or two B-Tu children — i.e. get a single asterisk.

Now consider the double-asterisked families, those who have B-Tu B-Xx or B-Xx B-Tu, where Xx can be any day of the week, including Tuesday.

Again we can separate out the B-Tu B-Tu families, who are 1/196 of all families. What about the rest, those with B-Tu [other] or [other] B-Tu? How may of those [other] siblings are boys?

At first glance you might think half of them are boys. Remember, though, that [other] *excludes* B-Tu kids, who we've already accounted
for. There are only 13 possibilities for [other], not 14. Six of those possibilities are boys, seven are girls.

Bottom line for double asterisks:

1 family in 196 has B-Tu B-Tu.

13/196 have B-Tu [other], of which 6/13 have a boy [other] — 6/196 of all families (because 6/13 × 13/196 = 6/196)

By the same logic, 6/196 have [other] B-Tu with a boy [other]

So altogether 1+6+6 out of 196 families get a double asterisk.

Proportion of double asterisks to single asterisks: 13/27.

[Here's my VB code for the modeling:

Option Explicit

Option Base 1

' Test siblings probability problem

: Sub Main()

' Declarations:

Dim iFileNo, iV, iSex(2), iDay(2) As Integer

Dim lU, lMax, lStar, lStars As Long

Dim dRat As Double

Dim sFilePath, sFileName, sSex(2), sDay(2), sStar, sStars As String

' Set up and open the output file:

sFilePath = "C:\Documents and Settings\Owner\My Documents\Visual Basic\Data files\"

sFileName = sFilePath & "Siblings.txt"

iFileNo = FreeFile

Open sFileName For Output As iFileNo

' Initialize:

lStar = 0

lStars = 0

Randomize

' How many families?

lMax = 1000000

' Do the business:

For lU = 1 To lMax

For iV = 1 To 2

' Assign random sex:

iSex(iV) = Int(2 * Rnd() + 1)

Select Case iSex(iV)

Case 1

sSex(iV) = "B"

Case 2

sSex(iV) = "G"

End Select

' Assign random day of week:

iDay(iV) = Int(7 * Rnd() + 1)

Select Case iDay(iV)

Case 1

sDay(iV) = "Mo"

Case 2

sDay(iV) = "Tu"

Case 3

sDay(iV) = "We"

Case 4

sDay(iV) = "Th"

Case 5

sDay(iV) = "Fr"

Case 6

sDay(iV) = "Sa"

Case 7

sDay(iV) = "Su"

End Select

Next iV

' Assign single asterisk:

If (sSex(1) = "B" And sDay(1) = "Tu") Or (sSex(2) = "B" And sDay(2) = "Tu") Then

sStar = "*"

lStar = lStar + 1

Else

sStar = " "

End If

' Assign double asterisk:

sStars = " "

If sStar = "*" And sSex(1) = "B" And sSex(2) = "B" Then

sStars = "**"

lStars = lStars + 1

End If

' Show one family:

' Print #iFileNo, sSex(1) & "-" & sDay(1) & " " & sSex(2) & "-" & sDay(2) & " " & sStar & " " & sStars

Next lU

' Print stats:

Print #iFileNo, " "

Print #iFileNo, "Single asterisks: " & lStar

Print #iFileNo, " "

Print #iFileNo, "Double asterisks: " & lStars

Print #iFileNo, " "

dRat = lStars / lStar

Print #iFileNo, "Ratio is " & dRat

Close #iFileNo

End Sub

Quick'n'dirty, but does the job.]

[Thursday morning, 7/8/2010]

I got so much feedback from readers on this — over 200 emails so far — I thought I'd better post a
follow-up.

I have not yet had time to read all responses, but I've read enough to pick out a few common themes.

• **Lawyering the words** Here once again is the problem as stated:

I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?

To a person in the mathematical-logical frame of mind, that is perfectly straightforward. We got some bare facts about the guy's family.
Where we didn't get facts, everything is open. Is the boy his first child or his second? He doesn't tell us, so we don't know. Could be either.
Might the other child *also* be a boy born on Tuesday? Sure: he didn't say anything to the contrary. And so on.

Now, in the context of everyday human relations, that's a bit abstract. Suppose Sam tells you: "I have two cars. One's a red
Toyota." The normal implication you will take away is that Sam's other car is *not* a red Toyota. If it was, then surely Sam would have
said: "I have two cars. They're both red Toyotas."

In the math-logic frame, though, we don't take that implication. No mathematician would; no logician would; no ordinary citizen accustomed to
amusing himself with math-logic brainteasers would. Where we're not given information, we have no knowledge. And after all, Sam *might*, for
all we know to the contrary, have been intending to say "I have two cars. One's a red Toyota. Come to think of it, so's the other
one," but dropped dead from a massive coronary infarction, or been felled by a meteorite, after that second period.

Some of us are irrepressibly verbal, though; and some, even among very bright people, just can't get into the math-logic frame of thinking at all. The story I told here illustrates the point.

• **Arguing relevance** "The kid's date of birth is irrelevant! It has no bearing on the
problem!"

I disagree. It's highly relevant.

The problem concerns the speaker's family, in the context of all possible families, of which there is some large number *F*. He tells
us he has two children. That's nontrivial information. It reduces the universe of families we need to consider to *T*, all possible two-child
families, *T* < *F*. He then tells us that one child is a boy, reducing the universe still further to
*T _{b}* = 3

*T*/ 4. He then tells us that the boy was born on a Tuesday, reducing the universe of consideration still further, to

*T*= 9

_{bt}*T*/ 49. This is all nontrivial information helping to constrain the quantities we're obliged to think about in order to solve the problem.

_{b}It is of course possible to add irrelevant information when stating a problem. I might toss two coins, then tell you: "One of them came
up heads, and my Aunt Maisie has a boil on her backside. What's the chance that both coins are heads?" Aunt Maisie's boil, though of course very
regrettable, is not germane to the status
of my tossed coins. Here, however, the lad's having been born on a Tuesday *is* germane. It gives us information about the speaker's family.
The speaker's family is what we are interested in.

• **Day of week is arbitrary information** If you're given day of month (or some other datum)
instead of day of week, it changes your answer.

Duh. Of course it does. If you change the given facts, you change the output probabilities. That's what Bayesian analysis is all about!

If, instead of day of week (pick one from seven), the speaker had given us odd or even year (pick one of two), the problem's solution would be 3/7. If he'd said: "One is a boy born on a leap day" (i.e. a February 29), the solution would be 2921/5843 (assuming the kid is less than 110 years old). So what?

• **The double-counting issue** "You're forgetting to double-count the "B-Tu B-Tu"
cases."

Possibly I am; but why is Visual Basic *also* forgetting to double-count them?

Here is the logic of my simulation, whose VB code I have shown above.

Make a random pick from the set {B, G}

Make a random pick from the set {Mo, Tu, We, Th, Fr, Sa, Su}

That's your first child.

Make a random pick from the set {B, G}

Make a random pick from the set {Mo, Tu, We, Th, Fr, Sa, Su}

That's your second child.

That's your two-child family.

If either (or both*) of those pairs of picks resulted in a B-Tu, flag the family as being of interest.

If you just flagged the familyandboth picks from {B, G} gave a B, double-flag the family.

Repeat all the above some very large number of times, tallying flags and double-flags.

Compute the proportion of flagged families that are double-flagged.

What is not being counted there? How should that logic be changed to deal with this double-counting issue?

Suppose I were to write this program:

Make a random pick from the set {H, T}

Make a random pick from the set {H, T}

That's your two-coin toss.

If either of those picks resulted in an H, flag the two-coin toss as being of interest.

If you just flagged the two-coin tossandthe other pick was a T, double-flag the two-coin toss.

Repeat all the above some very large number of times, tallying flags and double-flags.

Compute the proportion of flagged two-coin tosses that are double-flagged.

Would that be failing to double-count something? If I coded that up and simulated a few million two-coin tosses, would my result be wrong?

* In VB, as in every other computer language known to me, the plain "Or" is inclusive (Latin *vel*). If it's an
exclusive "or" you want (Latin *aut*) the VB operator is "Xor."

—————————

This is as far as I can take things right now. I'll come back to the issue later today. And yes, I'm dealing with the easy stuff first. (And ignoring the rude stuff. To the reader who says my calculations can't be worth much as I believe in "evolutionism": Yes, and I also believe in heliocentrism, atomic-theory-of-matter-ism, circulation-of-the-blood-ism, laws-of-motion-ism, statistical-mechanics-ism, continental-drift-ism, and all sorts of other weird notions. My credulity is unbounded.)

As things stand at present, though (mid-morning Thursday, July 8), I am unconvinced by any counter-arguments I've seen.

If I were to find myself in a room with two persons, one a bookmaker, the other a public figure of impeccable integrity — a justice
of the U.S. Supreme Court, say … No, scratch that: make it a *National Review* editor — if, in this situation, the
integrity guy were to say: "I have two children. One is a boy born on a Tuesday," and then leave the room; and if, after he'd left and
closed the door, the bookie were to turn to
me and say "Thirteen'll get ya fourteen the other one's a boy too," I'd hesitate. On any longer odds, though — 15 to 13, 1401
to 1300, 5 to 4, 4 to 3, … — I'd take the bet. Why would I be wrong to do so?

The emails are still coming in. Most re-argue points I've dealt with up above, though perhaps I didn't express myself with sufficient clarity.

Hi Derb,

Same family, same 2 kids, same son, but different probabilities of your other kid being a son, depending onhowyou describe your son? i.e.,

Three different bets on one family? They can't all be right. Maybe you have a quantum family, you know, reality changes depending on the manner in which it's observed. :-)

- He's just a boy. Odds: 50/50.
[NB: This is wrong — see below. It's an extraordinarily common error, though. — JD]- He was born on Tuesday. Odds: 13/27.
- He was born in an odd year. Odds: 3/7.

I like the quip; but three different bets *can* all be right. (Or "right" — what exactly constitutes a
"right" bet?) We're not ascertaining sure true facts, we're estimating probabilities based on information supplied. If you change
the input information, *of course* the estimated probability will change! *That's what probability theory is all about*.

What makes this particular result so counterintuitive is the fact of the extra input information seeming so irrelevant to the main matter at hand — the sexes of the two children.

And you can even pump up the counterintuitiveness by removing the human element. As one reader pointed out, the problem as stated is isomorphic to this one:

I just tossed two fair coins. One came up heads; it was minted on a Tuesday. What is the probability both coins came up heads?

Now my result is *really* counterintuitive! (And it's an interesting sidebar issue why this should be any more counterintuitive than
the original. Why should we be any more tolerant of weirdness in human affairs than in the physical world? Discuss among yourselves.)

Math is of course full of counterintuitive results, some of them *much*
weirder than this one.

Those weirdnesses are in pure math, though, a world of aery abstraction where anything might be possible. The probability theory we're using here is a branch of applied math. We really don't like a counterintuitive result in applied math. It suggests that somebody's bridge might fall down.

I may as well state my **final** conclusion right here. I think that a fair use of the word
"probability" is bound to encompass
some oddities like this, just as a fair use of the word "measure" is bound to include weirdnesses like the Banach-Tarsky Paradox.

The best counter I can think of to my own conclusion is a sort of efficient-market argument. If I'm right, then surely in all these millennia of human beings practicing gambling, some ingenious entrepreneur would have gotten rich by mining some such anomaly.

I'd counter the counter by observing that markets are only efficient *eventually*. It took human beings eight hundred years
to devise a plow-horse collar that did not strangle the horse.

—————————

Herewith some common categories of reader response, other than those dealt with above.

• **Agreers** A lot of readers — around a quarter — just agree with me. Yes, the answer to the
problem as stated is 13/27. There's nothing wrong with either my logic or my empirical simulation.

Most of these readers went on to add interesting qualifications, of the sort very ably summarized in
this *Science News* article
on the problem. The qualifications overlap to some degree with the "lawyering" approach I noted up above.
*Science News*:

Everything depends … on why I decided to tell you about the Tuesday-birthday-boy. If I specifically selected him because he was a boy born on Tuesday (and if I would have kept quiet had neither of my children qualified), then the 13/27 probability is correct. But if I randomly chose one of my two children to describe and then reported the child's sex and birthday, and he just happened to be a boy born on Tuesday, then intuition prevails: The probability that the other child will be a boy will indeed be 1/2. The child's sex and birthday are just information offered after the selection is made, which doesn't affect the probability in the slightest.

I'm not sure about that, but perhaps my hesitation is just temperamental. I couldn't care less about why the guy is saying what he's saying, whether he's a chronic liar, or which side of the bed he got out of that morning. Surely we should just take the facts as given, apply logic, and see what comes out.

Even that 1/2 is problematic, as the article points out elsewhere. If you apply my original logic to the following problem:

I have two children. One is a boy. What is the probability I have two boys?

… you actually get 1/3. Having told you that one of my children is a boy, I have placed my family in the subset of all possible two-child families — it constitutes three-quarters of them — that are B-B, B-G, or G-B with equal probability. The B-B families make up one-third of this subset.

Additional input data, no matter how apparently irrelevant, moves the probability estimate from this 1/3 to the more intuitive 1/2. In fact, as my examples above with odd-even years and leap-days show, the more off-the-wall the extra data is, the closer it pushed the estimated probability to 1/2; but even what looks like the most minimal data (odd-even year) pushes the probability four-sevenths of the way from 1/3 to 1/2.

• **Deniers** Of the people who answered my challenge to find fault with my empirical result, all the thoughtful
ones fell back on the double-counting issue, which
I dealt with above, at any rate to my own satisfaction.

The ones who got closest to persuading me were those who supplied their own code. Sample:

' MY VERSION Test siblings probability problem

Sub Main_v2()

' Declarations:

Dim iFileNo, iV, iSex(2), iDay(2) As Integer

Dim lU, lMax, lStar, lStars As Long

Dim dRat As Double

Dim sFilePath, sFileName, sSex(2), sDay(2), sStar, sStars As String

' Set up and open the output file:

sFilePath = "C:\Documents and Settings\"

sFileName = sFilePath & "Siblings_v2.txt"

iFileNo = FreeFile

Open sFileName For Output As iFileNo

' Initialize:

lStar = 0

lStars = 0

Randomize

' How many families?

lMax = 1000000

' Do the business:

For lU = 1 To lMax

For iV = 1 To 2

' Assign random sex:

iSex(iV) = Int(2 * Rnd() + 1)

Select Case iSex(iV)

Case 1

sSex(iV) = "B"

Case 2

sSex(iV) = "G"

End Select

' Assign random day of week:

iDay(iV) = Int(7 * Rnd() + 1)

Select Case iDay(iV)

Case 1

sDay(iV) = "Mo"

Case 2

sDay(iV) = "Tu"

Case 3

sDay(iV) = "We"

Case 4

sDay(iV) = "Th"

Case 5

sDay(iV) = "Fr"

Case 6

sDay(iV) = "Sa"

Case 7

sDay(iV) = "Su"

End Select

Next iV

' Assign single asterisk: First Born

If (sSex(1) = "B" And sDay(1) = "Tu") Then

sStar = "*"

lStar = lStar + 1

' Assign double asterisk:

sStars = " "

If sStar = "*" And sSex(2) = "B" Then

sStars = "**"

lStars = lStars + 1

End If

Else

sStar = " "

End If

' Assign single asterisk: Second Born

If (sSex(2) = "B" And sDay(2) = "Tu") Then

sStar = "*"

lStar = lStar + 1

' Assign double asterisk:

vsStars = " "

vIf sStar = "*" And sSex(1) = "B" Then

sStars = "**"

lStars = lStars + 1

End If

Else

sStar = " "

End If

' Show one family:

' Print #iFileNo, sSex(1) & "-" & sDay(1) & " " & sSex(2) & "-" & sDay(2) & " " & sStar & " " & sStars

Next lU

' Print stats:

Print #iFileNo, " "

Print #iFileNo, "Single asterisks: " & lStar

Print #iFileNo, " "

Print #iFileNo, "Double asterisks: " & lStars

Print #iFileNo, " "

dRat = lStars / lStar

Print #iFileNo, "Ratio is " & dRat

Close #iFileNo

End Sub

The result is this:

Single asterisks: 142525

Double asterisks: 71436

Ratio is 0.501217330292931

I'm on the side of 50/50 that the 2nd child is a boy.

Thanks. Now get out of my head!

I'd be glad to oblige, Sir, but it's *your* head — you have to kick me out.

That's just double-counting, and my former objections apply. We don't double-count on coin tosses, saying there are six possibilites for a double toss: TT (counted twice), TH, HT, HH (counted twice). Why do it here? Especially since, as I have just noted, the original problem is isomorphic to a coin-tossing one!

You can always jiggle code to get the result you want. What I am unconvinced of is that any of the code offered to me is a better translation than mine into logic (as instantiated in the coding language) of the statements offered by the speaker. You must at least allow that my code is in better compliance with the Occam's Razor principle!

[The opposite of that principle is referred to in Chinese by the idiom "drawing feet on a snake" (畫蛇添足). I just mention this because I like the idiom, and it kept coming to mind when I was looking at the code offered. Occam's Razor is, in my opinion, always the better way to go.]

• **Vituperators** Quite a lot of people take a counterintuitive result in mathematics personally. It makes them mad.
They blame the messenger.

So you admit that the day of the week you are born on, the month, the direction of the wind, the mother's favorite color — whatever — are all irrelevant to the baby's gender, yet you go on to argue that those things have a discernible relevance to the outcome? [NB: Not only did I not admit any such thing, I plainly said the OPPOSITE! — JD] You admit the irrelevancy, then say the day of the week creates a 13/27 fraction, the month creates a 23/47 fraction, the direction of the wind (four ways) creates a 7/15 fraction — all giving different probabilities, yet all somehow irrelevant!

At this point it is safe to say that you are either a man with no character who cannot admit when he makes a mistake — or you are just as stupid as a box of rocks.

Well, it's always salutary to be reminded that we are really — all of us some of the time, and some of us all the time — just smart apes, snarling at each other and trying to pick fights.

Also that a great many people — perhaps the previous parenthetical even applies again — are unable to face the fact that the universe and its laws are whatever they are, and don't give a fig whether we like them or not.

The emails are *still* coming in — four more just today. I'm up well over 400. I spent time at the weekend combing
through them, but couldn't respond to more than about one in ten. Apologies to the others; many, many thanks to all for taking the trouble to give me
a view.

I shall be a dogged empiricist till I die; but I must say, as a result of reading through those emails, my view of
what I called
the "lawyering" issue has softened somewhat. (It helps that while engaged in this, I was reading James Gould
Cozzens' fine lawyer novel
*The Just and the
Unjust*, which left me with a lot of respect for lawyers — more respect, at any rate, than Cozzens'
*The Last
Adam* left me with for doctors, or
*Men and
Brethren* for clergymen.)

Several readers thanked me for the link to
the *Science News* article on
the puzzle, which, they said, explained the 13/27 result better than I had. Hmph. It looks to me just like
my July 7 posting on The Corner …

I think I can get a majority vote for the proposition that this is a *very* weird result. That, of course, is why it's created such a
stir.

There are three plausible answers to the problem as stated, two intuitive and one counterintuitive.

- 1/3. The Tuesday business is perfectly irrelevant; we have B-B, B-G, or G-B with equal probability, the speaker having excluded G-G.
- 1/2.
*All*the other givens are irrelevant: the chance that any particular child is male is one in two. - 3/7, 5/11, 7/15, 9/19, 11/23, 13/27, 15/31, … and an infinity of other answers with the
probability (2
*N*− 1) / (4*N*− 1) for any integer*N*> 1. The value of*N*is just the number of equal-probability values the "extra" datum might assume in the data set it implies. "Tuesday" implies the data set "days of the week," so for that datum*N*= 7. These answers approach 1/2 asymptotically as*N*→ ∞; so the more rarefied the "extra" datum you're given, the closer my empirical logic will get you to the limiting answer 1/2.

My final conclusion remains as stated above. If you take a problem of this kind and

- reduce the given statements to logical assertions, then
- instantiate the logic in code, then
- run the code;

and if the result you get is counterintuitive, then your intuition is probably wrong. That may be a naïvely empirical approach, and I'll agree that there are lawyerly cases to be made for the other two solutions noted above. (It would be interesting, in fact, to see something like this argued before a jury.) Invited to choose between the empirical and the intuitive, though, I'll go with empiricism every time.

It's a weird result, sure enough. Still, we live in a weird universe, as Martin Gardner noted in
*The Night
is Large* (index references to Chesterton, Gilbert K.) Mathematics captures some of that weirdness, as Gardner was very much aware.

Since this
whole thing started at a conference paying tribute to Gardner, it's fitting to quit the topic with a nod to him. My personal obituary notice for
Martin Gardner is here, with another one upcoming
in *Focus*, the very excellent monthly newsletter of the Mathematical Association of America.