## Girls only, part 1

Posted by Chris on October 10, 2011 – 8:09 pm

I know this is well known and almost certainly has been posted here before, you’ll see why I’m posting it when it comes to part 2.

I have two children, at least one of them is a girl. What is the probability that they are both girls?

As a few of you have brought it up, assume that each child is equally likely and independently to be a boy or a girl.

October 10th, 2011 at 9:13 pm

i think that at least one is a girl the other could be a boy or a girl it will be 50% that it’s a girl.

October 10th, 2011 at 9:24 pm

I have not looked back, and I don’t remember this. But the unknown sex would be 50/50, but an odd thing is that in reality there’s a slight more females than males born. I’ll ignore that and say 100% for the known female and 50% for the unknown. So I suppose that means a 50% probability that they are both girls…. Oh and a 50% for a girl and a boy… (oh,,,and 0% that they are both boys? LOL , just having fun.) So what’s part 2 for this… Oh, another on they way? Try an ultra sound? …No don’t, let it be a surprise.

October 10th, 2011 at 9:30 pm

Hey Ragknot i’m fireball but a different account i remember something about 51% for girl and 49% i might be wrong [KnightWizard AKA FireBall

October 10th, 2011 at 9:31 pm

i don’t know if i’m correct about the boy and girl thing

October 10th, 2011 at 9:35 pm

I wonder if what part two is right?

October 10th, 2011 at 11:03 pm

I remember getting into a tangle with this one before. If I recall correctly the arguments went something like this . . .

Your two children can be BB, BG, GB or GG, all equally likely. (BG means boy born first, girl second). If at least one is a girl then BB is eliminated. Out of the remaining equally likely outcomes there is just a one in three chance of getting GG, so the answer is 1/3.

The other argument is that regardless of the sex of the known child (whether first or second) there is a 50% chance of the other one being a girl.

I seem to remember that I favoured the second argument and, as usual, I was declared to be wrong.

Contrary to what Ragknot says I believe that more boys are born than girls (about 52% to 48%). It may be different down Texas way, in Ragknot County. But as they get older more boys die off through wars and doing stupid things. So retirement homes end up with many more women then men.

October 11th, 2011 at 12:15 am

1/3. Possibilities are GG,BG,and GB, all equally likely.

October 11th, 2011 at 2:46 am

Assuming probability of either are both 1/2.

October 11th, 2011 at 5:47 am

are there actually more boys born though? does anyone have a reference?

I did hear about there being more boys then girls in china. Girls are killed after birth because they wont be able to provide for them in the future.

October 11th, 2011 at 5:55 am

1/3 it is. Assuming the children can be identified, eldest/youngest or largest/smallest, smartest/dimmest or any such labelling (as long as it doesn’t correlate to the sex [I know, I should say "gender"] of the child), then we can write the children’s sexes as an ordered pair. So BB,BG,GB,GG are the four equally likely possibilities for a two child family. But as we know that at least one is a girl, we eliminate BB leaving 1/3 of families being GG.

If I had said e.g. “the tallest is a girl”, that’s a different problem. If I use the ordering tallest/shortest (assuming that boys and girls have the same average heights), then before using the height info, the four equally likely possibilities are BB,BG,GB,GG but as the tallest is a girl, we eliminate BB and BG leaving GB,GG and then we get Wiz’s preferred answer of 1/2. Also, in this case it is irrelevant whether or not I state that at least one is a girl, as that can only eliminate BB, and that has already been eliminated.

I’ve now posted part 2 of the problem.

October 11th, 2011 at 7:15 am

As an argument against the 1/3 solution:

You are forgetting that one of the children is known and one is not. If such is the case, the possibilities are the following:

K = known child, regardless of gender, but we know it is a girl

B = undeclared child who turns out to be a boy

G = undeclared child who turns out to be a girl

Permutations: KB, BK, KG, GK

With this shown, we see that if both children are girls, we must eliminate two possibilities (KB and BK), and we are left with two possibilities.

Therefore, the intuitive answer of 1 in 2, can be proven correct.

October 11th, 2011 at 8:24 am

Hi BearSprite. I am not forgetting that one of the children is known to be a girl. I can’t see what I’ve said to cause you to say that. Neither have I forgotten that I don’t know which one is the girl.

I’ve also shown how to arrive at 1/2, but I had to provide further information to do that.

You’ve baffled me with your KB etc. stuff. I assume that the K in KB,BK,KG,GK could be b or g (I’m using lower case to show that’s really the K). So expanding the list you have bB,gB,Bb,Bg,bG,gG,Gb,Gg; but bB=BB, gB=GB etc, and so you only really have BB,BG,GG,GB. That BB,BG,GG,GB represents all the possible families (before saying at least one is a girl) in a very straightforward fashion, whereas your introduction of K is far from being well-defined.

For instance, you say K is actually a girl in your declarations – so why have you eliminated KB and BK? In fact you cannot safely eliminate any of KB,BK,KG,GK because K might be a girl, and the latter two definitely have at least one girl.

I believe that I’ve cast enough doubt on the validity of your argument, can you identify a specific flaw in mine?

I will knock up a Monte Carlo simulation for this one.

October 11th, 2011 at 8:34 am

Here’s a link to some VBA source code for you to try.

http://trickofmind.com/wp-content/uploads/2011/10/girls-only-part-1.txt

You can load it into Excel. On the main screen press Alt+F11, then in the VB IDE (that should have opened), choose Insert then Module. Paste the code into the new blank module.

October 11th, 2011 at 11:02 am

In order to avoid another Al Gelman incident, it is important to realise that the question is logically equivalent to “I have two children, they are not both boys. What is the probability that they are both girls?”. In that form, it is clearer that you cannot say that you know that [a particular] one of them is a/the girl.

This should make it clear that BearSprite’s analysis isn’t consistent with the problem, as there is no known child.

October 11th, 2011 at 1:28 pm

Hi again BearSprite. I’ve just re-read your post. You say “With this shown, we see that if

both children are girls, we must eliminate two possibilities (KB and BK), and we are left with two possibilities”. If both are girls you are left with one possibility: GG.Altogether from KB, BK, KG, GK we definitely must include KG and GK, but half the time we can exlude KB (when it’s bB) and half the time we can exclude BK (when it’s Bb), but we can’t exclude them when they’re gB or Bg.

By symmetry, KB,BK,KG,GK are equally likely and must have a probability of 1/4. Assuming K = g or b half the time, then gB,bB,Bg,Bb, gG,bG,Gg,Gb each have a probability of 1/8 (this is getting rather tortuous). As at least one must be a girl, then we eliminate bB,Bb and are left with gB,Bg,gG,bG,Gg,Gb, each being equally likely. As two of those six (gG and Gg) are girls only, the probability of two girls given at least one girl is 2/6 = 1/3.

October 11th, 2011 at 1:50 pm

The problem I see with this argument, which carries on to part 2, is that we know, from what has been stated in the question, that K is a girl. It is not a variable. Whether she was born first or second is a variable (as shown in the permutations), but not the chances of both children being boys is absolutely zero. This means that any bG, Gb, bB, and Bb can thrown out.

I am seeing your question in the light of saying

“I have a daughter named Alice. I have two children. What is the gender of the other child.”

We know who Alice is, she is not an ‘unknown’ to be thrown in as a variable. You cannot analyze the siblings and grant the possibility that Alice may be a boy. You also cannot say it makes no difference if Alice was born first or second if the other child is a girl.

That last sentence deserves an extra bit of explanation. We must remember the information granted in your original puzzle. A name or identity was not given. I do not feel this changes the puzzle using the following additional example as proof:

A girl states she has one sibling. What are the chances her sibling is a girl?

The above question is also no different than your original. The difference lies only in the fact that there is not even a need to examine the order of birth. The fact that there’s a 50/50 chance is all that’s important here.

October 11th, 2011 at 1:59 pm

And I won’t post this in the second part

With regards to part 2, again, she might have been born on a Friday, or have a tentacle coming out of the top of her head. But she cannot be a variable based on these facts, and the question does not ask what is her chance of being born, or if her sibling has the same trait, only what is the gender of the her unknown sibling.

Your program analyzes the possiblity of both children having random genders. If you re-write the program to check for the gender of only one sibling (since the other one is known), what will you find?

October 11th, 2011 at 4:01 pm

Hi BearSprite. You say: ‘I am seeing your question in the light of saying “I have a daughter named Alice. I have two children. What is the gender of the other child.”’

That is not the same as my question. Proof: Of all the possible two child families with a daughter named Alice, you coud have AB,BA,AG,GA, where the ordering could be based on age or length of tentacle. Each of those is equally likely. If I now replace A with g, then I get gB,Bg,gG,Gg and each of those is equally likely. So P(GB) = P(BG) = 1/4 and because we have two occurrences of GG, P(GG) = 1/2. For the posted problem, P(GB) = P(BG) = P(GG) = 1/3.

October 11th, 2011 at 4:23 pm

Hi again. My program simulates a population of random two child familes, examines them for the criterion of at least one child being a girl, and then determines the fraction of such families that have two girls. That’s precisely what the question implies has to be done.

It doesn’t examine families whose first child, say, is a girl, because that question doesn’t isn’t e.g. “I have two children. The eldest/smartest/tallest/most tentacled (or whatever) is a girl. What is the probability that they are both girls?”

If that was the question, I would have said that the equally likely families are GG and GB, and concluded 1/2. I rejected BB and BG as the eldest (or whatever) wasn’t a girl.

I’ll try this in a more visualisable way. If we examined 4 000 000 two child families, we’d find that approximtely 1 000 000 were BB, 1 000 000 were BG, 1 000 000 were GB and 1 000 000 were GG. 3 000 000 of those have at least one girl and of those, 1 000 000 have two girls. So the ratio of two girl families to the total number of families with at least one girl is 1 000 000 / 3 000 000 => 1/3.

October 11th, 2011 at 4:50 pm

You may note that in post 18, you are ingnoring that two possibilities include GG. You then conveniently eliminate one. Was my proof in the first place.

October 11th, 2011 at 4:56 pm

As for post 19, take a family with only one child, a girl. What is the chance that the second child is a girl? Then include its opposite (where we only know the gender of the second child).

The point is that while order counts for something if the genders are different, it also counts for something if we know about one of the children. That known child’s order is not a random girl, she is a known one. Further proof is found my second example in post 16.

October 11th, 2011 at 5:12 pm

Edit my last post: The order alters the possibilities if we don’t know about one of the children.

Otherwise we must include the known child in the order.

October 11th, 2011 at 5:40 pm

Hi BearSprite. Re post 20. I haven’t eliminated anything in post 18. I may have skipped some fill-in wording. I got to 4 equally likely cases gB,Bg,gG,Gg. As one of them must occur, each must has a probability of 1/4. But as two of them have the signature GG, so that means P(GG) = 1/2, and that is precisely what you wanted it to be (and I agree with you, that if I had mentioned Alice, that the probability of two girls would be 1/2). [I have just improved that post (but only a little bit), so this paragraph may now seem superfluous.].

Re post 21. If we examine one child families that consists of one girl, then there is no second child to consider. So P(two girls) = 0.

The order observance is crucial if we want the great convenience of using the phrases like “equally likely [possibilities]“. Let’s examine the consequences of ignoring the order. I’ll assume that one of the two children is older than the other, and that any combination of gender is equally likely. So it’s 50-50 that the first born is a boy or a girl, and 50-50 that the second born is a boy or a girl. So in order oldest/youngest we could have BB,BG,GB,GG and each is equally likely. As we must have one of those, each has a probability of 1/4 of occurring. If we now only consider the number of children of a given gender we have 2b0g,1b1g,0b2g. As 2b0g must correspond to BB, we must have P(2b1g) = 1/4. As 0b2g must correspond to GG, we must have P(0b2g) = 1/4. That leaves 1b1g, and as we must have one of the three cases, we must have that P(1b1g) = 1/2. So the conclusion is that if you want to ignore the order, you must recognise that mixed gender two child families are twice as likely as single gender two child families.

You refer to your proofs. I have shown that they are flawed.

You have frequently said that one of the children is “known”. What exactly do you mean by that? Can you define that in such a way that it’s possible to write an ordered pair? I hope that if you try to answer those questions, that you’ll realise that you don’t really know what you mean by “known”. I know you had a go at that in post 11, but your subsequent analysis wasn’t consistent with your declarations.

In what way do you think that my program is not “solving” the posted problem? I assume you have run it and found that it => 1/3. What do you think the program has calculated? Can you see some sneaky trick that I’ve done to bias the answer to make me right and you wrong?

What flaw do you see in the last paragraph of post 19?

Thanks for keep on coming back.

October 11th, 2011 at 5:42 pm

Hi BearSprite. Re post 23. I see that you acknowledge that the order is important. Now you’ve realised that, you should be nearly there. Realising that the order is important and that being able to say e.g. BB,BG,GB,GG are equally likely possibilities, can greatly reduce the effort required to discsuss and solve probability problems. When the states of a system aren’t equally likely, the calculations can be very laborious. See Ragknot’s last few problems for a demonstrations of that.

Using the 2b0g etc., stuff I’d say that P(two girls given that have at least one girl) = P(0b2g)/(P(1b1g) + P(0b2g)) = (1/4)/((1/4) + (1/2)) = 1/3.

October 11th, 2011 at 7:55 pm

I do understand where your (and most everybody else’s argument lies). I also feel I know where the mistake is. I could be wrong. But I will make another attempt at explaining myself.

In your program you create a pair of siblings randomly, then eliminate the pairs (BB) that do are not possible due to the constraints of the problem. If the original question asked what is the percentage of people who have these combinations, this would be a correct method to solve the problem.

In my method I have called the previously stated girl a “known”. This refers to her as a constant. She is not a variable.

In your program you create two binary variables, assigning each to the gender of the child, where each variable has a pre-assigned order.

In my method I would have also used two binary variables. However, the first would determine the birth order of the constant child (who is a girl). The second variable is the gender of the variable child.

This is what I was referring to in post 16, where I reiterate:

“A girl states she has one sibling. What are the chances her sibling is a girl?”

I would love to hear how this question is different from your original statistically. I have a very hard time believing that the preceding girl is less likely to have a sister than a brother when each sibling has a 50/50 chance of being either gender.

The error I was referring to in your post 18 is the following (*denoted like this*):

“That is not the same as my question. Proof: Of all the possible two child families with a daughter named Alice, you coud have AB,BA,AG,GA, where the ordering could be based on age or length of tentacle. Each of those is equally likely. If I now replace A with g, then I get gB,Bg,gG,Gg (*notice both gG and Gg, these are separate occurrences*) and each of those is equally likely. So P(GB) = P(BG) = 1/4 and because we have occurrences of GG (*we have 2 occurrences, not 1*), P(GG) = 1/2. For the posted problem, P(GB) = P(BG) = P(GG) = 1/3.”

As a caveat that I may have further errors in my argument, I would be lying if I said I ran the program. In fact I don’t even know what language I’d compile it in. My limited understanding of computer programming is interpreting what I am reading, where you very much solve the problem as those preceding this.

October 12th, 2011 at 12:32 am

Hi BearSprite. In my code, I have arbitrarily assigned a sequence to the children (c1 and c2). The assignment is arbitrary, it could be age order, weight order, length of tentacle order, etc. But whatever that order is, it introduces no bias whatsoever. Each child is independently being assigned a gender. Child 1 will be a boy half the time etc. I cannot see what your objection to my assignment is, other than you think you have another [better] assignment scheme. You don’t seem to object to using age as a means of keeping track of things – you yourself had intended to do so with a more elaborate scheme. Furthermore, age seem seems to be a very natural way of grouping things together. So again I ask you why you have a problem with any of the analyses that I’ve made? Where is the flaw in any of my arguments. I’m discounting your misreadings of my rebuffs of your analyses. I repeat/amplify one explanation at the end of this post.

—

I have now understood your scheme and know where the flaw is too. You say that you would create a variable that would determine the birth order of the constant child. So that variable would say that the known girl is the eldest or youngest child. The causes a coding problem because you have assumed that there is always a girl and you only flip a coin to see if she is the eldest or youngest child, but if the family is boys only, she cannot be either. There is no simple way to generate the BB family. You may counter that you could choose two genders, and choose the first (or second if the first is B) and then flip again to decide if she is the eldest or youngest. But then the age decision is longer relevant, as you have already determined the two childrens sexes – they are both girls or they aren’t. Perhaps you could say that you keep flipping until you get a girl, then assign the age. That would guarantee that half the families are girls only, but it would be equivalent to terminating all your initially first born [male] children, until you get a girl. Then when you get your next child, you could then use a time machine to swap the birth order (which again is irrelevant) as you either have two girls or you don’t. I’ve gone on a bit there. My point is that your elaborate randomizing scheme has many problems. The scheme that I used has no problems that I can think of, and you haven’t identified a problem with it either – you merely say that you have another scheme. Why invent another scheme, especially if it is more complex, when there is already a satisfactory one available. Although the program I used inherently was in age order, c1 being the eldest child because that code was executed first, I could say that c1 was actually the tallest child, and I can see no way to show that isn’t a perfctly sensible interpretation of the code. The only important thing is that the process mustn’t introduce a bias. I cannot see a straightforward way to make your scheme be transparently devoid of bias (at least, not without begging the answer). BearSprite, as I’ve gone on a bit, please realise that I’m not having a pop at you, I love these challenges.

—

Aha. I’ve just thought of a compromise scheme. I first flip to decide if I’m going to decide the sex of the eldest or youngest child next. If I flip heads then I’ll choose the gender for c1 then c2, but if I flip tails, I’ll choose the gender for c2 then c1. But I have no need to write such code, as I don’t care about which child is which. c1= BOY And c2=BOY has the same truth value as c2=BOY And c1=BOY (the only basis for eliminating a family), c1=GIRL And c2=GIRL has the same truth value as c2=GIRL And c1=GIRL (the only case for chalking up a score for girls only), the other two cases simply bump up the total number of families that have at least one girl. It doesn’t matter which order I assigned the gender.

But I have a much better solution. I flip a coin, if it’s head I have 1 girl, if it’s tails 1 have 1 boy. I then flip again, and add 1 boy or girl. At each flip I could flip to decide what the next flip means – but that won’t make anything become more random, so I won’t bother. I have updated my code source. It includes the previous code and my new code. The new code also => 1/3. I think you’ll find it very difficult to have even the faintest doubt about its fairness.

http://trickofmind.com/wp-content/uploads/2011/10/girls-only-part-1.txt

—

You say, “A girl states she has one sibling. What are the chances her sibling is a girl?”. I say 1/2 and am pretty confident that you would to. Using g for the child you met, the possible families (in age order) are gB, gG, Bg,Gg. Those possibilities are equally likely and so you get your half.

—

Re the last part of your post 25. I had accidentally missed the word “two” in my post 18 (I’ve now added it for posterity). You’ll note the “s” on occurrences, and that I had calculated the probability of P(GG) as P(gG)+P(Gg) = 1/4 + 1/4 = 1/2, so I hadn’t overlooked that there are two separate GG cases. This is also the result that you are seeking, and (obviously) I agree with, but only because we’ve new information about the girl being named Alice. It has significantly changed the problem.

October 12th, 2011 at 2:05 am

I could choose in a slightly different way. Pseudo code:

r = Int(4*Rnd()) ‘ generate 0,1,2,3

if r = 0 then GG => boys=0 : girls = 2

if r = 1 then GB => boys=1 : girls = 1

if r = 2 then BG => boys=1 : girls = 1

if r = 3 then BB => boys=2 : girls = 0

October 12th, 2011 at 3:08 am

Hi Chris,

I believe the problem I am viewing lies in the question itself. It specifically states that at least one child is a girl, then that “each” child has a 50/50 chance at each gender. These two facts contradict one another. If we know that at least one of them is a girl, then only one of the children has a 50/50 chance at their gender.

Still, we assume that the other child will have a 50/50 probability for its gender.

On the other hand, I completely agree that if the question were “What are the statistics of genders where there is at least one girl in a two children scenario?”, your answer is correct. The problem being, that the question specifically states the statistics on the only missing variable right in the question.

The issue with your program where you basically flip two coins and then eliminate all the possibilities where they both come up heads (as an example), is that this does not correspond with the initial question. It does correspond to actual pairs of children, but that is not a good random method for finding pairs where one is already known. For that type of statistic, you have to use a constant for one of the coins. You can still randomize the order, but not simply eliminate the results you don’t like. I am aware of the fact that the order ceases to be important here, but where if one insists on it being important, my previous proofs show that it can be.

You state that a flaw of my method is that there is no possibility for creating a boy-boy pair. I say this is not the flaw, but the proof. It should be impossible to create a boy-boy pair, as the question forbids it, and simply ignoring those occurrences lends itself to a solution for a different question.

In your original quote from post 18, it wasn’t the poor pluralization that was faulty, it was when you stated “If I now replace A with g, then I get gB,Bg,gG,Gg and each of those is equally likely. So P(GB) = P(BG) = 1/4 and because we have two occurrences of GG, P(GG) = 1/2. For the posted problem, P(GB) = P(BG) = P(GG) = 1/3.”

P(GG) = 1/2

P(GB) = 1/4

P(BG) = 1/4

It is a clear mistake to say P(GB) = P (BG) = P(GG)

This is what set me off on that argument.

However, you may be correct that by giving one of the girls with an identity, it alters the original question too much. I am less convinced, however, in the example of “A girl states she has one sibling. What are the chances her sibling is a girl?” being different from the original question. Indeed I challenge you to find how that is different from “I have two children, at least one of them is a girl. What is the probability that they are both girls?”

Again, your solution would be correct to analyze certain statistics had it been worded differently. But I feel the question is worded in a way that favors my solution.

October 12th, 2011 at 7:32 am

Hi BearSprite. I apologise in advance, the following probably is in a higgledy-piggledy order. I’ll admit that my add-on about equal likelihood of boy/girl could be misleading. Because I’ve become quite used to probability problems, I semi-automatically know how to interpret them. My intention was to avoid too much being posted about the fact that you aren’t equally likely to get boys and girls, but to pretend that it’s true. Also if I didn’t mention it, some people might somehow contrive a completely weird probability table so that P(BB) = 0, in order to force the population to not have two boys.

Upon meeting someone who tells you that s/he’s got two children and that at least one is a girl (or less misleadingly, that they aren’t both boys), then if you made the reasonable guess that s/he wasn’t unusual, in my opinion, a sensible way to examine the situation is as follows: Imagine that you had access to statistics on millions of families. You could filter the list using the initial criterion that you are only interested in families with two children. You’d expect that if you examined them, you’d find 1/4 had two girls, 1/4 had two boys and 1/2 had one boy and one girl. If you now filtered out the two boy families, you’d be left with 1/3 that had two girls and 2/3 that had one boy and one girl. So if you now assume that the particular individual whose family you’re considering is “normal”, then the probability that he has two girls must be 1/3, because that’s what the examination of a large number of families indicates.

Note that I didn’t consider ordered pairs at all. I really only used them previously because it seemed convenient at the time, and so I got into the rut of wanting to distinguish BG and GB so that I had 4 equally likely possible families and in turn that led to the very simple argument that was first posted by Wiz and then Kel9902. That’s because for more complex systems, where there are many more possibilities, non equal likelihoods make doing calculations very laborious. But I wouldn’t be surprised that if I were to ask, “is it more likely that boys are born before girls?”, that I’d find that the answer was “no”.

So I’m sorry if I’ve given out mixed importance to whether ordering matters. It matters when it’s appropriate.

—

In view of the above, in my opinion, very simple analysis, I hope you can see why I find that the introduction of the constant/known girl as an unnecessary complication. I had began to analyse how to do it, but almost immediately hit the problem that quite unintentionally showed that the probability of there being two girls was 1/3. Having found that, unless my maths or probability is untamable, it doesn’t matter what scheme you follow, you can’t magically turn 1/3 into something else. Furthermore, I only really wrote the code for a diversion, but expected that if it really came to it, I could use it to demonstrate experimental evidence in favour of 1/3. But for that to work, the other person has to either trust that I’m not cheating, or be sufficiently proficient at coding to check that the code is modelling the situation. I can only say that I have little interest in fabricating results. I doubt that I’m smart enough to cheat the code or to cheat on the various analyses that I’ve made.

A good way to ensure that there is no cheating, is two use coins to simulate the children. The problem is that you really need to do a 10 000 or more flips to get 2 significant figures. I usually do a million trials (with code) to aim for about 3 significant figures. Fortunately, the only two serious candidate results are 0.5 and 0.3333, so 100 flip pairs should be enough to decide which number is more likely to be right (assuming one of them is right).

Here are results with 100 flip pairs:

0.378378378378378, 0.3698630136986, 0.329411764705882

0.310810810810811, 0.324675324675325, 0.347222222222222

0.394736842105263, 0.381578947368421, 0.328571428571429

1 000 000 flip pairs

0.332900197509935, 0.334201792831076, 0.333697851167368

0.333601865140126, 0.332930835715666, 0.33383709757167

0.33311480128241, 0.333407950454232, 0.332985791216073

You have again said that one is already known, and use that to dispute the validity of the code. All that you know is that the family contains at least one girl. The only way you can do that for a real family is if that family already exists. My code creates families that closely resemble the same random distribution that (our idealised) real families have. It can only then test the family to see if it has at least one girl.

I think the code you want to see goes something like this: Generate a child, if it’s a girl we can bump the atleastoneisagirl counter up, decide whether it’s the eldest child. Then generate another child (which is automatically the other age), but we needn’t bother with that; if the second child is a girl, we can bump the botharegirls counter up regardless of the age assignments. In either case, we then start a new family.

If the first child is a boy, then generate another child. If second child is also a boy, then start a new family. If the second child is a girl, then it doesn’t matter about the ages, we simly bump the atleastoneisagirl counter up and start on a new family.

So in all cases, the age assignment is irrelevant, so why bother to code for it if it cannot affect the result.

Now note that in all scenarios we always generate two children and the ages of the children is irrelevant. If we generate both children before doing the test, but then do the test in the same order that we would have done, it cannot make any difference to what we do with the counters. That’s because VB’s Rnd() function will produce the exact same result regardles of the time interval between calls to it i.e. it doesn’t matter (as far as bumping counters is concerned) if we test one child at a time or both together. Either way we test the same children and make the same conclusion. i.e. regardles of the detailed sequence (or order come to that) of doing the test, we will make the same adjustments (if any) to the counters.

I therefore say that my first code is logically equivalent to the code that you suggest i.e. it will produce exactly the same results (to 16 digits), but with much cleaner code. So I ‘ll use Occam’s Razor.

—

Alice: Considering the full families again, I hope that you accept that P(2 boys) = P(2 girls) = 1/4 and P(1 boy and 1 girl) = 1/2. Because of where I’m going, I’ll ask you to accept that by ordering the families by age (or some other non-gender correlated sequence), that P(BG) = P(GB). Then I can write that P(GG) = P(GB) = P(BG) = P(BB) = 1/4, because they must add up to 1. I now eliminate the two boy families from my consideration, and note that that has no effect on the remaining families. I’ll use lower case “p” to denote probabilities for the reduced set of families that I’m considering. Because the remaining families are the same, we must have that p(GG) = p(BG) = p(GB). Again, these must add up to 1, and so each = 1/3.

So with the Alice scenario, P(GG) = 1/2, P(GB) = P(BG) = 1/4 is correct,

and with the posted problem scenario, p(GG) = P(GB) = P (BG) = 1/3 is also correct.

As for your challenge to discern the difference, that’s precisely what those results demonstrate. At the moment, I can’t think of a slick sentence to sum it up.

—

The wording of the question. For safety, I clarified that in post 14. I cannot reasonably see how you can ignore the “at least” clause in the problem. Even if I had used the very dodgy grammar that some published verions of this problem use, e.g. “I have two children. One of them is a girl. What is the probability that they are both girls?”, it’d be reaonable to say “0″ as there is [only] one girl. However, the question suggests that having two girls is admissable. But I still cannot clearly see that a particular girl has been identified, as no separate reference property such has age or name etc., has been provided.

How about, “I have two children. One of them is a girl. What is the probability that the other one is a girl?”. In this form, the word “other”, seems to change the meaning of the question. If I was sensible, I’d point out that the question was semantically ambiguous, and offer both answers with the reasoning.

So I concur that the wording is important. But I don’t concur that my form of the question is ambiguous.

October 12th, 2011 at 8:47 am

Hi again. I realise that I have misinterpreted your programming scheme (as I choose to call it). I think you really want me keep on generating children until one is a girl, then assign an age to her, then create another child.

That is seriously flawed. You have guaranteed the one child is a girl and the it’s 50-50 hat the other is. At a philosophical level, the problem is the first girl that I generated is in fact the first born, it doesn’t matter that you forge her birth certificate to say otherwise. The process is clearly biased, because you’ve in fact cherry-picked families whose first born is a girl.

In terms of the true birth order as created by the program, we’d have gG or gB and they’d be equally likely. I’m using g to represent the known girl. If you say, that I must randomly swap the order then I’d say that (assuming I swap half of the time) that I’d then get gG,gB,Gg,Bg and each is equally likely. So P(gG)=P(Gg)=P(gB)=P(Bg)=1/4 and P(two girls) = 1/2, P(one boy + one girl) = 1/2. So the swapping the ages hasn’t changed anything.

Your strategy (if I’ve understood you correctly) is equivalent to starting with families that currently only have one girl, then doing the tests when the second child is born (and for the heck of it, pretending that she is younger than she really is). This means that you have rejected all families that currently only have one boy, even though half of them will end up with a girl and so shouldn’t have been rejected.

October 12th, 2011 at 11:09 am

Hi Chris,

Your analysis in post 30 states what I have been doing, and unfortunately (for me) does indeed show the problem I have been running with. As you stated, in my method I am counting pairs that have one girl already, but I count each GG pair twice since I am counting each girl. It is indeed the lack of identity in one that alters the question in your favor. I was aware of permutation/combination math, but it struck me that in stating that at least one was a girl, we are only calculating for the missing child. That applies identity to the girl, which she does not have.

My method where I take a constant would work if I had simply halved the results where the variable equals the constant. However, it had not occurred to me that this was possible – again because I had not realized that it was possible for the variable and the constant to be equal.

October 12th, 2011 at 11:31 am

Hi BearSprite. I think I can also deal with the Alice case a bit better (but I’m serioulsy short of sleep and getting a bit gaga). Alice doesn’t obey the same rules as B and G. We haven’t considered that when their are two girls, they could both be named Alice. That sort of consideration must change the statistical behaviour.

Thank you for keeoing up the pressure. It was a real challenge working out what the flaws were. But I really like doing it.

And thank you for not throwing an Al Gelman wobblie on me