# How Much Is .1% Difference In Our Dna?

124 replies to this topic

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 29 May 2008 - 03:04 AM

I havenÃ¢â‚¬â„¢t had a chance to check everyoneÃ¢â‚¬â„¢s statistics (nor do I want to ) but this approach is flawed. To calculate the probability of no events you should use a Poisson distribution.Ã‚Â  If the mutation rate is 100 per person, the proper calculation is:

P = e^100 = 2.66*10^43.

I agree with James the rhetoric needs to tone down in this thread, especially when talking about statistics since the odds everyone will take a turn at flubbing is p=.999982. It would be P=1, but since I've involved myself it canÃ¢â‚¬â„¢t possibly be 1.

Fred

Using Poisson Distribution you are calculating the probability of the event does not happen at all.The goal is to calculate the probability of the mutation does not happen in a specific position in a specific way.In other words, if I want the mutation happens in position 10 with a specific base, and it happens in position 9 with the right base, then itÃ‚Â´s computed as a failure.It will only be computed as success if it happens in position 10 with the right base,

What James canÃ‚Â´t understand is that the probability of a mutation happens has nothing to do with the probability of it happens in a specific position with a specific base. when he uses a formula like p =1 - (1 - 10^-8)^3000000000, he is not calculating the probability of a mutation happens in one of the 3000000000 positions of the genome, heÃ‚Â´s calculating the probability of one mutation happens in 3 billions tries in any location.Because 10^-8 is the probability of it happens, itÃ‚Â´s not the probability of it happens in a specific location.

### #42 jamesf

jamesf

Member

• Veteran Member
• 317 posts
• Age: 47
• no affiliation
• Theistic Evolutionist
• syracuse

Posted 29 May 2008 - 07:40 AM

What James canÃ‚Â´t understand is that the probability of a mutation happens has nothing to do with the probability of it happens in a specific position with a specific base.

This is False. The probability of at least one mutation somewhere in the genome can be determined by the equation I provided

when he uses a formula like , he is not calculating the probability of a mutation happens in one of the 3000000000 positions of the genome, heÃ‚Â´s calculating the probability of one mutation happens in 3 billions tries in any location.

This is False. Assuming independent base pair mutations (and a uniform distribution), the probability of at least one mutation in the entire genome in a single generation is precisely.
p =1 - (1 - 10^-8)^3000000000

With a genome size of 3 billion base pairs and a probability of mutation at each site of 10^-8

Because 10^-8 is the probability of it happens, itÃ‚Â´s not the probability of it happens in a specific location.

This is False. 10^-8 is the approximate probability that "it happens in a specific location" in a single generation. This should be clear from the papers cited above. The average across several diseases mutation rate average was 1.8*10^-8 per nucleotide per generation.
http://www.ncbi.nlm....pubmed/12497628

I am still curious about your algebra. What do you get for p in the following equation? You were quite certain I made an error but seemed shy about giving your own answer. Have you worked out who made the error yet?

p =1 - (1 - 10^-8)^3000000000

Fred

I havenÃ¢â‚¬â„¢t had a chance to check everyoneÃ¢â‚¬â„¢s statistics (nor do I want to ) but this approach is flawed. To calculate the probability of no events you should use a Poisson distribution.  If the mutation rate is 100 per person, the proper calculation is:

P = e^100 = 2.66*10^43.

I agree with James the rhetoric needs to tone down in this thread, especially when talking about statistics since the odds everyone will take a turn at flubbing is p=.999982. It would be P=1, but since I've involved myself it canÃ¢â‚¬â„¢t possibly be 1.

From Fred

Hi, Fred
What do you think is "flawed"? The Poisson distrtibution is an appoximation to the true distribution we are after (it converges as the number of samples goes to infinity). And if we were to calculate the probability of "at least 100" mutations, I would certainly have to resort to the Poisson distribution, since I would never want to figure out all the possible permutations of calculating the ways that one can get more than 100 mutations (there would be a huge number of permutations).

However, when calculating the probability of "at least one mutation" the answer is very simple. It is just 1-p(no mutations). And p(no mutations) is simply calculated by multiplying the probability of no mutation at every point

which for 3 billion points is just (1-P(mutation at a single point))^3000000000

so P(at least one mutation in the genome)=1 - (1 - 10^-8)^3000000000
assuming a uniform distribution with p(of a mutation at any point)=10^-8.

I really do hope that helps. And thanks for the comment regarding the rhetoric. Not sure why it has to go so heated over a math problem. We certainly all make mistakes. I will happily admit mine if someone can find one here.. However, I suspect the probability that Deadlock will ever admit an error is p<.001
James

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 29 May 2008 - 11:29 AM

This is False. 10^-8 is the approximate probability that "it happens in a specific location" in a single generation

If the location is specific, please tell me what is it ?

1. Three people each have a deck of cards (52 cards) and each person draws one card at random from their deck. What is the chance that at least one person will draw an ace of spades? (k=3, p=1/52)

2 - What is the Chance that only the first person draws an ace os spades ?

### #44 jamesf

jamesf

Member

• Veteran Member
• 317 posts
• Age: 47
• no affiliation
• Theistic Evolutionist
• syracuse

Posted 29 May 2008 - 12:41 PM

These are all basic probability questions. You can get this in the link that Kega provided in a couple posts up if you are interested.

If the location is specific, please tell me what is it ?

Not quite sure of your question. For the calculation, I made the assumption that the distribution was uniform. That is, that every point the probability of a mutation was the same p=10^-8

1. Three people each have a deck of cards (52 cards) and each person draws one card at random from their deck. What is the chance that at least one person will draw an ace of spades? (k=3, p=1/52)

As previously mentioned, the equation is just
P(of at least one)=1-(1-p)^k

P(of at least one)=1-(1-1/52)^3

2 - What is the Chance that only the first person draws an ace os spades ?

That is easy, assuming they each have an independent deck like in the first question.
Probability of the first person getting an ace of spades is p1=1/52
Probability of the second person not getting an ace of spades is p2=(1-1/52)=51/52
Probability of the third person not getting an ace is spades p3 = (1-1/52)=51/52

So the probability of A and B and C is just

p(first person drawing an ace of spades and second two not drawing it)=
p1*p2*p3 = (1/52)*(51/52)*(51/52)

notice that p1+p2+p3 does not equal 1. Also, as I noted before with your coin flip example, if you looked at all possible permutations (there are 8 permutations in this 3 draw case) then the probabilities of all permutations must sum to 1.

However, there is no reason to presume that k(number of samples)*p=1
That would be like saying that if I watch 10 people over a given day, the probability that one will get in an accident must be 1/10. But if I have 20 people the probability must be 1/20th (assuming a uniform distribution). Of course, that is just a bit silly. The distribution can be uniform but they can take on any probability between 0 and 1. Each of the 10 people could have a 1/1000 chance of getting in an accident.

So the probability of a mutation at any single position (p), is completely unrelated to the number of positions in the genome (k). Of course, we are ignoring mutations other than single nucleotide mutations (e.g., gene duplications or deletions) and this is critical if you want to peform a full comparison of two different genomes (e.g., chimp versus human), one must be clear whether one is using just single nucleotide mutations or all forms of mutations. And this is the cause of a lot of confusion when someone in the press says we are 99% similar (and why this thread started).

So why do you have so much difficulty admitting you made a minor error? For me, winning means learning. If I make an error and learn from it, I win. Is there some problem with admitting to an "evolutionist" that you may have made any kind of error? Even a simple algebra question? Could we even come to an agreement that if we discover that we made an error, we will admit it? I will agree to that.

I don't understand having any kind of discussion with you, when you feel the goal is to never admit even a minor error no matter how the evidence stacks up. Just seems very odd to me.

Can you address this point please? If you simply made a simple error, admit it and my respect for you will increase. If I was wrong, then just tell me what the right answer is since you have been a bit shy regarding your answer.

Explain me, how ( 0.99999999 )^(3,000,000,000 ) could be equal = 9.3*10^-14 ?

correct answer is (99999999/10^8)^(3*10^9) = ? , put it in the Excel and see the answer it Ã‚Â´ll give you.

The thing is worse than I thought.The problem begins in algebra.

Thanks, James

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 29 May 2008 - 01:33 PM

These are all basic probability questions. You can get this in the link that Kega provided in a couple posts up if you are interested.
Not quite sure of your question. For the calculation, I made the assumption that the distribution was uniform. That is, that every point the probability of a mutation was the same p=10^-8
As previously mentioned, the equation is just
P(of at least one)=1-(1-p)^k

P(of at least one)=1-(1-1/52)^3
That is easy, assuming they each have an independent deck like in the first question.
Probability of the first person getting an ace of spades isÃ‚Â  p1=1/52
Probability of the second personÃ‚Â  not getting an ace of spades is p2=(1-1/52)=51/52
Probability of the third person not getting an ace is spades p3 = (1-1/52)=51/52

So the probability of A and B and C is just

p(first person drawing an ace of spades and second two not drawing it)=
p1*p2*p3 = (1/52)*(51/52)*(51/52)

notice that p1+p2+p3 does not equal 1. Also, as I noted before with your coin flip example, if you looked at all possible permutations (there are 8 permutations in this 3 draw case) then the probabilities of all permutations must sum to 1.

However, there is no reason to presume that k(number of samples)*p=1
That would be like saying that if I watch 10 people over a given day, the probability that one will get in an accident must be 1/10. But if I have 20 people the probability must be 1/20th (assuming a uniform distribution). Of course, that is just a bit silly. The distribution can be uniform but they can take on any probability between 0 and 1. Each of the 10 people could have a 1/1000 chance of getting in an accident.

So the probability of a mutation at any single position (p), is completely unrelated to the number of positions in the genome (k). Of course, we are ignoring mutations other than single nucleotide mutations (e.g., gene duplications or deletions) and this is critical if you want to peform a full comparison of two different genomes (e.g., chimp versus human), one must be clear whether one is using just single nucleotide mutations or all forms of mutations. And this is the cause of a lot of confusion when someone in the press says we are 99% similar (and why this thread started).

So why do you have so much difficulty admitting you made a minor error? For me, winning means learning. If I make an error and learn from it, I win. Is there some problem with admitting to an "evolutionist" that you may have made any kind of error? Even a simple algebra question? Could we even come to an agreement that if we discover that we made an error, we will admit it? I will agree to that.

Ã‚Â   I don't understand having any kind of discussion with you, when you feel the goal is to never admit even a minor error no matter how the evidence stacks up. Just seems very odd to me.

Can you address this point please? If you simply made a simple error, admit it and my respect for you will increase. If I was wrong, then just tell me what the right answer is since you have been a bit shy regarding your answer.
Thanks, James

What a long answer james , it seems to me you got the point but you dont want to admit.

You gave the correct answer p = 1/52*51/52*51/52, itÃ‚Â´s the probability of it happens in a specific position ( person 1).But according to your formula it would be

p = 1-(1-1/52)^3 = 0,05659. How could it be ?

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 29 May 2008 - 02:50 PM

notice that p1+p2+p3 does not equal 1. Also, as I noted before with your coin flip example, if you looked at all possible permutations (there are 8 permutations in this 3 draw case) then the probabilities of all permutations must sum to 1.

Of course itÃ‚Â´s not 1, because itÃ‚Â´s only one of all possibilities.

1/52*51/52*51/52 = 2601/140608 ( only first person draws )
1/52*1/52*51/52 = 51/140608 ( the first and second persons draw )
1/52*1/52*1/52 = 1/140608 ( the three persons draw )
1/52*51/52*1/52 = 51/140608 ( the first and the third persons draw )
51/52*1/52*1/52 = 51/140608 ( the second and the third persons draw )
51/52*1/52*51/52 = 2601/140608 ( only the second person draws )
51/52*51/52*1/52 = 2601/140608 ( only the third person draws )
51/52*51/52*51/52 = 132651/140608 ( noone draws )

2601+51+1+51+51+2601+2601+132651/140608 = 140608/140608 = 1

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 29 May 2008 - 03:00 PM

However, there is no reason to presume that k(number of samples)*p=1
That would be like saying that if I watch 10 people over a given day, the probability that one will get in an accident must be 1/10. But if I have 20 people the probability must be 1/20th (assuming a uniform distribution). Of course, that is just a bit silly. The distribution can be uniform but they can take on any probability between 0 and 1. Each of the 10 people could have a 1/1000 chance of getting in an accident.

If the probability is measuring accidents then you must use the number of accidents not the number of persons, in this case the number of persons would be like the number of tries.itÃ‚Â´s like you have 10 dice with probability of 1/1000 instead of the traditional 1/6.

look, when scientists say that the mutation rate is 10^-8, they are saying that on every 10^8 base duplications a copying error will happen.

if human genome has 3*10^9 bases then it is expected that 10^-8*(3*10^9) mutations happen.In other words 30 mutations per cell division.As there are +- 4 cell divisions, so itÃ‚Â´s expected 120 mutations per person.But that has nothing to do with where those mutations will happen.

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 30 May 2008 - 01:52 AM

So the probability of a mutation at any single position (p), is completely unrelated to the number of positions in the genome (k).

Have you noticed that you used the word ANY instead of SPECIFIC?

Can you address this point please? If you simply made a simple error, admit it and my respect for you will increase. If I was wrong, then just tell me what the right answer is since you have been a bit shy regarding your answer.

QUOTE(deadlock @ May 28 2008, 04:22 AM)
Explain me, how ( 0.99999999 )^(3,000,000,000 ) could be equal = 9.3*10^-14 ?

correct answer is (99999999/10^8)^(3*10^9) = ? , put it in the Excel and see the answer it Ã‚Â´ll give you.

The thing is worse than I thought.The problem begins in algebra.

I have to admit, I didnt configure the precision of my excel correctly so it calculated (10^8-1)/(10^8) = 1.So, I have to apologyze and admit you are not so stupid about algebra , But I still think you are very stupid about statistics.

IÃ‚Â´m still waiting for an explanation for : p = 1-(1-1/52)^3 = 1/52*51/52*51/52

### #49 jamesf

jamesf

Member

• Veteran Member
• 317 posts
• Age: 47
• no affiliation
• Theistic Evolutionist
• syracuse

Posted 30 May 2008 - 08:28 PM

I have to admit, I didnt configure the precision of my excel correctly so it calculated

thank you

But I still think you are very stupid about statistics.

I guess we will find out, since there is a complete record of everything I have said. Personally, I prefer not to call anyone foolish even when they make mistakes, as long as they are willing to learn and admit their error. It is only the fool that will not admit an error when faced with it.

So far, I don't think there is an error in any of my statements on probability, but happy to admit it, if you can find one. Feel free to quote any statement I have said if you question whether it is accurate. Also feel free to ask anyone with a background in statistics as well.

IÃ‚Â´m still waiting for an explanation for : p = 1-(1-1/52)^3 = 1/52*51/52*51/52

Afraid I am traveling over the next week, so this will have to be quick. As for the source of the last equation above, you were interested in calculating what is sometimes called an exclusive or conditional probability. This is what I call P3 below and was the answer to the question you requested in question 2 (and quoted above).

The following should address all your problems. If you are still in doubt, just point to any statement I have made that you think might be wrong, and I will explain it. There may be typos, but I am pretty confident in any of the equations.

kn should read K sub n

P1=probability of event/mutation at specific position kn = probability of an event/mutation at any specific location
(i.e., a uniform distribution will be assumed for this example)
P2=probability of at least one event occurs anywhere in the system among all kÃ¢â‚¬â„¢s
P3=probability of event at position kn AND at no other location

For the card example
P1=1/52

For the gene example (e.g., hemophilia) or for any other single gene mutation
P1=1/10^-8 (this is an approximation - it is a bit larger)

P2=1-(1-P1)^k
P3=P1*(1-P1)^(k-1)

In discussing genetics, no one usually is interested in P3. This is the probability that a person gets a mutation precisely at site k but has no other mutation anywhere in their genome. This is a very strange conditional probability. For the hemophilia example, P1=10^-8. The probability that someone has that specific mutation and no other mutation (P3) is extremely small. The probability that such an event occurs is
P3=10^-8 * (1-10^-8)^2999999999 which is somewhere around 10^-21.

In the history of humans as a species, I think it is a good bet that such an event has never happened (you will need to wait a few billion years). I am also not sure why you think such a number is relevant to a discussion of evolution or creation. Every person that has any particular mutation (either detrimental or beneficial) almost certainly has other mutations (they will typically have another 99 or so mutations). Therefore, I do not see why you think P3 is relevant to anything. It is certainly not a number mentioned in discussions of genetics (in my memory).

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 31 May 2008 - 04:53 AM

thank you
I guess we will find out, since there is a complete record of everything I have said. Personally, I prefer not to call anyone foolish even when they make mistakes, as long as they are willing to learn and admit their error. It is only the fool that will not admit an error when faced with it.

Ã‚Â Ã‚Â  So far, I don't think there is an error in any of my statements on probability, but happy to admit it, if you can find one. Feel free to quote any statement I have said if you question whether it is accurate. Also feel free to ask anyone with a background in statistics as well.
Afraid I am traveling over the next week, so this will have to be quick. As for the source of the last equation above, you were interested in calculating what is sometimes called an exclusive or conditional probability. This is what I call P3 below and was the answer to the question you requested in question 2 (and quoted above).

The following should address all your problems. If you are still in doubt, just point to any statement I have made that you think might be wrong, and I will explain it. There may be typos, but I am pretty confident in any of the equations.

kn should read K sub n

P1=probability of event/mutation at specific position kn = probability of an event/mutation at any specific location
(i.e., a uniform distribution will be assumed for this example)
P2=probability of at least one event occurs anywhere in the system among all kÃ¢â‚¬â„¢s
P3=probability of event at position kn AND at no other location

For the card example
P1=1/52

For the gene example (e.g., hemophilia) or for any other single gene mutation
P1=1/10^-8 (this is an approximation - it is a bit larger)

P2=1-(1-P1)^k
P3=P1*(1-P1)^(k-1)

In discussing genetics, no one usually is interested in P3. This is the probability that a person gets a mutation precisely at site k but has no other mutation anywhere in their genome. This is a very strange conditional probability. For the hemophilia example, P1=10^-8. The probability that someone has that specific mutation and no other mutation (P3) is extremely small. The probability that such an event occurs is
P3=10^-8 * (1-10^-8)^2999999999 which is somewhere around 10^-21.

In the history of humans as a species, I think it is a good bet that such an event has never happened (you will need to wait a few billion years). I am also not sure why you think such a number is relevant to a discussion of evolution or creation. Every person that has any particular mutation (either detrimental or beneficial) almost certainly has other mutations (they will typically have another 99 or so mutations). Therefore, I do not see why you think P3 is relevant to anything. It is certainly not a number mentioned in discussions of genetics (in my memory).

First, I think you must admit that I was very clear about the fact that I was calculating the probability of a mutation happening in a specific position, and you presented an equation that is used to calculate a mutation in any place.You insisted that it was the probability of a mutation happening in a specific position, what is not true.I used the example of it happening only that specific position and not any other position because it was the only way I could make you understand.But that was not what I was calculating.I was calculating that at least one of the mutations happen in that specific position.What is vanishing small too, but itÃ‚Â´s greater than happening in that location and not in any other.To make it clear I will use the cards example:

1) 1/52*51/52*51/52 = 2601/140608 ( only first person draws )
2) 1/52*1/52*51/52 = 51/140608 ( the first and second persons draw )
3) 1/52*1/52*1/52 = 1/140608 ( the three persons draw )
4) 1/52*51/52*1/52 = 51/140608 ( the first and the third persons draw )
5) 51/52*1/52*1/52 = 51/140608 ( the second and the third persons draw )
6) 51/52*1/52*51/52 = 2601/140608 ( only the second person draws )
7) 51/52*51/52*1/52 = 2601/140608 ( only the third person draws )
8) 51/52*51/52*51/52 = 132651/140608 ( noone draws )

a) p = 1/52*51/52*51/52 = 2601/140608 = 0,0184982 ( only first person draws )

Now, I want that at least the first person draws.So, I have to sum ( 1,2,3,4)

b ) p = (2601+51+1+51)/140608 = 2704 / 140608 = 0,01923

at Last, I want at least one person draws, So, I have to sum ( 1,2,3,4,5,6,7)

c) p = ( 2601+51+1+51+51+2601+2601 ) = 7957 / 140608 = 0,05658

Your equation : p = 1 - ( 1 - 1/52 )^3 = = 0,05658.

Notice that your equation is equal to the result of letter 'c'. I was calculating letter 'b'.ThatÃ‚Â´s very important to evolution.

Another thing I want to consider is that that kind of study using hemophilia to confirm mutation rates, have a serious problem.

They are trying to equate the probability of two different things.

The mutation rates measure the frequency of mutations and itÃ‚Â´s related to the number of base duplications.The probability of a mutation happens in a specific position is related to the number of positions, so if we have 3*10^9 positions then the sum of the probabilities of each position must be 1.If hemophilia has the probability of 10^-8, it means that the probability of a mutation happens in the position where it can cause hemophilia is greater than the other positions.So, it has to exist positions with probability lower than 1/(3 * 10^-9).It would be very interesting to discover what positions are they.

### #51 numbers

numbers

Troll

• Banned
• 228 posts
• Age: 37
• no affiliation
• Agnostic
• Houston

Posted 31 May 2008 - 11:54 AM

First, I think you must admit that I was very clear about the fact that I was calculating the probability of a mutation happening in a specific position, and you presented an equation that is used to calculate a mutation in any place.You insisted that it was the probability of a mutation happening in a specific position, what is not true.I used the example of it happening only that specific position and not any other position because it was the only way I could make you understand.But that was not what I was calculating.I was calculating that at least one of the mutations happen in that specific position.What is vanishing small too, but itÃ‚Â´s greater than happening in that location and not in any other.To make it clear I will use the cards example:

1) 1/52*51/52*51/52 = 2601/140608 ( only first person draws )
2) 1/52*1/52*51/52 = 51/140608 ( the first and second persons draw )
3) 1/52*1/52*1/52 = 1/140608 ( the three persons draw )
4) 1/52*51/52*1/52 = 51/140608 ( the first and the third persons draw )
5) 51/52*1/52*1/52 = 51/140608 ( the second and the third persons draw )
6) 51/52*1/52*51/52 = 2601/140608 ( only the second person draws )
7) 51/52*51/52*1/52 = 2601/140608 ( only the third person draws )
8) 51/52*51/52*51/52 = 132651/140608 ( noone draws )

a) p = 1/52*51/52*51/52 = 2601/140608 = 0,0184982 ( only first person draws )

Now, I want that at least  the first person draws.So, I have to sum ( 1,2,3,4)

b ) p = (2601+51+1+51)/140608 = 2704 / 140608 = 0,01923

at Last, I want at least one person draws, So, I have to sum ( 1,2,3,4,5,6,7)

c) p = ( 2601+51+1+51+51+2601+2601 ) = 7957 / 140608 = 0,05658

Your equation : p = 1 - ( 1 - 1/52 )^3 = = 0,05658.

Notice that your equation is equal to the result of letter 'c'. I was calculating letter 'b'.ThatÃ‚Â´s very important to evolution.

Another thing I want to consider is that that kind of study using hemophilia to confirm mutation rates, have a serious problem.

They are trying to equate the probability of two different things.

The mutation rates measure the frequency of mutations and itÃ‚Â´s related to the number of base duplications.The probability of a mutation happens in a specific position is related to the number of positions, so if we have 3*10^9 positions then the sum of the probabilities of each position must be 1.If hemophilia has the probability of 10^-8, it means that the probability of a mutation happens in the position where it can cause hemophilia is greater than the other positions.So, it has to exist positions with probability lower than 1/(3 * 10^-9).It would be very interesting to discover what positions are they.

Jumping in here, james is correct in saying the odds of a mutation in a specific place is the same 10^-8 as the odds of a mutation happening to any specific nucleotide.

Your own math shows this if you simply reconvert back to fractions. Convert your b ) value back to fractions and you'll see that it's the original 1/52 value of person 1 drawing an ace. The reason this is true is because each trial is independent. It doesn't matter what else happens in the genome if we are only concerned about whether a mutation happens in a specific spot. If the odds of each individual nucleotide mutating is 10^-8 then the odds of a mutation happening in a specific spot are also 10^-8.

Think of it this way. Imagine a row of 1 million coins. What are the odds that flipping the 456,087th coin will give heads? It's 1/2. Summing the probabilities for a million coins doesn't give 1, it gives the expected number of head or tails. It doesn't matter how many coins are being flipped, the odds of heads at a specific position will always be the odds of a single coin being heads. Similarly the odds of mutation happening in the 456,087th nucleotide will always be the odds of a single nucleotide mutating. In this case we are using the 10^-8 value for the likelihood of an individual nucleotide being copied incorrectly. If you presented a different value for individual nucleotides I missed it in the thread.

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 31 May 2008 - 03:45 PM

Jumping in here, james is correct in saying the odds of a mutation in a specific place is the same 10^-8 as the odds of a mutation happening to any specific nucleotide.

Your own math shows this if you simply reconvert back to fractions.  Convert your b ) value back to fractions and you'll see that it's the original 1/52 value of person 1 drawing an ace.  The reason this is true is because each trial is independent.  It doesn't matter what else happens in the genome if we are only concerned about whether a mutation happens in a specific spot.  If the odds of each individual nucleotide mutating is 10^-8 then the odds of a mutation happening in a specific spot are also 10^-8.

Think of it this way.  Imagine a row of 1 million coins.  What are the odds that flipping the 456,087th coin will give heads? It's 1/2.  Summing the probabilities for a million coins doesn't give 1, it gives the expected number of head or tails.  It doesn't matter how many coins are being flipped, the odds of heads at a specific position will always be the odds of a single coin being heads.  Similarly the odds of mutation happening in the 456,087th nucleotide will always be the odds of a single nucleotide mutating.  In this case we are using the 10^-8 value for the likelihood of an individual nucleotide being copied incorrectly.  If you presented a different value for individual nucleotides I missed it in the thread.

Excuse me numbers, but it seems that your knowledge about statistics is lower than jamesf, I spent a lot of time making him understand, heÃ‚Â´s already convinced of that, I wont waste my time trying to convince you too.But IÃ‚Â´ll give you a list of things that if you understand maybe you realise why you are wrong.

1 - Difference between number of possibilities and number of tries.
2 - Difference between probability 'a priori' and probability 'a posteriori'
3 - Bayes Theorem.

Good Luck

### #53 numbers

numbers

Troll

• Banned
• 228 posts
• Age: 37
• no affiliation
• Agnostic
• Houston

Posted 31 May 2008 - 10:18 PM

Excuse me numbers, but it seems that your knowledge about statistics is lower than jamesf, I spent a lot of time making him understand, heÃ‚Â´s already convinced of that, I wont waste my time trying to convince you too.But IÃ‚Â´ll give you a list of things that if you understand maybe you realise why you are wrong.

1 - Difference between number of possibilities and number of tries.
2 - Difference between probability 'a priori' and probability 'a posteriori'
3 - Bayes Theorem.
Good Luck

Rather than try to argue about who understands less about statistics lets try to figure out where the errors are.

P(mutation)=#mutations/size of genome
I think your error is in treating mutation rate as if # mutations did not vary proportionately to genome size. There is absolutely no evidence that this is the case and you have provided no reason to think this is true. Instead evidence suggests that P(mutation) is more or less constant in the realm of 1 mistake for every 200-300 million nucleotides. James provided links to two studies showing this. The below example is the simplest I can think of to try and show where you are going wrong.

Chance of picking a specific card=1/52
Chance of not picking a specific card=1-(1/52)
Assuming each draw is independent.

What are the odds of person #1 in a group of 1 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1 in a group of 2 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1 in a group of 3 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1 in a group of 3,000,000,000 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1,304,086,057 in a group of 3,000,000,000 picking an ace of spades from a deck of cards?
1/52.

DNA Example (exactly the same as your card example):
Chance of copy error (aka mutation). 1.8x10^-8
Chance of not making a copy error. 1-(1.8x10^-8)
Assuming each nucleotide is independent.

What are the odds of nucleotide #1 in a genome with size=1 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1 in a genome with size=2 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1 in a genome with size=3 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1 in a genome with size=3,000,000,000 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1,304,086,057 in a genome with size=3,000,000,000 making a copy error?
1.8x10^-8.

You should be able to see that the card example is correct. Hopefully it clarifies why your wrong in thinking position matters to mutation probability. If it isn't clear, start with the genome size=1 example and figure out what the probability is for a mutation. Then work up to the genome size=2. See if the probabilities match, if they don't please show your math so I can see what your actually trying to calculate.

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 01 June 2008 - 04:50 AM

Rather than try to argue about who understands less about statistics lets try to figure out where the errors are.

P(mutation)=#mutations/size of genome
I think your error is in treating mutation rate as if # mutations did not vary proportionately to genome size.  There is absolutely no evidence that this is the case and you have provided no reason to think this is true.  Instead evidence suggests that P(mutation) is more or less constant in the realm of 1 mistake for every 200-300 million nucleotides.  James provided links to two studies showing this.  The below example is the simplest I can think of to try and show where you are going wrong.

Chance of picking a specific card=1/52
Chance of not picking a specific card=1-(1/52)
Assuming each draw is independent.

What are the odds of person #1 in a group of 1 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1 in a group of 2 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1 in a group of 3 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1 in a group of 3,000,000,000 picking an ace of spades from a deck of cards?
1/52.

What are the odds of person #1,304,086,057 in a group of 3,000,000,000 picking an ace of spades from a deck of cards?
1/52.
DNA Example (exactly the same as your card example):
Chance of copy error (aka mutation). 1.8x10^-8
Chance of not making a copy error. 1-(1.8x10^-8)
Assuming each nucleotide is independent.

What are the odds of nucleotide #1 in a genome with size=1 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1 in a genome with size=2 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1 in a genome with size=3 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1 in a genome with size=3,000,000,000 making a copy error?
1.8x10^-8.

What are the odds of nucleotide #1,304,086,057 in a genome with size=3,000,000,000 making a copy error?
1.8x10^-8.
You should be able to see that the card example is correct.  Hopefully it clarifies why your wrong in thinking position matters to mutation probability.  If it isn't clear, start with the genome size=1 example and figure out what the probability is for a mutation.  Then work up to the genome size=2. See if the probabilities match, if they don't please show your math so I can see what your actually trying to calculate.

I didnt say that position matters to mutation probability, I said that the probability of a mutation happens is different from the probability that mutation happens in a specific location.

1 - We have a mutation 'a priori' = 10^-8
2 - We have a mutatioin 'a posteriori' = 3 * 10^-9 ( positions of genome )

Imagine you have 10 persons with 10 deck of cards, if three aces of spades were drawn, what is the probability of those cards were drawn by person #1, #5 and #8 ?

Bayes' theorem

### #55 numbers

numbers

Troll

• Banned
• 228 posts
• Age: 37
• no affiliation
• Agnostic
• Houston

Posted 01 June 2008 - 10:17 PM

I didnt say that position matters to mutation probability, I said that the probability of a mutation happens is different from the probability that mutation happens in a specific location.

You keep claiming this but never actually bother to provide any math to back it up. Trying to use bayes with independent variables like genome size and mutation probability will simply result in P(B ) getting cancelled out and the rather trivial result of P(A|B )=P(A)

1 - We have a mutation 'a priori' = 10^-8
2 - We have a mutatioin 'a posteriori' = 3 * 10^-9 ( positions of genome )

Imagine you have 10 persons with 10 deck of cards, if three aces of spades were drawn, what is the probability of those cards were drawn by person #1, #5 and #8 ?

Bayes' theorem

P(A|B )=P(B|A)*P(A)/P(B )

A=select 3 out of 10
B=3 mutations

In your example P(A|B ) will simply be 3/10 which is the same as P(A).

This relationship P(A|B )=P(A) is the condition that indicates statistical independence. When this condition is true the two variables A and B are independent of each other. Furthermore because of this statistical independence it's also true that P(mutation|position)=P(mutation) which is what you have been trying unsuccessfully to deny. Using Bayes theorem is just a more complicated way of showing the same thing I've pointed out in my earlier posts.

To put it in your own terms "the probability of a mutation happens is NOT different from the probability that mutation happens in a specific location". If you disagree please try and calculate P(mutation|position) in such a way that you don't get P(mutation).

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 02 June 2008 - 02:45 AM

You keep claiming this but never actually bother to provide any math to back it up.Ã‚Â  Trying to use bayes with independent variables like genome size and mutation probability will simply result in P(B ) getting cancelled out and the rather trivial result of P(A|B )=P(A)
P(A|B )=P(B|A)*P(A)/P(B )

A=select 3 out of 10
B=3 mutations

In your example P(A|B ) will simply be 3/10 which is the same as P(A).Ã‚Â

the correct formula is :

P( Ai ) = 1/10 and P ( B ) = 3/10

P( Ai|B ) = P(Ai) * P( B/Ai ) / (P(A1) * P(B/A1)) + (P(A2) * P(B/A2)) + ... + P(An)* P(B/An)

In the above case the result will be 1/10 not 3/10

This relationship P(A|B )=P(A) is the condition that indicates statistical independence.Ã‚Â  When this condition is true the two variables A and B are independent of each other.Ã‚Â  Furthermore because of this statistical independence it's also true that P(mutation|position)=P(mutation) which is what you have been trying unsuccessfully to deny.Ã‚Â  Using Bayes theorem is just a more complicated way of showing the same thing I've pointed out in my earlier posts.

To put it in your own terms "the probability of a mutation happens is NOT different from the probability that mutation happens in a specific location".Ã‚Â  If you disagree please try and calculate P(mutation|position) in such a way that you don't get P(mutation).

P( c ) = 1/52
P( p ) = 1/10

P(p/c ) = ( 1/10 * 1/52 )/ (1/10 * 1/52 ) * 10
P(p/c ) = ( 1/ 520 ) / ( 1 / 520 ) * 10
P(p/c ) = ( 1 / 520 ) / ( 10 / 520 )
P(p/c ) = ( 1 / 520 ) / ( 1 / 52 )
P(p/c ) = 52/520 = 1 / 10.

As you can notice the result is the probability of it happens in a specific position, not the probability of it happens.If you replace 1/52 with 10^-8 and 1/10 with ( 3*10^-9), the result is obvious.

### #57 numbers

numbers

Troll

• Banned
• 228 posts
• Age: 37
• no affiliation
• Agnostic
• Houston

Posted 02 June 2008 - 10:03 AM

the correct formula is :

P( Ai ) = 1/10 and P ( B ) = 3/10

I did screw up the odds of the group {1,5,8} it'll actually be 1/(10C3)=1/120=P(A). If you want to use 1/10 (odds of (10C1) ) that's fine too and much easier because it's just the odds of picking {1}. I'll use P(A) or P(p) and you can put whatever number you want in.

P( Ai|B ) = P(Ai) * P( B/Ai ) / (P(A1) * P(B/A1)) + (P(A2) * P(B/A2)) + ... + P(An)* P(B/An)

In the above case the result will be 1/10 not 3/10
P( c ) = 1/52
P( p ) = 1/10

P(p/c ) = ( 1/10 * 1/52 )/ (1/10 * 1/52 ) * 10
P(p/c ) = ( 1/ 520 ) / ( 1 / 520 ) * 10
P(p/c ) = ( 1 / 520 ) / ( 10 / 520 )
P(p/c ) = ( 1 / 520 ) / ( 1 / 52 )
P(p/c ) = 52/520 = 1 / 10.

That's the odds of 1 mutation occuring in 1 of 10 spots. You specified 3 aces which means the prob=3*P(p|c)=3/10.
Notice that you are calculating the odds of P(p|c) when the odds of a mutation occurring given a specific position is P(c|p). Also notice that you arrived at the same conclusion I did P(p|c)=P(p) and that therefore P(c|p)=P(c ).

As you can notice the result is the probability of it happens in a specific position, not the probability of it happens.If you replace 1/52 with 10^-8 and 1/10 with ( 3*10^-9), the result is obvious.

Your answer of 1/10 is the odds of a specific position containing the only mutation that occurs. You can convert by P(p|c)*#mutations=P(c|p). If there are 3 mutations the odds are 3/10 for one of the 3 mutations getting placed in a specific position.

P(p|c)=P(p)=1/genome size

P(c|p)=P(c )=#mutations/genome size
#mutations=P(c )/genome size
P(p)*#mutation=P(c|p)=P(c )

It also doesn't help that your example had 3 mutations in 10 tries even though the expected number of mutations was ~.2.

If you want to see things clearly you should use a #mutations that is consistent with the probability of mutation. I.e. if P© is 1/52 use a number of players=52 and number of aces=1. You'll get a P(c )=number of c *P(p). It might help to switch to coin flips so you only have to use multiples of 2.

genome size=4 coins

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 02 June 2008 - 10:14 AM

That's the odds of 1 mutation occuring in 1 of 10 spots. You specified 3 aces which means the prob=3*P(p|c)=3/10.
Notice that you are calculating the odds of P(p|c) when the odds of a mutation occurring given a specific position is P(c|p). Also notice that you arrived at the same conclusion I did P(p|c)=P(p) and that therefore P(c|p)=P(c ).

Numbers, what you are saying is : Knowing that if a mutation happens it must happen in the position 10, then what is the probability of a mutation happens in position 10 ?

Of course your reasoning is completely flawed.The correct question is :

If a mutation happens what is the probability of it happens in position 10 ?

### #59 numbers

numbers

Troll

• Banned
• 228 posts
• Age: 37
• no affiliation
• Agnostic
• Houston

Posted 02 June 2008 - 03:51 PM

Numbers, what you are saying is : Knowing that if a mutation happens it must happen in the position 10, then what is the probability of a mutation happens in position 10 ?

Of course your reasoning is completely flawed.The correct question is :

If a mutation happens what is the probability of it happens in position 10 ?

Assuming 10 positions and 1 mutation, the probability of position 10 is 1/10. However you are arbitrarily picking numbers of mutations that are inconsistent with the existing probabilities. In the example of 1 mutation in 10 positions the calculated probability of mutation will also be 1/10.

Here's a formal proof that P(prob at location)=P(prob), no extraneous numbers, just logic.

1.) P(mutation)=#mutations/genome size

2.) #mutations=genome size*P(mutation)

3.) P(object at position X) = available objects/available positions
example: 2 pegs 3 holes. P(peg at hole 3)=2/3

4.) available objects=# mutations

5.) available positions= genome size

6.) P(object at position X) = #mutations/genome size

7.) P(object at position X) = P(mutation)

QED

The P(mutation) in the genome is 1 mutation every 200-300 million or ~2x10^-8, therefore the odds of a mutation occurring at position 10 is also ~2x10^-8.

Veteran Member

• Veteran Member
• 1,196 posts
• Age: 43
• Christian
• Creationist
• Rio de Janeiro

Posted 02 June 2008 - 03:56 PM

Assuming 10 positions and 1 mutation, the probability of position 10 is 1/10. However you are arbitrarily picking numbers of mutations that are inconsistent with the existing probabilities.Ã‚Â  In the example of 1 mutation in 10 positions the calculated probability of mutation will also be 1/10.
Here's a formal proof that P(prob at location)=P(prob), no extraneous numbers, just logic.

1.) P(mutation)=#mutations/genome size

2.) #mutations=genome size*P(mutation)

3.) P(object at position X) = available objects/available positions
Ã‚Â  Ã‚Â  example: 2 pegs 3 holes.Ã‚Â  P(peg at hole 3)=2/3

4.) available objects=# mutations

5.) available positions= genome size

6.) P(object at position X) = #mutations/genome size

7.) P(object at position X) = P(mutation)

QED

The P(mutation) in the genome is 1 mutation every 200-300 million or ~2x10^-8, therefore the odds of a mutation occurring at position 10 is also ~2x10^-8.

ItÃ‚Â´s irrelevant what is the probability of a mutation.The probability of it happens in a specific position is always 1/ ( number of positions ).

What happens first ? the mutation or the position ? ItÃ‚Â´s impossible to know the position before it happens. So, the probability is 'given a mutation happened' what is the probability of it has happened in position x.I canÃ‚Â´t understand how you cannot figure out so obvious reasoning.

Look at your equation 6.), if we use the example I gave with the deck of cards it would be: 52 cards, 10 persons ( positions ) then :

P(object at position X) = 52/10.

What would be an absurd, because all probability must be a number between 0 and 1. As I said you must use the Bayes Theorem to solve that.

#### 0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users