World Families Forums - some basic statistics

Welcome, Guest. Please login or register.
July 31, 2014, 08:46:14 AM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  some basic statistics
« previous next »
Pages: [1] 2 Go Down Print
Author Topic: some basic statistics  (Read 2365 times)
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« on: May 29, 2012, 08:04:42 AM »

I read over on rootsweb a discussion where a person had 4 somewhat rare mutations.  He ascertained that by looking at the percent of persons having that allele value in his Hg.  They then said that since there was a 1% occurrence of each of three of those, the probability of all three simultaneously is .001.

I don't agree with that conclusion.  What you have to do, I believe, is form a probability table from the mutation rates.  For 67 dys loci, e.g., you then sum all the rates and that is then the probabillity of a mutation.  Defining 1 - sum as the probability of no mutations defines and completes the set.  See Schaums outline of Probability and Statistics for examples.

Therefore in our case, the probabilities will run from  about 10^-2 to 10^-4 for each dys loci.  I would then estimate for slow, rare mutations like 426, 388, 454, 455 etc. that the probability of three of the slower mutations as being 10^-12 not 10^-3.

By the way, this is my reasoning for discounting GD, all dys loci mutations are not equal!
« Last Edit: May 29, 2012, 08:05:43 AM by ironroad41 » Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #1 on: May 29, 2012, 09:40:42 AM »

I read over on rootsweb a discussion where a person had 4 somewhat rare mutations.  He ascertained that by looking at the percent of persons having that allele value in his Hg.  They then said that since there was a 1% occurrence of each of three of those, the probability of all three simultaneously is .001.

I don't agree with that conclusion.  What you have to do, I believe, is form a probability table from the mutation rates.  For 67 dys loci, e.g., you then sum all the rates and that is then the probabillity of a mutation.  Defining 1 - sum as the probability of no mutations defines and completes the set.  See Schaums outline of Probability and Statistics for examples.

Therefore in our case, the probabilities will run from  about 10^-2 to 10^-4 for each dys loci.  I would then estimate for slow, rare mutations like 426, 388, 454, 455 etc. that the probability of three of the slower mutations as being 10^-12 not 10^-3.

By the way, this is my reasoning for discounting GD, all dys loci mutations are not equal!

I'm not following along with this Rootsweb discussion, but I agree with you that Genetic Distance counting between pairs of haplotypes can be misleading, particularly when looking at only a small number of haplotypes.  When we look at just one person, like you are in the case above, we are in danger of drowning while crossing the river with an average depth of 3 feet.  Where I am from we have a river described as a mile wide and foot deep, but I can tell you first hand that there are dangerous channels where the current is swift and the depth can be up to 10-15 feet. I don't think statistically calculated probabilities can be successfully applied to a sample of one.

However, I absolutely agree with you that all STRs are not equal. In some cases I treat them differently.  On all of the multi-copy STRs I use forms of modified infinite-allele counting rather than straight one increment equal one step counting. In this way, I do  and make some effort at normalizing between STRs.  Still, you are quite correct, some STRs rarely mutate in comparion to others.  The problem is that no matter how slow an STR mutates it could have happened in the last generation.

This is why I think it is important to look at STR off-modal signatures for haplogroups, predicted deep ancestral varieties and predicted clusters. You can get some idea of the sequence of the mutations. Some times it is an obvious that a very slow STR mutated recently and should be discounted.  For example, you have ten people with the same last name and that match on some high number of STRs, say 63 or 64 out of 67. If just one of the ten has a very slow marker that is off-modal from the rest and from the haplogroup, it is reasonable to predict that probably is a recent mutation.

I think that while GD calculations can be misleading, they can also be helpful as a another tool to evaluate groups of people. We don't need to throw the baby out with the bath water, but we should cross-check outcomes with other methods.

Other tools that can provide cross-checking for possible errors or anomalies include deep clade testing to the terminal SNP and testing out to more STRs.
« Last Edit: May 29, 2012, 09:45:06 AM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #2 on: May 29, 2012, 10:41:14 AM »

If possible I would like to extend this line of thinking.  The mutation data is usually per generation, so the probabilities are also per gen.  Further, the reference for the distance is usually the modal value of the set of data which can be misleading?  Especially so, if a "bottleneck" occurred in the Hg under analysis.

At least two events occur in a bottleneck: a. the diversity of the hg is severely decreased. b.  It takes a long time for the population to recover and begin to grow at a reasonable rate.  This can result in the Hg before the "disaster" having a different set of modal values than after.

Another issue that I have observed is that there appears to be a discontinuity in the modal values from Hg E to Hg R?  I will be discussing hgs E3a, E3b, G, I , J2, R1a, R1b.  From the table from rootsweb.ancestry.com of modal values vs hg I observe the following: Dys 393: 13,13, 14,13,12,13,13;  388: 12,12,12/13,14,15,12,12.  I have cherrypicked these patterns as their are many different patterns also.  But, sometimes I ask myself how the Hg tree evolved and is the time between the occurrence of Hgs the same.  I don't think so, it more appears like there was Hg E and then G to R occurred in almost parallel?

What I'm trying to come to grips with here is how Hgs evolved and how do we understand what outlier haplotypes tell us about the evolution of the haplogroup?

Based on the extending the analysis of my first post in this series, I am saying that the probability of a haplotype having 3 or 4 dys loci non modal values for "rare" dys loci is extremely small and suggests that a large period of time has passed since the Hg began?  Note:  In this discussion clade/sub-clade can be substituted for Hg.
« Last Edit: May 29, 2012, 10:43:15 AM by ironroad41 » Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #3 on: May 29, 2012, 10:50:45 AM »

I read over on rootsweb a discussion where a person had 4 somewhat rare mutations.  He ascertained that by looking at the percent of persons having that allele value in his Hg.  They then said that since there was a 1% occurrence of each of three of those, the probability of all three simultaneously is .001.

I don't agree with that conclusion.  What you have to do, I believe, is form a probability table from the mutation rates.  For 67 dys loci, e.g., you then sum all the rates and that is then the probabillity of a mutation.  Defining 1 - sum as the probability of no mutations defines and completes the set.  See Schaums outline of Probability and Statistics for examples.

Therefore in our case, the probabilities will run from  about 10^-2 to 10^-4 for each dys loci.  I would then estimate for slow, rare mutations like 426, 388, 454, 455 etc. that the probability of three of the slower mutations as being 10^-12 not 10^-3.

By the way, this is my reasoning for discounting GD, all dys loci mutations are not equal!

.

However, I absolutely agree with you that all STRs are not equal. In some cases I treat them differently.  On all of the multi-copy STRs I use forms of modified infinite-allele counting rather than straight one increment equal one step counting. In this way, I do  and make some effort at normalizing between STRs.  Still, you are quite correct, some STRs rarely mutate in comparion to others.  The problem is that no matter how slow an STR mutates it could have happened in the last generation.



I have observed in the Ian Cam data a mutation at 426 which is extremely rare.  So your last comment is correct.  That said, we are trying to understand combinations of rare allele values.  In that context, and assuming independent events, the probability of two or more rare mutations goes as the product of the independent probabilities.  It would appear to increase in probability as we consider more generations, but as soon as we consider 100 or 1000 generations we are talking about long periods of time.
Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #4 on: May 30, 2012, 01:09:26 PM »

I am continuing a discussion about diversity on this board, because it seems more appropriate.  I have, in the past, on other forums, proposed a model for the mutational process.  I can't comment on the apparent memory requirement or how knowledge of what mutations have occurred in a line is retained; that is the subject more appropriate to a different forum.

I suggest that the mutational process can be viewed as a preprogrammed slot machine.  It has 25, 37, 67, 111 or what have you different channels (equivalent to the dys loci we study.)  The arm of the machine is pulled at every meiosis and a similar or new pattern is observed.  Each channel (dys loci) has a different rate of mutation and the range of rates over the range of channels exceeds 100:1 or so.  All of this is consistent with father/son observations and analysis of haplotypes.

As Mike commented above, it is possible to observe a very slow mutation at any time.  The better comment is when will the next one occur?  The gambler, who plays slot machines, knows that the probability of two big payoffs in a row is miniscule.  If it wasn't the house would be broke.  So, most pulls of the arm result in no change and the faster mutators occur more frequently, which, again, is consistent with observations.

I think a direct conclusion from this model is that outlier haplotypes are very old and rare, and that the GD model for estimating separation in time is flawed.  JMHO.
« Last Edit: May 30, 2012, 01:10:54 PM by ironroad41 » Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #5 on: May 30, 2012, 02:00:56 PM »

I think a direct conclusion from this model is that outlier haplotypes are very old and rare

and conversely those closet to the modal very young and common ?
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #6 on: May 30, 2012, 02:11:29 PM »

thats a good comment.  Off the top, I think I agree, but I'll mull it over. 
Logged
razyn
Old Hand
****
Offline Offline

Posts: 405


« Reply #7 on: May 30, 2012, 02:21:09 PM »

I think a direct conclusion from this model is that outlier haplotypes are very old and rare

and conversely those closet to the modal very young and common ?

Isn't that "by definition," in that these "modals" of which we speak are constructed from the values most prevalent -- statistically -- in the modern population?  Which could be paraphrased as young, and common... although that rarely is mentioned.

This contrast set also highlights the inherent meaninglessness of two of Gioiello's Golden Rules: that marker values cluster around the modal, except when they head for the tangent, and don't.  Mike W pointed this out a few weeks ago, a little more subtly.  Something to the effect that "you've covered all the bases."  Mike is from a baseball family.
Logged

R1b Z196*
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #8 on: May 30, 2012, 02:27:28 PM »

I think a direct conclusion from this model is that outlier haplotypes are very old and rare

and conversely those closet to the modal very young and common ?

Isn't that "by definition," in that these "modals" of which we speak are constructed from the values most prevalent -- statistically -- in the modern population?  Which could be paraphrased as young, and common... although that rarely is mentioned.

This contrast set also highlights the inherent meaninglessness of two of Gioiello's Golden Rules: that marker values cluster around the modal, except when they head for the tangent, and don't.  Mike W pointed this out a few weeks ago, a little more subtly.  Something to the effect that "you've covered all the bases."  Mike is from a baseball family.

no I wasn't being very subtle

I suppose I should say I was throwing a googly :)
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #9 on: May 30, 2012, 04:02:34 PM »

Sorry, I don't appear to be as good at games as razyn and yourself.  I don't think Gioellos rules are broken.  What is important in answering your "subtle" question is also to ask what is the mutation rate of the dys loci you are referring to.

Being off modal at a fast mutator has very little information.  Being off modal at a very slow mutator does.  You can blithely use the word random and say it occurred by normal chance or you can find a smaller subset with an odd set of values at the slower mutators.  what I think Gieoello is getting at that many of the slower mutators only have a modal +/- 1 set of values.  If they have had two mutations over the time of the haplotype, they null out and no change is observed (hidden mutation).  This reduces the apparent diversity which is your measure of time?

My point is what is your diversity trying to show?  Age.  No way.  Relative age, possibly, but until you understand the rules for mutations, you have to be careful.

I now better understand why Marko H. dropped off these forums.  I think I'll join him.  good luck in your "studies"!!!
Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #10 on: May 30, 2012, 04:20:54 PM »

Sorry, I don't appear to be as good at games as razyn and yourself.  I don't think Gioellos rules are broken.  What is important in answering your "subtle" question is also to ask what is the mutation rate of the dys loci you are referring to.

Being off modal at a fast mutator has very little information.  Being off modal at a very slow mutator does.  You can blithely use the word random and say it occurred by normal chance or you can find a smaller subset with an odd set of values at the slower mutators.  what I think Gieoello is getting at that many of the slower mutators only have a modal +/- 1 set of values.  If they have had two mutations over the time of the haplotype, they null out and no change is observed (hidden mutation).  This reduces the apparent diversity which is your measure of time?

My point is what is your diversity trying to show?  Age.  No way.  Relative age, possibly, but until you understand the rules for mutations, you have to be careful.

I now better understand why Marko H. dropped off these forums.  I think I'll join him.  good luck in your "studies"!!!

I'm not referring to one loci, that would be pointless

case in point

in the Z18 project the closest person to the extended WAMH modal at 111 loci has almost half the GD from that modal as the person who's the furthest.

The closest is Z18+, Z14+, Z372- whilst the furthest is Z18+, Z14+, Z372+, L257+

Claiming one of these has a younger or older haplotye is meaningless, and in fact they are just as rare as each other. What they are however is opposite ends of a normal statistical distribution and most people are somewhere in the middle, as you should expect.

BTW the GDs were measured using the hybrid mutation model but almost all of the values were only one step from the modal value anyway.
« Last Edit: May 30, 2012, 04:23:16 PM by Jdean » Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #11 on: May 30, 2012, 08:56:59 PM »

j....
My point is what is your diversity trying to show?  Age.  No way.  Relative age, possibly, but until you understand the rules for mutations, you have to be careful. ...

You are contradicting yourself to some extent. If diversity can show relative age with some consistency then all we need is the right scale to calibrate to and walaa! we have real time.... or we can run multiple scenarios, depending on the mutation rates you like.
« Last Edit: May 30, 2012, 08:57:48 PM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #12 on: May 31, 2012, 10:00:01 AM »

Sorry, I don't appear to be as good at games as razyn and yourself.  I don't think Gioellos rules are broken.  What is important in answering your "subtle" question is also to ask what is the mutation rate of the dys loci you are referring to.

Being off modal at a fast mutator has very little information.  Being off modal at a very slow mutator does.  You can blithely use the word random and say it occurred by normal chance or you can find a smaller subset with an odd set of values at the slower mutators.  what I think Gieoello is getting at that many of the slower mutators only have a modal +/- 1 set of values.  If they have had two mutations over the time of the haplotype, they null out and no change is observed (hidden mutation).  This reduces the apparent diversity which is your measure of time?

My point is what is your diversity trying to show?  Age.  No way.  Relative age, possibly, but until you understand the rules for mutations, you have to be careful.

I now better understand why Marko H. dropped off these forums.  I think I'll join him.  good luck in your "studies"!!!

I'm not referring to one loci, that would be pointless

case in point

in the Z18 project the closest person to the extended WAMH modal at 111 loci has almost half the GD from that modal as the person who's the furthest.

The closest is Z18+, Z14+, Z372- whilst the furthest is Z18+, Z14+, Z372+, L257+

Claiming one of these has a younger or older haplotye is meaningless, and in fact they are just as rare as each other. What they are however is opposite ends of a normal statistical distribution and most people are somewhere in the middle, as you should expect.

BTW the GDs were measured using the hybrid mutation model but almost all of the values were only one step from the modal value anyway.
In the context that you think of GD, it would be meaningless and wrong.  I suppose you think the mutational process is a random process.  If you do please explain your reasoning.  I don't know if you have had any work with random number generators, but it is a tough problem to generate a completely random data set.  What kills many of the PRNG's is correlation.  Well, I may seem dumb to you but I can flat out tell you that the mutational process is correlated!

e.g. your boss likes to hand out "attaboy"s and "aw craps".  Now in his mind 10 attaboys = 1 Aw crap.  Substitute CDYa,b for atta boys and 426,388,454 etc. for Aw Crap.  Thats how much sense GD makes.

I don't believe we have a random process, each mutational event is not equally likely and the probability of two similar events in succession is much higher for faster mutators than slower.
Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #13 on: May 31, 2012, 11:06:56 AM »

If possible I would like to extend this line of thinking.  The mutation data is usually per generation, so the probabilities are also per gen.  Further, the reference for the distance is usually the modal value of the set of data which can be misleading?  Especially so, if a "bottleneck" occurred in the Hg under analysis.

At least two events occur in a bottleneck: a. the diversity of the hg is severely decreased. b.  It takes a long time for the population to recover and begin to grow at a reasonable rate.  This can result in the Hg before the "disaster" having a different set of modal values than after.



Another issue that I have observed is that there appears to be a discontinuity in the modal values from Hg E to Hg R?  I will be discussing hgs E3a, E3b, G, I , J2, R1a, R1b.  From the table from rootsweb.ancestry.com of modal values vs hg I observe the following: Dys 393: 13,13, 14,13,12,13,13;  388: 12,12,12/13,14,15,12,12.  I have cherrypicked these patterns as their are many different patterns also.  But, sometimes I ask myself how the Hg tree evolved and is the time between the occurrence of Hgs the same.  I don't think so, it more appears like there was Hg E and then G to R occurred in almost parallel?

What I'm trying to come to grips with here is how Hgs evolved and how do we understand what outlier haplotypes tell us about the evolution of the haplogroup?

Based on the extending the analysis of my first post in this series, I am saying that the probability of a haplotype having 3 or 4 dys loci non modal values for "rare" dys loci is extremely small and suggests that a large period of time has passed since the Hg began?  Note:  In this discussion clade/sub-clade can be substituted for Hg.

This is the first break in the ISOGG tree: http://www.scirp.org/journal/PaperInformation.aspx?paperID=19566
 
Our haplogroups C and ff. are not descended from out of Africa.  Lets have a rewrite shall we??

edit:  A subsequent post by L.G.Mayka, clarifies this issue and negates the above assertion.  It is also negated by a subsequent poster.  The succession appears to be: Adam, A0, BT (and this is where R lies.).  Note:  this was a paper by Klyosov!
« Last Edit: May 31, 2012, 11:37:26 AM by ironroad41 » Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #14 on: May 31, 2012, 12:52:46 PM »


In the context that you think of GD, it would be meaningless and wrong.  I suppose you think the mutational process is a random process.  If you do please explain your reasoning.  I don't know if you have had any work with random number generators, but it is a tough problem to generate a completely random data set.  What kills many of the PRNG's is correlation.  Well, I may seem dumb to you but I can flat out tell you that the mutational process is correlated!

e.g. your boss likes to hand out "attaboy"s and "aw craps".  Now in his mind 10 attaboys = 1 Aw crap.  Substitute CDYa,b for atta boys and 426,388,454 etc. for Aw Crap.  Thats how much sense GD makes.

I don't believe we have a random process, each mutational event is not equally likely and the probability of two similar events in succession is much higher for faster mutators than slower.

Just because we have a mixed bag of loci with different mutation rates doesn’t stop the process from being random, nether does the apparent fact that the chance of an upward mutation is slightly greater than a downward one.

I wasn't talking about genetic distance but rather following on from your conversation regarding outlier haplotypes and your conviction that they were therefore old, I think I made my point very well.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #15 on: May 31, 2012, 03:27:18 PM »

Just give me your definition of a random process and show me how a sequence of mutations are random?
Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #16 on: May 31, 2012, 04:43:25 PM »

I can provide an estimate of the number of mutations in the Ian Cam of Clan Gregor.  The founder was born c. 1350AD.  There are about 75 entries in the set as of now.  Here are some WAG's at the number of mutations at each of the first 37 dys loci excluding the paired dys loci such as CDY a, b ( which is almost uncountable).  Note: I have tried to identify unique mutational events as defined by C. Kerchner.

1. 1 mutation at: 389i, 448, 460, 442, 393, 426, 455.

2. 2 mutations: 439

3. 4 mutations: 607, 2389ii

4. many mutations:  6@458, 9@449, 7 @ H4, 16 @456, 21@ 576 and 8@570.

NO way is this the output of a random process!  As you might expect 388, 454 and other slow mutators have had no apparent mutations.

If I were a gambling man, I would first pick no mutation, then CDYa,b and then 576 etc. to lay my money on. The house would love it if you bet on 388 or similar dys loci.
Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #17 on: May 31, 2012, 06:18:24 PM »

I can provide an estimate of the number of mutations in the Ian Cam of Clan Gregor.  The founder was born c. 1350AD.  There are about 75 entries in the set as of now.  Here are some WAG's at the number of mutations at each of the first 37 dys loci excluding the paired dys loci such as CDY a, b ( which is almost uncountable).  Note: I have tried to identify unique mutational events as defined by C. Kerchner.

1. 1 mutation at: 389i, 448, 460, 442, 393, 426, 455.

2. 2 mutations: 439

3. 4 mutations: 607, 2389ii

4. many mutations:  6@458, 9@449, 7 @ H4, 16 @456, 21@ 576 and 8@570.

NO way is this the output of a random process!  As you might expect 388, 454 and other slow mutators have had no apparent mutations.

If I were a gambling man, I would first pick no mutation, then CDYa,b and then 576 etc. to lay my money on. The house would love it if you bet on 388 or similar dys loci.

If you were a gambling man how would you be betting on a system that wasn't random and who would you go to too place your bet.

PS what odds would you except :)

PPS and why do you have a list of preferences, if the process wasn't random there would be no choice.
« Last Edit: May 31, 2012, 06:57:03 PM by Jdean » Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #18 on: June 01, 2012, 06:58:15 AM »

If possible I would like to extend this line of thinking.  The mutation data is usually per generation, so the probabilities are also per gen.  Further, the reference for the distance is usually the modal value of the set of data which can be misleading?  Especially so, if a "bottleneck" occurred in the Hg under analysis.

At least two events occur in a bottleneck: a. the diversity of the hg is severely decreased. b.  It takes a long time for the population to recover and begin to grow at a reasonable rate.  This can result in the Hg before the "disaster" having a different set of modal values than after.



Another issue that I have observed is that there appears to be a discontinuity in the modal values from Hg E to Hg R?  I will be discussing hgs E3a, E3b, G, I , J2, R1a, R1b.  From the table from rootsweb.ancestry.com of modal values vs hg I observe the following: Dys 393: 13,13, 14,13,12,13,13;  388: 12,12,12/13,14,15,12,12.  I have cherrypicked these patterns as their are many different patterns also.  But, sometimes I ask myself how the Hg tree evolved and is the time between the occurrence of Hgs the same.  I don't think so, it more appears like there was Hg E and then G to R occurred in almost parallel?

What I'm trying to come to grips with here is how Hgs evolved and how do we understand what outlier haplotypes tell us about the evolution of the haplogroup?

Based on the extending the analysis of my first post in this series, I am saying that the probability of a haplotype having 3 or 4 dys loci non modal values for "rare" dys loci is extremely small and suggests that a large period of time has passed since the Hg began?  Note:  In this discussion clade/sub-clade can be substituted for Hg.

This is the first break in the ISOGG tree: http://www.scirp.org/journal/PaperInformation.aspx?paperID=19566
 
Our haplogroups C and ff. are not descended from out of Africa.  Lets have a rewrite shall we??

edit:  A subsequent post by L.G.Mayka, clarifies this issue and negates the above assertion.  It is also negated by a subsequent poster.  The succession appears to be: Adam, A0, BT (and this is where R lies.).  Note:  this was a paper by Klyosov!

 Well, it appears to be argumentative?  here is a post by Didier supporting Klyosov's work: From: didier.vernade@Safe-mail.net
Subject: Re: [DNA] "Out of Africa" and R1b new papers published
Date: Thu, 31 May 2012 19:02:54 -0400

I read the 2 papers and here is my output.
 
Reminder :
 1 - Re-Examining the "Out of Africa" Theory and the Origin of Europeoids (Caucasoids) in Light of DNA Genealogy
 http://www.scirp.org/journal/PaperInformation.aspx?paperID=19566
 
2 - Ancient History of the Arbins, Bearers of Haplogroup R1b, from Central Asia to Europe, 16,000 to 1500 Years before Present
 http://www.scirp.org/journal/PaperInformation.aspx?paperID=19567
 
Paper 1 is a very interesting paper. Figure 3, in my opinion, is a breakthrough and probably the major breakthrough brought by A. Klyosov. I do have criticisms but I want first to make sure that my later critics will not be taken as dismissing this paper.
My main "difficulty" with this paper is about the search of an alternative geographical origin for "Adam" (or whatever name you give to this MRCA). Best would be not to propose any alternative and only to point the shape of the tree on Figure 3 and the timing suggest that Alpha and Beta had different geographical localization. I also do NT see the point on the (too) long discussion on the SNPs. I understand that many people still have the ancient M91 origin in mind but it doesn't make any point to the paper. Last, there might be some possible discussion on how, from the 4 A haplogroup clusters , acknowledged to be very different, a base haplotype was "found" ; a difference in 1 or 2 values might affect the timing but I do admit that it wouldn't change very much.
 
Paper 2 is very difficult to read. Probably because A. Klyosov had to update a story presented many times and he wanted to include new data from several different sources, often a few haplotypes here and there, to the global picture.
 Let me go to the point. I never accepted as established the migration by R1b-M269 by a north African route and I really think that the data presented do NT support this view. First, I would like to point that I don't think the map on Figure 10 is fair. As everyone knows there are plenty R-L23 and R-L51 in eastern Europe and in the Balkans and this map is more or less minimizing this fact. I undersrand that it's unwanted but the result is disturbing as it favors the north african hypothesis. Why ? Well, I would like first to remind people of this list that R-L23 and R-51 were the clades early reported (with the RFLP p49a,f assay) as "ht35" . Several groups looked for "ht35" (as opposed to "ht15" for the western type) and they localized them in eastern Europe and the Balkans ; in the middle east to some extent. The picture has changed but, roughly, it's clear that L51 ( a SNP known to be upstream of L11) is rare in western Europe as compared to eastern Europe. So (I go strai!
 ght to the point) if the route throught north Africa was made by R1b-M269 (+ some R-L23 impossible to find anymore) the geographical localization of R-L51 is hard to explain.

I am not going to produce an alternative explanation out of my hat. Let say that I posted that R-M269 came up to Italy but were stopped , it seems and, possibly, changed there from a terrestrial move to a sailing one. From Italy it's possible to reach north Africa near the Iberian coast. The difference is that the group reaching Iberia was a probably a mix including R-M269, R-L51 and possibly R-L11.

Here is my two cents on this question.  I thought the basic premise of the Hg subdivisions was a series of SNP's. showing descent.  If a person doesn't have an SNP, what does that mean?   I would assume that it means you are not part of that lineage?  Mayka, says differently.  I'm not sure what is correct at this time?

ps.  the second paper is also reviewed and commented on in a manner probably not appreciated by this board.  However it follows if paper one is correct.  The out of africa doesn't make sense if we are not descendants of hg A and B.  So, then the question is where did M269 and originate and when?  Asia or Europe?
« Last Edit: June 01, 2012, 07:01:55 AM by ironroad41 » Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #19 on: June 01, 2012, 07:23:18 AM »

Here is my two cents on this question.  I thought the basic premise of the Hg subdivisions was a series of SNP's. showing descent.
Yes, I think that is fairly clear.

If a person doesn't have an SNP, what does that mean?   I would assume that it means you are not part of that lineage?  Mayka, says differently.  I'm not sure what is correct at this time?

I don't see the confusion about being + (derived) or - (ancestral) for an SNP.  Someone who is + for SNP X is closer related to everyone who is + for SNP X than a person who is - for SNP X.
« Last Edit: June 01, 2012, 07:46:37 AM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #20 on: June 01, 2012, 07:29:23 AM »

...As Mike commented above, it is possible to observe a very slow mutation at any time.  The better comment is when will the next one occur?  The gambler, who plays slot machines, knows that the probability of two big payoffs in a row is miniscule.  If it wasn't the house would be broke.  So, most pulls of the arm result in no change and the faster mutators occur more frequently, which, again, is consistent with observations. ...

I'm not quite following your whole train of thought here but I disagree with a nuance of how a pragmatic gambler views probabilities.  Assuming the game, whatever it is, is not fixed, then the odds of a particular outcome are no different from one play to the next.  There is no change in the "expected" probability.  The best example is if red comes up on roulett e. The next spin has no less chance of being red than the last spin.  All that we can really using to predict or estimate with is the "expected" probability.
« Last Edit: June 01, 2012, 07:45:39 AM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #21 on: June 01, 2012, 07:38:14 AM »

... At least two events occur in a bottleneck: a. the diversity of the hg is severely decreased. b.  It takes a long time for the population to recover and begin to grow at a reasonable rate.  This can result in the Hg before the "disaster" having a different set of modal values than after....

Sure, a bottleneck would decrease diversity as long as branches of the haplogroup's prior tree were lopped in some uneven fashion.  If the bottleneck pruned the prior tree in some systematic fashion to "thin" the tree evenly I'm not so sure the diversity would decrease. Probably, in nature, the bottleneck's pruning of the tree would be uneven.

However, when people estimate the age of a haplogroup they are not really estimating the age of the birth of the SNP, they are estimating the age of the Most Recent Common Ancestor (MRCA) with the same SNP. They are two different things and may well have two different haplotypes, providing different STR starting points for future modals of descendant populations to be calculated.

Given that, if branches of the pre-bottleneck tree are lopped off unevenly all that is happening is that the MRCA is changing to a more recent man than before the branch cutting took place. That MRCA may still be before the bottleneck, it just depends on which branches were cut.

At some point, if lucky, a haplogroup's tree becomes so big or is spread out so geographically, culturally or autosomally (marrying into other groups) that it becomes almost impossible to be severely pruned by a bottleneck.  Many, many haplogroups (seedlings) are gone.  We are just remnants of the lucky ones.

All of this illustrates the need and usefulness for interclade TMRCA calculations, as Ken Nordtvedt has developed.  This separates a pair of clearly related groups (as known by their SNP status) and allows us to compare them.  We know that their interclade TMRCA sets a pretty solid maximum for the actual births of the two different SNPs that mark the two respective groups.   Their interclade MRCA man could NOT have had either of the descendant SNPs so the SNPs have to be younger than him.  I don't want to get teary eyed but this is a beautiful concept in terms of its potential application.
« Last Edit: June 01, 2012, 07:53:13 AM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #22 on: June 01, 2012, 08:03:25 AM »

Here is my two cents on this question.  I thought the basic premise of the Hg subdivisions was a series of SNP's. showing descent.
Yes, I think that is fairly clear.

If a person doesn't have an SNP, what does that mean?   I would assume that it means you are not part of that lineage?  Mayka, says differently.  I'm not sure what is correct at this time?

I don't see the confusion about being + (derived) or - (ancestral) for an SNP.  Someone who is + for SNP X is closer related to everyone who is + for SNP X than a person who is - for SNP X.
  The big issue I see here is what this implies.  If A and B are separate, distinct lineages from C and subsequent, then that is a big deal (in my opinion).

Shades of Velikovsky, Van Daniken, Sitchkin and Roddenberry!!  Beam me up Scottie.
Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #23 on: June 01, 2012, 08:09:09 AM »

...As Mike commented above, it is possible to observe a very slow mutation at any time.  The better comment is when will the next one occur?  The gambler, who plays slot machines, knows that the probability of two big payoffs in a row is miniscule.  If it wasn't the house would be broke.  So, most pulls of the arm result in no change and the faster mutators occur more frequently, which, again, is consistent with observations. ...

I'm not quite following your whole train of thought here but I disagree with a nuance of how a pragmatic gambler views probabilities.  Assuming the game, whatever it is, is not fixed, then the odds of a particular outcome are no different from one play to the next.  There is no change in the "expected" probability.  The best example is if red comes up on roulett e. The next spin has no less chance of being red than the last spin.  All that we can really using to predict or estimate with is the "expected" probability.
I'll admit I have never designed a slot machine.  I do know it is designed such that the house doesn't lose.  More than that is speculation, but I would see no problem in designing a machine where two "grand prizes" in a short period of time cannot occur.  The machines do pay off and the astute gambler knows that.  Thats why the gamers sit in the houses and watch the machines for hours.

Here is an excerpt from designing a slot machine:

Establish payout schedules. While slot machine gaming is completely random, the payback percentage and hit frequencies are programmed into the machine before it even hits the  floor. Some are looser than others, as these schedules vary from one machine to the next. Having a high hit frequency can also contribute to the game's success, as those near-wins can keep the player in the seat.


Read more: How to Design Slot Machines | eHow.com http://www.ehow.com/how_7566975_design-slot-machines.html#ixzz1wXj7zFWQ

So the winning is not random, its preprogrammed.  The machine always wins, its rigged!

edit:  I still believe the basic mutational process is not random, its outcomes prove that to me.  As you illustrated a random outcome is an equally likely outcome process. You like to refer to Ken Nordtvedt for you source of "modelling".  My source is Elwyn R. Berlekamp, whom I worked for off and on for 18 years, mostly as a consultant of the firm he founded after I left Kodak.  Heres his resume from Wikipedia:Berlekamp was born in Dover, Ohio. While an undergraduate at the Massachusetts Institute of Technology (MIT), he was a Putnam Fellow in 1961. He completed his Bachelor's and Master's degrees in electrical engineering in 1962. Continuing his studies at MIT, he finished his Ph.D. in electrical engineering in 1964; his advisors were Claude Shannon, Robert G. Gallager, Peter Elias and John Wozencraft. Berlekamp taught at the University of California, Berkeley from 1964 until 1966, when he became a researcher at Bell Labs. In 1971, Berlekamp returned to Berkeley where, as of 2010, he is a Professor of the Graduate School.[1][2][3].
 
He is a member of the National Academy of Engineering (1977)[4] and the National Academy of Sciences (1999).[5] He was elected a Fellow of the American Academy of Arts and Sciences in 1996.[6] He received in 1991 the IEEE Richard W. Hamming Medal,[7] and in 1998 the Golden Jubilee Award for Technological Innovation from the IEEE Information Theory Society.[8]
 
Berlekamp is one of the inventors of the Welch-Berlekamp and Berlekamp-Massey algorithms, which are used to implement Reed-Solomon error correction. In the mid-1980s, he was president of Cyclotomics, Inc., a corporation which developed error-correcting code technology.[1] With John Horton Conway and Richard K. Guy, he co-authored Winning Ways for your Mathematical Plays, leading to his recognition as one of the founders of combinatorial game theory. He has studied various games, including Fox and Geese and other fox games, dots and boxes, and, especially, Go. With David Wolfe, Berlekamp co-authored the book Mathematical Go, which describes methods for analyzing certain classes of Go endgames.
 
Outside of mathematics and computer science, Berlekamp is active in money management. In 1986, on behalf of Axcom Trading Advisors, a futures trading company, Berlekamp began information-theoretic studies of commodity and financial futures. In 1989, Berlekamp owned the largest interest in Axcom. After the firm's futures trading algorithms were rewritten, Axcom's flagship fund had a return (in 1990) of 55%, net of all management fees and transaction costs. Today, this fund is known as the Medallion Fund and is managed by James Harris Simons and his Renaissance Technologies Corporation.[9]
 
Berlekamp and his wife Jennifer have two daughters and a son and live in Piedmont, California.
« Last Edit: June 01, 2012, 08:53:10 AM by ironroad41 » Logged
Richard Rocca
Old Hand
****
Offline Offline

Posts: 523


« Reply #24 on: June 01, 2012, 08:31:45 AM »

 Well, it appears to be argumentative?  here is a post by Didier supporting Klyosov's work: From: didier.vernade@Safe-mail.net
Subject: Re: [DNA] "Out of Africa" and R1b new papers published
Date: Thu, 31 May 2012 19:02:54 -0400

I read the 2 papers and here is my output.
 
Reminder :
 1 - Re-Examining the "Out of Africa" Theory and the Origin of Europeoids (Caucasoids) in Light of DNA Genealogy
 http://www.scirp.org/journal/PaperInformation.aspx?paperID=19566
 
2 - Ancient History of the Arbins, Bearers of Haplogroup R1b, from Central Asia to Europe, 16,000 to 1500 Years before Present
 http://www.scirp.org/journal/PaperInformation.aspx?paperID=19567
 
Paper 1 is a very interesting paper. Figure 3, in my opinion, is a breakthrough and probably the major breakthrough brought by A. Klyosov. I do have criticisms but I want first to make sure that my later critics will not be taken as dismissing this paper.
My main "difficulty" with this paper is about the search of an alternative geographical origin for "Adam" (or whatever name you give to this MRCA). Best would be not to propose any alternative and only to point the shape of the tree on Figure 3 and the timing suggest that Alpha and Beta had different geographical localization. I also do NT see the point on the (too) long discussion on the SNPs. I understand that many people still have the ancient M91 origin in mind but it doesn't make any point to the paper. Last, there might be some possible discussion on how, from the 4 A haplogroup clusters , acknowledged to be very different, a base haplotype was "found" ; a difference in 1 or 2 values might affect the timing but I do admit that it wouldn't change very much.
 
Paper 2 is very difficult to read. Probably because A. Klyosov had to update a story presented many times and he wanted to include new data from several different sources, often a few haplotypes here and there, to the global picture.
 Let me go to the point. I never accepted as established the migration by R1b-M269 by a north African route and I really think that the data presented do NT support this view. First, I would like to point that I don't think the map on Figure 10 is fair. As everyone knows there are plenty R-L23 and R-L51 in eastern Europe and in the Balkans and this map is more or less minimizing this fact. I undersrand that it's unwanted but the result is disturbing as it favors the north african hypothesis. Why ? Well, I would like first to remind people of this list that R-L23 and R-51 were the clades early reported (with the RFLP p49a,f assay) as "ht35" . Several groups looked for "ht35" (as opposed to "ht15" for the western type) and they localized them in eastern Europe and the Balkans ; in the middle east to some extent. The picture has changed but, roughly, it's clear that L51 ( a SNP known to be upstream of L11) is rare in western Europe as compared to eastern Europe. So (I go strai!
 ght to the point) if the route throught north Africa was made by R1b-M269 (+ some R-L23 impossible to find anymore) the geographical localization of R-L51 is hard to explain.

I am not going to produce an alternative explanation out of my hat. Let say that I posted that R-M269 came up to Italy but were stopped , it seems and, possibly, changed there from a terrestrial move to a sailing one. From Italy it's possible to reach north Africa near the Iberian coast. The difference is that the group reaching Iberia was a probably a mix including R-M269, R-L51 and possibly R-L11.

Here is my two cents on this question.  I thought the basic premise of the Hg subdivisions was a series of SNP's. showing descent.  If a person doesn't have an SNP, what does that mean?   I would assume that it means you are not part of that lineage?  Mayka, says differently.  I'm not sure what is correct at this time?

ps.  the second paper is also reviewed and commented on in a manner probably not appreciated by this board.  However it follows if paper one is correct.  The out of africa doesn't make sense if we are not descendants of hg A and B.  So, then the question is where did M269 and originate and when?  Asia or Europe?

While I'm not in disagreement about Didier's critique of Klyosov's fiction, L51* is not more common in Eastern Europe than Western Europe.
« Last Edit: June 01, 2012, 08:32:14 AM by Richard Rocca » Logged

Paternal: R1b-U152+L2*
Maternal: H
Pages: [1] 2 Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.111 seconds with 17 queries.