World Families Forums - STR Wars: Is diversity meaningful? more meaningful than Hg frequency?

Welcome, Guest. Please login or register.
September 21, 2014, 09:51:43 AM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  STR Wars: Is diversity meaningful? more meaningful than Hg frequency?
« previous next »
Pages: 1 2 [3] 4 5 ... 14 Go Down Print
Author Topic: STR Wars: Is diversity meaningful? more meaningful than Hg frequency?  (Read 18504 times)
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #50 on: April 14, 2012, 12:59:19 PM »


Like it or not, hobbyists who invested the most time and discussion (and I think are most credible) on TMRCAs and STR diversity - Ken, Marko, Anatole (Klyosov), Tim Janzen, Vince, etc. all come out with R-M269 being fairly young, like the 4-8k ybp age.  The Chief Scientist at FTDNA, Michael Hammer, says R-M269 is "4-8k years" old.

Klyosov's method is different than Nordtvedt as is Heinila's.  Still their results are similar.  They do explain their methods and they are available.

They are not overtly different, the methodologies. I can't speak for Heinila's or even Janzen, or much about Nordvedt methodology. But I can say that most of the work Klyosov has done comes from projects from FTDNA, and not from randomly collected data.  May I also remind the readers that Klyosov, and at least what I’ve seen from Nordtvedt get also extremely young TMRCA for almost all European haplogroups including I1, I-M253, I-M26, etc. The only thing I would say, is that from what I have observed unlike Klyosov, who appears to be very close minded when it comes to criticism of his methodology, and resorts to a cesspool of all sort of logical fallacies one couldn’t even imagine, Ken Nordtvedt appears to be more open minded, and even willing to modify his methodology if he believes something is wrong.
« Last Edit: April 14, 2012, 01:01:13 PM by JeanL » Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #51 on: April 14, 2012, 01:05:41 PM »

Like it or not, hobbyists who invested the most time and discussion (and I think are most credible) on TMRCAs and STR diversity - Ken, Marko, Anatole (Klyosov), Tim Janzen, Vince, etc. all come out with R-M269 being fairly young, like the 4-8k ybp age.  The Chief Scientist at FTDNA, Michael Hammer, says R-M269 is "4-8k years" old.

Klyosov's method is different than Nordtvedt as is Heinila's.  Still their results are similar.  They do explain their methods and they are available.

I should add, a non-STR diversity based method, counting SNPs on branch lengths, was used by Karafet et al in 2008 to estimate the age of R1 (not R1b or R1a but their common ancestor) as 18.5k ybp.  FTDNA's Michael Hammer was in that author group.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #52 on: April 14, 2012, 01:26:48 PM »

Believe me, I get lost plenty.

If only 1/2 an SNP occurs per generation, then it would take only 6 generations to get 3, but these are just averages anyway.

however as you pointed out most male lines die out, which is presumably what happened to L21-,L459+ or  L21+,Z245- (and any of the other possible combinations), but I thought reduced extinction rates was one of the facets of the wave surfing idea ?

We don't know if L459 or Z245 is upstream of L21 or not.  Unfortunately, NOT many P312* folks have tested for those.

.. but you just said it, most lineages die out.  Everyone on the wave of an expansion does not prosper. As far as paternal lineages, it looks like those who prosper are fairly limited in number, it's just they do a ton of damage (so to speak.)

I don’t know if it makes that much difference to my point which order the SNPs occurred in apart from the more limited testing under P312, but L21- people are testing L459 & Z245. According to Ymap 274 people have tested for L459 of which 94 were positive. I’m assuming the other 180 were P312+, L21- folk but I suppose Thomas could have been adding tests from WTY as well.

These SNPs could have occurred in 6 generations (or more if we include the untested Z260 & Z290) but what are the chances that all of them were then discovered in the 1000 genome project, at 1 SNP every 2 generations there are presumably many yet to be unearthed !!!
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

seferhabahir
Old Hand
****
Offline Offline

Posts: 272


« Reply #53 on: April 14, 2012, 01:32:05 PM »

Out of 12 L21** members, 2 are Irish, 3 English, 1 Welsh, 1 German, 1 Belarussian, and 4 unknown who might be at least British Isles.  The origin could be British Isles, but not necessarily Ireland.  More testing needed, of course.

I should remind you that I'm really not a Belarussian, and that my Belarussian (or perhaps even my "continental" status) is likely an accident of historical migration of the 1111EE cluster, due to various expulsions over time. If there were a lot more U.S. testers whose Ashkenazi great-grandparents lived in modern day Belarus or Ukraine, it would really skew the R-L21 frequency maps. Of course, I suffer from a huge case of Male Haplogroup Disorder, which makes me believe the origin of L21 is nowhere near the British Isles. But if any non-Ashkenazi person ever shows up in 1111EE, I'm open to considering myself to be a pre-proto-Celt just for fun.
Logged

Y-DNA: R-L21 (Z251+ L583+)

mtDNA: J1c7a

JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #54 on: April 14, 2012, 01:33:04 PM »

I should add, a non-STR diversity based method, counting SNPs on branch lengths, was used by Karafet et al in 2008 to estimate the age of R1 (not R1b or R1a but their common ancestor) as 18.5k ybp.  FTDNA's Michael Hammer was in that author group.

Indeed, that was done by fixing the age of CT at 70 kya, and then interpolating how far downstream each haplogroup was located. In fact here is the quote about it:

Quote from: Karafet et al(2008)
The time to the most recent common ancestral Y chromosome and the estimated ages of 11 major clades are presented in Table 2. To provide estimates of the age of the nodes, we chose to fix the time to the most recent common ancestor of CT (defined by P9.1, M168, and M294) at 70 thousand years ago (Kya), which is consistent with previous estimates from genetic and archaeological data (Lahr and Foley 1998; Hammer and Zegura 2002; Macaulay et al. 2005), and is the chronological approximation given in Jobling et al. (2004) (p250) for the first major human out-of-Africa dispersals. We estimated the times for intermediate nodes by using a linear interpolation. The age estimates in years should be viewed with caution because we do not know if the calibration date chosen above is accurate.[/color]

Moreover, per Table-2 of that study the age of R1 is 18,500 ybp (95% CI 12,500-25,700), and the age of I is 22,200 ybp (95% CI 15,300-30,000), so it seems I is about 1.2 times older than R1. Also, if we move TMRCA of CT upwards, then R1 TMRCA goes upward, and I TMRCA goes up, the same happens if we move it downwards.
« Last Edit: April 14, 2012, 01:50:37 PM by JeanL » Logged
seferhabahir
Old Hand
****
Offline Offline

Posts: 272


« Reply #55 on: April 14, 2012, 02:09:00 PM »

Out of 12 L21** members, 2 are Irish, 3 English, 1 Welsh, 1 German, 1 Belarussian, and 4 unknown who might be at least British Isles.  The origin could be British Isles, but not necessarily Ireland.  More testing needed, of course.
I should remind you that I'm really not a Belarussian, and that my Belarussian (or perhaps even my "continental" status) is likely an accident of historical migration of the 1111EE cluster, due to various expulsions over time. If there were a lot more U.S. testers whose Ashkenazi great-grandparents lived in modern day Belarus or Ukraine, it would really skew the R-L21 frequency maps. Of course, I suffer from a huge case of Male Haplogroup Disorder, which makes me believe the origin of L21 is nowhere near the British Isles. But if any non-Ashkenazi person ever shows up in 1111EE, I'm open to considering myself to be a pre-proto-Celt just for fun.

And one of the Irish L21** just now came back as DF41+ so is no longer in the L21** list.
Logged

Y-DNA: R-L21 (Z251+ L583+)

mtDNA: J1c7a

Dubhthach
Old Hand
****
Online Online

Posts: 273


« Reply #56 on: April 14, 2012, 04:23:59 PM »

Out of 12 L21** members, 2 are Irish, 3 English, 1 Welsh, 1 German, 1 Belarussian, and 4 unknown who might be at least British Isles.  The origin could be British Isles, but not necessarily Ireland.  More testing needed, of course.
I should remind you that I'm really not a Belarussian, and that my Belarussian (or perhaps even my "continental" status) is likely an accident of historical migration of the 1111EE cluster, due to various expulsions over time. If there were a lot more U.S. testers whose Ashkenazi great-grandparents lived in modern day Belarus or Ukraine, it would really skew the R-L21 frequency maps. Of course, I suffer from a huge case of Male Haplogroup Disorder, which makes me believe the origin of L21 is nowhere near the British Isles. But if any non-Ashkenazi person ever shows up in 1111EE, I'm open to considering myself to be a pre-proto-Celt just for fun.

And one of the Irish L21** just now came back as DF41+ so is no longer in the L21** list.

That would be me! :-)
Logged
seferhabahir
Old Hand
****
Offline Offline

Posts: 272


« Reply #57 on: April 14, 2012, 04:29:26 PM »

And one of the Irish L21** just now came back as DF41+ so is no longer in the L21** list.
That would be me! :-)

Yes, very cool. Congratulations on getting into a probable new son of L21...
Logged

Y-DNA: R-L21 (Z251+ L583+)

mtDNA: J1c7a

Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #58 on: April 14, 2012, 10:54:46 PM »

Believe me, I get lost plenty.

If only 1/2 an SNP occurs per generation, then it would take only 6 generations to get 3, but these are just averages anyway.

however as you pointed out most male lines die out, which is presumably what happened to L21-,L459+ or  L21+,Z245- (and any of the other possible combinations), but I thought reduced extinction rates was one of the facets of the wave surfing idea ?

We don't know if L459 or Z245 is upstream of L21 or not.  Unfortunately, NOT many P312* folks have tested for those.

.. but you just said it, most lineages die out.  Everyone on the wave of an expansion does not prosper. As far as paternal lineages, it looks like those who prosper are fairly limited in number, it's just they do a ton of damage (so to speak.)

I don’t know if it makes that much difference to my point which order the SNPs occurred in apart from the more limited testing under P312, but L21- people are testing L459 & Z245. According to Ymap 274 people have tested for L459 of which 94 were positive. I’m assuming the other 180 were P312+, L21- folk but I suppose Thomas could have been adding tests from WTY as well.

The point is that we don't know if L459, L21 and Z245 all happened at about the same time. L459, for instance could have occurred many generations upstream of L21 in some P312* lineage that is now mostly extinct except the L21 sub-element of it.

I don't think you can assume the other 180 were P312xL21. There aren't nearly that many P312* guys in WTY, not even close. Outside of WTY, a only a few P312* have tested for L459.

These SNPs could have occurred in 6 generations (or more if we include the untested Z260 & Z290) but what are the chances that all of them were then discovered in the 1000 genome project, at 1 SNP every 2 generations there are presumably many yet to be unearthed !!!

I don't know the odds and you don't know the odds, but neither us has much idea of the generations between these three SNPs.  You are making assumptions about those three SNPs to support your objections.

We should be careful about what we assume.
« Last Edit: April 14, 2012, 10:56:41 PM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #59 on: April 14, 2012, 11:01:52 PM »

I should add, a non-STR diversity based method, counting SNPs on branch lengths, was used by Karafet et al in 2008 to estimate the age of R1 (not R1b or R1a but their common ancestor) as 18.5k ybp.  FTDNA's Michael Hammer was in that author group.

Indeed, that was done by fixing the age of CT at 70 kya, and then interpolating how far downstream each haplogroup was located. In fact here is the quote about it:

Quote from: Karafet et al(2008)
The time to the most recent common ancestral Y chromosome and the estimated ages of 11 major clades are presented in Table 2. To provide estimates of the age of the nodes, we chose to fix the time to the most recent common ancestor of CT (defined by P9.1, M168, and M294) at 70 thousand years ago (Kya), which is consistent with previous estimates from genetic and archaeological data (Lahr and Foley 1998; Hammer and Zegura 2002; Macaulay et al. 2005), and is the chronological approximation given in Jobling et al. (2004) (p250) for the first major human out-of-Africa dispersals. We estimated the times for intermediate nodes by using a linear interpolation. The age estimates in years should be viewed with caution because we do not know if the calibration date chosen above is accurate.[/color]

Moreover, per Table-2 of that study the age of R1 is 18,500 ybp (95% CI 12,500-25,700), and the age of I is 22,200 ybp (95% CI 15,300-30,000), so it seems I is about 1.2 times older than R1. Also, if we move TMRCA of CT upwards, then R1 TMRCA goes upward, and I TMRCA goes up, the same happens if we move it downwards.


This is no proof. These are just estimates.

Nevertheless, the "most likely" case for an R1 TMRCA estimate using a totally non-STR based (SNP counting) method aligns very nicely with our top scientist-hobbyist TMRCA estimates for R1b and its subclades, our our top scientist-hobbyists are using at least three different methods - Nordtvedt's Gen7, Klyosov's, and Heinila's "most probable outcome."

The net is we have an SNP based method that supports STR variance based methods and three of those methods generally agree.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #60 on: April 14, 2012, 11:09:24 PM »

This is no proof. These are just estimates.

Nevertheless, the "most likely" case for an R1 TMRCA estimate using a totally non-STR based (SNP counting) method aligns very nicely with our top scientist-hobbyist TMRCA estimates for R1b and its subclades, our our top scientist-hobbyists are using at least three different methods - Nordtvedt's Gen7, Klyosov's, and Heinila's "most probable outcome."

The net is we have an SNP based method that supports STR variance based methods and three of those methods generally agree.


What are you talking about when you said: this is no proof?? The net from Karafet et al(2008) is that R1 is 18500 ybp if CT is 70000 ybp, and under that assumption I is 22200 ybp. If you use three different methods which do not take into account the effects of microsatellite choice, you would still get the same age estimates, because the three methods would undermine the age of the haplogroup, so nothing new there. 
« Last Edit: April 14, 2012, 11:11:42 PM by JeanL » Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #61 on: April 15, 2012, 07:01:59 AM »


The point is that we don't know if L459, L21 and Z245 all happened at about the same time. L459, for instance could have occurred many generations upstream of L21 in some P312* lineage that is now mostly extinct except the L21 sub-element of it.

I don't think you can assume the other 180 were P312xL21. There aren't nearly that many P312* guys in WTY, not even close. Outside of WTY, a only a few P312* have tested for L459.

These SNPs could have occurred in 6 generations (or more if we include the untested Z260 & Z290) but what are the chances that all of them were then discovered in the 1000 genome project, at 1 SNP every 2 generations there are presumably many yet to be unearthed !!!

I don't know the odds and you don't know the odds, but neither us has much idea of the generations between these three SNPs.  You are making assumptions about those three SNPs to support your objections.

We should be careful about what we assume.



I’m not assuming anything, simply wondering if this data can be used to some effect :)

Whether or not Thomas included WTY results in the Ymap data is an important consideration (that’s why I mentioned it), the best way to know for sure of course would be to ask him but he can be a little erratic with his replies but of course he’s a busy man. However I think it’s at least more than likely that he does and since there are 122 negative results in WTY for L459 of which 117 are something other than P312*, I think we can reasonably comfortably remove them from the 180 neg results reported at Ymap.

That still leaves 63 and unless we can think of another source of random testing Thomas could be using (I can’t other than the 1000 genome which seems unlikely) I think it’s reasonable to assume (bum, am I allowed one ? :) these 63 are P312*

That would leave a roughly guessed 63 P312+, L459- against 94 L21+, L459+ results which is probably still a little light on numbers to draw concrete conclusions from but at least gives enough detail to say if an L21+, L459- or L21-, L459+ fellow did turn up he would be quite lonely.

Without going through all that again I think it’s fair to say the results for Z245 are going to be roughly in line with those for L459, we at least know with certainty that L21+, Z245- or L21-, Z245+ hasn’t been found.

So getting back to the original question

We have a reasonably good idea that L21, L459 & Z245 are pretty much the same thing, but we don’t know the order they arrived (and probably never will).

What we can say is this suggests there was a reasonable time frame between the P312* grandfather  and the first L21+, L459+, Z245+ fellow, unless these three SNPs happened right on top of each other which sounds less likely.

But from interclade calculations we know there wasn’t that much time between P312, U152 and L21 (or L459 / Z245, whichever was last)

This tells us that L21 (or whatever) split from P312 earlier than interclade calculations can tell us and in my opinion draws questions around the idea that L21’s spread is due to it being born on the crest of a wave. Of course this idea (and it’s only that) doesn’t completely quash the ‘surfing the wave’ idea but to my mind at least suggests L21 sat around somewhere fairly sedentarily for a reasonable time (possibly building up numbers) before surfing out.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #62 on: April 15, 2012, 11:01:12 AM »


The point is that we don't know if L459, L21 and Z245 all happened at about the same time. L459, for instance could have occurred many generations upstream of L21 in some P312* lineage that is now mostly extinct except the L21 sub-element of it.

I don't think you can assume the other 180 were P312xL21. There aren't nearly that many P312* guys in WTY, not even close. Outside of WTY, a only a few P312* have tested for L459.

These SNPs could have occurred in 6 generations (or more if we include the untested Z260 & Z290) but what are the chances that all of them were then discovered in the 1000 genome project, at 1 SNP every 2 generations there are presumably many yet to be unearthed !!!

I don't know the odds and you don't know the odds, but neither us has much idea of the generations between these three SNPs.  You are making assumptions about those three SNPs to support your objections.

We should be careful about what we assume.
I’m not assuming anything, simply wondering if this data can be used to some effect :)
Your counter-arguments are based on assumptions, whether you call them wondering or whatever. They apparently are constructed to argue that you've found an exception to statistically researched methods like what Ken Nordtvedt or Marko Heinila have constructed.

Whether or not Thomas included WTY results in the Ymap data is an important consideration (that’s why I mentioned it), the best way to know for sure of course would be to ask him but he can be a little erratic with his replies but of course he’s a busy man. However I think it’s at least more than likely that he does and since there are 122 negative results in WTY for L459 of which 117 are something other than P312*, I think we can reasonably comfortably remove them from the 180 neg results reported at Ymap.

That still leaves 63 and unless we can think of another source of random testing Thomas could be using (I can’t other than the 1000 genome which seems unlikely) I think it’s reasonable to assume (bum, am I allowed one ? :) these 63 are P312*
You can assume all you want, but then the weight of your counter-arguments mean little if your assumptions are false or unknown, which is the case.

That would leave a roughly guessed 63 P312+, L459- against 94 L21+, L459+ results which is probably still a little light on numbers to draw concrete conclusions from but at least gives enough detail to say if an L21+, L459- or L21-, L459+ fellow did turn up he would be quite lonely....

Maybe you missed it. Do you agree most Y lineages go extinct?  If so, then it is very conceivable that a P312* lineage had the L459 mutation and then many generations had the L21 mutation, but all the L459+ L21- lineages died off

.... or have not been found and tested yet. A lot can happen in 4000 years. What % of the population do you think we have tested for P312, L21 and L459?
« Last Edit: April 15, 2012, 11:07:09 AM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #63 on: April 15, 2012, 11:15:16 AM »


The point is that we don't know if L459, L21 and Z245 all happened at about the same time. L459, for instance could have occurred many generations upstream of L21 in some P312* lineage that is now mostly extinct except the L21 sub-element of it.

I don't think you can assume the other 180 were P312xL21. There aren't nearly that many P312* guys in WTY, not even close. Outside of WTY, a only a few P312* have tested for L459.

These SNPs could have occurred in 6 generations (or more if we include the untested Z260 & Z290) but what are the chances that all of them were then discovered in the 1000 genome project, at 1 SNP every 2 generations there are presumably many yet to be unearthed !!!

I don't know the odds and you don't know the odds, but neither us has much idea of the generations between these three SNPs.  You are making assumptions about those three SNPs to support your objections.

We should be careful about what we assume.
I’m not assuming anything, simply wondering if this data can be used to some effect :)
Your counter-arguments are based on assumptions, whether you call them wondering or whatever. They apparently are constructed to argue that you've found an exception to statistically researched methods like what Ken Nordtvedt or Marko Heinila have constructed.

Whether or not Thomas included WTY results in the Ymap data is an important consideration (that’s why I mentioned it), the best way to know for sure of course would be to ask him but he can be a little erratic with his replies but of course he’s a busy man. However I think it’s at least more than likely that he does and since there are 122 negative results in WTY for L459 of which 117 are something other than P312*, I think we can reasonably comfortably remove them from the 180 neg results reported at Ymap.

That still leaves 63 and unless we can think of another source of random testing Thomas could be using (I can’t other than the 1000 genome which seems unlikely) I think it’s reasonable to assume (bum, am I allowed one ? :) these 63 are P312*
You can assume all you want, but then the weight of your counter-arguments mean little if your assumptions are false or unknown, which is the case.

That would leave a roughly guessed 63 P312+, L459- against 94 L21+, L459+ results which is probably still a little light on numbers to draw concrete conclusions from but at least gives enough detail to say if an L21+, L459- or L21-, L459+ fellow did turn up he would be quite lonely....

Maybe you missed it. Do you agree most Y lineages go extinct?  If so, then it is very conceivable that a P312* lineage had the L459 mutation and then many generations had the L21 mutation, but all the L459+ L21- lineages died off

.... or have not been found and tested yet. A lot can happen in 4000 years. What % of the population do you think we have tested for P312, L21 and L459?

Oh well never mind, I thought this was a potentially interesting observation, never mind I'll crawl back under my rock.

BTW I'm not sure which statistical evidence you think I'm trying to overturn and yes I'm extremely aware of extinction rates.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #64 on: April 15, 2012, 11:34:24 AM »

This was part of the R-P312 Basque discussions but since we are getting into STR variance issues I copied it over here.

Quote from: Mikewww link=topic=10511.msg129159#msg129159

Just three or ten STRs are just not enough. Of course, with such a limited experiment you are going to get erratic results.  Remember Sandy Paterson's simulations where it was determined you need a minimum of 50 STRs to have any precision?

Also, I think you are doing comparisons of R-L23xM412. This is NOT a haplogroup. It is a paragroup. Since it is not really a single group with a single common ancestor I'm not sure that is a valid comparison to make between geographies.

I'm working with what I got, the data from Myres et al(2010) was sampled using 10 STRs, and the point of the exercise I did, was to show how much the variance changed when one used more linear vs.less linear STRs. As you can see when all of the STRs are used, Turkey turns out to have a higher variance than Western Europe, but when the slowest, most linear ones are used, it turns out Western Europe has a higher variance. It’s not about the numbers, but about the choice, of course 10-20 slow STRs trump 3 slow STRs, however to say that a set of 50 STRs regardless of their mutation rate is better than a set of 10-20 slow STRs is just not logic to me. 
I think it is fine to work with what you have but that doesn't mean you've demonstrated your point well.  Three or ten STRs is just not enough. Of course, you can get erratic results with working with such small data sets.

Yeah I’m doing comparisons on the L23(xM412) data from Myres et al(2010), and yes maybe the folks from the Caucasus are have some SNP than the folks from Europe do not have, but that doesn’t change the fact that they both descend from an L23 man, so yeah it is a group with a single common ancestor
Yes, but this is not representative of the whole group. All of the M412 (L51) folks are excluded and that's nine tenths of the group, or probably more.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #65 on: April 15, 2012, 12:05:05 PM »

I think it is fine to work with what you have but that doesn't mean you've demonstrated your point well.  Three or ten STRs is just not enough. Of course, you can get erratic results with working with such small data sets.

Yes, but this is not representative of the whole group. All of the M412 (L51) folks are excluded and that's nine tenths of the group, or probably more.

First off, my point was to show that the relative variance in populations varies as a function of the microsatellite choice. I accept that 3 STRs are rather small, but well if you can find me a scientific paper that uses more than 10-15 STRs I’ll be more than happy to work with it. As for the inclusion of the L51 folks, well, per Table-S3 of the Myres et al(2010) study, R-L23 has 214 samples, whereas M412(L51) only has 14, that is nowhere near nine tenths of the group, they are in fact a small minority. Moreover 13/14 M412 are found in Europe, with the additional one coming from Turkey, so it would be kind of pointless to work with a single Turkish haplotype.

PS: Not to get into gossip, but here is what Dr.Klyosov just told Sandy Paterson over at Rootsweb:

Quote
There are several problems with your approach, and it would be good if you
listen if you really want to understand where the problems are.

Some of those problems are technical, but they show that you are not "in" as
yet. Some of them are fundamental.

Let me explain. What you do, you pick something such as "the sum of
variance" which can always be picked for any senseless series of numbers,
you divide by something which is highly uncertain, you get something, which
you would always get when you divide something by something, and you say -
voila, I got it. What is the worst in all of it, you do not even read papers
which explain all what you do, you ignore them, you do not take the data
into account.

This is a recipe for a complete disaster which you get.

You insist that the mutation rate constant for the 111 marker haplotypes is
0.41. Have you seen HOW it was "obtained"? It was obtained for only 34
markers from those 111 markers, and only 9 of them were in the 68-111 marker
row. Do you call it "science"?
http://archiver.rootsweb.ancestry.com/th/read/GENEALOGY-DNA/2012-04/1334407313

Here is what he said very recently in regards to John Chandler’s mutation rates:
Quote

For the record, Chandler's estimates are good only for the 12 marker panel.
For the 25 and 37 marker panel they are grossly incorrect. They do not in
agreement with his own data on the 12 marker haplotypes.
http://archiver.rootsweb.ancestry.com/th/read/GENEALOGY-DNA/2012-04/1334423409

It seems to me there isn’t much harmony in the hobbyist community.
« Last Edit: April 15, 2012, 12:51:57 PM by JeanL » Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #66 on: April 15, 2012, 04:46:29 PM »

I think it is fine to work with what you have but that doesn't mean you've demonstrated your point well.  Three or ten STRs is just not enough. Of course, you can get erratic results with working with such small data sets.

Yes, but this is not representative of the whole group. All of the M412 (L51) folks are excluded and that's nine tenths of the group, or probably more.

First off, my point was to show that the relative variance in populations varies as a function of the microsatellite choice. I accept that 3 STRs are rather small, but well if you can find me a scientific paper that uses more than 10-15 STRs I’ll be more than happy to work with it.
I'm not going to defend the scientific papers' use of 10-15 STRs. I think it is too small of a number.  We know that for matching people in R1b, using 67 markers is highly beneficial even versus 37. I don't why anyone thinks 3, for sure, is worth even looking at. I don't think anyone in the hobbyist community is claiming to do valid TMRCA's with 3 markers or 12 or 15. FTDNA has come out with 111 STRs and has used this reasoning - to do more precise TMRCA's.

As for the inclusion of the L51 folks, well, per Table-S3 of the Myres et al(2010) study, R-L23 has 214 samples, whereas M412(L51) only has 14, that is nowhere near nine tenths of the group, they are in fact a small minority. Moreover 13/14 M412 are found in Europe, with the additional one coming from Turkey, so it would be kind of pointless to work with a single Turkish haplotype.

All of L51 includes all of L11 and all of U106 and all of P312. Paragroups do not represent a group of people with a single common ancestor. Paragroups may be missing large chunks of the data that is available for total group. In your example, you are missing the bulk of L23 and/or L51.

PS: Not to get into gossip, but here is what Dr.Klyosov just told Sandy Paterson over at Rootsweb:...

Here is what he said very recently in regards to John Chandler’s mutation rates:
....
For the record, Chandler's estimates are good only for the 12 marker panel.
For the 25 and 37 marker panel they are grossly incorrect.
....
It seems to me there isn’t much harmony in the hobbyist community.

If you are certain Chandler's estimates are grossly incorrect you should go on to Rootsweb and make your case. I know that Leo Little's rates are also used on certain panels, but unfortunately, he is no longer with us.

I agree there is not harmony in the hobbyist community, but regardless of methodology, prognostication and personal and communications differences, that makes the following even more significant.

Nevertheless, the "most likely" case for an R1 TMRCA estimate using a totally non-STR based (SNP counting) method aligns very nicely with our top scientist-hobbyist TMRCA estimates for R1b and its subclades, our our top scientist-hobbyists are using at least three different methods - Nordtvedt's Gen7, Klyosov's, and Heinila's "most probable outcome."

The net is we have an SNP based method that supports STR variance based methods and three of those methods generally agree.

No matter how they do it, R-M269 comes out about the same age. I should add Tim Janzen to the list, but he is just using a variance of Nordtvedt's methods. We could add Vince Vizachero as well but he may also use Nordtvedt's stuff too.
« Last Edit: April 15, 2012, 05:21:17 PM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #67 on: April 15, 2012, 05:43:28 PM »

I'm not going to defend the scientific papers' use of 10-15 STRs. I think it is too small of a number.  We know that for matching people in R1b, using 67 markers is highly beneficial even versus 37. I don't why anyone thinks 3, for sure, is worth even looking at. I don't think anyone in the hobbyist community is claiming to do valid TMRCA's with 3 markers or 12 or 15. FTDNA has come out with 111 STRs and has used this reasoning - to do more precise TMRCA's.

Well 37, 67, 111 markers sets are good to get personal matches, to find TMRCA in population databases a good 10-15 panel of linear STRs would just do it. Again the usage of the 3 slowest/more linear STRs vs. the 4 faster/less linear STRs was to show how the relative variance changed, and how places that had a greater variance overall ended up having less variance than other places when the most linear markers were used.

All of L51 includes all of L11 and all of U106 and all of P312. Paragroups do not represent a group of people with a single common ancestor. Paragroups may be missing large chunks of the data that is available for total group. In your example, you are missing the bulk of L23 and/or L51.

Well the L23 samples I analyzed were L23(xL51), so again it wasn’t a paragroup, or at least not on the way you are describing it. So no, I’m not missing the bulk of L23 because, I only included the L23(xL51) samples, not the L23+ samples.

If you are certain Chandler's estimates are grossly incorrect you should go on to Rootsweb and make your case. I know that Leo Little's rates are also used on certain panels, but unfortunately, he is no longer with us.

I didn’t say that, that was Anatole Klyosov who said that, did you even care to check the links I provided?

I agree there is not harmony in the hobbyist community, but regardless of methodology, prognostication and personal and communications differences, that makes the following even more significant.

Nevertheless, the "most likely" case for an R1 TMRCA estimate using a totally non-STR based (SNP counting) method aligns very nicely with our top scientist-hobbyist TMRCA estimates for R1b and its subclades, our our top scientist-hobbyists are using at least three different methods - Nordtvedt's Gen7, Klyosov's, and Heinila's "most probable outcome."

The net is we have an SNP based method that supports STR variance based methods and three of those methods generally agree.

No matter how they do it, R-M269 comes out about the same age. I should add Tim Janzen to the list, but he is just using a variance of Nordtvedt's methods. We could add Vince Vizachero as well but he may also use Nordtvedt's stuff too.

Well as long as a bunch of less linear and few more linear STRs are being thrown together you are going to get TMRCA that are saturated by the inherit loss of linearity of most STRs that were used.  What’s the point of using 111 markers if most of them lose linearity in less than 5000 ybp, no wonder they get TMRCA that are between 4000-8000 ybp.
« Last Edit: April 15, 2012, 05:44:24 PM by JeanL » Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #68 on: April 16, 2012, 04:54:11 PM »


All of L51 includes all of L11 and all of U106 and all of P312. Paragroups do not represent a group of people with a single common ancestor. Paragroups may be missing large chunks of the data that is available for total group. In your example, you are missing the bulk of L23 and/or L51.

Well the L23 samples I analyzed were L23(xL51), so again it wasn’t a paragroup, or at least not on the way you are describing it. So no, I’m not missing the bulk of L23 because, I only included the L23(xL51) samples, not the L23+ samples.

It looks like we have a disagreement on terminology or understanding.
Quote from: wikipedia
Paragroup is a term used in population genetics to describe lineages within a haplogroup that are not defined by any additional unique markers. In human Y-chromosome DNA haplogroups, paragroups are typically represented by an asterisk (*) placed after the main haplogroup[1].
[1] The Y Chromosome Consortium, T. Y C. (2002). "A Nomenclature System for the Tree of Human Y-Chromosomal Binary" Genome Research
http://en.wikipedia.org/wiki/Paragroup

R-L23xL51 or call it R-L23* if you wish, is a paragroup.

There may be people that are R-L23xL51 in your sample that are closer related to R-L51 people than to all of the other R-L23xL51 people in the sample. Make sense? We don't know how many subclades there are hidden in R-L23xL51. Their Most Recent Common Ancestor can not be determined by SNP knowledge, other than to say they have the same Most Recent Common Ancestor as all of R-L23, including all of the L51 (on down to P312, U106) guys.
« Last Edit: April 16, 2012, 04:57:55 PM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #69 on: April 16, 2012, 05:05:36 PM »

Here is what he said very recently in regards to John Chandler’s mutation rates:
Quote
For the record, Chandler's estimates are good only for the 12 marker panel.
For the 25 and 37 marker panel they are grossly incorrect. They do not in
agreement with his own data on the 12 marker haplotypes.
http://archiver.rootsweb.ancestry.com/th/read/GENEALOGY-DNA/2012-04/1334423409

If you are certain Chandler's estimates are grossly incorrect you should go on to Rootsweb and make your case. I know that Leo Little's rates are also used on certain panels, but unfortunately, he is no longer with us.

I didn’t say that, that was Anatole Klyosov who said that, did you even care to check the links I provided?

I apologize I thought the way you referred to him meant that you agreed with him. Why quote someone unless you are explicit in your agreement or disagreement?  I guess you are just saying there is not harmony in the hobbyist community.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #70 on: April 16, 2012, 05:59:45 PM »

R-L23xL51 or call it R-L23* if you wish, is a paragroup.

There may be people that are R-L23xL51 in your sample that are closer related to R-L51 people than to all of the other R-L23xL51 people in the sample. Make sense? We don't know how many subclades there are hidden in R-L23xL51. Their Most Recent Common Ancestor can not be determined by SNP knowledge, other than to say they have the same Most Recent Common Ancestor as all of R-L23, including all of the L51 (on down to P312, U106) guys.

Ok let’s put it this way, the yet-to-be-discovered SNPs under L23 that aren’t L51 appear to be older in Western Europe than in Eastern Europe. At the same time Western European variance appears to be slightly younger than the Caucasus one in all instances, and older than Turkey when the most linear markers, otherwise Turkey appears to be the oldest when using the four less linear STRs.

But here is something interesting, if we assume that L23 was born outside of Europe, and that the clades that entered Europe were either L11, or P312/U106, then one would expect the European L23 to be relatively scarce, and very young; because any L23 in Europe would be newcomers from outside very recently, yet Western Europe has L23 that have a TMRCA as old as Turkey, and almost as old as the Caucasus. So does that mean that L23 was part of the initial wave of colonization, and that L11 was born along the way?

Ok, mind everyone this is all based on the data provided by Myres et al(2010), this is nowhere near conclusive of the European genetic panorama. I want to make sure everyone understand that I am advancing this observations based on very limited data.
« Last Edit: April 16, 2012, 06:04:55 PM by JeanL » Logged
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #71 on: April 16, 2012, 06:01:26 PM »


I apologize I thought the way you referred to him meant that you agreed with him. Why quote someone unless you are explicit in your agreement or disagreement?  I guess you are just saying there is not harmony in the hobbyist community.


You got it.

Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #72 on: April 16, 2012, 06:15:28 PM »

I think it is fine to work with what you have but that doesn't mean you've demonstrated your point well.  Three or ten STRs is just not enough. Of course, you can get erratic results with working with such small data sets....

First off, my point was to show that the relative variance in populations varies as a function of the microsatellite choice. I accept that 3 STRs are rather small, but well if you can find me a scientific paper that uses more than 10-15...

I've actually played with "miccrosatellite choice" in the past, because of concern about your point.  I ran through the R-L21 file of long haplotypes and tried 12, 25, 37, 67 length haplotypes and after throwing out the non-multicopy non-null STRs, I would run variance calculations adding an STR or two or subtracting.   What I found was the variance relationships between the subclade of L21 were fairly stable when you start using above 15-20 STRs.

Generally, I find very little jostling of the relationships in R1b subclades when you start using 25 or so markers and get up to about 30 haplotypes.

Here is "test" run for you on R-L21's major subclades based on different sets of markers.

Relative variance with the 49 mixed speed, non-multicopy, non-null STRs from FTDNA's 1st 67:
L21__________:  Var=0.99 (N=2590)
DF21_________:  Var=0.80 (N=116)
L513_________:  Var=0.75 (N=157)
Z253_________:  Var=0.61 (N=145)
M222_________:  Var=0.49 (N=540)
Z255_________:  Var=0.39 (N=102)

Relative variance with the 36 best* linear duration, non-multicopy, non-null STRs from FTDNA's 1st 67:
L21__________:  Var=1.02  (N=2590)
DF21_________:  Var=0.73 (N=116)   
L513_________:  Var=0.64 (N=157)
Z253_________:  Var=0.60 (N=145)
M222_________:  Var=0.45 (N=540)
Z255_________:  Var=0.35 N=102)

Relative variance with the 24 mixed speed, non-multicopy, non-null STRs from FTDNA's 1st 37:
L21_________:  Var=0.95 (N=3234)
DF21________:  Var=0.95 (N=125)   
L513________:  Var=0.73 (N=166)
Z253________:  Var=0.64 (N=170)   
M222________:  Var=0.57 (N=734)
Z255________:  Var=0.45 (N=128)

Relative variance with the 16 best* linear duration, non-multicopy, non-null STRs from FTDNA's 1st 37:
L21_________:  Var=0.95 (N=3234)
DF21________:  Var=0.92 (N=125)
L513________:  Var=0.64 (N=166)
Z253________:  Var=0.63 (N=170)
M222________:  Var=0.54 (N=734)
Z255________:  Var=0.43 (N=128)

* Linear durations greater than 7000 years according to Marko Heinila's analysis.


See how stable the order of the above haplogroup stays?  The percentage differences between the different haplogroups do change depending on the STRs used. I am not trying to say that STR variance is precise. It isn't, but the more data you have you can improve precision.

Generally, what I've found is that the linear 36 STR (most of which are slower) and the 49 STR mixed speed marker calculation runs rarely change the positioning of haplogroups.

Most variance relationships between R1b haplogroups work well at 16 or 24 markers on 37 length haplotypes. M222 did flip-flop with Z255 for us on the low marker runs above, however, the notable exception is that U198 looks quite old (high variance compared to U106 or Z381) with the 37 length haplotypes.  However if you ratchet up the U198 analysis to 36 or 49 markers on 67 length haplotypes everything seems to fit back into place (younger than Z381.)

I just think it is the law of large numbers at work and the value of having more STR "experiments."
« Last Edit: April 16, 2012, 06:19:31 PM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #73 on: April 16, 2012, 06:25:48 PM »

R-L23xL51 or call it R-L23* if you wish, is a paragroup.

There may be people that are R-L23xL51 in your sample that are closer related to R-L51 people than to all of the other R-L23xL51 people in the sample. Make sense? We don't know how many subclades there are hidden in R-L23xL51. Their Most Recent Common Ancestor can not be determined by SNP knowledge, other than to say they have the same Most Recent Common Ancestor as all of R-L23, including all of the L51 (on down to P312, U106) guys.

Ok let’s put it this way, the yet-to-be-discovered SNPs under L23 that aren’t L51 appear to be older in Western Europe than in Eastern Europe. ...

You can't really say that.  There may be some not yet discovered SNPs under L23* that are older than L51, but we really don't know.  There could just be well "balanced" distribution of four or five major SNPs (A, B, C, D, & E) under L23* that are each younger than L51 and one or two could (A & B perhaps) are actually closer related to L51 than to C, D and E.

If some L23* not yet discovered SNPs are older in Western Europe versus Eastern Europe I don't think that means STR diversity is not meaningful, which is the topic of this thread.  

Regardless, I don't think have many long Western European R-L23* haplotypes to be very conclusive in comparing with the East, do we?...  well that question belongs on another thread.
« Last Edit: April 16, 2012, 07:12:41 PM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #74 on: April 16, 2012, 07:21:03 PM »

I'm just adding this reply FYI because of the comments from Vincent Vizachero on Rootsweb related to weighting STRs.

.... I've also tried to weight each STR against its maximum variance so that no STR would have more weight than another. That didn't work out so well. I received some crazy results. I think it goes back to using a calendar to measure hours and every now then even the slowest STRs have fairly quick successive mutations. It's like the calendar page turned on that STR when I'm only trying to measure 10 or 12 hours worth of time....

Quote from: Rootsweb question
> Is it reasonable to downweight the GDs for specific markers as a function of their mutation rates (analogous to what is done in Ken's Generations spreadsheets)?

Quote from: Vincent Vizachero
If there was a time-efficient way adjust reweight the markers for each pair of haplotypes, I suppose it might be helpful. But the benefit would be far outweighed by the cost of developing the algorithm and computer code to make it happen, I suspect. Theoretically, yes, this would be an improvement but the magnitude would be small.
http://archiver.rootsweb.ancestry.com/th/read/GENEALOGY-DNA/2012-04/1334616624

Despite the lack of harmony among hobbyist researchers, what Vincent posted on this specific topis is supportive of positions Anatole Klyosov has taken where he uses the average rate for a set of STRs in his TMRCA calculations rather than applying each individually, which would be needed if you wanted to weight STRs.
« Last Edit: April 16, 2012, 07:22:02 PM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Pages: 1 2 [3] 4 5 ... 14 Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.162 seconds with 18 queries.