World Families Forums - Calculated Variance of R1b-L23(xL51) from academic studies.

Welcome, Guest. Please login or register.
August 20, 2014, 09:18:19 AM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  Calculated Variance of R1b-L23(xL51) from academic studies.
« previous next »
Pages: [1] Go Down Print
Author Topic: Calculated Variance of R1b-L23(xL51) from academic studies.  (Read 931 times)
JeanL
Old Hand
****
Offline Offline

Posts: 425


« on: May 10, 2012, 03:40:52 AM »

So, I ran my mutation calculator on a country by country basis for the Myres.et.al.2010 dataset, and on a regional basis for the Herrera.et.al.2011 dataset which only includes Armenian samples. I discarded samples with sizes smaller than 5, which is why I still think that the Pakistan/Tabassaran/Bagvalals results ought to be taken with a grain of salt since they are so small.

The non-multicopy variance in the case of Myres et al.2010 excludes DYS389I, and DYS389II. The non-multicopy in the case of Herrera et al.2011 excludes DYS385a and DYS385b along with the DYS389I/DYS389II combo.

Something very weird is that the modal for DYS389II=28-29 in the Armenian samples from Herrera et al.2011, however the values observed for DYS389II in the sole Armenian sample from Myres et al(2010) Table-S3 is DYS389II=15. Maybe I missed something but that is quite some difference there.

Thanks to Jdean for pointing it out. See below for the new table, it now makes sense that DYS389 was being expressed as the total number of repeats. Sorry, I was working on this thing very late yesteday, so I did not catch onto that. I also fixed a minor computing bug it had, which was causing the variance to be slightly off.
For more information refer to Table-S2 from Herrera.et.al.2011 and Table-S3 from Myres.et.al.2010.


In any case, the non-multicopy variance excludes those STRs so those anomalies should not affect the non-multicopy variance.



http://i1133.photobucket.com/albums/m582/jeanlohizun/Myresetal2010Herreraetal2010L23xL51data.jpg

See post below for an explanation of the table.
« Last Edit: May 10, 2012, 11:17:45 AM by JeanL » Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #1 on: May 10, 2012, 05:04:01 AM »

So, I ran my mutation calculator on a country by country basis for the Myres.et.al.2010 dataset, and on a regional basis for the Herrera.et.al.2011 dataset which only includes Armenian samples. I discarded samples smaller with sizes smaller than 5, which is why I still think that the Pakistan results ought to be taken with a grain of salt since it is 5 haplotypes. The non-multicopy variance in the case of Myres et al.2010 excludes DYS389I, and DYS389II. The non-multicopy in the case of Herrera et al.2011 excludes DYS385a and DYS385b along with the DYS389I/DYS389II combo.

Something very weird is that the modal for DYS389II=28-29 in the Armenian samples from Herrera et al.2011, however the values observed for DYS389II in the sole Armenian sample from Myres et al(2010) Table-S3 is DYS389II=15. Maybe I missed something but that is quite some difference there.

For more information refer to Table-S2 from Herrera.et.al.2011 and Table-S3 from Myres.et.al.2010.

In any case, the non-multicopy variance excludes those STRs so those anomalies should not affect the non-multicopy variance.



http://i1133.photobucket.com/albums/m582/jeanlohizun/MyresHerreraL23xL51data.jpg


389 isn't a multi copy marker but rather two STRs next to each other.

Some companies report 389II as the total of the first and second STR (FTDNA for instance) whist others count each STR independently.

If you look at Mike's spreadsheets you'll see he separates the values out by subtracting 389I from 389II which is necessary for variance calculations since a mutation in 389I will also get reported in 389II.

Edit: Ken handles this differently in his interclade spreadsheet, he only looks at 389II and uses a combined mutation rate. I think the reason for this is he doesn’t won't to go through his I1 data (which is a little large) and change these values.

I change my copy of his spreadsheets to allow both values to be used.
« Last Edit: May 10, 2012, 05:43:28 AM by Jdean » Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #2 on: May 10, 2012, 11:12:45 AM »

Here is the new Table I made:

  • In the Myres.et.al.2010 section the 10 STRs data includes all STRs used by Myres et al(2010), the 8 STR variance is the variance of the STRs that the Myres.et.al.2010 and Herrera.et.al.2011 datasets have in common. Those 8 STRs are: DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS439
  • In the Herrera.et.al.2011 section the 15 STRs data excludes DYS385a and DYS385b, it includes all other STRs used my Herrera.et.al.2011. As previously mentioned the 8 STR variance was calculated for the purpose of comparing both datasets
  • Lastly, both datasets are combined and sorted based on highest STR variance from the 8 STRs both datasets have in common. It is worth noticing that while the combined Armenian sample has the greatest variance this is mostly due to Sasun(n=16) which as it can be seen the regional samples has a far greater variance than all others.



http://i1133.photobucket.com/albums/m582/jeanlohizun/Myresetal2010Herreraetal2010L23xL51data.jpg


« Last Edit: May 10, 2012, 11:19:29 AM by JeanL » Logged
alan trowel hands.
Guru
*****
Offline Offline

Posts: 2012


« Reply #3 on: May 10, 2012, 02:12:51 PM »

Here is the new Table I made:

  • In the Myres.et.al.2010 section the 10 STRs data includes all STRs used by Myres et al(2010), the 8 STR variance is the variance of the STRs that the Myres.et.al.2010 and Herrera.et.al.2011 datasets have in common. Those 8 STRs are: DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS439
  • In the Herrera.et.al.2011 section the 15 STRs data excludes DYS385a and DYS385b, it includes all other STRs used my Herrera.et.al.2011. As previously mentioned the 8 STR variance was calculated for the purpose of comparing both datasets
  • Lastly, both datasets are combined and sorted based on highest STR variance from the 8 STRs both datasets have in common. It is worth noticing that while the combined Armenian sample has the greatest variance this is mostly due to Sasun(n=16) which as it can be seen the regional samples has a far greater variance than all others.



http://i1133.photobucket.com/albums/m582/jeanlohizun/Myresetal2010Herreraetal2010L23xL51data.jpg




Well it does seem Turkey, Armenia and Romania keep popping up as high variance L23*.  Pitty the sample from the Romania and adjacent areas is not higher.  Seems to me that somewhere (and in whatever direction) L23* moved around in one of these areas first.  If the Anatolians and later Armenians IE element did enter from SE Europe then Romania would not be too far off where you would expect it to have come from.  I cant really see past L23* having been the IE element given a lack of any other realistic options in that area.  Coule L23* be the early split off from IE represented by Anatolian. 
Logged
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #4 on: May 10, 2012, 03:00:08 PM »

    [/list]

    Pitty the sample from the Romania and adjacent areas is not higher. 

    Yeah it is a pity, but the thing is that the study(Myres.et.al) wasn't focus on R1b-L23(xL51), but on R1b-M269+ in general, and not just on Romania, but all of Europe+adjacent areas. So from the point of view of the authors the Romanian sample size of 330 was probably good enough. The Armenian study focused just on the Armenian populations and got much better resolution. Hopefully, in the near future we shall see more studies focusing in populations with more SNP resolution and more STRs. In any case, there is a definite need for more sampling with a higher number of STRs.
    Logged
    Mike Walsh
    Guru
    *****
    Offline Offline

    Posts: 2964


    WWW
    « Reply #5 on: May 10, 2012, 03:08:58 PM »

    Thanks for doing this, JeanL.


    What do you think about the Arrarat Valley high variance?  If I read that correctly, it has a sample of 36 haplotypes with 15 STRs. Those are not bad data sets.

    Yes, it would great to have more data from Romania. That could be a critical area and it does look like the variance is decent there.

    Over in the Assyrian Heritage thread, Palamede, is supposing that the R-L23* in Anatolia came from the Balkans.  I'm not sure what the total reasoning is.
    « Last Edit: May 10, 2012, 03:17:58 PM by Mikewww » Logged

    R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
    alan trowel hands.
    Guru
    *****
    Offline Offline

    Posts: 2012


    « Reply #6 on: May 10, 2012, 04:10:11 PM »

    Thanks for doing this, JeanL.


    What do you think about the Arrarat Valley high variance?  If I read that correctly, it has a sample of 36 haplotypes with 15 STRs. Those are not bad data sets.

    Yes, it would great to have more data from Romania. That could be a critical area and it does look like the variance is decent there.

    Over in the Assyrian Heritage thread, Palamede, is supposing that the R-L23* in Anatolia came from the Balkans.  I'm not sure what the total reasoning is.

    The Hittite empire?  I think it included at least the northern half of Syria at one point
    Logged
    alan trowel hands.
    Guru
    *****
    Offline Offline

    Posts: 2012


    « Reply #7 on: May 10, 2012, 04:12:11 PM »

    In fact most of the early Assyrian empire if not all of it was also once part of the Hittite empire. 
    Logged
    razyn
    Old Hand
    ****
    Offline Offline

    Posts: 405


    « Reply #8 on: May 10, 2012, 04:29:20 PM »


    Yes, it would great to have more data from Romania. That could be a critical area and it does look like the variance is decent there.

    Looks like another job for Alexandromir, if he finds us.  Humanist has sent him a link to the other recent thread, "R1b-L51 from the West."  If he shows up there, remind him to read this one too.  I think he expects to be in Romania collecting DNA samples in the next month or two.
    Logged

    R1b Z196*
    palamede
    Senior Member
    ***
    Offline Offline

    Posts: 64


    « Reply #9 on: May 11, 2012, 02:02:38 PM »

    I discarded samples with sizes smaller than 5, which is why I still think that the Pakistan/Tabassaran/Bagvalals results ought to be taken with a grain of salt since they are so small.

    I am surprised. According to http://en.wikipedia.org/wiki/Y-DNA_haplogroups_by_populations_of_the_Caucasus with references

    Tabarassans (Lezgui linguistic group) R1b=17/43 39,5%
    Lezgis 5/31 16,1%

    Bagvalas (Andi linguistic  group) 19/28 67,9%
    Andis 17/115  14,8%

    Ossetes-Digor (Iranian family) 21/127 16,5%
    Abhkaz (NW linguistic family) 6/58 12,1%
    Balkars (Turkish  family) 5/38 13,2%
    Kumyks  (Turkish  family)15/76 19,7%

    Georgians 6/66 9,1%

    Addition: Finally, I see you spoke of Myres samples and the calculated variance.

    AlthoughI have a limited confidence in the variance figures because it is difficult to integrate all the elements of the populations and of their formation, I see the low variance of R1b-L23* of Bashkirs in Oural, either from  Trypolie origin or from caucasian origin. Near Bashkirs, we have Kazakhs tribes with names showing enough fresh caucasian origin (Tcherkesses) and Argyn group of tribes (more half an million members) with  70% G1, showing old origin from Iranian Plateau (before Aryans). There were population moves not documented by History, in the reverse direction of what we are used to think.
    « Last Edit: May 11, 2012, 02:24:44 PM by palamede » Logged

    Y=G2a3b1a2-L497 Wallony-Charleroi; Mt=H2a2a1 Normandy-Bray
    Dodecad-DiY: E Eur 9,25% W Eur 48,48% Med 28,46% W Asia 11,70%
    World9: Atl-Balt 67,61% Southern 13,23% Cauc-Gedr 12,73%
    K12a: North-E 39,71% Med 37,9% Cauc 12,55% Gedr 5,78% SW Asia 2,13%
    Pages: [1] Go Up Print 
    « previous next »
    Jump to:  


    SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

    Page created in 0.096 seconds with 18 queries.