World Families Forums - TMRCA calculations

Welcome, Guest. Please login or register.
September 03, 2014, 12:13:13 AM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  TMRCA calculations
« previous next »
Pages: 1 ... 5 6 [7] Go Down Print
Author Topic: TMRCA calculations  (Read 8449 times)
stoneman
Old Hand
****
Offline Offline

Posts: 141


« Reply #150 on: July 28, 2012, 02:44:10 PM »

I am 100% sure that the SNPs are'nt private.



They may be personal SNP's.  I am not expert in SNP nomenclature, D. Reynolds, et.al. may be able to better comment.  My comments re: STR's is that I have some very unusual allele values for several of my dys loci.  Who knows has a thread on U106 right now that may be of interest to you?
Logged
spanjool
Member
**
Offline Offline

Posts: 38


« Reply #151 on: September 10, 2012, 05:06:38 AM »

I compared mean pairwise mismatches from DF27xL176,2 and its sub-clades with some MRCA calculations. The third column are mean pairwise mismatches calculated with the Arlequin  program; the MRCA based on Nordtvedt’s calculator are in the 4th column.
1   df27      20.7   3.7k
2   z196      18.8   3,5k
3   z209      18   2.7k   
4   z220      17.6   4.0k
5   z278      17   2.1k
6   m153      15   2.4k
Plotting the numeric SNP order against the mean pwmm produces a polynomial of the third degree; the time interval of the SNP’s as well as the rise in the effective population will cause a deviation of linearity.
Producing a similar plotting with the MRCA’s gives a polynomial function of the 5th degree.  Indicating that inherent assumptions in the MRCA calculations are inconsistent.
The pattern changes If one calculates with Nordtvedt method a SNP together with its downstream SNP’s; but still the polynomial is of the 6th degree!
DF27xL176.2 and sub clades    3.9k
Z196 and sub clades      3.5k
Z209 and sub clades      3.2k
Z220 and sub clades      3.2k
Z278 and sub clades      2.4k
Further I like to present the differences of FST values of the subpopulations of the DF27 metapopulation; FST values represent distances in subpopulations and are well used in population genetics:
FST differences from  Df27 with:
Z196      0.091
Z209       0.097
L176.2      0.014
SRY2627                  0.048
L165      0.141

It looks as if the mean pairwise mismatches and the FST calculations are more reliable to date subpopulations associated with a defining SNP. The Nordtvedt based MRCA calculates the age of an clade including  its sub clades acceptable.
When I plot the  mean pairwise mismatch of DF27xL176.2 and sub clades  (each time including its sub clades) against the associated MRCA I get a straight line with a linear relationship.

The formula is MRCA=1000(0.23xmean pairwise mismatch – 0.65)  ybp (R2=0.9718).

The Z220 subpopulation becomes then: 3.4K and not 4.0K. And DF27 subpopulation 4.1K. The L484 subpopulation is aged 1.3K . A small cluster inside a phylogenetic tree including Bob Bjorkman, Nik Okkels and myself have a MRCA of 0.9K.

I have no expertise in field of the Klyosov or Dieneke calculations. May be there is someone who could provide them in regard to the data used here. They came from the spreadsheet of Mike Walsh.

May I end with a quote from a recent article in Plos by Rocco et al:
The paucity of haplogroup defining genetic markers has meant that these microsatellite-derived dating calculations have to be conducted without regard to lower level phylogenetic relationships, and therefore erroneously compare populations that may be phylogenetically distant. By identifying the lower level branches of the R1b1a2 phylogenetic tree, more accurate dating of truly related haplogroups will be possible.

Hans

PS My Word file looks here slightly distorted.
« Last Edit: September 10, 2012, 05:19:36 AM by spanjool » Logged

R1b-Z220*
razyn
Old Hand
****
Offline Offline

Posts: 406


« Reply #152 on: September 10, 2012, 08:40:57 AM »

There is a lot to ponder here, and I'd like to know better what I'm pondering.

I think the fourth column in the first little table would be TMRCA values by using the Nordtvedt (Gen7?) calculator AS IS, and that that doesn't include either pairwise mismatches or FST values.  Is that right?  Then the "formula" part, in boldface, I totally don't understand.  But, whatever, I get that the corrected (or compensated) values for the TMRCA of several SNPs (happily, including my own L484) have been obtained by plugging in the values found (somehow) in this formula... right?  There is just too much missing for me to understand what is visible.

It would be useful to calculate the TMRCA for the same set of SNPs all three ways (or four, or five ways, if e.g. Klyosov's or other techniques are applied).  My vote would be for a longer set (than the six in your first table).  Since you are getting your data (for the pairwise mismatches) from Mike's spreadsheet, IMO his tables of descent would be a good place to start, for picking a useful, longer set of SNPs.  His tables are anyway much more legible than the ISOGG tree, or Thomas Krahn's Draft Tree.

Also, I continue to suspect that the perceived difference between Z209 and Z220 is imaginary, and that only one should be used (preferably Z220) -- unless and until someone tests both SNPs and is negative for one of them.  This is especially problematic in the first table, wherein "the MRCA based on Nordtvedt's calculator" for Z220 differs from Z209 by 1,300 years (and Z220 is older than its gr-gr-grandparent DF27).  But even disregarding the absurd result, they show a mean pairwise mismatch difference of .4, and I'm not convinced that is real.  I think it's an artifact of testing patterns, since people started skipping the Z209 test.
Logged

R1b Z196*
spanjool
Member
**
Offline Offline

Posts: 38


« Reply #153 on: September 14, 2012, 12:37:12 AM »

I tried to weed out the anomalies that occur if one uses Nordtvedt's MRCA calculation (or any calculator based upon this system) with a subpopulation in a clade indentified by a SNP  without also including its sub-clades.
I found that such an anomalie did not occur in the mean pairwise mismatches of the same subpopulations.
Next I gauged the mean pairwise matches of a subpopulation identified by a SNP against the Nordtvedt calculation of the same SNP; in both instances also including its subclades.
This gave a straight line when plotted against each other.
The formula is derived from this line with a high R2 (indicating its correctness).
It is now possible to calculate indirect correctly an age estimate by using the mean pairwise mismatches and this formula.
Logged

R1b-Z220*
spanjool
Member
**
Offline Offline

Posts: 38


« Reply #154 on: September 14, 2012, 12:39:59 AM »

In regard to the assumption that Z209 and Z220 should be pooled together.
When combining them I still find a anomalous MRCA with the Nordtvedt calculator and not with the mean pairwise mismatches.
Also observe that MRCA's from Z209 and Z220 calculated with its subclades become normal again.

Logged

R1b-Z220*
razyn
Old Hand
****
Offline Offline

Posts: 406


« Reply #155 on: September 14, 2012, 01:26:56 AM »

I responded in a lot more detail on the Yahoo R-P312 group, where you posted the same message Monday morning.  But to simplify matters -- I got your results by using this version of your formula.  Let M=the mean pairwise mismatch (for these DF27 SNPs, your column 3 in the first table of that post).  Then the formula is (0.23M - 0.65) = time to most recent common ancestor, expressed in Kybp.

And about those other numbers, R2=0.9718 (usually written r2=0.9718) is not part of the formula itself, but the correlation coefficient indicating that its results are highly consistent.  I think.

Now, if I just knew how to generate those M numbers myself...
Logged

R1b Z196*
Pages: 1 ... 5 6 [7] Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.061 seconds with 18 queries.