World Families Forums - TMRCA calculations

Welcome, Guest. Please login or register.
December 25, 2014, 12:15:24 AM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  TMRCA calculations
« previous next »
Pages: 1 2 3 [4] 5 6 7 Go Down Print
Author Topic: TMRCA calculations  (Read 9119 times)
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #75 on: May 08, 2012, 01:15:07 PM »


At ftdna.com/public, there are more than 60,000 unique haplotypes at 37 level, and almost 35,000 with 67. The question is just what happens if more data is used with the same estimation method that produced the estimates in the 111T calculator.  This spreadsheet contains a collection of mutation rate estimates based on various datasets.
-Marko H.



What is the oldest TMRCA(i.e. Clan with known TMRCA, where everyone presumed to be equally removed from the MRCA) in the subsets of 37 STRs, or the 67 STRs from which mutation rates have been calibrated?
Logged
MarkoH
Member
**
Offline Offline

Posts: 20


« Reply #76 on: May 08, 2012, 01:49:20 PM »

What is the oldest TMRCA(i.e. Clan with known TMRCA, where everyone presumed to be equally removed from the MRCA) in the subsets of 37 STRs, or the 67 STRs from which mutation rates have been calibrated?

The rate calculation uses haplotypes to get an idea about the relative mutation rates; that is, what fraction of total mutations is expected to happen in each specific STR locus.

Given that, the calculation sets the sum of estimated mutation rates so that the expected number of mutations in the transmissions covered by the YHRD dataset equals to the actually observed number of mutations in the YHRD father/son transmission dataset. (This is sometimes called "calibration" to get absolute per generation mutation rate estimates.)

This does not directly consider deep genealogical TMRCA's.  (For rate estimation, the clusters of close haplotypes from surname projects are especially useful.)  However, I would expect that probably most of the well-known deep genealogies are covered by the dataset since it covers data from many projects.

Logged
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #77 on: May 08, 2012, 03:12:05 PM »


The rate calculation uses haplotypes to get an idea about the relative mutation rates; that is, what fraction of total mutations is expected to happen in each specific STR locus.

Well, in order to be able to count the mutations, they have to be based off a modal allele value. Therefore this modal value has a great effect on the mutation rates. How do you deal with that?


Given that, the calculation sets the sum of estimated mutation rates so that the expected number of mutations in the transmissions covered by the YHRD dataset equals to the actually observed number of mutations in the YHRD father/son transmission dataset. (This is sometimes called "calibration" to get absolute per generation mutation rate estimates.)

This does not directly consider deep genealogical TMRCA's.  (For rate estimation, the clusters of close haplotypes from surname projects are especially useful.)  However, I would expect that probably most of the well-known deep genealogies are covered by the dataset since it covers data from many projects.

I understand what you are saying about father/son transmission rates. However what I was referring too, was the process where X number of mutations are observed in locus Y, and everyone in locus Y is presumed to descend from ancestor A who lived XX generations ago. I have seen that process done quite often by people like Anatole Klyosov. I was wondering what is the deepest MRCA that has been analyzed on a known dataset to calibrate mutation rates?


Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #78 on: May 08, 2012, 04:56:37 PM »


At ftdna.com/public, there are more than 60,000 unique haplotypes at 37 level, and almost 35,000 with 67. The question is just what happens if more data is used with the same estimation method that produced the estimates in the 111T calculator.  This spreadsheet contains a collection of mutation rate estimates based on various datasets.
-Marko H.



What is the oldest TMRCA(i.e. Clan with known TMRCA, where everyone presumed to be equally removed from the MRCA) in the subsets of 37 STRs, or the 67 STRs from which mutation rates have been calibrated?

 In clan gregor everyone is not equally removed.  Thats how you get a range in the number of mutations.  In Clan Gregor the modal is found by majority vote.  Then mutations from the modal can be estimated for each entry.  The oldest entries that I've seen are in Clan Donald  (R1b) and the Campbells.  The Campbells have several unique interesting mutations, 19 and 20 at 458 e.g.  

Clan Gregor is different in that it starts with the 11 to 10 mutation at 391.  By definition this excludes anyone with an 11, for example, which is the normal value for Dal Riadic scots. ( I can imagine a person with an 11 who might be considered if the rest of his haplotype was very close to the modal?)

As always trying to establish mutation rates from data sets is very difficult/impossible for the slower mutators since they don't mutate often enough to provide enough data.

It also can be difficult to estimate the faster mutators, it is difficult to count the number of mutations which have occurred at CDYa,b after a few hundred years.
« Last Edit: May 08, 2012, 05:05:54 PM by ironroad41 » Logged
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #79 on: May 08, 2012, 08:32:24 PM »

Quote

I copy the data into excel as text and then use the replace function in notepad to change the hyphens to tabs, normally works quite well.
Your right. That is would be alittle easier on small groups. But ColinF sent me Michael Hébert macro Spreadsheet. I tested and it work most excellent.

MJost
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #80 on: May 08, 2012, 08:39:18 PM »

Welcome Marko. Thanks for posting as I am sure your knowledge will be valued here.

MJost
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
MarkoH
Member
**
Offline Offline

Posts: 20


« Reply #81 on: May 08, 2012, 09:01:16 PM »

Well, in order to be able to count the mutations, they have to be based off a modal allele value. Therefore this modal value has a great effect on the mutation rates. How do you deal with that?

Calculation considers close haplotype pairs. These are then modeled as independent random draws from a theoretical distribution.   This would not work with longer haplotype distances, since haplotype pairs would not be considered independently drawn any longer.    

The benefit of this is that the use of relatively small genealogical datasets is avoided. Estimation of trees, on the other hand, is much harder than estimation of the mutation rates. (Mutation rates can actually be used to verify large phylogenies rather than other way around.)

The repeat number than chances is probably most commonly the modal value, and the estimated mutation rate might typically reflect that in the case of a very dominant modal value.

Quote
I I was wondering what is the deepest MRCA that has been analyzed on a known dataset to calibrate mutation rates?

Perhaps the split of certain Scottish vs. Scandinavian R1a1 related to Vikings. Or perhaps Vladimir Monomakh case in N1.

« Last Edit: May 08, 2012, 09:15:54 PM by MarkoH » Logged
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #82 on: May 08, 2012, 09:52:14 PM »


Calculation considers close haplotype pairs. These are then modeled as independent random draws from a theoretical distribution.   This would not work with longer haplotype distances, since haplotype pairs would not be considered independently drawn any longer.   

The benefit of this is that the use of relatively small genealogical datasets is avoided. Estimation of trees, on the other hand, is much harder than estimation of the mutation rates. (Mutation rates can actually be used to verify large phylogenies rather than other way around.)

The repeat number than chances is probably most commonly the modal value, and the estimated mutation rate might typically reflect that in the case of a very dominant modal value.

So let’s see if I understand you correctly:

When you say close haplotype pairs, do you mean a set of haplotypes that differ by 1 mutation at most in each locus? I gotta say, it sounds like an interesting idea. So you model the probability of getting something else other than the presumed modal value as the mutation rate? So how many times do you do the random sampling, and how wide are the confidence intervals in the mutation rates relative to the mean mutation rate values you get for each locus?

Also, how do you handle variability in the mutation rate that is behond the +-1 mutations off modal? 
« Last Edit: May 08, 2012, 10:06:22 PM by JeanL » Logged
spanjool
Member
**
Offline Offline

Posts: 38


« Reply #83 on: May 09, 2012, 01:59:03 AM »


An example (MRCA/Coalescence in generations)

M153              36/30        6


The difference between MRCA and Coalescence have to be taken in account.
As the latter points more to neutral mutations  (mutation pressure); the first related to strong effects like founders of bottlenecks.

The smaller the difference the more the population is stable and the more trustable the MRCA (bell shaped pair wise mismatches in the STR alleles).

A bigger difference points to a non settled MRCA because of a relative high peripheral selection; together with a ragged shaped pair wise mismatch indicative for a subpopulation in harsh circumstances and in isolation (no gene flow).

The medium differences shows a subpopulation with reasonable growth and wealth but affected by gene flow; less isolated.
Hans

So what was  the sigma that you got for R-M153, right now I see the MRCA was 36 generations or 900 ybp(using 25 year/gen) or 1080 ybp (using 30 years/gen), would you say that it is an accurate estimate of the age of R-M153?


Interesting reaction but not to the point!
What about the essence of the whole post; do you have any to say there.
Hans
Logged

R1b-Z220*
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #84 on: May 09, 2012, 02:50:42 AM »

Interesting reaction but not to the point!
What about the essence of the whole post; do you have any to say there.
Hans

I want to know the sigmas for R1b-M153 to get an estimate of how wide from a mean TMRCA of 900 ybp does the estimate expand. According to your calculation what is the max TMRCA that R1b-M153 can have? That is, what is the 97.5% confidence interval value for the TMRCA?
« Last Edit: May 09, 2012, 02:51:05 AM by JeanL » Logged
MarkoH
Member
**
Offline Offline

Posts: 20


« Reply #85 on: May 09, 2012, 05:23:12 AM »

When you say close haplotype pairs, do you mean a set of haplotypes that differ by 1 mutation at most in each locus? I gotta say, it sounds like an interesting idea. So you model the probability of getting something else other than the presumed modal value as the mutation rate? So how many times do you do the random sampling, and how wide are the confidence intervals in the mutation rates relative to the mean mutation rate values you get for each locus?

Also, how do you handle variability in the mutation rate that is behond the +-1 mutations off modal? 

The method has been outlined here by J. Chandler.

The calculation assumes that mutation rate does not depend on repeat number. The estimated rate is the sum of rates for STR reductions and additions for each locus. The amount of change is not estimated (it can be more than one repeat in the case of multistep or copy overwrite).

It is not possible to estimate accurately what mutated without knowing the phylogeny. Same applies to error limits. However, comparison of estimates for different datasets (35,000 x 67 vs 4,000 x111) gives an idea or uncertainty.   Direct methods for estimating errors produce underestimates (jackknife method or bayesian limits).

Logged
MarkoH
Member
**
Offline Offline

Posts: 20


« Reply #86 on: May 09, 2012, 05:34:46 AM »

For the deep genealogical question, I pulled out this kind of approximate 111 STR dated tree for R-L176.1. It shows Scottish and Scandinavian branches parting ways in one case at 700 years and in another case at 1,200 years.  This assumes both correct tree and mutation rates, that is, a lot is assumed.  

This case has been studied by various authors with very different results. It is seemingly either held  that much lower (Klyesov) or much higher (Banks, McDonald) rate than the about 0.3 per generation for 111 STR's used here is needed to understand this.
« Last Edit: May 09, 2012, 05:36:35 AM by MarkoH » Logged
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #87 on: May 09, 2012, 10:30:43 AM »


The method has been outlined here by J. Chandler.

The calculation assumes that mutation rate does not depend on repeat number. The estimated rate is the sum of rates for STR reductions and additions for each locus. The amount of change is not estimated (it can be more than one repeat in the case of multistep or copy overwrite).

But that would certainly be a problem, because from what I have observed mutation rate does depend on the repeat number. In fact, it varies widely with the repeat number. 

It is not possible to estimate accurately what mutated without knowing the phylogeny. Same applies to error limits. However, comparison of estimates for different datasets (35,000 x 67 vs 4,000 x111) gives an idea or uncertainty.   Direct methods for estimating errors produce underestimates (jackknife method or bayesian limits).


Ok but my concern would be that if those 35000 67 STRs haplotypes are all members of Clan Projects, or if they are simply 35000 67 STRs haplotype. The difference arises that if they are all members of clan projects, then it would mean that their MRCA would be in the order of 0-2000 ybp at most, therefore there would be some effects that would apply in such scenario. Even if all of them aren’t members of Clan Projects, if a vast majority is, it will still skew the results. 
« Last Edit: May 09, 2012, 10:30:59 AM by JeanL » Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #88 on: May 09, 2012, 12:20:48 PM »

All members of clan projects are not young.  Check my haplotype: z5hg3 on YSearch.  There are a range of entries.  Yes, in clan Gregor the Ian Cam are about 600+  years old, but the gregory's and griegs/greggs can go back into BC.  The people like myself are much older than most of the entries.  I will admit, that a lot of the DalRiada go back to c. 200 BC, and possibly were an immigrant belgian tribe.

I would also mention that in addition to the asymmetric up/down rates; it appears, for some dys loci, that mu changes with modal value?

I would be interested in an explanation of your comment "skew the results"?  I thought you had dismissed my suggestion that including younger SNP's in a set of data made it look younger.
« Last Edit: May 09, 2012, 12:23:06 PM by ironroad41 » Logged
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #89 on: May 09, 2012, 01:08:56 PM »

I would be interested in an explanation of your comment "skew the results"?  I thought you had dismissed my suggestion that including younger SNP's in a set of data made it look younger.

If most of the 35000 results from the 67 STRs come from projects with relatively young MRCA, then the mean observed mutation rate  for any given locus is going to be skewed towards the observed mutation rate in that time frame.
« Last Edit: May 09, 2012, 01:09:17 PM by JeanL » Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #90 on: May 09, 2012, 01:20:58 PM »

I agree and I have shown how slower or faster mutators can skew a TMRCA estimate, e.g. Ian CAm of Clan Gregor.  My SNP suggestion was broader.  Suppose you have four sons and they all are successful in starting families.  Down the line quite a ways one son begins a dynasty defined by a SNP(think M222).   1K to 2K years later we want to estimate the TMRCA back to the original ancestor of the four sons some say 5K years ago.  The 3 sons still may have lines but they are few in number compared to the dynasty.  It is my belief that the "dynasty" entries will skew the estimate toward the dynasty founder not the original ancestor.

I'm not sure how we get around this issue other than by selecting "independent" entries as I did in the Ian Cam, i.e., selecting entries with few common mutations.
Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #91 on: May 09, 2012, 02:03:04 PM »

....  1K to 2K years later we want to estimate the TMRCA back to the original ancestor of the four sons some say 5K years ago. ...

I think there is a value in interclade TMRCA calculations here. Perhaps Marko can explain this better, but the separate calculations for each of the clades in the pair helps filter out intraclade issues as long as the pair of clades are of the same general age.

(EDIT 05/09: "filter" is not the right word, but a reduction in Sigma (error range) can be reached.)
« Last Edit: May 10, 2012, 08:58:55 AM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #92 on: May 09, 2012, 02:17:14 PM »

I'm not sure?  It would seem that the variance/diversity is lower in one line that has a lot of entries converging to one man who is younger than the clade.  It would bias the estimate in my opinion.  I will, however, appreciate Markos thoughts on this issue?
Logged
MarkoH
Member
**
Offline Offline

Posts: 20


« Reply #93 on: May 10, 2012, 11:19:01 AM »

But that would certainly be a problem, because from what I have observed mutation rate does depend on the repeat number. In fact, it varies widely with the repeat number.  

I have also estimated such differences in the past. This was discussed in length at dna-forums.org, for example. A 67 level dataset is considered here vs. assumption that mutation rates would have geometric dependence on the repeat number.

The number estimated by ignoring the repeat number dependence would follow average mutation rate for the dataset repeat number distribution. This is a question of level in detail in modeling.  To capture repeat number dependence, significantly more complex methods are needed.  Same applies to the use of such model parameters. (At the same time, some even use only one average mutation rate for everything.)

There is no reason to expect that any particular subgroup would have systematically lower or higher mutation rates than the average. (Exception are the very slow loci that do not usually contribute much to tmrca estimates.) Refinements with repeat number dependence did not change results dramatically from the basic model.


Quote
Ok but my concern would be that if those 35000 67 STRs haplotypes are all members of Clan Projects, or if they are simply 35000 67 STRs haplotype. The difference arises that if they are all members of clan projects, then it would mean that their MRCA would be in the order of 0-2000 ybp at most, therefore there would be some effects that would apply in such scenario. Even if all of them aren’t members of Clan Projects, if a vast majority is, it will still skew the results.  

Many distantly related sample pairs are "correlated" in the sense that same long branches of evolution contributed to large number of such pairs. If, say, two samples are as distant as being from say R1a and R-L21 , all such pairs have exactly the same mutations that happened during the timespan from R1b mutation to R-L21 mutation, for example: The inclusion of this evolution in many times would distort the results by giving a large overweight to one particular period of haplotype evolution.

In closely related groups, however, the small differences in haplotypes mostly arise independently through different evolutionary pathways, and rate estimation model approximations ("statistically independent draws") are quite accurate.  (The issue of shared evolution "correlation" with more distant haplotype pairs can be addressed with a refined weighting derived from approximative phylogenies.)
« Last Edit: May 10, 2012, 11:46:20 AM by MarkoH » Logged
JeanL
Old Hand
****
Offline Offline

Posts: 425


« Reply #94 on: May 10, 2012, 11:37:52 AM »


I have also estimated such differences in the past. This was discussed in length at dna-forums.org, for example. A 67 level dataset is considered here vs. assumption that mutation rates would have geometric dependence on the repeat number.

The number estimated by ignoring the repeat number dependence would follow average mutation rate.
This is a question of level in detail in modeling.  To capture repeat number dependence, significantly more complex methods are needed.  Same applies to the use of such model parameters. (At the same time, some even use only one average mutation rate for everything.)

If I understand the graphs correctly it seems once the mutation rate departs from the norm STR repeats (i.e. 12-14 for DYS393, or 22-25 for DYS390, etc) the error bars become significant. The only thing I could think off that would somehow tackle the problem of repeat dependence would be measured mutation rates for a given locus for different STR repeats. Then try to use some sort of curve fit, to try to model the variation as a function of repeat. Of course such scenario would require a good amount of data points before the curve fit approaches a valid fit.

There is no reason to expect that any particular subgroup would have systematically lower or higher mutation rates than the average. (Exception are the very slow loci that do not usually contribute much to tmrca estimates.)  Refinements with repeat number dependence rarely changed the results from the basic model. 

Well, the variation of mutation rate as a function of repeat should present no problem when estimating MRCA in time frames which are relatively recent(i.e. 0-2500 ybp). Even the calibration process using genealogical trees appears to work modestly, however I believe than once we are in a time frame that it is double, and then the effects would start taking place.  I mean, you can check that repeat number dependence would rarely change the results for a set that has a historically known TMRCA. But for sets that have no known TMRCA, then I’m guessing that when you mentioned that repeat number dependence doesn’t change the results, it because they did not differ from the model where mutation rate was independent of repeat number. However, how does one make sure that the mutation rate dependence on repeat number in these longer timeframes is being modeled correctly.


Many distantly related sample pairs are "correlated" in the sense that same long branches of evolution contributed to large number of such pairs. As an example, if two samples are as distant as being from say R1a and R-L21 , all such pairs have exactly the same mutations that happened from the timespan from R1b mutation to R-L21 mutation, for example: the inclusion of this evolution in many times would distort the results by giving a large overweight to one particular period of haplotype evolution.

In the closely related groups, however, the small differences in haplotypes mostly arise independently through different evolutionary pathways, and rate estimation model approximations ("statistically independent draws") are quite accurate.  (The issue of shared evolution "correlation" can be addressed with a refined weighting for the haplotype pairs, however.)


I’m not sure I understand your R1a, R-L21 example, could you clarify that a bit?
Logged
MarkoH
Member
**
Offline Offline

Posts: 20


« Reply #95 on: May 10, 2012, 11:55:43 AM »

If I understand the graphs correctly it seems once the mutation rate departs from the norm STR repeats (i.e. 12-14 for DYS393, or 22-25 for DYS390, etc) the error bars become significant. The only thing I could think off that would somehow tackle the problem of repeat dependence would be measured mutation rates for a given locus for different STR repeats. Then try to use some sort of curve fit, to try to model the variation as a function of repeat. Of course such scenario would require a good amount of data points before the curve fit approaches a valid fit.

Error bars get larger since there is not much data for very deviant repeat numbers. Surely this is not relevant for TMRCA applications since such cases are quite rare in the datasets.


Quote
Well, the variation of mutation rate as a function of repeat should present no problem when estimating MRCA in time frames which are relatively recent(i.e. 0-2500 ybp). Even the calibration process using genealogical trees appears to work modestly, however I believe than once we are in a time frame that it is double, and then the effects would start taking place.

Saturation is not the only effect, there are also multisteps that work in different direction.  In very detailed level there cannot be general answer.  

Quote

Many distantly related sample pairs are "correlated" in the sense that same long branches of evolution contributed to large number of such pairs. As an example, if two samples are as distant as being from say R1a and R-L21 , all such pairs have exactly the same mutations that happened from the timespan from R1b mutation to R-L21 mutation, for example: the inclusion of this evolution in many times would distort the results by giving a large overweight to one particular period of haplotype evolution.

In the closely related groups, however, the small differences in haplotypes mostly arise independently through different evolutionary pathways, and rate estimation model approximations ("statistically independent draws") are quite accurate.  (The issue of shared evolution "correlation" can be addressed with a refined weighting for the haplotype pairs, however.)


I’m not sure I understand your R1a, R-L21 example, could you clarify that a bit?


The close clusters are what is needed for rate estimation purposes. They represent an extension to father/son observations.  


« Last Edit: May 10, 2012, 11:56:51 AM by MarkoH » Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #96 on: May 11, 2012, 12:01:52 PM »

You are correct.  There are no downstream/subsequent SNP's after M269.  This implies that the haplotype will be very diverse since its TMRCA is when the M269 mutation occurred.

The logic is as I stated above, the haplotype of that entry will have had a long time to experience STR mutations and will therefore reflect the time back to its founder.  It's becoming clearer to me that including younger SNP's will reduce the diversity, since the founder existed a briefer period of time.

On another thread Mike asked the importance of diversity.  I think I can now answer his question: the haplotype has to have the SNP of interest and no subsequent SNP mutations to reflect the diversity in his haplotype for the age of the SNP of interest.

Certainly, you will agree that time zero for a SNP is the founder haplotype and subsequent descendants reflect diversity from that haplotype only?

That's not what I meant.

I'm afraid I don't understand what you are saying and struggle to find any logic in your explanations.

 It appears to be quite simple.  If I take a set of 21 entries and 23 dys loci (out of the first 37) and estimate the TMRCA for a group who is Z253+ and L226- I get c. 1000 BC.  If I do the same type of estimate on a group of Z253+,  L226+  I usually get a TMRCA c. 400 AD.  It seems pretty clear to me?  Note L226 is a subclade of Z253.
« Last Edit: May 11, 2012, 12:02:31 PM by ironroad41 » Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #97 on: May 11, 2012, 12:33:49 PM »

 It appears to be quite simple.  If I take a set of 21 entries and 23 dys loci (out of the first 37) and estimate the TMRCA for a group who is Z253+ and L226- I get c. 1000 BC.  If I do the same type of estimate on a group of Z253+,  L226+  I usually get a TMRCA c. 400 AD.  It seems pretty clear to me?  Note L226 is a subclade of Z253.

So L226 is younger than Z253, I'll alert the press :)

Actually I think I worked out what you were on about above. I think you were referring to M269* not M269 which would make the rest of your statements make more sense and which I should have realised at the time (apologies)

However it's feasible for a group of people to be negative for all known SNPs below an old SNP and still have low diversity when compared with each other.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #98 on: May 11, 2012, 02:25:25 PM »

No apologies required.  The 1K BC if fairly uncertain, 3 of the dys loci I used: 458, 449  and 576 were probably undercounted/incorrectly counted.  I don't use Variance/ASD but simply count all mutations as one-step.  A different set of samples might give a much different estimate?  I don't know, especially, if there were more outliers like myself.  z5hg3 in Y Search.

Do you have any opinion about the work Jean M referred to which estimates L226 as maybe as much as 2K BC?  Its a long paper, using a different technique, and I haven't spent much time on it.  Maybe MarkoH could give us his opinion.
« Last Edit: May 11, 2012, 03:11:05 PM by ironroad41 » Logged
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #99 on: May 11, 2012, 02:48:24 PM »

No apologies required.  The 1K BC if fairly uncertain, 3 of the dys loci I used: 458, 449  and 576 were probably undercounted/incorrectly counted.  I don't use Variance/ASD but simply count all mutations as one-step.  A different set of samples might give a much different estimate?  I don't know, especially, if there were more outliers like myself.  z5hg3 in Y Search.

Do you have any opinion about the work Jean M referred to which estimates M226 as maybe as much as 2K BC?  Its a long paper, using a different technique, and I haven't spent much time on it.  Maybe MarkoH could give us his opinion.

I'm reasonably sure M226 is a typo but I can't decide if you mean M222 or L226, I would be surprised if either were as old as that though.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

Pages: 1 2 3 [4] 5 6 7 Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.121 seconds with 17 queries.