World Families Forums - STR Wars: Is diversity meaningful? more meaningful than Hg frequency?

Welcome, Guest. Please login or register.
July 22, 2014, 06:44:21 PM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  STR Wars: Is diversity meaningful? more meaningful than Hg frequency?
« previous next »
Pages: 1 ... 7 8 [9] 10 11 ... 14 Go Down Print
Author Topic: STR Wars: Is diversity meaningful? more meaningful than Hg frequency?  (Read 17214 times)
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #200 on: April 29, 2012, 12:20:25 PM »


I understand what you are showing and I applaud the effort, but I still think an analysis of only one or two STRs is not enough to rely on.  There could be aberrations in any one STR in any one haplogroup. An example is U106's L1 subclade which has a lot of DYS439=null.

... but I think your point above for P312 and L22 is still valid.  Even though the modal for P312 and L22 are different, the dispersion is similar.    

This is just my perspective at looking at a glass of water and saying it is half full versus half empty, but I what I think that people on this forum call "convergence around the modal" is really just divergence from the ancestral for scattered extant fairly young branches on the Y DNA tree.

Of course null values are a different story and shouldn't be included in variance calculations since they have nothing to do with STRs, however I understand and agree with your point.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

Maliclavelli
Guru
*****
Offline Offline

Posts: 2146


« Reply #201 on: April 29, 2012, 01:08:24 PM »

Mikewww says: “This is just my perspective at looking at a glass of water and saying it is half full versus half empty, but I what I think that people on this forum call "convergence around the modal" is really just divergence from the ancestral for scattered extant fairly young branches on the Y DNA tre”.

Actually I haven’t spoken of “convergence around the modal” but of “mutations around the modal” and “convergence to the modal”.
For instance: hg. R1b1* etc: DYS426= 12, 11, 12, 13 ,12…. And these mutations are the modals of R1b1, R-M269, R-L23, R-L51, R-P312 etc.
Logged

Maliclavelli


YDNA: R-S12460


MtDNA: K1a1b1e

Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #202 on: April 29, 2012, 07:04:42 PM »

...
Actually I haven’t spoken of “convergence around the modal” but of “mutations around the modal” and “convergence to the modal”.
For instance: hg. R1b1* etc: DYS426= 12, 11, 12, 13 ,12…. And these mutations are the modals of R1b1, R-M269, R-L23, R-L51, R-P312 etc.
I confess I don't understand the finer points of "mutations around" and "convergence to" the modal. 

For the whole hypothesis of using Y STRs as molecular clocks, I think what is important is that there is some kind of relatively consistent divergence from the ancestral allele. I mean, that is an important premise for the hypothesis. Do you agree?
Logged

R1b-L21>L513(DF1)>L705.2
Maliclavelli
Guru
*****
Offline Offline

Posts: 2146


« Reply #203 on: April 30, 2012, 12:11:57 AM »

Oh yes, I agree, and it could be useful if we do some caveats:
1)   mutations aren’t linear, but there are back mutations and also multi step ones, if very rare I think
2)   every line is new and begins a new cycle, i.e. R-L51+ happened in a man with DYS426=13 (or all the R-L51-s known so far descend from a similar man), then 13 is the modal value and from there we should calculate the mutations rate etc. To consider all R-L11 like an unicum is wrong
3)   I’d say that every familiar line is a new beginning
4)   How are counted the mutations of DYS426 in hg. R1b without taking in consideration what I have said?
Logged

Maliclavelli


YDNA: R-S12460


MtDNA: K1a1b1e

Maliclavelli
Guru
*****
Offline Offline

Posts: 2146


« Reply #204 on: April 30, 2012, 04:38:57 AM »

Indeed it is. The MRCA of the Z220 six is 4590!!. At the same time the MRCA of Z196xL176xM153xZ209-xZ220 (n=29) is 2733!!
Z196xL175 starts as a group with a MRCA of 3606; alter exclusion of M153 (MRCA 1068) it is 3888; without z209 (MRCA 4511) is changes into 3256; leaving Z220 out, it becomes 2733.
All these MRCA do not include RecLOH events etc.
Hans van Vliet

Here a Median Joining analysis after Star Contraction and Media Parsimony the Z220 six. We really need more Z209++ to test for Z220.
I'm very unsure how the old age of Z220 will connect easily with M153 though the Z278 and Z214 nodes.
Looking at Marko H's analysis it may come with a surprise.
Hans van Vliet
http://dl.dropbox.com/u/74936451/z220.pdf

The pair wise mismatch analysis of 67 STR markers of the Z196xL176 group shows a beautiful bell shape; indicative for a fast growing population.
Excluding the M153 subpopulation makes the profile more rugged showing that a more steady state in the growth had occurred.
The Z220 clade itself has a very rugged pair wise mismatch profile: a stabilising society?
Age and growth patterns hint a more slightly horizontal development in the downstream clades of Z196; with M153 way down the stream.
Hans

Mikewww, look at all these interesting analyses of Hans van Vliet in another thread and you’ll get a picture of all my theories (and doubts).
Logged

Maliclavelli


YDNA: R-S12460


MtDNA: K1a1b1e

Maliclavelli
Guru
*****
Offline Offline

Posts: 2146


« Reply #205 on: April 30, 2012, 04:54:19 AM »

This reasoning is probably the same done by Klyosov about haplogroup A0, that it has nothing to do with A1b etc., but probably is of a hominid mixed with the incomers from Eurasia, the same of Neanderthals or Denisovians, even though probably less old.
The same happens also amongst closer lines, but which have some discontinuities, i.e. many lines lack because they are extinct. The same of course happens also in the mtDNA, from this many mistakes also in the last classification of Behar et al.
And what is an outlier if not the surviving of a line distant from the others survived? Perhaps the analysis of Zhivotovsky is right because he takes in consideration people of the Austro-Asian migrants, who didn’t meet again after the separation. These are pure lines, if I may say so.
« Last Edit: April 30, 2012, 04:55:39 AM by Maliclavelli » Logged

Maliclavelli


YDNA: R-S12460


MtDNA: K1a1b1e

ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #206 on: April 30, 2012, 09:47:09 AM »

I think that your comments are correct and timely.  As I said on my thread I will try to present data that shows that a large number of STR's are bounded in their range, which implies that hidden mutations are prevalent.  I think that hidden mutations will produce the same observable effect you cite, an apparent decrease in mutation rate over longer time spans.  I know the Maori and Gypsy data Zhiv used was not more than a thousand or so years, but if he was using the Faster mutators such as CDYa,b; then I would expect a similar observation would be made and apparently thats what he did?

I think it's at least reasonably clear that loci don't mutate equally up or down also it's likely that loci with different values will have slightly (probably very slightly) diffrent mutation rates.

However how big is this issue and do we need to worry about it that much, especially when talking about L11 and its offspring ?

DYS492 offers a convenient way to see if loci are likely to behave in a dramatically different way after a mutation.

looking at all the values for this loci in the P312 project I found

9         0.10%
10         0.21%
11         0.94%
12         95.60%
13         1.68%
14         1.47%

where as in the U106 project we see

10         0.12%
12         1.06%
13         95.89%
14         2.82%
15         0.12%


Both sets of data look remarkably similar apart from the obvious fact that 12 is modal in the first set and 13 in the second.

There definitely doesn’t seem to be any particular tendency for U106 to try and get back to a value of 12 and I think it's reasonable to conclude that if somebody who was P312 had a value of 13 at this loci that it would behave in exactly the same way as somebody who had 13 and was U106.

 The distribution about the modal is close in both sets of data.  Whether 5% of the mutations are multisteps is hard to discern.  Look at the behavior of 388 in R1b, I and J.  I think you'll be surprised?

An additional comment would be that this is a very slow mutator and equal values for 13 and 14 in the 312 data set is interesting.  In general, there aren't a lot mutations at this loci.  Two options might be: a. a mutation from 12 to 13 and then from 13 to 14.  Once at 14, there may be descendants who carry that unusual value, or b. a multistep with the same scenario as above.  i.e.  we don't know if the proliferation of 14 is a family event or random event?  I've seen data where, an apparent multistep occurred of about 4 steps and then a population built up around that value.  In the case of 388, we observe a modal 12 the E Hgs, and then successively higher for G(50%  12 and 50% 13),I (14) and J(15) and then a return to 12 for R1a and R1b, as if they had evolved directly E3a and E3b. This data is obtained from the dataset I referenced earlier. Note the width of the distribution for E3a and b is essentially 3, only two states for G, 4 for I and 6 for J.  R1a is unusual in that the modal is 12 (.83) and the allele state 10 has .15.  Clearly a multistep occurred and then a population growth occurred?
 I have begun to develop observations re: the properties of the data set R - Z253+.  My initial work is with 74 entries and 23 of 37 dys loci.  Initally, I am calculating the TMRCA dys loci by dys loci, using the Burgurella mutation rates.  I assumed all mutations were single step, regardless of step size.  In my experience, it takes a while to become familiar with a data set and their may be some incorrect observations/results?

Note:  I included all except for Z253-.  So 226,554 and 895 which are younger than 253 are included. From what I can gather from Machiavelli/VanVliet this will decrease the overall TMRCA's.

The results are highly variable.  Six dys loci give TMRCA's of approximately8K to 10K BP.  These are:391,392,456,576 and 442.  392 is the cleanest, it has 6 mutations with one more than +/-1 from the modal.  The others are almost all bimodal and I used the allele value with the highest number of apparent mutations as the modal.  In the case of 391 there are 45 11, 25 10's and 1 12.  456 has 1 at 12, 1 at 14, 33 at 15, 29 at 16 and 8 at 17.  the other two are similar to these two.  391 has always been interesting to me, it is as if a man had a mutation from 11 to 10 and then he had two sons one of which had a mutation 10 to 11 (or vice versa).  These two brothers then began two dominant lines which we still have today??

Contrary to the above we have TMRCA's as follows:  390 = 1655; 19 = 1250; 388  = 3464; 426 = 0; 455 and 454 = 600 BP etc.

Comments:  I did not include, 385a,b; 389i,ii; 459a,b; 464a,b,c,d; YCAIIa,b and CDYa,b.  Also, I have no means, at present, to count "hidden mutations" at the modal.  The results may reflect their occurrence but the variability of mutation rates and other assumptions may affect the data results also.  My next step is to only include Z253+ and M226- entries to observe this effect.
« Last Edit: April 30, 2012, 09:48:21 AM by ironroad41 » Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #207 on: April 30, 2012, 03:58:48 PM »

I have started another thread which discusses this issue from a different point of view but to specifically answer the original question is that diversity appears to be more important.  But we have understand that diversity restarts, in some sense, with each SNP.  So when we are looking at a set of data, we have to use entries that all have the oldest SNP, and only the oldest SNP, we are trying to estimate time to.  Using entries with downstream SNP's will mix the estimate toward a smaller value.  Specifically, if you want to estimate the time of R-L21, then we have to use entries who have that as their last SNP.
Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #208 on: April 30, 2012, 06:14:53 PM »

I have started another thread which discusses this issue from a different point of view but to specifically answer the original question is that diversity appears to be more important.

I agree on that point, but I'll just add the caveat that diversity (and frequency) should be considered stand-alone. As many of the pieces of the puzzle as you can gather should be considered in context.

Quote from: ironroad41
But we have understand that diversity restarts, in some sense, with each SNP.

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't think bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments. (I meant I've asked Chandler, et al, including Klyosov who actually is a bio-chemist.)

I don't see anything in the academic studies that point to this. Is there such study?

This is where I made the point on another thread that we are all "homo sapiens sapiens", a subspecies of hominids.  We are more alike than different and everything I read is that most SNPs used are the ones that are searched for and found in the "junk" DNA between the genes.  In other words, they impact nothing.
« Last Edit: April 30, 2012, 06:50:50 PM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #209 on: April 30, 2012, 06:49:48 PM »


What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments.


Mike

Since this is the second time you've mentioned this I thought to post this simplistic explanation (simplistic because that's the extent of my knowledge) of what a null reading is.

One of the steps in reading an STR is finding it.

This is done by employing a 'dyed' chemical that bonds to the DNA near the STR.

If there is an alteration in this section of DNA then the chemical dye doesn’t bond and the relevant section can't be found.

When that happens the result is reported as a null result, but it really means the area wasn't read and could be anything within the bounds of probability for that STR

In the case of the L1 DYS439=null the disruption is in fact the L1 SNP, but a null reading at this loci could (I assume) easily be produced by a private SNP.

As I said I'm no expert in this sort of thing but I'm sure if you asked somebody like Vince Tilroe or Thomas Krahn they would be delighted to explain in-depth all the ins and outs.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #210 on: April 30, 2012, 07:01:43 PM »

I have started another thread which discusses this issue from a different point of view but to specifically answer the original question is that diversity appears to be more important.

I agree on that point, but I'll just add the caveat that diversity (and frequency) should be considered stand-alone. As many of the pieces of the puzzle as you can gather should be considered in context.

Quote from: ironroad41
But we have understand that diversity restarts, in some sense, with each SNP.

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't think bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments. (I meant I've asked Chandler, et al, including Klyosov who actually is a bio-chemist.)

I don't see anything in the academic studies that point to this. Is there such study?

This is where I made the point on another thread that we are all "homo sapiens sapiens", a subspecies of hominids.  We are more alike than different and everything I read is that most SNPs used are the ones that are searched for and found in the "junk" DNA between the genes.  In other words, they impact nothing.

What I mean is that, simply, the founder of a line, the person which had an SNP starts the mutational process over.  All descendants of that founder show diversity from him.  The diversity after him is from his haplotype.
Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #211 on: April 30, 2012, 07:02:29 PM »

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments.

Since this is the second time you've mentioned this I thought to post this simplistic explanation (simplistic because that's the extent of my knowledge) of what a null reading is.

One of the steps in reading an STR is finding it.
This is done by employing a 'dyed' chemical that bonds to the DNA near the STR.
If there is an alteration in this section of DNA then the chemical dye doesn’t bond and the relevant section can't be found.
When that happens the result is reported as a null result, but it really means the area wasn't read and could be anything within the bounds of probability for that STR

In the case of the L1 DYS439=null the disruption is in fact the L1 SNP, but a null reading at this loci could (I assume) easily be produced by a private SNP.

As I said I'm no expert in this sort of thing but I'm sure if you asked somebody like Vince Tilroe or Thomas Krahn they would be delighted to explain in-depth all the ins and outs.

I understand. This is the deal that Leo Little uncovered. I'm just pointing to this as the only case I'm aware of where standard STR  "reading" has a direct physical link to an SNP.  I guess, technically, the null is a "no read" so that is what you are being diligent in pointing out.  I agree with you - a null is a testing "no read."

The point I'm trying to make is there is no direct cause-effect tie between an STR value and a SNP based Y DNA tree haplogroup.  Any associations of STR values to SNPs are just coincidental. The reason different SNP marked haplogroups probably have different STR ancestral values is that they are must the remnants or scattered surviving branches of the human Y DNA family tree.   There were many, many more Y branches but most have died out leaving us with what we see today. R-P312 has the WAMH modal... U106 has something slightly different, Hg I has something different, etc.

 I just felt that I had to mention the L1 correlation with the DYS439=null "reading" to be clear because this a case that doesn't really fit the point, but as you articulate, it really is not an exception, just a not applicable to the point as far as the point I was making.
« Last Edit: April 30, 2012, 07:07:48 PM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #212 on: April 30, 2012, 07:09:00 PM »

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments.

Since this is the second time you've mentioned this I thought to post this simplistic explanation (simplistic because that's the extent of my knowledge) of what a null reading is.

One of the steps in reading an STR is finding it.
This is done by employing a 'dyed' chemical that bonds to the DNA near the STR.
If there is an alteration in this section of DNA then the chemical dye doesn’t bond and the relevant section can't be found.
When that happens the result is reported as a null result, but it really means the area wasn't read and could be anything within the bounds of probability for that STR

In the case of the L1 DYS439=null the disruption is in fact the L1 SNP, but a null reading at this loci could (I assume) easily be produced by a private SNP.

As I said I'm no expert in this sort of thing but I'm sure if you asked somebody like Vince Tilroe or Thomas Krahn they would be delighted to explain in-depth all the ins and outs.

I understand. This is the deal that Leo Little uncovered. I'm just pointing to this as the only case I'm aware of where standard STR  "reading" has a direct physical link to an SNP.  I guess, technically, the null is a "no read" so that is what you are being diligent in pointing out.  I agree with you - a null is a testing "no read."

The point I'm trying to make is there is no direct cause-effect tie between an STR value and a SNP based Y DNA tree haplogroup.  Any associations of STR values to SNPs are just coincidental.

 I just felt that I had to mention the L1 correlation with the DYS439=null "reading" to be clear because this a case that doesn't really fit the point, but as you articulate, it really is not an exception, just a not applicable to the point as far as the point I was making.

Sorry I wasn't intending to drive it home with a sledge hammer, I just wasn't sure you knew what a null result was :)
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #213 on: April 30, 2012, 07:11:09 PM »

I have started another thread which discusses this issue from a different point of view but to specifically answer the original question is that diversity appears to be more important.

I agree on that point, but I'll just add the caveat that diversity (and frequency) should be considered stand-alone. As many of the pieces of the puzzle as you can gather should be considered in context.

Quote from: ironroad41
But we have understand that diversity restarts, in some sense, with each SNP.

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't think bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments. (I meant I've asked Chandler, et al, including Klyosov who actually is a bio-chemist.)

I don't see anything in the academic studies that point to this. Is there such study?

This is where I made the point on another thread that we are all "homo sapiens sapiens", a subspecies of hominids.  We are more alike than different and everything I read is that most SNPs used are the ones that are searched for and found in the "junk" DNA between the genes.  In other words, they impact nothing.

What I mean is that, simply, the founder of a line, the person which had an SNP starts the mutational process over.  All descendants of that founder show diversity from him.  The diversity after him is from his haplotype.

I agree with you although I don't really thinking of it as "starting over" but rather as just a snapshot in time....  like this is the mile marker on the highway as we drive by.  Early in this thread, I tried to call this "divergence from the ancestral."   Other people use terms like "convergence to the modal" or "mutations around the modal" or the like and I'm not exactly sure what that means so I will describe my perspective is this is just "divergence from the ancestral."
« Last Edit: April 30, 2012, 07:18:01 PM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #214 on: April 30, 2012, 07:13:04 PM »

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments.

Since this is the second time you've mentioned this I thought to post this simplistic explanation (simplistic because that's the extent of my knowledge) of what a null reading is.

One of the steps in reading an STR is finding it.
This is done by employing a 'dyed' chemical that bonds to the DNA near the STR.
If there is an alteration in this section of DNA then the chemical dye doesn’t bond and the relevant section can't be found.
When that happens the result is reported as a null result, but it really means the area wasn't read and could be anything within the bounds of probability for that STR

In the case of the L1 DYS439=null the disruption is in fact the L1 SNP, but a null reading at this loci could (I assume) easily be produced by a private SNP.

As I said I'm no expert in this sort of thing but I'm sure if you asked somebody like Vince Tilroe or Thomas Krahn they would be delighted to explain in-depth all the ins and outs.

I understand. This is the deal that Leo Little uncovered. I'm just pointing to this as the only case I'm aware of where standard STR  "reading" has a direct physical link to an SNP.  I guess, technically, the null is a "no read" so that is what you are being diligent in pointing out.  I agree with you - a null is a testing "no read."

The point I'm trying to make is there is no direct cause-effect tie between an STR value and a SNP based Y DNA tree haplogroup.  Any associations of STR values to SNPs are just coincidental.

 I just felt that I had to mention the L1 correlation with the DYS439=null "reading" to be clear because this a case that doesn't really fit the point, but as you articulate, it really is not an exception, just a not applicable to the point as far as the point I was making.

Sorry I wasn't intending to drive it home with a sledge hammer, I just wasn't sure you knew what a null result was :)

No, I was being sloppy and that could confuse folks so you I appreciate that.  In fact, I learned a little more about the process from your description.
Logged

R1b-L21>L513(DF1)>L705.2
Jdean
Old Hand
****
Offline Offline

Posts: 678


« Reply #215 on: April 30, 2012, 07:16:46 PM »

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments.

Since this is the second time you've mentioned this I thought to post this simplistic explanation (simplistic because that's the extent of my knowledge) of what a null reading is.

One of the steps in reading an STR is finding it.
This is done by employing a 'dyed' chemical that bonds to the DNA near the STR.
If there is an alteration in this section of DNA then the chemical dye doesn’t bond and the relevant section can't be found.
When that happens the result is reported as a null result, but it really means the area wasn't read and could be anything within the bounds of probability for that STR

In the case of the L1 DYS439=null the disruption is in fact the L1 SNP, but a null reading at this loci could (I assume) easily be produced by a private SNP.

As I said I'm no expert in this sort of thing but I'm sure if you asked somebody like Vince Tilroe or Thomas Krahn they would be delighted to explain in-depth all the ins and outs.

I understand. This is the deal that Leo Little uncovered. I'm just pointing to this as the only case I'm aware of where standard STR  "reading" has a direct physical link to an SNP.  I guess, technically, the null is a "no read" so that is what you are being diligent in pointing out.  I agree with you - a null is a testing "no read."

The point I'm trying to make is there is no direct cause-effect tie between an STR value and a SNP based Y DNA tree haplogroup.  Any associations of STR values to SNPs are just coincidental.

 I just felt that I had to mention the L1 correlation with the DYS439=null "reading" to be clear because this a case that doesn't really fit the point, but as you articulate, it really is not an exception, just a not applicable to the point as far as the point I was making.

Sorry I wasn't intending to drive it home with a sledge hammer, I just wasn't sure you knew what a null result was :)

No, I was being sloppy and that could confuse folks so you I appreciate that.  In fact, I learned a little more about the process from your description.

Well it's not the full story of course, apparently they employ lasers and other fancy gizmos as well.
Logged

Y-DNA R-DF49*
MtDNA J1c2e
Kit No. 117897
Ysearch 3BMC9

Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #216 on: April 30, 2012, 07:37:37 PM »

I'm just cataloging this quote here as it is pertinent to the overall thread.

Sandy Paterson is a M222 researcher who is an actuarist by profession. Here are his comments from Rootsweb on Ken Nordvedt's model.

Quote from: Sandy Paterson
Ken Nordtvedt, erstwhile Emeritus Professor of Physics at Montana State has proved that

E(v) = mG

where E(v)= expected marker variance
m=mutation rate
G=number of generations

So the number of generations taken to reach a given level of dispersion of
marker scores can be estimated as (observed variance)/(mutation rate).
Obviously, the more markers the better. This means it's quite natural to
divide the observed sum of variance of one haplogroup by that of another in
order to get a feel for the age of one haplogroup relative to another.
That's what Mike did, and he did so in order to avoid arguments about poorly
researched mutation rates. I think that's perfectly valid.
http://archiver.rootsweb.ancestry.com/th/read/dna-r1b1c7/2012-03/1333010242

There is a secondary point in that he is agreeing with me that by comparing (relative) variance rather than TMRCA's I'm avoiding the mutation rate evolutionary versus germ line controversy.
« Last Edit: April 30, 2012, 07:38:08 PM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #217 on: April 30, 2012, 08:00:35 PM »

...  I have begun to develop observations re: the properties of the data set R - Z253+.  My initial work is with 74 entries and 23 of 37 dys loci.  Initally, I am calculating the TMRCA dys loci by dys loci, using the Burgurella mutation rates.  I assumed all mutations were single step, regardless of step size.  In my experience, it takes a while to become familiar with a data set and their may be some incorrect observations/results?

Note:  I included all except for Z253-.  So 226,554 and 895 which are younger than 253 are included. From what I can gather from Machiavelli/VanVliet this will decrease the overall TMRCA's.

The results are highly variable.  Six dys loci give TMRCA's of approximately8K to 10K BP.  These are:391,392,456,576 and 442.  392 is the cleanest, it has 6 mutations with one more than +/-1 from the modal.  The others are almost all bimodal and I used the allele value with the highest number of apparent mutations as the modal.  In the case of 391 there are 45 11, 25 10's and 1 12.  456 has 1 at 12, 1 at 14, 33 at 15, 29 at 16 and 8 at 17.  the other two are similar to these two.  391 has always been interesting to me, it is as if a man had a mutation from 11 to 10 and then he had two sons one of which had a mutation 10 to 11 (or vice versa).  These two brothers then began two dominant lines which we still have today??

Contrary to the above we have TMRCA's as follows:  390 = 1655; 19 = 1250; 388  = 3464; 426 = 0; 455 and 454 = 600 BP etc....

I think what you are seeing is why Ken Nordtvedt has been suggesting all along that more STRs is better.

Here is the actuarist's view on Ken's recommendation.
Quote from: Sandy Paterson
However, KN does indded suggest that more is better....
What I've found is that you start getting reasonable results as n approaches 50. Anything less is dicey.
http://archiver.rootsweb.ancestry.com/th/read/dna-r1b1c7/2012-03/1332498888

That's why I quit doing STR variance on 16 and 25 markers. I can see in my own haplogroup comparisons that the results are not consistent when picking out markers amongst a low number. Sandy is saying that his simulations showed we should be using at least 50 STRs. This all makes me cringe knowing the academic studies typically use a number like 6, 10 or 15 STRs at the most in their diversity calculations.

Ken recently made the following comment on using just one STR for TMRCA estimations. The question was posed and Ken answered.
Quote
why not using DYS724, in a C14 dating sort of way, as as a very simple and rough indicator of time to a MCRA within one (sub)clade of a haplogroup?

Quote from: Ken Nordtvedt
Very Very Rough Indicator. Consider the sigma on tmrca using just this oneSTR.

But theory is fine: the underlying assumption is that each individual STR is a crude clock. We want a better clock, so we compose such by combining behaviors of many, many individual STR clocks.

Remember: C14 clock is the composite result of millions of radioactive atoms doing their thing.
http://archiver.rootsweb.ancestry.com/th/read/y-dna-haplogroup-i/2012-04/1335729772

So, even though idea of Carbon-14 dating is based on aggregating many, many crude clocks together in a mathematically sound way.  That's all Ken is doing with STRs.

If you like math and want the math theory discussion, you should probably read these posts from a couple of years ago on the Central Limit Theory in Action with Ken Nordtvedt, John Chandler and James Heald, who is also pretty sharp.
http://archiver.rootsweb.ancestry.com/th/read/GENEALOGY-DNA/2008-03/1204573397

The aggregation of STR clocks is very clearly not perfected, but Heald agrees.
Quote from: James Heald
I suspect Ken is quite right, that with enough markers, P(T | t) rapidly becomes approximately Gaussian, because of the Central Limit Theorem; with the mean of T = mean no of steps = mu t
http://archiver.rootsweb.ancestry.com/th/read/GENEALOGY-DNA/2008-03/1204573397

Please don't misinterpret things out of context. There are many disagreements among the mathematicians. However the thoughtful aggregation of STR clocks to estimate the relative age of clades is useful, no doubt.  I expect to see new breakthroughs over the next couple of years... maybe JeanL has one for us.  I think Heinila developed some new forms of analysis.
« Last Edit: May 01, 2012, 12:10:59 PM by Mikewww » Logged

R1b-L21>L513(DF1)>L705.2
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #218 on: May 01, 2012, 07:26:08 AM »

What do you mean by "diversity restarts?"   If you are implying that that there is a biological link between Y SNPs and Y STRs, I don't bio-chemists think that is true. I think L1 and DYS439=null might be an exception, but I've asked this question multiple times and I always get a "no" answer without any counter-arguments.

Since this is the second time you've mentioned this I thought to post this simplistic explanation (simplistic because that's the extent of my knowledge) of what a null reading is.

One of the steps in reading an STR is finding it.
This is done by employing a 'dyed' chemical that bonds to the DNA near the STR.
If there is an alteration in this section of DNA then the chemical dye doesn’t bond and the relevant section can't be found.
When that happens the result is reported as a null result, but it really means the area wasn't read and could be anything within the bounds of probability for that STR

In the case of the L1 DYS439=null the disruption is in fact the L1 SNP, but a null reading at this loci could (I assume) easily be produced by a private SNP.

As I said I'm no expert in this sort of thing but I'm sure if you asked somebody like Vince Tilroe or Thomas Krahn they would be delighted to explain in-depth all the ins and outs.

I understand. This is the deal that Leo Little uncovered. I'm just pointing to this as the only case I'm aware of where standard STR  "reading" has a direct physical link to an SNP.  I guess, technically, the null is a "no read" so that is what you are being diligent in pointing out.  I agree with you - a null is a testing "no read."

The point I'm trying to make is there is no direct cause-effect tie between an STR value and a SNP based Y DNA tree haplogroup.  Any associations of STR values to SNPs are just coincidental. The reason different SNP marked haplogroups probably have different STR ancestral values is that they are must the remnants or scattered surviving branches of the human Y DNA family tree.   There were many, many more Y branches but most have died out leaving us with what we see today. R-P312 has the WAMH modal... U106 has something slightly different, Hg I has something different, etc.

 I just felt that I had to mention the L1 correlation with the DYS439=null "reading" to be clear because this a case that doesn't really fit the point, but as you articulate, it really is not an exception, just a not applicable to the point as far as the point I was making.
I can't prove that there is no direct relationship between an STR pattern and a SNP.  Are you arguing that they are independent?  Then how can we show groups of subclades with similar STR signatures and all having a common SNP?

The whole premise has been that a modal reflects the STR pattern of one man who both has an SNP value and an STR pattern. I don't think it can be proven that the SNP and the defining STR mutation occurred simultaeously, but they were close in time.
  All this said, the coincidence of an SNP and an STR modal pattern may be a "red herring".  What is important is that using entries from younger SNPs is wrong. As an example take M226 which is not a real old SNP, c. 400 AD is the estimate.  All men with that SNP are descended from one man and we can infer his modal haplotype from his descendants.  All members of M226 reflect diversity beginning at 400 AD to the present time depending when their line had a mutation from ancestors set of values.

This brings me to my point that when you are trying to estimate Z253, you should not include M226 entries.  They only have diversity to the time of the M226 defining mutation and will reduce the estimate from those entries of Z253 whose diversity started much earlier in time, the time when the defining mutations for Z253 occurred, be that an SNP or STR?

I'll repeat what I said before; SNP's for a hierarchical set of data.  If I take the set of entries with a younger SNP, I get a younger TMRCA.  When trying to determine the time of a modal value of an older SNP, only entries with just that SNP should be used, no entries having a subsequent SNP should be used because it will reduce the TMRCA.
« Last Edit: May 01, 2012, 07:31:47 AM by ironroad41 » Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #219 on: May 01, 2012, 07:53:52 AM »

I thought I might expand on my "red herring" comment. 

In the clan Gregor, I have made numerous estimates of the TMRCA of the "founder" of the clan.  But there always is an uncertainty.  We don't know precisely when the mutation occurred?  It could have occurred with ggf,gf,f or possibly son.  there is no way of knowing.  Our SD's are always such that we cannot be sure which person had the mutation.  I think the same level of uncertainty exists with the timing between an SNP and the STR modal.  The modal reflects all the entries most frequent values dys loci by dys loci. ( note: if we include subsequent SNP entries we may bias the modal).

Our TMRCA  estimate is our best guess when the modal occurred and the time to the SNP defining that modal can't be too far off.  It might not be O'neill of the nine hostages, but it sure was probably a close relative (re M226).
Logged
Jean M
Guru
*****
Offline Offline

Posts: 1253


« Reply #220 on: May 01, 2012, 09:12:23 AM »

Our TMRCA  estimate is our best guess when the modal occurred and the time to the SNP defining that modal can't be too far off.  It might not be O'neill of the nine hostages, but it sure was probably a close relative (re M226).

I think that you mean M222 and in fact there is no evidence that Niall of the Nine Hostages carried M222. If he is anything more than fiction, he was from a part of Ireland which is actually low on M222. The whole idea that he was the ancestor of the men in Donegal carrying M222 rested on genealogies which were tampered with c. 700 AD to make this famous person the ancestor of various families of Donegal, who then claimed to be the Northern Uí Néill. See Irish Surnames and y-DNA: Uí Néill

You may be chasing a similar will-o'-the-wisp with the founder of Clan Gregor. I can't really tell from what you have written. In general men of the same surname who are actually related should turn out to have a common ancestor at around the period that surnames developed, and that has been found to be the case for some  that have been investigated. However if you have been looking at everyone with a McGregor surname, including those not known to be related by paper trail to the clan chiefs, that could be throwing you out. As you know, not all McGregors will be descended from the same Gregor. Some may not be descended from a Gregor at all. Though we can expect some not descended from the clan founder to be at least in the same haplogroup, since R1b-L21 is so common in Scotland. (I take it that the chief's line is L21.)  
« Last Edit: May 01, 2012, 09:40:42 AM by Jean M » Logged
ironroad41
Old Hand
****
Offline Offline

Posts: 219


« Reply #221 on: May 01, 2012, 11:47:13 AM »

I thought that was Trinity Colleges opinion re: Niall?

The Ian Cam are destringuished by a mutation to 10 at 391.  This is common to all the entries to date.  The Clan Gregor has over 600+ entries; Grier, Grieg, Gregory etc.  The Ian Cam are one set of the entries but assert they are descendants of the founder of the Clan Gregor name.  So, if you go to the FtDNA Clan Gregor website and observe the entries, you will that the Ian Cam are a pretty homgeneous group.  2124 is a direct descendant of the Clan founder and has had no observable mutations.
Logged
chris1
Senior Member
***
Offline Offline

Posts: 66


« Reply #222 on: May 01, 2012, 11:58:38 AM »

Our TMRCA  estimate is our best guess when the modal occurred and the time to the SNP defining that modal can't be too far off.  It might not be O'neill of the nine hostages, but it sure was probably a close relative (re M226).

I think that you mean M222 and in fact there is no evidence that Niall of the Nine Hostages carried M222. If he is anything more than fiction, he was from a part of Ireland which is actually low on M222. The whole idea that he was the ancestor of the men in Donegal carrying M222 rested on genealogies which were tampered with c. 700 AD to make this famous person the ancestor of various families of Donegal, who then claimed to be the Northern Uí Néill. See Irish Surnames and y-DNA: Uí Néill

You may be chasing a similar will-o'-the-wisp with the founder of Clan Gregor. I can't really tell from what you have written. In general men of the same surname who are actually related should turn out to have a common ancestor at around the period that surnames developed, and that has been found to be the case for some  that have been investigated. However if you have been looking at everyone with a McGregor surname, including those not known to be related by paper trail to the clan chiefs, that could be throwing you out. As you know, not all McGregors will be descended from the same Gregor. Some may not be descended from a Gregor at all. Though we can expect some not descended from the clan founder to be at least in the same haplogroup, since R1b-L21 is so common in Scotland. (I take it that the chief's line is L21.)  
If I've got it right, the 'Ian Cam' McGregor group is a younger subclade branching off the very large 'Scots Modal' cluster around 600 years ago. 'Scots Modal' is L21+ but I think it has yet to discover its defining SNP downstream of L21.

Regarding the O'Neill/UiNeill, this is what I picked up: One group is the L21+, M222+ (Ui Neill), a very large cluster known as the 'Nial cluster'/'NW Irish' and contains many surnames.

Another cluster, one that I don't think you mention in the link, (with a number of O'Neill and MacShane surnames) is possibly P312* (named 'O'Neill Variety' or O'Neill Variant') and has not yet discovered its defining SNP downstream of P312 (L21- , U152- and Z196- so far).
Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2963


WWW
« Reply #223 on: May 01, 2012, 12:23:28 PM »

... Please don't misinterpret things out of context. There are many disagreements among the mathematicians. However the thoughtful aggregation of STR clocks to estimate the relative age of clades is useful, no doubt.  I expect to see new breakthroughs over the next couple of years... maybe JeanL has one for us.  I think Heinila developed some new forms of analysis.

I can see Ken continues to work on enhancements.  I currently use his Generations7 methodology in conjunction with the Haplotype_Data spreadsheets I maintain for R1b deep clade tested people. I'll let initial testing settle a little and then I'll incorporate 111T version.

Quote from: Ken Nordtvedt
I have upgraded my excel program for estimating intra and inter clade variance based age estimates for y haplotypes. Generations111T now takes haplotypes which include all the 111 standard FTDNA STRs (although 11 of the multi-copy ones are not used). But haplotype collections of mixed STR numbers can be used. I like to think the upgrade program is also more user friendly than the Generations7 it replaces.

Generations111T can be downloaded from link below. Please report any glitches, etc. The “T” stands for “test model”. Read instructions.
Single haplotypes can be entered into both clade A and clade B row spaces to obtain TMRCA for haplotype pair, or up to 400 haplotypes of each clade can be used. Or just one clade can be entered to obtain both coalescence age and TMRCA age estimates.
http://archiver.rootsweb.ancestry.com/th/read/Y-DNA-HAPLOGROUP-I/2012-05/1335879324

BTW - good news!  I already have about 900 111 STR length haplotypes for just R-L21.  FTDNA developed the 68-111 panel to enhance TMRCA estimations. I don't think there are any multi-copy STRs in this panel.

Now is the time to consider upgrading to 111 STRs if you haven't already.
Logged

R1b-L21>L513(DF1)>L705.2
Jean M
Guru
*****
Offline Offline

Posts: 1253


« Reply #224 on: May 01, 2012, 12:59:58 PM »

I thought that was Trinity Colleges opinion re: Niall?

Yes indeed. They put out the study claiming Niall as the daddy of M222 just months before an historian undermined the whole idea that Niall was the founder of the Northern Ui Neill. Then later testing blew more holes in the idea. But let us not digress. McGregor is your interest.
Logged
Pages: 1 ... 7 8 [9] 10 11 ... 14 Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.219 seconds with 18 queries.