World Families Forums - Posting a combined R-L21 data Excel spreadsheet

Welcome, Guest. Please login or register.
November 22, 2014, 06:37:17 AM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  Posting a combined R-L21 data Excel spreadsheet
« previous next »
Pages: [1] 2 Go Down Print
Author Topic: Posting a combined R-L21 data Excel spreadsheet  (Read 2423 times)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« on: October 11, 2009, 09:39:56 PM »

Okay, I am finally about ready to post the spreadsheet I have of R-L21* folks from the R-L21 Plus project (which is what I recommend joining,) Ysearch and various FTDNA surname and geographic projects that we stumble upon.

Is there anyway to post it on this forum in a place where I can update it in place periodically and we can all get it it without multiple clicks or searches?

I will also try to insert into the Google tools (with some cool analytical features)  that Dr. Krahn suggested as well, but I do want to get this out there in an Excel format.  I've got all kinds of sorting and viewing macros to make it easier to navigate as well Mode, Mean, Median and GD calculations.
« Last Edit: October 11, 2009, 09:42:43 PM by Mike » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Nolan Admin - Glenn Allen Nolen
Project Coordinator
Old Hand
*****
Offline Offline

Posts: 292


WWW
« Reply #1 on: October 11, 2009, 09:50:32 PM »

You might ask Terry Barton, the forum owner, for a separate page to display your information then provide a link to that file. That's my only suggestion.
Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #2 on: October 12, 2009, 02:08:21 PM »

You might ask Terry Barton, the forum owner, for a separate page to display your information then provide a link to that file. That's my only suggestion.
Terry was kind of enough to set up special page that he called a geographic project page that is named R-L21.   I guess we can use it for whatever we want as long as it is relevant.    Is anyone familiar with how these work? Can one "attach" documents and create a menu of links?
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Nolan Admin - Glenn Allen Nolen
Project Coordinator
Old Hand
*****
Offline Offline

Posts: 292


WWW
« Reply #3 on: October 12, 2009, 10:55:26 PM »

Ask Terry for instructions if there is not an Administrative help index on the left hand side of the page. 
Logged
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #4 on: October 15, 2009, 12:45:49 AM »

I posted the data I have under the "Files" section at   http://tech.groups.yahoo.com/group/RL21Project/

There are a lot of calculations and sorting built-in.  Also a bunch of speculations on R-L21* varieties.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #5 on: October 15, 2009, 01:40:58 AM »

I posted the data I have under the "Files" section at   http://tech.groups.yahoo.com/group/RL21Project/

There are a lot of calculations and sorting built-in.  Also a bunch of speculations on R-L21* varieties.

Wow, the more improved model.....can  I get an instruction manual or a cheat sheet??? Just kiddin'.  Thank goodness I have quad intel, 4gb 2.4ghz machine to run it.... :)
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #6 on: October 24, 2009, 06:21:58 PM »

This has been updated with the latested Ysearch R1b1b2a1b5 folks (thanks to Vince T) and recent R-L21Plus project folks and misc others (particularly null 425 Clan O'Collas folks.)

There is also a summary table of STR distributions, variance, standard deviations, etc. at the  bottom.  The summary table is built into the spreadsheet but I also have it attached as a .pdf file.  Both new files are version 6b under the FILES section at
http://tech.groups.yahoo.com/group/RL21Project/

You have to join the group, but you don't have to receive any emails from the group or anything like that.  Just "Edit Membership" and pick the "Web only - no emails" option.

Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #7 on: October 25, 2009, 01:37:31 PM »

Mike,
What is the significance of the Red/Pink cell background above or below the WAMH on each DYS in the Summary Table?
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #8 on: October 25, 2009, 11:39:23 PM »

Mike,
What is the significance of the Red/Pink cell background above or below the WAMH on each DYS in the Summary Table?
I'm not positive I am addressing your question but I'll give it a shot.

The Summary Table rows 1011 through 1020 have counts for each STR column.  As an example the ">= +4 WAMH" row means that the count of every STR (one per column) that is four or more greater than WAMH is totalled.

For visual effect, every cell that has a count greater than or equal to 20% of the total count is RED.   Every cell that has a count greater than or equal to 10% but less than 20% is VIOLET.  Every cell that has a count that is less than 10% of the total but at least 1 is BLUE.   

Example: There are 313 people with a value of 10 at DYS393.   That 313 is greater than 20% of the total count of 876 people that had a test result of any kind at DYS393.  Of course, since R-L21*'s modal is basically, WAMH, the whole WAMH row is RED (all greater than 20%.)
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #9 on: October 26, 2009, 10:04:44 AM »

That is the exact answer I was looking for. Thanks!

It is interesting where there are larger distribution of Allele values both negative and positive mutations.

How can each DYS be interpreted in reference to ancestral or derived allele values?
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #10 on: October 26, 2009, 11:39:16 AM »

That is the exact answer I was looking for. Thanks!

It is interesting where there are larger distribution of Allele values both negative and positive mutations.

How can each DYS be interpreted in reference to ancestral or derived allele values?
I'm not the right guy to answer that question.  I would assume that the modal has as a connection to the ancestral values, but perhaps it should be the L21+ modal for restricted geography where L21+'s father and brothers are found.   I don't know.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #11 on: October 26, 2009, 05:14:18 PM »

As I understand most mutation, at low allele values have a ratio mutation upwards two to one downward. When the allele is highTwenty or higher, they mutate down wards generally.

Looking at the sheet, I notice that there are several exceptions.
DYS' Trending downwards:
Fast439, Med459b, Fast464*a & c, Med460, Fast456, Med413a, Slow565

*This maybe due to changes in interpetation

This information of these values maybe important for use in deeper ancestry connections.
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
cmblandford
Senior Member
***
Offline Offline

Posts: 85


« Reply #12 on: October 27, 2009, 02:24:24 AM »

Mike, if you change group 9919 to 459a<=9; you only pick up one more guy - me.  (-:

Logged

Y-DNA:  R-DF13*


Surname Project:  Blandford

Kit:  ft115893   Ysearch:  EYSPZ


Earliest Known Ancestor:  Thomas Blanford; Dorset, England; born 1648


Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #13 on: October 27, 2009, 08:53:37 AM »

Chuck,

I can do that easily, but my variety assignments are very speculative anyway.  Do you  (or does anyone) know much about the nature of DYS459?  I'm not intelligent on the multi-copy markers generally speaking.

I've read the stuff on DYS464 and have implemented the modified infinite allele formula for it.  I think understand it based on FTDNA's and Bruce Walsh's guidance.  I understand that 389-1 and 2 can be treated separately by just using the "389ii-i" difference method for reporting the second.  I understand and have implemented the YCAIIa and YCAIIb modified infinite allele formula.

I don't treat DYS385a and DYS385b nor DYS459a and DYS459b nor CDYa and CDYb nor 395s1 and 395s2 any differently than any other step-wise individual marker.  That may be okay but I'm looking for help.  If DYS459a and DYS459b are each individual step-wise model STR markers, then it makes a lot of sense to call you in that group because you are on the "far" side of WAMH.

Does anyone have any comments on the nature of DYS459a and DYS459b?

Same question for CDYa and CDYb?  It is such a fast mover it seems logical that some modified infinite allele method should be used like it is for DYS464.  The logic I read for DYS464 is that it is so fast moving the infinite allele method is needed.   CDYa and CDYb are even faster.
« Last Edit: October 27, 2009, 08:55:52 AM by Mike » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
vtilroe
Project Coordinator
Old Hand
*****
Offline Offline

Posts: 150


« Reply #14 on: October 29, 2009, 02:39:00 AM »

Same question for CDYa and CDYb?
I've looked at several of the chromat traces for the CDYa & b regions from WTY data, and I have to say that the structure of that regions is actually really screwy.  The best way I can explain it is that it seems to behave like a zipper missing a bunch of teeth.  It doesn't seem to be a simple case of plus 1 or minus 1 repeat.
Logged

YSearch & MitoSearch: 2GXWW


yDNA: R-U106*


mtDNA: U5a1a1 (Genbank# GQ368895)


R-P312-WTY Project Admin http://tinyurl.com/daertg

Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #15 on: October 29, 2009, 10:45:40 AM »

Same question for CDYa and CDYb?
I've looked at several of the chromat traces for the CDYa & b regions from WTY data, and I have to say that the structure of that regions is actually really screwy.  The best way I can explain it is that it seems to behave like a zipper missing a bunch of teeth.  It doesn't seem to be a simple case of plus 1 or minus 1 repeat.
V Tilroe,
Thanks for you help.  Your finding confirms the way I was leaning.  In the R-L21* spreadsheet I've been using the pure "step-wise" model for CDYa & b but that appears to overstating GD's.

I just found this quote of FTDNA on another web site although I couldn't find it on FTDNA directly:
"Some markers have shown themselves to be more volatile then others and the population geneticists have created a model to account for these ‘aberrations’. For markers that fall into this category, despite the fact that two people could be separated by 2 (or 3) mutations, the scientific assumption is that the change took place in a single generation (between a father and a son) and therefore it is treated as a single step, despite the fact that more then one ‘point’ separates two samples.  We classify DYS 464a-d and CDYa-b as following this model."

It looks like CDYa and CDYb should have 2 or 3 differences in alleles counted as 1 and 4 to 6 counted as 2.
« Last Edit: October 29, 2009, 11:33:12 AM by Mike » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #16 on: October 29, 2009, 01:22:11 PM »

Mike, if you change group 9919 to 459a<=9; you only pick up one more guy - me.  (-:


Yes, your 459a=6 is unique indeed.  You might be a very early branch off this bunch,  but the lot you seem to fit with the 9919 640=12 folks which you don't have.  

This is all speculative, so just take this with a grain of salt:
The progenitor with YCAII=19,19  459b=9,9 happened along time ago and he was 640=11 (WAMH).   Your branch split away and somewhere along time ago and 459a made the mutation(s) down to the very rare 6.  Meanwhile, another branch of the 9919 progenitor developed the 640=12 mutation that seems to be more frequent.

On the other hand, you might want to examine the folks below closely.  Perhaps your YCAIIb move to 19 was the recent mutation, not the 459a.  In that case these guys look closer related to you:
6QTNH - zzzunk - England
8NQXQ - Dew
« Last Edit: October 29, 2009, 01:24:12 PM by Mike » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
cmblandford
Senior Member
***
Offline Offline

Posts: 85


« Reply #17 on: November 07, 2009, 08:10:50 PM »

Quote
It doesn't seem to be a simple case of plus 1 or minus 1 repeat.

Vince, does 459ab show the same irregularity as CDYab or is it more "regular"

Any thoughts on the likely direction and increment or decrement of mutations for 459ab?

Thanks
« Last Edit: November 07, 2009, 08:36:42 PM by cmblandford » Logged

Y-DNA:  R-DF13*


Surname Project:  Blandford

Kit:  ft115893   Ysearch:  EYSPZ


Earliest Known Ancestor:  Thomas Blanford; Dorset, England; born 1648


vtilroe
Project Coordinator
Old Hand
*****
Offline Offline

Posts: 150


« Reply #18 on: November 08, 2009, 01:28:14 AM »

Quote
It doesn't seem to be a simple case of plus 1 or minus 1 repeat.

Vince, does 459ab show the same irregularity as CDYab or is it more "regular"

Any thoughts on the likely direction and increment or decrement of mutations for 459ab?

Thanks
459a and 459b look a lot more stable.  The trick with these two is that they have highly similar patterns on each arm of a very large palindromic region.

http://ymap.ftdna.com/cgi-bin/gbrowse/hs_chrY/?name=Sequence:DYS459_1
http://ymap.ftdna.com/cgi-bin/gbrowse/hs_chrY/?name=Sequence:DYS459_2
http://www.smgf.org/ychromosome/marker_details.jspx?marker=DYS459

The way they mutate basically depends on how the palindromes fold up and stick together.
http://www.dna-fingerprint.com/static/PalindromicRegion-V2.pdf
(Right-click to zoom):


(DYS459 is not being sequenced in WTY.)
« Last Edit: November 08, 2009, 01:34:57 AM by vtilroe » Logged

YSearch & MitoSearch: 2GXWW


yDNA: R-U106*


mtDNA: U5a1a1 (Genbank# GQ368895)


R-P312-WTY Project Admin http://tinyurl.com/daertg

Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #19 on: November 09, 2009, 06:38:40 PM »

... Fast464*a & c, Med460, Fast456, Med413a, Slow565

*This maybe due to changes in interpetation


Has the DYS 464a,b,c,d  test parameters changed at FtDNA in the last year or it is still the same methodology?
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #20 on: December 16, 2009, 08:27:50 PM »

I've updated the spreadsheet out on the Yahoo Groups "Files".  http://tech.groups.yahoo.com/group/RL21Project/

There are now 916 confirmed R-L21* folks.   I have the latest info I could find on the new SNP's.  It's out in column BZ.  You can just go to the column heading and click on the SNP you want or do your own custom autofiltering.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #21 on: February 24, 2010, 12:12:31 PM »

I've updated the spreadsheet out on the Yahoo Groups "Files".  http://tech.groups.yahoo.com/group/RL21Project/

There are now 916 confirmed R-L21* folks.   I have the latest info I could find on the new SNP's.  It's out in column BZ.  You can just go to the column heading and click on the SNP you want or do your own custom autofiltering.
Here are the latest R-L21* statistics for haplotypes that I've found in the R-L21Plus project, Ysearch an various surname and other FTDNA projects.  This is understated as I haven't checked a lot of the surname and other projects for a while.  There are just too many.  Please encourage L21+ (including L226, L159, L193 and M222) people to join the project.
http://www.familytreedna.com/group-join-request.aspx?group=R-L21Plus

I plan to start include M222+ haplotypes and stats but I'm just running out of spreadsheet and computer space.

The total number of confirmed R-L21+ M222- haplotypes that I have found: 1022
Of the 1022:
7 are L193+ although FTDNA says there another 3 out there
38 are L159+
46 are L226+

There are also a few L69 and L144's and at least one L195 but I'm not sure how stable these are or how they fit.  For instance there is an L144+ person who is also L195+.

Of the 1022, 683 are 67 markers long, which is great, but in my opinion we need everyone to upgrade to 67 markers (unless your brother or male cousin already has.)

Of the 1022, 578 are from the British Isles, 22 are from Scandinavia, 17 are from Eastern Europe, 90 are from Western Continental Europe (I'm including Iberia in that.)  Keep in mind that the testing rates for British Isles folks are much higher than elsewhere.  I do divide up the categories a little different than on the RL21Plus project screen so look at the details if you are doing an analysis.

The modal for R-L21* is the same as the Western Atlantic Modal (WAMH) for ALL 67 markers.  The median is the same as the WAMH except at the fast moving CDYa, where L21* is 37, rather than 36.  The mean (average) for L21* is WAMH also except L21* is 449=30, 464c=16 and CDYa=37.

The Sum of the Variance for all 67 markers for R-L21* folks is 21.
« Last Edit: February 24, 2010, 12:14:27 PM by Mikewww » Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
cmblandford
Senior Member
***
Offline Offline

Posts: 85


« Reply #22 on: February 25, 2010, 03:34:10 PM »

Quote
The Sum of the Variance for all 67 markers for R-L21* folks is 21

Mike, could you clarify how this calculation is made.  Is this the maximum absolute variance from the WAHM for each marker summed over all 67 markers?
« Last Edit: February 25, 2010, 03:35:53 PM by cmblandford » Logged

Y-DNA:  R-DF13*


Surname Project:  Blandford

Kit:  ft115893   Ysearch:  EYSPZ


Earliest Known Ancestor:  Thomas Blanford; Dorset, England; born 1648


Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #23 on: February 25, 2010, 04:19:00 PM »

Quote
The Sum of the Variance for all 67 markers for R-L21* folks is 21
Mike, could you clarify how this calculation is made.  Is this the maximum absolute variance from the WAHM for each marker summed over all 67 markers?
There is been a discussion on this prior and someone much smarter than I indicated the simple calculation as Excel performs is exactly what we should use. No adjustments.

The Excel VARP function is used on all 67 STR columns and then the results of all 67 are summed.

This is NOT a variance or distance from WAMH.   This is just the Variance for the haplotypes in the spreadsheet.   

We could calculate an average GD from WAMH but I'm not sure if that is meaningful.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
vtilroe
Project Coordinator
Old Hand
*****
Offline Offline

Posts: 150


« Reply #24 on: February 26, 2010, 12:24:45 AM »

Coalescence Age in Generations = Sum of Variance divided by Sum of Mutation Rates.

Assume average mutation rate per micro-satellite = 0.0024
Sum of Mutation Rates = 67*0.0024 = 0.1608

Therefore G = 21/0.1608 = 130.6 generations

Multiply 130.6 generations by 30 years/generation gives an intra-clade coalescence age of about 3,918 years (with the appropriate confidence interval), which seems a bit high compared to other numbers I've seen, but not unreasonable.

Logged

YSearch & MitoSearch: 2GXWW


yDNA: R-U106*


mtDNA: U5a1a1 (Genbank# GQ368895)


R-P312-WTY Project Admin http://tinyurl.com/daertg

Pages: [1] 2 Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.117 seconds with 17 queries.