World Families Forums - Try out the new Russian "hobbyist" YDNA search function

Welcome, Guest. Please login or register.
October 01, 2014, 06:30:51 AM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  R1b General (Moderator: rms2)
| | |-+  Try out the new Russian "hobbyist" YDNA search function
« previous next »
Pages: [1] Go Down Print
Author Topic: Try out the new Russian "hobbyist" YDNA search function  (Read 960 times)
razyn
Old Hand
****
Offline Offline

Posts: 406


« on: December 23, 2011, 12:17:46 PM »

I'm cross-posting a paraphrase of something I said this morning on a DNA-Forums thread (that is effectively buried, under some verbiage about maps).

On the start page http://www.semargl.me/dna/ydna/ the word Найти in the left menu column means "Find," and it's a pretty powerful search function. I was only working in the R1b files, which were recently expanded to over 19,000 samples (very few of which are Russian). If you type in your own Ysearch number or your FTDNA kit number, below a somewhat rudimentary map (with a table on the right of the mapped haplotypes) it produces a much more complete list of your best matches -- not limited to "genetic distance of four," or the like, and by no means limited to the ones they could map. My closest match (thus far tested to 67 markers) is at a GD of 10 -- and beyond him, this engine gives me a whole page of matches, most of whom have a GD between 17 and 22 from me. The haplotypes and accompanying genealogical information appear in English -- or, in the case of the DYS markers, in non-Cyrillic alpha characters.

This engine is not at all limited to R1b haplotypes; but a very respectable sample of them (tested to 67 markers) has been added to it within the past month -- and it does a very good job of mining them.  These guys really know how to write programming code... and I really don't.  Power to them.
Logged

R1b Z196*
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #1 on: December 23, 2011, 02:51:50 PM »

Now this is "one-stop shopping"! Who ever has set this up is on my favorite list.

I used Bing Translate to read more links and
http://www.semargl.me/dna/ydna/hg-maps/

which is an 'Outline map of haplogroups and subclades (branches)' was cluster selectable. I was placed in - R1b-Temp-27 [494].

Do you or someone knows how these Temp groups were clustered? Also, is there a way to get ID's with these maps?

Maybe they could have a translated page link in the future in English?

Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
Mark Jost
Old Hand
****
Offline Offline

Posts: 707


« Reply #2 on: December 23, 2011, 03:21:49 PM »

http://www.semargl.me/dna/ydna/map-snp/355/

L21 SNP Map
Logged

148326
Pos: Z245 L459 L21 DF13**
Neg: DF23 L513 L96 L144 Z255 Z253 DF21 DF41 (Z254 P66 P314.2 M37 M222  L563 L526 L226 L195 L193 L192.1 L159.2 L130 DF63 DF5 DF49)
WTYNeg: L555 L371 (L9/L10 L370 L302/L319.1 L554 L564 L577 P69 L626 L627 L643 L679)
razyn
Old Hand
****
Offline Offline

Posts: 406


« Reply #3 on: December 23, 2011, 05:56:25 PM »

The search page has Demo 1 and Demo 2 buttons (at the bottom, after you enter the kit number but before you execute your search) that seem to address separate databases, with considerable overlap.  The url calls one "ydna-nearest-neighbors" and the other "molgen-nearest-neighbors."  In my opinion, one would do well to use both demos.

A feature I really like is that it starts with your selected haplotype as the norm (corresponding to "modal" on an FTDNA project display), and the colorized result shows blue or pink as divergence from your selected sample.  Not from the P312 modal, or whatever.  One column of the display called "Dist. Markers" shows the number of DYS markers on which there is any difference from yours (pink or blue, up or down).  The next column, "Dist. Step," shows the GD, or total number of steps by which the haplotype differs from yours.  The fact that all of the displayed haplotypes are 67 marker samples gives a very "apples to apples" feel to the overall display.  A lot of scrolling from side to side, up and down, is involved; and I suppose if you are looking at a relatively modal haplotype you might get an unwieldy data set, crash your browser or something.  But for me, it's just fine.

I think this display could be very helpful in identifying or refining "clusters" of the type Ken Nordtvedt and several others have been describing, for five or six years, on the basis of shared "off-modal" STR patterns.  Specifically, it can visually reveal any cluster with which your selected sample has marked affinities... if there are any.
Logged

R1b Z196*
razyn
Old Hand
****
Offline Offline

Posts: 406


« Reply #4 on: December 26, 2011, 11:23:43 AM »

The site is now up in English (and German) as well as Russian.  There is also a handy tutorial, in English, posted early this morning on DNA-Forums -- again, on a thread captioned "Google Maps of SNPs distribution."  If you are able to sign in there, look at each of the new thumbnails for instructions on using the separate functions on their menu.

http://dna-forums.org/index.php?/topic/16821-google-maps-of-snps-distribution/page__view__findpost__p__295302

There is so much more here than the mapping function -- and some of it is more useful, since their Google maps (like Maciamo Hay's maps on Eupedia, and most everybody else's) are based on the geographical information supplied by individual customers to the testing company.  It's a good effort, but the source of the geographical information can be both sketchy and untrustworthy.  That's something we just have to live with, whether we are using Ysearch, or this nifty Russian site, or some other approach.

I did find several British "earliest known ancestors" clustered together in Kyrgyzstan... didn't quite know what to make of that.
Logged

R1b Z196*
rms2
Board Moderator
Guru
*****
Offline Offline

Posts: 5023


« Reply #5 on: December 26, 2011, 02:16:51 PM »

The Kyrgyzstan map pins are a consequence of neglecting the minus sign before the number for longitude. They have my map pin, which should be stuck in Wheeling, West Virginia, out there someplace, too.

I looked at that site for matches but didn't get much out of it. The matches that count I already knew about, and all those that are much further away I don't regard as of much use.
Logged

razyn
Old Hand
****
Offline Offline

Posts: 406


« Reply #6 on: December 26, 2011, 08:50:39 PM »

The Kyrgyzstan map pins are a consequence of neglecting the minus sign before the number for longitude.
Excellent observation, I don't think that would ever have crossed my mind.

Quote
The matches that count I already knew about, and all those that are much further away I don't regard as of much use.
Well, yes, but -- count toward what?  I don't think a GD of 17 counts for much in genealogy; but it might point in the right direction (as distinguished from some more random guess) for defining clades, subclades, clusters or whatever.  Testing the people of interest beats testing everybody... with the goal being better informed discussion of the R1b phylogeny, population movements before there were written records, and other more anthropological interests.

There have been very specific laments voiced on several forums, including FTDNA's in-house one, about the inability of Ysearch to find (or maybe just the disinclination of FTDNA to display) matches at a GD greater than, let's see -- 7, at 67 markers.

http://forums.familytreedna.com/showpost.php?p=334290&postcount=3

This analogous search program doesn't care whether it's finding our actual kinfolks.  It's just looking for the minimum mismatch in alleles.  Playing with the results, or opting not to do so, is up to us.
Logged

R1b Z196*
rms2
Board Moderator
Guru
*****
Offline Offline

Posts: 5023


« Reply #7 on: December 27, 2011, 09:12:24 AM »

You can easily find distant matches at Ysearch. Just enter the number of markers you want checked, then, instead of using the little "maximum genetic distance of" box, use the box below it, "maximum genetic distance of 1 per marker compared above __ markers", and enter the number of markers that will give you the difference that is the distance you want to check. For example, if you want to see those who are 15 away at 67 markers, just enter 52 in that second little box.

Of the several reasons I don't find distant matches (out beyond, say, 10 away at 67 markers) all that useful is the fact that I start picking up folks in other subclades of R1b. My haplotype is "pseudo-Frisian" (best way I can think to describe it), so I tend to get U106 matches as I go further out. At 12 and 25 markers, of course, I don't have to go out far at all. There they are, all crowded in close.
Logged

Mike Walsh
Guru
*****
Offline Offline

Posts: 2964


WWW
« Reply #8 on: December 27, 2011, 01:35:09 PM »

...
A feature I really like is that it starts with your selected haplotype as the norm (corresponding to "modal" on an FTDNA project display), and the colorized result shows blue or pink as divergence from your selected sample.  Not from the P312 modal, or whatever.  One column of the display called "Dist. Markers" shows the number of DYS markers on which there is any difference from yours (pink or blue, up or down).  The next column, "Dist. Step," shows the GD, or total number of steps by which the haplotype differs from yours.  ....
I think this display could be very helpful in identifying or refining "clusters" of the type Ken Nordtvedt and several others have been describing, for five or six years, on the basis of shared "off-modal" STR patterns.  Specifically, it can visually reveal any cluster with which your selected sample has marked affinities... if there are any.
This is what I've been doing in the R-L21, P312xL21 (U152, Z196 here), R-M269xU106xP312, R-U106 spreadsheets posted at the L21, P312 and U106/S21 Yahoo group file sections.
I don't have any reason for you to use what I have versus what others have. It's up to use whatever works for you.
I've got GD calculations (67 length only) based on any target you select in the spreadsheet, including your own or the "Base" modal (usually WAMH) or the modal for a particular selected (using MS Excel autofiltering) haplotypes. There are more colors, view, filters and sorts available probably than what you need. I particularly like the "off-modal" view where you are just looking at GD steps from the target haplotype sorted from slowest to fastest.
I've got the terminal and immediately downstream SNP results also posted in a column and calculate short (but hiearchial) haplogroup labels that make it easier to sort and filter by the resolution of SNP testing.
I've got an 111 length only tab/worksheet now too. Will post the 111 GD calculator columns for 111 for L21 later this week.
I've got the formatting set up to do easy copy/paste's from the spreadsheets into Ken N's Gen 7 methodology.
I tried to drive mapping from this data, but will punt on that for a while. Frequency or absolute count mapping can be misleading anyway. I've found it more useful to have a strict regional, local geographic designations that I can sort or calculate variance by. If I can figure out how to drive variance maps from this I'll proceed on that.
These spreadsheets work best on Excel 2010 with some decent RAM in your PC.
Logged

R1b-L21>L513(DF1)>S6365>L705.2(&CTS11744,CTS6621)
Pages: [1] Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.083 seconds with 19 queries.