World Families Forums - Haploblock Dataset Match - What Next?

Welcome, Guest. Please login or register.
December 27, 2014, 12:17:19 PM
Home Help Search Login Register

+  World Families Forums
|-+  General Forums - Note: You must Be Logged In to post. Anyone can browse.
| |-+  X-chromosome (X DNA) (Moderator: Seán MacGorman Powell)
| | |-+  Haploblock Dataset Match - What Next?
« previous next »
Pages: [1] Go Down Print
Author Topic: Haploblock Dataset Match - What Next?  (Read 2572 times)
ssandifer
Member
**
Offline Offline

Posts: 5


« on: February 25, 2009, 04:45:38 PM »

GHOSTX just updated the haploblock datasets with my X-Chromosome data Id#36 (ssandifer) and in the first haploblock (Didier Vernade) dataset I looked at I matched with 3 others #8, #73 FamilyPast-1, and #132 Turner,Ann-1.

My question is what next? What does the matches mean? What should I research next based on this input. Are these matches meaningful in any way?

SidSandifer
Logged
Seán MacGorman Powell
X-chromosome Project Administrator
Board Moderator
Old Hand
*****
Offline Offline

Posts: 154



WWW
« Reply #1 on: February 25, 2009, 09:43:30 PM »

GHOSTX just updated the haploblock datasets with my X-Chromosome data Id#36 (ssandifer) and in the first haploblock (Didier Vernade) dataset I looked at I matched with 3 others #8, #73 FamilyPast-1, and #132 Turner,Ann-1.

My question is what next? What does the matches mean? What should I research next based on this input. Are these matches meaningful in any way?

SidSandifer

Hi Sid,

The haploblock candidates in that results chart were chosen because they tend to be highly-stable SNP sequences over long periods of time (potentially many thousands, tens of thousands, or even hundreds of thousands of years).  The fact that you are a perfect match to other people for any given haploblock does not imply that you are closely related to them--though it does imply that you shared a common ancestor with them at some point in the distant past.  You may notice that even if you are a perfect match with somebody for one haploblock, you are a mismatch to that same person on other haploblocks.  This is due to the fact that the X chromosome recombines every generation or two, and you will end up being a mixture of X-DNA from many different ancestors.

We're still trying to figure out the significance of these X haploblocks as far as their application to genealogy and deep ancestry is concerned, so the results chart is more for scientific research purposes at this point than for people trying to identify specific ancestors... though you might be able to get some clues as to where certain haploblocks potentially originated, geographically, by comparing the ancestral origins that matching members have posted (as shown in the blue legend box on the right margin of that chart).  There's not really enough data yet though to jump to any firm conclusions from the ancestries that people have been posting there.

As for your next step, aside from submitting your results here, as you've done, and to Adriano Squecco's Y-DNA spreadsheet... unless you have a lot of scientific data analysis skills, it's basically a waiting game to see if anybody else turns up who matches you.  There are some DNA analysis programs out there that can help analyze your data, but they are rather complicated for a layman to use (and some are rather expensive). 

You can do searches at 23andMe for any members who have listed relevant ancestral origins on their profiles (e.g., you could search for the term "England," as I see that you are mostly English).   People who have any Asian blocks in their ancestry painting diagram, who also have X-DNA ancestry in the Americas, could try searching for "Native American, Amerind, Amerindian, etc."  People with Asian blocks who do not have ancestry in the Americas can try searching for Asian, Chinese, Japanese, etc.).  There are many search strategies, and you just have to tailor it to your own ancestry.  When you find somebody's profile who looks promising, send them a sharing invitation, and then compare your results with theirs after they accept (if they accept).  If you're very lucky, you may find that they match you in certain comparisons.

Feel free to post back with any other questions.

Sean
Logged

a.k.a., GhostX
ssandifer
Member
**
Offline Offline

Posts: 5


« Reply #2 on: February 27, 2009, 12:21:36 PM »

Thanks for the input GhostX.

Still trying to understand some things. For instance:

One, if I look at my raw data for the Y-Chromosome and find rs9341274 (I believe M227) and then click on the plus box; and follow that by clicking on dbSNP Lookup I get the NCBI Single Nucleotide Polymorphism page. Now if I scroll down this page to Population Diversity I see Genotype Detail and Alleles and the letters C and G under both and the G having the color green under Genotype and the color blue under Alleles. What does the Genotype Detail and the Alleles tell me. (What arethey  representing?)

Two, I believe both KathyJ and David said they discovered haploblocks representing a particular African block or North American block. How did they identify that block was of a particular race, origin, culture, etc. not sure what the right term is?

Sorry, I guess I should have used an X-Chromosome example in the first paragraph since this forum is about X-Chromosomes, but hope the example I gave will work for my question.
« Last Edit: February 27, 2009, 12:24:28 PM by ssandifer » Logged
Seán MacGorman Powell
X-chromosome Project Administrator
Board Moderator
Old Hand
*****
Offline Offline

Posts: 154



WWW
« Reply #3 on: February 28, 2009, 12:49:49 AM »

Thanks for the input GhostX.

Still trying to understand some things. For instance:

One, if I look at my raw data for the Y-Chromosome and find rs9341274 (I believe M227) and then click on the plus box; and follow that by clicking on dbSNP Lookup I get the NCBI Single Nucleotide Polymorphism page. Now if I scroll down this page to Population Diversity I see Genotype Detail and Alleles and the letters C and G under both and the G having the color green under Genotype and the color blue under Alleles. What does the Genotype Detail and the Alleles tell me. (What arethey  representing?)

Two, I believe both KathyJ and David said they discovered haploblocks representing a particular African block or North American block. How did they identify that block was of a particular race, origin, culture, etc. not sure what the right term is?

Sorry, I guess I should have used an X-Chromosome example in the first paragraph since this forum is about X-Chromosomes, but hope the example I gave will work for my question.

Sid,

Those bars you see at dbSNP show the frequencies that a given allele shows up in the population being referenced.  In your example, the population appears to be a random sampling of Y-DNA contributors from all over the world, if I'm reading that correctly.  It's typical in other chromosomes than the Y for there to instead be several different bars representing more specific populations (e.g., Asian, Sub-Saharan African, European).  In your example, it's telling you that 99.1% of the people sampled have a C allele for that SNP, and 0.9% have a G.

One thing you need to pay attention to in the "Browse Raw Data" window for any given SNP is what the "dbSNP Orientation" says.  If it says "Plus," then the results are presented the same way at 23andMe versus dbSNP.  If it says "minus," then you have to convert your results to the complimentary base (e.g, C to G, A to T, etc) before you can compare with dbSNP.

The bar under "Genotype Detail" shows the different possible combinations of alleles for a chromosome that has two copies (i.e., the X chromosome in females, or the autosomes).  The Y chromosome does not have two copies in normal human males, so that's why the "Genotype Detail" column just shows the same thing as shown for the "Alleles" column.  You may want to check out an X-chromosome (or any of the autosomes) SNP to see what I am talking about.

I'll let KathyJ or David speak to how they deduced that a particular block is for a particular geographic origin, as there are several ways to come to that sort of conclusion.  One way is to just check out the dbSNP entries for all the SNPs in a block of interest, and if your results for each of the SNPs in that block consistently show alleles that are more common to one population (e.g., Asian) than to the other populations (e.g. Sub-Saharan African or European), then you might be able to conclude that this block originated in that geographic region (though possibly tens or hundreds of thousands of years ago).
Logged

a.k.a., GhostX
geneticgenie
Member
**
Offline Offline

Posts: 45


« Reply #4 on: March 01, 2009, 06:44:15 PM »

Thanks for the input GhostX.

Still trying to understand some things. For instance:

One, if I look at my raw data for the Y-Chromosome and find rs9341274 (I believe M227) and then click on the plus box; and follow that by clicking on dbSNP Lookup I get the NCBI Single Nucleotide Polymorphism page. Now if I scroll down this page to Population Diversity I see Genotype Detail and Alleles and the letters C and G under both and the G having the color green under Genotype and the color blue under Alleles. What does the Genotype Detail and the Alleles tell me. (What arethey  representing?)

Two, I believe both KathyJ and David said they discovered haploblocks representing a particular African block or North American block. How did they identify that block was of a particular race, origin, culture, etc. not sure what the right term is?

Sorry, I guess I should have used an X-Chromosome example in the first paragraph since this forum is about X-Chromosomes, but hope the example I gave will work for my question.

Sid,

Those bars you see at dbSNP show the frequencies that a given allele shows up in the population being referenced.  In your example, the population appears to be a random sampling of Y-DNA contributors from all over the world, if I'm reading that correctly.  It's typical in other chromosomes than the Y for there to instead be several different bars representing more specific populations (e.g., Asian, Sub-Saharan African, European).  In your example, it's telling you that 99.1% of the people sampled have a C allele for that SNP, and 0.9% have a G.

One thing you need to pay attention to in the "Browse Raw Data" window for any given SNP is what the "dbSNP Orientation" says.  If it says "Plus," then the results are presented the same way at 23andMe versus dbSNP.  If it says "minus," then you have to convert your results to the complimentary base (e.g, C to G, A to T, etc) before you can compare with dbSNP.

The bar under "Genotype Detail" shows the different possible combinations of alleles for a chromosome that has two copies (i.e., the X chromosome in females, or the autosomes).  The Y chromosome does not have two copies in normal human males, so that's why the "Genotype Detail" column just shows the same thing as shown for the "Alleles" column.  You may want to check out an X-chromosome (or any of the autosomes) SNP to see what I am talking about.

I'll let KathyJ or David speak to how they deduced that a particular block is for a particular geographic origin, as there are several ways to come to that sort of conclusion.  One way is to just check out the dbSNP entries for all the SNPs in a block of interest, and if your results for each of the SNPs in that block consistently show alleles that are more common to one population (e.g., Asian) than to the other populations (e.g. Sub-Saharan African or European), then you might be able to conclude that this block originated in that geographic region (though possibly tens or hundreds of thousands of years ago).


Exactly right.    If you want to see the HapMap characteristics between Asian, African and European, go to the raw data at 23andMe, click on the SNP. As an example, the most distinct geographical haploblock seems to be in the 66 million block. Find the following position:
66414536  (rs5918737), click on the + and you should see dbSNP Lookup   

The Ref Assembly will be the   “ref_assembly” where the position  is identified as above.

Below that you see that 97.7% of Sub-Saharan African (Yoruba) have T and nearly 100% of the Asians in the Chinese Han and Japanese population will have C.
A minority of Europeans will have T and a majority will have C.

A newer dataset would be at this site:
http://ftp.hapmap.org/genotypes/2009-02_phaseII+III/forward/non-redundant/

genotypes_chrM_CHB_r27_nr.b36_fwd.txt.gz will have some distinct  “Asian”
haplotypes

genotypes_chrX_YRI_r27_nr.b36_fwd.txt.gz  will have some distinct “African”
haplotypes

Anthropologists get nervous when we start assigning races to data so all I can say is that if you start at position 66404187 or so and look at the next 75 positions you will see distinct haplotypes within the block. I match all the Africans in this block but most Europeans will match the Asians. If you use mutations as a way to count back to most common recent ancestor , Sean and I won’t be related to each other in this block until probably going back 2 million or more years, well before Homo sapiens.  So what did our common ancestor look like then?

At the above HapMap site you can follow the same person throughout the data set and even identify family groups if you are so inclined by finding the ID number and looking up the person at Genbank.  There are other nationalities too, but I don’t know exactly where to find the Native American results that David and Anders have been discussing so that we can compare with the other groups.  It would be nice to compare exact SNPs.  I know that deCODEme does ancestry painting and some of the software can do this as well, but where to we find the haplotypes of the Pima for the above block?

Kathy J.





Logged

Kathy J.
X Chromosomes: 75% English, 12.5% German, 6.25% Dutch, 3.125% Irish, 3.125% Scottish;
from Father's X: 43.75% English, 6.25% Dutch;
from Mother's X: 31.25% English, 12.5% German, 3.125% Irish, 3.125% Scottish
ssandifer
Member
**
Offline Offline

Posts: 5


« Reply #5 on: March 11, 2009, 02:11:33 PM »

Thanks Sean & Kathy for providing input to my question. It helps a lot.

Sid
Logged
ssandifer
Member
**
Offline Offline

Posts: 5


« Reply #6 on: March 25, 2009, 10:00:30 AM »

Kathy & Sean

Still trying to understand how all this works. I match Kathy on one of the blocks she identified as having very little crossover combination and possibly as very ancient starting at position 68,455,063 (rs7888054).

I started to look at each position and the second position (rs5937206) caught my eye. How would you interpret the information at Population Diversity under the Genotype Details and Alleles for this position?

It seems that the Genotype A for this position is very rare and under Genotype Details only shows up in a few individuals under CHMJ that I believe is Japanese. And under Alleles seems to be very much the minority. If you were trying to pen a geographic region for this particular Genotype A at this position would you say European or Japanese or Asian?

If you look at the next position, again, you only see Genotype A showing up in the CHMJ category. However, under Hap-Map-YRI it shows A/A being dominantly African. What do you make of this? I thought from your previous responses that I understood this a little better but I am afraid I am not as far along as I thought.

Kathy, have you been able to determine the geographic region for this whole haploblock for your data?

Logged
geneticgenie
Member
**
Offline Offline

Posts: 45


« Reply #7 on: March 26, 2009, 01:35:35 AM »

Kathy & Sean

Still trying to understand how all this works. I match Kathy on one of the blocks she identified as having very little crossover combination and possibly as very ancient starting at position 68,455,063 (rs7888054).

I started to look at each position and the second position (rs5937206) caught my eye. How would you interpret the information at Population Diversity under the Genotype Details and Alleles for this position?

It seems that the Genotype A for this position is very rare and under Genotype Details only shows up in a few individuals under CHMJ that I believe is Japanese. And under Alleles seems to be very much the minority. If you were trying to pen a geographic region for this particular Genotype A at this position would you say European or Japanese or Asian?

If you look at the next position, again, you only see Genotype A showing up in the CHMJ category. However, under Hap-Map-YRI it shows A/A being dominantly African. What do you make of this? I thought from your previous responses that I understood this a little better but I am afraid I am not as far along as I thought.

Kathy, have you been able to determine the geographic region for this whole haploblock for your data?



I am not showing that HapMap YRI has predominately AA.  It shows mostly GG at this SNP rs5937206.
I think A at this SNP could be a relatively recent mutation. 
We are still trying to sort out some of the blocks in terms of geography.  In many of the blocks, it  may not be possible to identify a particular region but there could still be a founder present that we share during very ancient times that could point to a specific Haplogroup, similar to the founders in mtDNA. We are still learning.
Kathy J.
Logged

Kathy J.
X Chromosomes: 75% English, 12.5% German, 6.25% Dutch, 3.125% Irish, 3.125% Scottish;
from Father's X: 43.75% English, 6.25% Dutch;
from Mother's X: 31.25% English, 12.5% German, 3.125% Irish, 3.125% Scottish
ssandifer
Member
**
Offline Offline

Posts: 5


« Reply #8 on: March 26, 2009, 07:37:52 AM »

Sorry Kathy, when discussing the HapMap YRI  ( my comment "However, under Hap-Map-YRI it shows A/A being dominantly African. ") I was talking about the next position SNP rs17302855. Where it shows A as 73% Asian again in the Japanese sampling and A/A 83% African. I guess I thought that A and A/A would represent the same thing.

Again in this position A only shows up in the CHMJ sample. So maybe as you say it is a recent mutation here as well.

And on the following position (rs2209420)  there is no C showing only C/C in the Genotype Detail. And there is a similar situation with G on the position (rs914284). Okay, why does Genotype Details sometimes have only C/C or A/A only and other times have C/C and C and A/A and A. Is this because of the sampling technique or is there something meaningful there?

Can I ask the above questions in a different way? On the first position in this haploblock the Genotype (for me) is G. When you look at the Geontype Detail it has A  A/A  A/G G G/G  N in the heading. The only place the G shows in the detail is the line for CHMJ. Does the G genotype represent only the males whereas the G/G would represent females? Is the Genotype Detail saying men who have a G are only showing in CHMJ?
« Last Edit: March 26, 2009, 10:55:49 AM by ssandifer » Logged
geneticgenie
Member
**
Offline Offline

Posts: 45


« Reply #9 on: March 26, 2009, 05:15:04 PM »

Sorry Kathy, when discussing the HapMap YRI  ( my comment "However, under Hap-Map-YRI it shows A/A being dominantly African. ") I was talking about the next position SNP rs17302855. Where it shows A as 73% Asian again in the Japanese sampling and A/A 83% African. I guess I thought that A and A/A would represent the same thing.

Again in this position A only shows up in the CHMJ sample. So maybe as you say it is a recent mutation here as well.

And on the following position (rs2209420)  there is no C showing only C/C in the Genotype Detail. And there is a similar situation with G on the position (rs914284). Okay, why does Genotype Details sometimes have only C/C or A/A only and other times have C/C and C and A/A and A. Is this because of the sampling technique or is there something meaningful there?

Can I ask the above questions in a different way? On the first position in this haploblock the Genotype (for me) is G. When you look at the Geontype Detail it has A  A/A  A/G G G/G  N in the heading. The only place the G shows in the detail is the line for CHMJ. Does the G genotype represent only the males whereas the G/G would represent females? Is the Genotype Detail saying men who have a G are only showing in CHMJ?


You are looking at a population that doesn't really exist in CHMJ because this is more of a tumor line, because it is a pre-fetus that never became a person; therefore it grew like a tumor.

According to NCBI:

Population Detail
Submitter Population Handle: KYUGEN
Submitter Population ID: CHMJ
Population Text: 
"DNA was extracted from complete hydatidiform mole (CHM), a benign tumor formed by the fertilization of an empty ovum by a single haploid sperm, that later duplicates its chromosomes to give a diploid (duplicated haploid) cell mass. CHMs offer a unique opp ortunity for determining long-range definitive haplotypes at a genome-wide level.The 74 CHM samples were collected by the nation-wide effort in Japan (cooperated with the Japan Association of Obstetricians & Gynecologists). Both the female donors of the CHM tissues and the male partners were Japanese. The project has been approved by the Ethical Committee of Kyushu University. "

Normally in real populations, the results will show A/A or G/G even for a male.  The technology cannot distinquish between a male with one chromosome and a female with two chromosomes.

The only way I can tell immediately if it is a female is if it is heterozygous (eg. reported as A/G or AG). It is time consuming to go into the ID numbers and pedigrees to determine if it is a male or female, but until somebody can figure out a faster way, that is what we have go through.
Kathy J.

 
Logged

Kathy J.
X Chromosomes: 75% English, 12.5% German, 6.25% Dutch, 3.125% Irish, 3.125% Scottish;
from Father's X: 43.75% English, 6.25% Dutch;
from Mother's X: 31.25% English, 12.5% German, 3.125% Irish, 3.125% Scottish
geneticgenie
Member
**
Offline Offline

Posts: 45


« Reply #10 on: March 27, 2009, 02:01:03 AM »

Sorry Kathy, when discussing the HapMap YRI  ( my comment "However, under Hap-Map-YRI it shows A/A being dominantly African. ") I was talking about the next position SNP rs17302855. Where it shows A as 73% Asian again in the Japanese sampling and A/A 83% African. I guess I thought that A and A/A would represent the same thing.

Again in this position A only shows up in the CHMJ sample. So maybe as you say it is a recent mutation here as well.

And on the following position (rs2209420)  there is no C showing only C/C in the Genotype Detail. And there is a similar situation with G on the position (rs914284). Okay, why does Genotype Details sometimes have only C/C or A/A only and other times have C/C and C and A/A and A. Is this because of the sampling technique or is there something meaningful there?

Can I ask the above questions in a different way? On the first position in this haploblock the Genotype (for me) is G. When you look at the Geontype Detail it has A  A/A  A/G G G/G  N in the heading. The only place the G shows in the detail is the line for CHMJ. Does the G genotype represent only the males whereas the G/G would represent females? Is the Genotype Detail saying men who have a G are only showing in CHMJ?


You are looking at a population that doesn't really exist in CHMJ because this is more of a tumor line, because it is a pre-fetus that never became a person; therefore it grew like a tumor.

According to NCBI:

Population Detail
Submitter Population Handle: KYUGEN
Submitter Population ID: CHMJ
Population Text: 
"DNA was extracted from complete hydatidiform mole (CHM), a benign tumor formed by the fertilization of an empty ovum by a single haploid sperm, that later duplicates its chromosomes to give a diploid (duplicated haploid) cell mass. CHMs offer a unique opp ortunity for determining long-range definitive haplotypes at a genome-wide level.The 74 CHM samples were collected by the nation-wide effort in Japan (cooperated with the Japan Association of Obstetricians & Gynecologists). Both the female donors of the CHM tissues and the male partners were Japanese. The project has been approved by the Ethical Committee of Kyushu University. "

Normally in real populations, the results will show A/A or G/G even for a male.  The technology cannot distinquish between a male with one chromosome and a female with two chromosomes.

The only way I can tell immediately if it is a female is if it is heterozygous (eg. reported as A/G or AG). It is time consuming to go into the ID numbers and pedigrees to determine if it is a male or female, but until somebody can figure out a faster way, that is what we have go through.
Kathy J.

 


Now that I have a chance to clarify the hydatidiform mole, I see that according to MedicineNet.com, that it is the development of the germ cell without equal contributions from the male and female DNA. It means that all chromosomes come from the woman's male partner.  That makes me wonder what kind of DNA was in my own dermoid teratoma when I was 11 years old.  Was that like a clone of my own DNA? A bit off topic, but I wish I still had my tumor to test just for fun.  It was about the size of a volley ball. 
Logged

Kathy J.
X Chromosomes: 75% English, 12.5% German, 6.25% Dutch, 3.125% Irish, 3.125% Scottish;
from Father's X: 43.75% English, 6.25% Dutch;
from Mother's X: 31.25% English, 12.5% German, 3.125% Irish, 3.125% Scottish
Pages: [1] Go Up Print 
« previous next »
Jump to:  


SEO light theme by © Mustang forums. Powered by SMF 1.1.13 | SMF © 2006-2011, Simple Machines LLC

Page created in 0.134 seconds with 19 queries.