You may have heard of the term haplotree. This is the tree of humankind that traces all paternal lines back to a common male ancestor who lived tens of thousands of years ago. “Y chromosome Adam,” as this man is sometimes referred to, was not the only man living during his time. However, he was the only man who had an unbroken line of male descendants who carried his Y chromosome to the modern day.
This line has many, many branches in it as various descendants migrated to different areas of the globe. These branches each developed unique genetic markers that help us determine when they split apart and how.
We call each of these branches a haplogroup. Haplogroups are collections of shared genetic markers called haplotypes.
Haplotypes
When we look at the entire genome of a person, no two sets of DNA are exactly identical. Each unique set is called a haploid genotype, which is usually abbreviated to the term haplotype.
Each person’s haplotype is unique in much the same way that each individual’s written signature is unique. Because of this, haplotypes are also sometimes referred to as DNA signatures or genetic signatures.
Haplotypes can refer to the entire genome but can also refer to any portion of the genome, all the way down to a single nucleotide. Regardless of the size of the portion, the genetic signature found within is the haplotype for that portion. To better understand this, let’s use the idea of a name as an analogy.
Let’s say your name is John Doe, and you want to find your name in a database. If you only type the letter J into a search, you will likely find many people who all have a J in their name. To help narrow it down, you next search for JO. There will be fewer people, but Joel, Jordan, Joseph, and many other names will still appear. Next, you type in JOHN, and the search narrows it down even more. The complete collection of all of the letters in your name (J O H N D O E) is unlikely to be shared by many people in the database if any. In other words, the more detailed you make your search criteria, the fewer people will match the criteria.
Now let’s think of this name as your haplotype. If you define your haplotype as the complete JOHN DOE, you will probably be the only person in the database with that haplotype. Instead, if you define your haplotype as JOHN, you will now share that haplotype with other people. Regardless of the size of the haplotype you define, all of the people who share the same haplotype are grouped together to form a haplogroup.
Haplogroups
Continuing our name analogy, let’s think about the entire alphabet. If your haplogroup is simply defined as everyone whose name starts with J, then it will exclude the majority of possible names, but your haplogroup will still be very, very large.
You can narrow this haplogroup to JOHN. Everyone who is in haplogroup JOHN will also be in haplogroup J, but not everyone in Haplogroup J will be in Haplogroup JOHN. By using this method, we can assign large, broad haplogroups with many smaller haplogroups or subgroups inside them. For example, JOHN would be a subgroup of JO, which in turn would be a subgroup of J. We could write out this haplogroup as J > JO > JOHN.
For genealogists, we look exclusively at Y-DNA and mtDNA haplogroups. We add together all the different known haplogroups and arrange them from very broad down to very narrow ones, and we get a chart that looks very similar to a family tree. We call this tree of haplogroups the haplotree.
Y-DNA Haplotree
The Y-DNA haplotree consists of branches that are each defined by a specific SNP. The many branches of the tree are organized through a process of elimination. To use our name analogy, haplotype JOHN consists of the SNPs J, O, H, and N. This can also be described as J > JO > JOHN. When two or more people share haplotype JOHN, it forms haplogroup JOHN, and haplogroup JOHN is added to the haplotree. Where do we place it in the haplotree though?
Haplogroup JOHN is part of the larger haplogroup J. Haplogroup J is defined by SNP J and would also contain subgroups like JOEL, JORDAN, and JOSEPH. Because these all contain the SNP J, we can consider haplogroup J to be the “parent” of these subgroups. Whenever a person tests, we compare their haplotype to the larger haplotree to see to which subgroup they belong.
As more people test, markers that were once unique only to a specific person’s haplotype are found in others, and these shared markers create new haplogroups. These haplogroups are fitted into the tree as subgroups of the broader haplogroup they all share. With this method, the tree grows as more and more subgroups are added.