Skip to main content

Glossary Terms



Aboriginal means relating to a group of people native to a geographic region. These are the original inhabitants of a region.


Adenine is one of the four bases that make up our DNA. It is abbreviated "A."

The other bases are thymine (T), guanine (G), and cytosine (C). Adenine always pairs with thymine.


An administrator (a.k.a. Group Administrator) is someone who is in charge of Group Projects. The Group Projects at FamilyTreeDNA are run by unpaid volunteer administrators.


Admixture refers to ancestry from more than one recent population group. Many people today have ancestry from more than one population and/or location.


An allele is a genetic variant at a specific point, locus, in our genetic code.


DNA amplification is the production of many DNA copies from one or a few copies or fragments.


An ancestor is someone from whom you descend. For example, your grandparents are your ancestors.

Ancestral Haplotype

In genetic genealogy, the ancestral haplotype is the set of marker values of your ancestor.

Ancestral Signature

The ancestral signature is the oldest known or suspected haplotype for a lineage.

See: Modal Haplotype

Ancestral State

The ancestral state of an allele is the assumed initial condition (value) of the allele and is often represented by the sequence reference.

See: Derived State


Anthropology is the study of human origins and culture.


Ashkenazi is the branch of the Jewish population that settled in Germany and then Eastern Europe during the Jewish Diaspora.

Atlantic Modal Haplotype

See: Western Atlantic Modal Haplotype (WAMH)


An autosome is a non-sex chromosome. Humans have 22 pairs of autosomes, which are numbered 1 through 22, and one pair of sex chromosomes (the X and Y).

Autosomal DNA (atDNA)

Autosomal DNA is DNA from one of our chromosomes located in the cell nucleus. It generally excludes the sex chromosomes. Humans have 22 pairs of autosomal chromosomes and a pair of sex chromosomes.

Source: “autosome”. Oxford Dictionaries. April 2010. Oxford Dictionaries. April 2010. Oxford University Press. 09 February 2013 (


Back Mutation

A back mutation is when a marker value changes back to its original value.


A base is a unit or building block of DNA. Adenine (A), cytosine (C), guanine, (G), and thymine (T) are the four primary bases in DNA. The order of bases is the sequence of DNA.

Base Pair

In genetics, nucleotides are called bases. A base pair (bp) is two complementary nucleotides on opposite strands of DNA. Base pairs are measured using metric units.

  • 1 base pair = 1 base pair (bp)
  • 1,000 base pairs = 1 kilo-base (kb)
  • 1,000,000 base pairs = 1 mega base (Mb)

Buccal Cell

A buccal cell is a type of cell found in cheek tissue inside the mouth.


Cambridge Reference Sequence (CRS)

The Cambridge Reference Sequence (CRS) is the mitochondrial DNA sequence first sequenced in 1981. It was used as a basis for comparison with mtDNA test results until it was replaced with the Reconstructed Sapiens Reference Sequence (RSRS).

See: Revised Cambridge Reference Sequence (rCRS)


A catalyst is a substance that starts or speeds up a chemical reaction without being affected by that reaction.

centiMorgan (cM)

A centiMorgan (cM) is a measurement for DNA based on the likelihood of a segment to recombine from one generation to the next.. A single centiMorgan is considered equivalent to a 1% (1/100) chance that a segment of DNA will crossover or recombine within one generation.

For humans, one million base pairs (bp) average about one centiMorgan. However, the rate of recombination is highly variable.


A centromere is one of the parts of each chromosome. It is a dense area that joins together the two chromatids (arms) of each chromosome.


A chromatid is one of two strands of a chromosome.


A chromosome is a structure found in the nucleus of a cell that contains genetic material. Humans have 23 pairs of chromosomes: 22 pairs of autosomes and one pair of sex chromosomes.


A clade is a group of related individuals.

Coding Region

A coding region is DNA that contains genes. In genetic genealogy, this most often refers to the part of the mitochondrial genome that contains genes.

Cohanim Modal Haplotype (CMH)

The Cohanim Modal Haplotype (CMH) is the Y-chromosome (paternal) profile most frequently found in men with an oral tradition of Cohen ancestry. It is a Y-chromosome DNA STR (Short Tandem Repeat) haplotype.


Cohen is the Hebrew word for priest, which refers to a direct male descendant of Aaron, the brother of Moses. The plural is Cohanim.

Combined DNA Index System (CODIS)

The CODIS system uses marker locations in the autosomal DNA. In the United States, the FBI maintains a CODIS test result database to identify people and solve crimes.

Complementary Sequences

Complementary sequences are opposing strands of DNA. They bond together to form the double helix. The bases always complement one another. Adenine and thymine pair together. Cytosine and guanine pair together.


Convergence is the process of two genetically distant haplotypes changing over time to resemble one another.


Cytosine is the “C” of the four bases that make up DNA. Cytosine always pairs with guanine.

The other bases are adenine (A), guanine (G), and thymine (T).



A deletion is when one or more of the letters (nucleotides) of your genetic code is deleted.

Deoxyribonucleic Acid (DNA)

DNA, deoxyribonucleic acid, is the genetic code that makes each of us a unique individual. Humans inherit about one half of their genetic code from each of their parents. Our genetic code then holds the story of our heritage that has been passed down through the generations.

Derived State

The derived state of an allele is the changed (mutated) condition of the allele that differs from the ancestral state.

See also: Ancestral State


A descendant is someone who descends from a specific ancestor. For example, your children and grandchildren are your descendants.


A diaspora is the permanent displacement of a population from one location to a different location or locations.

DNA Amplification

DNA amplification is the production of many DNA copies from one or a few copies or fragments.

DNA Replication

DNA replication is the process by which the DNA double helix makes a copy of itself. It uses the old DNA as a template for the synthesis of new DNA strands. In humans, replication occurs in the cell nucleus.

DNA Segment

A DNA segment is any continuous run or length of DNA. It is described by the place where it starts and the place where it stops.

DNA Sequencing

DNA sequencing is the process of determining the exact order of the nucleotide bases in a segment of DNA.

Double Helix

A double helix is the twisted shape DNA forms when its two strands bond together. It looks like a twisting or rotating ladder.


Earliest Known Ancestor

Your earliest known ancestor is the furthest person who you have documented on a specific genealogical line. In genetic genealogy, it usually refers to someone on a direct maternal line (the mother, her mother, her mother’s mother, etc.) or on a direct paternal line (the father, his father, his father’s father, etc.).


An endogamous population is one where the members usually only marry within the population group. The bases for endogamy may be geography, ethnic identity, social class, or religion. Long periods of intermarriage have left many endogamous populations with lower than average levels of genetic diversity. Examples of historically endogamous populations are the Amish, the Basque, and the various sub-populations of the Jewish Diaspora.


A protein that facilitates a specific chemical reaction by working as a catalyst.

Exact Match

An exact match is when two people have exactly the same results for all markers or regions compared.


Exogamy is marriage outside of a cultural or population group.


FamilyTreeDNA Time Predictor (FTDNATiP™)

FamilyTreeDNA Time Predictor (FTDNATiP™) is a program used to calculate estimates of Time to the Most Recent Common Ancestor (TMRCA) for paternal lineages. It is the world’s first calculator that incorporates mutation rates specific to each marker. This increases the power and precision of estimates.

Full Mitochondrial Sequence (FMS)

This is your complete mitochondrial DNA. The complete mitochondrial genome consists of a single circular DNA sequence that contains 16,569 base pairs. The FamilyTreeDNA FMS test is called the mtFull Sequence test. This test will return results for all three parts of your mitochondrial DNA as follows:

  • HVR1 – 16001 to 16569
  • HVR2 – 00001 to 00574
  • Coding Region – 00575 to 16000



Genes are fundamental units of heredity that are made up of sequences of DNA. Genes are passed from parents to their children.  

Genealogical Time Frame

The genealogical time frame is the most recent one to fifteen generations. Recent genealogical times are the last one to five generations.

Genealogical Data Communication (GEDCOM)

A Genealogical Data Communication (GEDCOM) file is a special file format that was developed to provide a standard for encoding genealogical data. It is not used by most family tree software packages but most can import and export to GEDCOM format. Because of this, it is today used by many genealogists to exchange pedigree data files.


Genealogy is the study of family history.


A generation is the number of years between the birth of the parents and the birth of their children. Different studies use different numbers of years per generation. At FamilyTreeDNA, we use 25 years. However, for Time to the Most Common Ancestor (TMCA) calculations, it is the number of generations that is important.

Genetic Cousins

A genetic cousin is someone that meets the criteria to be a genetic match in genetic genealogical testing that may or may not be a known cousin.

Genetic Distance

There are two meanings for Genetic Distance:

  1. Genetic Distance is the number of differences, or mutations, between two
    sets of results. A genetic distance of zero means there are no differences in
    the results being compared against one another, i.e., an exact match. This is
    the meaning when comparing Y-chromosome DNA or mitochondrial DNA.
  2. For autosomal DNA comparisons, genetic distance may refer to the size of a
    DNA segment. The genetic distance is then the length of the segment in

Genetic Drift

Genetic drift is when a subset of a population moves to a different location and becomes genetically less like the main population. This happens over many generations.

Genetic Genealogy

Genetic genealogy is the use of your DNA to solve genealogy puzzles.


Genetics is the study of genes and heredity; the study of DNA.


A genome is the entire complement of an organism’s genetic material. This may refer to the DNA of a gamete, organelles (mitochondria and chloroplasts), organism, or species.

The human nuclear genome is composed of 46 chromosomes (23 pairs). They contain a total of 3 billion base pairs.

The human mitochondrial genome is composed of a single circular DNA sequence that contains 16569 base pairs.


The genetic makeup of an individual organism.

Glacial maximum

Glacial Maximum is the scientific term for the peak of an ice age.

Group Administrator Page

The Group Administration Page (GAP) is the user interface that FamilyTreeDNA Project Administrators use to manage Group Projects. The term GAP is also often used for Project Administrators.


Guanine is the “G” of the four bases that make up DNA. Guanine always pairs with cytosine.

The other bases are adenine (A), cytosine (C), and thymine (T).



A haplogroup is a branch on either the maternal or paternal tree of humankind. Haplogroups are associated with early human migrations. Today they can be associated with a geographic region or regions.

Note: Though maternal and paternal haplogroups may have similar naming systems, their definitions are different.


A haplotype is a set of genetic markers inherited together from one parent.  

Two individuals that match exactly on all markers have the same haplotype.


Heredity is the transmission of genetic material from parents to offspring.


Heterozygous means that the two genetic code values (alleles) at a point in the genetic code are different.


Heterozygous Value















Homozygous means that the two genetic code values (alleles) at a point in the genetic code are identical.


Homozygous Value









Human Genome Organization (HUGO)

The Human Genome Organization (HUGO) is the scientific entity to which, among other things, researchers submit new short tandem repeat (STR) markers for number assignment.

Hypervariable Region (HVR)

A hypervariable region (HVR) is a part of the mitochondrial genome. There are two human hypervariable regions: HVR1 and HVR2. They do not contain genes. Therefore, they have a faster change (mutation) rate than the coding part of the mitochondrial genome.


Identical By Descent (IBD)

IBD stands for identical by descent. This means the DNA matches because it comes from a common ancestor. IBD can refer to a single mutation or to a segment of DNA. If a mutation or segment of DNA is IBD among a group of people, it comes from a common ancestor.

The Family Finder™ relationship predictions require a minimum number of results in a row to be identical in order to identify that the segment is likely to be IBD.

Identical By State (IBS)

IBS stands for identical by state, meaning the DNA matches by coincidence. When two individuals share numerous individual results without being related, those results are IBS.


In genetics, inbreed refers to someone whose parents are related. It most often refers to cases where the relationship is within five generations.


An indel is a type of mutation where genetic code is lost or gained. These are insertions and deletions.


An insertion is when one or more of the letters (nucleotides) of the genetic code is added.


A descendant of the Hebrew tribes.


Junk DNA

Junk DNA is a popular term for DNA that does not contain genes. This is non-coding DNA. Most of the genome consists of non-coding DNA. Because it does not code for specific function, it was long thought to be “junk.” However, scientists have found that in addition to containing markers that are helpful for genetic genealogy, parts of these non-coding regions have regulatory and other functions.



A descendant of the Hebrew tribe of Levi. There are strict historic guidelines for who is considered a Levite.


A lineage is all descendants of a specific ancestor.


A locus is a specific location in your genetic code. In a genetic map of our DNA, the locus tells us where to find any base. Each locus is named sequentially so that on chromosome 15 locus 26039212 comes after locus 26039211. The plural of locus is loci.

Longest Block

The Longest Block in the autosomal Family Finder™ test refers to the longest continuous segment of autosomal DNA that is shared between two individuals.



A marker is a physical location (locus) on the chromosome. The term is often used colloquially in genetic genealogy to refer to a short tandem repeat (STR). For example, “The Y-DNA67 test is a panel of 67 markers.”


The stage in the reproductive process in which sperm and egg cells are formed. During meiosis, the autosomal chromosomes recombine and mutations may occur.


A micro-allele is when part of a repeat for a short tandem repeat (STR) is lost.

Microarray Chip

A microarray or SNP chip is a high-density DNA test that is able to test many thousands of single nucleotide polymorphisms (SNPs) at once. The microarray chip is able to capture much of the diversity in someone’s genetic code by sampling known polymorphic loci.


Mitochondria are specialized subunits (organelle) within cells.

In humans, mitochondria are responsible for cell respiration and for producing energy. They evolve into their current state from separate organisms that form a mutually beneficial (symbiotic) relationship with the larger cell. Because they were once independent, they have their own mitochondrial DNA (mtDNA) genome.

This genome is passed from human mother to child.

Mitochondrial DNA (mtDNA)

The genetic material found in mitochondria. It is passed down from females to both sons and daughters, but sons do not pass down their mother’s mtDNA to their children.


Mizrahi is the branch of the Jewish population that settled in Middle Eastern, North African, and Caucasus countries. This may include Sephardi Jews who move to these places.

Modal haplotype

The most common result for each marker tested in a group of results.

See: Ancestral Signature

Most Recent Common Ancestor (MRCA)

In genetic genealogy, the Most Recent Common Ancestor (MRCA) is the ancestor shared most recently between two individuals.


A heritable change that occurs in genetic material. It may lead to a different number of repeats of a certain sequence or a change in one of the bases in a sequence.

Mutation Rate

The frequency with which random mutations occur.


Named Variant 

Named Variants are the Y-DNA SNPs (single nucleotide polymorphisms) that are on the list of 600,000+ known SNPs against which Big Y data is compared.

No Call

A no call occurs when a particular single nucleotide polymorphism (SNP) being analyzed has insufficient data to be confidently given a genotype value.

Non-Coding DNA

Non-coding DNA is DNA that does not contain genes. It may have other functions.

Non-Recombining Y (NRY)

The non-recombining Y (NRY) is the part of the Y chromosome that does not recombine with the X chromosome.

Nuclear DNA

Nuclear DNA is the genetic code that is found inside of the cell’s nucleus. Our autosomal and sex chromosomes are nuclear DNA.

Nucleic Acids

Nucleic acids are the basic components of our genetic code. DNA is made up of four types of nucleic acids: adenine (A), cytosine (C), guanine (G), and thymine (T).


Nucleotides are structural components of our genetic code. Each nucleotide is composed of one base plus a sugar molecule and a phosphate molecule. The bases are adenine, thymine, guanine, and cytosine, normally represented as A, T, G, and C, respectively.


The membrane-bound organelle containing the chromosomes.


A null value is when there has been a mutation that prevents a reading for a  Y-chromosome Short Tandem Repeat (STR) from being obtainable. It may be an actual deletion of the entire STR. It may also be caused by a change in DNA values such that the primer does not work.



An organelle is a part of a cell that performs a specialized function. Examples are the nucleus and the mitochondria.


Outbreed is when an individual’s parents’ common ancestry was more than ten generations in the past.


P Arm

The P Arm is the shorter of the two sides (short arm) of a chromosome.

See: Q Arm


A palindrome is something that reads the same way in either direction. In genetic genealogy, it is sections of DNA that read the same way. It is most significant for Y-chromosome DNA because palindromes may be copied over each other.

Parallel Mutation

A parallel mutation is when the same genetic change happens in completely unrelated lineages.


For STRs, a plot that shows the length of a fragment of DNA. This allows its allele value to be measured.


Phylogenetics is the study of how genetics can be used to show how people are related.

Phylogenetic Tree

A phylogenetic tree is a reconstruction through genetics of a lineage.

See Also: Y Chromosome Consortium (YCC) Tree


The enzyme that starts the process of making nucleic acids or assembling RNA or DNA.

Polymerase Chain Reaction (PCR)

A technique allowing the production of multiple copies of extremely small amounts of DNA fragments using DNA polymerase and specific primers.


A Polymorphism is a change in genetic code (mutation) that has reached a greater than 1% frequency in a local or global population. In genetic genealogy, we most often use it to describe backbone branch defining mutations. These are related to backbone haplogroups.

Note: The terms polymorphism and mutation in this sense do not refer to anything medical.


A population is a group of people who inhabit a geographic region or share a common origin.

Population Bottleneck

A population bottleneck is when a population is greatly reduced in size.


A short DNA sequence used in the polymerase chain reaction to initiate DNA synthesis at a particular location.

Private Variant

Private variants are the Y-DNA SNP (single nucleotide polymorphism) markers that are not on the list of 600,000+ known SNPs. These markers may or may not be unique to you as an individual. Men in related lineages may share some Private Variants. As men from distantly related lineages test, SNP markers may be moved from Private Variants to Named Variants. 


The main building block of our cells. Each one has a specific function.

Principal Component Analysis (PCA)

Principal Component Analysis is a mathematical method that attempts to separate an admixed data set (here a genetic profile) into one or more contributing groups. It was invented by Karl Pearson in 1901 and is sometimes called the Karhunen-Loève transform or proper orthogonal decomposition. 


Pseudonymization is a form of data masking that makes it more difficult for your data to be misused.

We mask the personal identifying information that is shared with any of the Group Projects that you have joined.


Q Arm

The Q Arm is the longer of the two sides (long arm) of a chromosome.

See: P Arm


Recombinational Loss of Heterozygosity (recLOH)

Recombinational Loss of Heterozygosity (recLOH) is a process by which one copy of genetic code is copied over others. The result is identical values. In genetic genealogy, this is most significant for the Y chromosome. Palindromic STR (short tandem repeat) markers may be copied over each other.

For example, DYS385 may have a,b values of 12,19 for a father. His son may have values of 12,12. This is a single recLOH event.


Recombination is the mixing of the DNA on each chromosome that you receive from your mother and father. Different chromosomes and different parts of each chromosome are more or less likely to recombine in a single generation.

Reconstructed Sapiens Reference Sequence (RSRS)

The Reconstructed Sapiens Reference Sequence (RSRS) is a mitochondrial DNA (mtDNA) reference sequence that uses both a global sampling of modern human samples and samples from ancient hominids. It was introduced in early 2012 as a replacement for the rCRS (revised Cambridge Reference Sequence). Because it is based on the likely modal haplotype of the common ancestor to both modern humans and such ancient groups as the Neanderthals, it shows an unbiased path back from any one modern mtDNA sequence to our distant common maternal ancestor.

Source: Behar, D. M., van Oven, M., Rosset, S., Metspalu, M., Loogväli, E.-L., Silva, N. M., Kivisild, T., Torroni, A., and Villems, R. (2012). A ”Copernican” reassessment of the human mitochondrial DNA tree from its root. The American Journal of Human Genetics, 90(4):675-684.


See: DNA Replication

Restriction Enzyme

A protein that recognizes a certain sequence of DNA and cuts the DNA at that site.

Revised Cambridge Reference Sequence (rCRS)

The revised Cambridge Reference Sequence (rCRS) is the revised sequence based on the first mtDNA sequence completed (Cambridge Reference Sequence). The Cambridge Reference Sequence (CRS) is the mitochondrial DNA sequence first sequenced in 1981. It was used as a basis for comparison with mtDNA test results until it was replaced with the RSRS.



Sephardic is the branch of the Jewish population that settled in Spain during the Jewish Diaspora.


See: DNA Sequencing

Sex Chromosome

The X or Y chromosome. Normally males have one X and one Y and females have two Xs.

Short Tandem Repeat (STR)

A short DNA motif (pattern) repeated in tandem. ATGC repeated eleven times would give the marker a value or allele of 11.

Single Nucleotide Polymorphism (SNP)

A single nucleotide polymorphism (SNP) is a change in your DNA code at a specific point.

Sister Clade

A sister clade is one of two haplogroups or subclades that are at the same level on a phylogenetic tree. For Y-chromosome research, this is sometimes a brother clade.

For example, on the maternal tree, H6a and H6b are sister clades.


A subclade is a subgrouping in the haplogroups of the human genetic trees. This may be either the Y-chromosome tree or the mitochondrial tree. Subclades are more specific to a location or population group than the major branches (haplogroups).


A last name or family name traditionally in many Western European countries passed down from a father to his children.



A telomere is the end of a DNA chromosome. Each of our autosomal and sex chromosomes has two telomeres.


The “T” of the four bases that make up DNA. The other bases are adenine (A), cytosine (C), and guanine (G). Thymine always pairs with adenine.

Time To the Most Recent Common Ancestor (TMRCA)

The amount of time or number of generations since individuals have shared a common ancestor. Since mutations occur at random, the estimate of the TMRCA is not an exact number (i.e., seven generations) but rather a probability distribution. As more information is compared, the TMRCA estimate becomes more refined.


A transition is a type of change in the genetic code (mutation). Examples are A < -> G and C < -> T.

Transmission Event

The passage of genetic material from one generation to the next.


A transversion is a type of change in the genetic code (mutation). Examples are A < -> C and G < -> T.


Western Atlantic Modal Haplotype (WAMH)

The most common Y-DNA haplotypes found in Europe’s most common Y-DNA haplogroup, R-M269.


X Chromosome

One of the two sex chromosomes, X and Y. Males receive a single X chromosome from their mother, while females receive an X chromosome from both their mother and their father. X is the sex chromosome that is present in both sexes, singly in males and doubly in females.

X Match

A person who matches you on the X chromosome.


Y Chromosome

One of the two sex chromosomes, X and Y. The Y chromosome passes down from father to son. Females do not receive it. As the Y chromosome is passed on through the paternal line, it is valuable for surname-based genealogy studies.

Y Chromosome Consortium (YCC) Tree

 graphic representation of the Y-DNA haplogroups according to the Y Chromosome Consortium (YCC) classification. Haplogroup names and major clades are labeled and mutation names are given along the branches of the trees.

Submit Feedback