DNA is composed of two parallel strands connected by four chemicals, called bases. These bases (A, C, T, and G) occur in pairs, with one-half of each pair attached to each strand. These pairs are called nucleotides and connect the two strands like rungs on a ladder. The bases always make pairs of AT and GC.
Short Tandem Repeats (STRs) are sequences of identical nucleotides that repeat themselves multiple times within a continuous stretch of the Y chromosome. We call these sequences markers. The number of times these sequences repeat is called the value of that STR marker.
In some cases, there are multiple copies of the same STR that occur in more than one location on the Y chromosome. We call these multi-copy markers. Even though the locations where this STR occurs are different, the identical pattern of the nucleotides in each location tells us this is the same STR. When this occurs, we report the value in hyphens, with each number representing the number of repeats in each location.
For example, DYS385 is a repeat of the nucleotide sequence GAAA that occurs in at least two different locations. If there are 14 repeats in each location, we will report the value as DYS385=14-14. Each repeat value is sometimes referred to as DYS485a and DYS485b.
Only certain markers will appear in multiple locations, and this is due to a unique structure called a palindrome.
In some places, a pinch in one strand of the Y chromosome causes a bulge shaped like a hairpin. This formation causes the nucleotides of a single strand to attach to each other instead of attaching to nucleotides on the other strand. On the two sides of the hairpin, the nucleotides connecting each other are mirror images. Because A always pairs with T and C always pairs with G, when the hairpin is straightened out, the strand will be the same read forward or backward. In other words, they are palindromes.
When an STR occurs in a palindrome, we count the number of repeats on each side. In many cases, this will occur in equal numbers on each side, such as DYS385=14-14. However, sometimes a repeat can be gained or lost on one side of the palindrome but not the other. This will result in a value such as DYS385=13-14.
It is also common to see the entire sequence occurring more than once on each side of the palindrome. One example is DYS464. In this palindromic STR, you will usually see four values, like 15-15-17-17.
Sometimes not just a single repeat but an entire sequence is added or removed, resulting in a value like 15-15-15-17-17-17.
These two distinct types of changes within a palindrome are called marker changes and copy changes.
Marker changes refer to changes in the values of individual markers rather than the number of markers. This counts the overall number of differences regardless of the order in which they appear. Let’s look at some various values of DYS464 that we might see between two people.
Person A: DYS464=15-15-15-16
Person B: DYS464=15-15-16-16
In this example, the two people would have a GD of 1, because the value differs on one marker copy.
Person A: DYS464=15-15-15-16
Person B: DYS464=15-15-16-15
In this example, the two people would have a GD of 0. Even though the values are not in the same order, there are still 3 copies with a value of 15 and one with a value of 16.
This refers to the number of copies, regardless of the values. When calculating Genetic Distance (GD) to copy changes, we count any difference in the number of copies as 1. Because they are palindromes, if more than one copy is lost or gained, it is still assumed to have occurred in a single mutation event.
Person A: DYS464=15-15-15-16-16-16
Person B: DYS464= 15-15-16-16
In this example, the GD would still be 1. Even though there are two additional copies of the marker in Person A, it is probable that both were added at the same time. In addition, it would not make sense to add together the total number of repeats in the extra copies, because those copies aren’t different in Person B, they simply do not exist.
In other words, we would not add together the repeats in the two extra copies and say these two people have a GD of 31 (15+16). We would still count this difference as a GD of 1. This applies regardless of how many more copies of a marker one person has than another.
Calculating Genetic Distance
For multi-copy markers, the aggregate amount of marker changes and copy changes between two individuals in a specific individual are combined to produce the GD for that marker. The fact that there are two types of mutations that can affect multi-copy markers means that the GD with them tends to be higher than in other markers. For this reason, they are often referred to as fast mutating markers. When comparing GD on fast mutating markers, it may be prudent to keep this increased rate of change in mind when determining the possible time to the most recent common ancestor (TMRCA).