Pedigree collapse occurs in a person’s family tree if one of their ancestral couples are related to each other. This causes the same ancestors to be repeated in their tree. For example, if a person’s parents are second cousins through their great-grandparents Joseph Dyer and Anna Smith, then Joseph Dyer and Anna Smith appear in the test taker’s pedigree twice. Instead of having 32 unique third-great-grandparents, this test taker has 28 unique third-great-grandparents. These diagrams illustrate how the pedigree collapses:
In a previous blog post, Diana defined endogamy, pedigree collapse and multiple relationships. Since this case study has both pedigree collapse and multiple relationships, I’m repeating those definitions and the result in a person’s family tree here:
Multiple relationships occurs when you are related to someone on more than one of your family lines, so you have multiple common ancestral couples. Your family tree looks normal, but when comparing it with a DNA match, you find more than one set of ancestors on different lines who could have contributed shared DNA.
Pedigree collapse occurs when cousins reproduce, so you have the same ancestors in more than one place in your family tree. Pedigree collapse can become endogamy if it repeatedly occurs over many generations. However, two to three instances do not qualify the situation as endogamy.
Endogamy occurs when related people reproduce within the same population for hundreds of years – generations and generations. If your ancestry is endogamous, you will have multiple distant ancestors repeated in your family tree over and over.
If you haven’t seen the other posts in our endogamy series, you can read them here:
- Endogamy, Pedigree Collapse, and Multiple Relationships: What’s the Difference and Why Does it Matter? by Diana Elder
- How Multiple Relationships Affect DNA Match Analysis by Nicole Dyer
- Multiple Relationships in an African-American Case Study by Allison Kotter
How to Know if a Test Taker Has Pedigree Collapse
Before we get to the case study, let’s talk about how to know if a DNA test taker has pedigree collapse in their tree. One way to check if a DNA test taker or one of their matches has pedigree collapse or endogamy in their pedigree is to use the “Are Your Parents Related Tool” at GEDmatch. This free tool can help identify Runs of Homozygosity (ROH). ROH indicate segments of DNA where the test taker’s parents contributed identical DNA at the same location. This tips you off that the test taker’s parents are related, and there is probably pedigree collapse or endogamy in their family tree. However if the pedigree collapse only occurs on one parent’s side, the tool won’t be able to tell you.
A DNA test taker who I will call Frank has some pedigree collapse in his tree. Using the “Are Your Parents Related Tool (AYPR)” at GEDmatch for Frank revealed two short runs of homozygosity. This indicated that his parents are distantly related.
After carefully reviewing Frank’s pedigree, I found that his parents were third cousins through one set of common ancestors, as well as third-cousins-once-removed through another set. I also noticed an instance of pedigree collapse further back on his paternal line. One way to visualize pedigree collapse is the Exploring Family Trees visualization tool at LearnForeverLearn.com/ancestors. I uploaded a GEDCOM file to visualize Frank’s tree there. As you can see, Frank has at least three instances of pedigree collapse. Two of the collapses cross over between Frank’s maternal and paternal lines, which explains the results from the AYPR tool.
How does pedigree collapse affect DNA match analysis? We’ll explore that in a case study about the father of Glen Hopper. The test taker for the Hopper case does not have known pedigree collapse in their own family tree. However, the discovery of a close DNA match who did have pedigree collapse on the relevant line caused additional analysis to be done.
Glen Hopper was born in 1902. He was the son of Susan Hopper and an unknown father. Family lore stated Glen’s father was Dan Daniel, a coal miner. Y-DNA evidence pointed to Glen’s father’s surname being Daniel. However, Glen’s social security application stated his father’s name was Robert Daniel, not Dan Daniel.
Figure 1: Glen’s Child’s Paternal Pedigree
To determine if the father of Glen Hopper was Robert Daniel or some other man, the autosomal DNA results of one of Glen’s children were analyzed. A close DNA match was discovered, who I will call Joseph for the sake of privacy, sharing 239 cM with Glen’s child. Reviewing Joseph’s tree, shown in figure 2, several ancestors with the surname Hopper and Daniel jumped out.
Figure 2: Joseph’s Hopper and Daniel Ancestors
The most obvious set of common ancestors is Robert Anderson Hopper and Susan Pollard. Glen’s mother is the grandchild of Robert Anderson Hopper and Susan Pollard. In figure 3, you can see that Glen’s child and Joseph are third-cousins-once-removed (3C1R) through Glen’s mother, Susan Hopper.
Figure 3: Relationship between Glen’s Child and Joseph is 3C1R
According to the Shared cM Project, 3C1R typically share an average of 48 cM or between 0-192 cM. The amount shared with Joseph, 239 cM, is outside of this range. This tipped us off that more relationships likely exist between Glen’s Child and Joseph, contributing to an inflated amount of shared DNA.
Our hypothesis is that Glen’s father was Robert Daniel – which would be an additional relationship contributing to an inflated amount of DNA. Joseph’s grandmother was a daughter of Robert Daniel. If Robert Daniel was the father of Glen, then Glen’s child and Joseph are also half-first-cousins-once-removed (half 1C1R), as shown in figure 4.
Figure 4: Possible Relationship Between Glen’s Child and Joseph: H1C1R
The Shared cM Project says that the average amount of DNA shared for half 1C1R is 224, and a range of 62-469. This relationship seems more likely to describe the 239 cM shared between Joseph and Glen’s child. If you add 48 cM, the average amount shared for 3C1R, to the average for half 1C1R, 48+224= 272 cM. This is a strategy you could use to estimate the expected amount of shared DNA for those with multiple relationships (not pedigree collapse or endogamy).
After reviewing Joseph’s pedigree further, it appears that Joseph’s great-grandparents, Robert Daniel and Etta Hopper, are first cousins – resulting in pedigree collapse in Joseph’s tree. His 3rd-great-grandparents, Marcus Daniel and Martha Mayes (green boxes in figure 2) are in his tree twice. Joseph received extra DNA from them, possibly causing him to share even more DNA with Glen’s child if Glen is a descendant of Marcus Daniel as we have hypothesized. So, if Robert Daniel was Glen’s father, Glen’s child has three distinct relationship paths with Joseph:
- 3C1R through Robert Anderson Hopper and Susan Pollard (Glen’s child > Glen > Susan Hopper > George Hopper > Robert Anderson Hopper/Susan Pollard; down from there to John Hopper < Etta Hopper < Elsie Daniel < Nell < Joseph)
- Half 1C1R through Robert Daniel (Glen’s child < Glen < Robert Daniel; down from there to Elsie Daniel > Nell > Joseph)
- 3C1R through Marcus Daniel and Martha Mayes (Glen’s child > Glen > Robert Daniel > Marcus Daniel > Marcus Daniel and Martha Mayes; down from there to Martha Daniel < Etta Hopper < Elsie Daniel , Nell < Joseph)
As if these three relationship paths were not enough, there is a possibility for additional relationship paths between Glen’s child and Joseph through Joseph’s ancestor, Jane Hopper. So far, Jane’s parentage is unknown, but it appears that she was not a daughter of Robert Anderson Hopper and Susan Pollard. Jane was probably related to these Hoppers somehow, because they were all living in the same area, but the exact connection is not known. Therefore, additional pedigree collapse further back in Joseph’s tree is possible. Also, if Robert Daniel was the father of Glen, it’s possible for Glen’s child to have pedigree collapse further back in their family tree as well.
To use the match with Joseph as evidence for or against the hypothesis, we first need to compare the expected amount of DNA for these relationships combined with the actual amount of shared DNA. If Robert Daniel was the father of Glen Hopper, their relationship would be half first cousin once removed. What is the best way to estimate the expected amount of shared DNA when there is pedigree collapse in one of the test taker’s trees? Can you simply add up the average amounts from the Shared cM Project to get an estimate? 48cM (3C1R) + 224cM (H1C1R) + 48cM (3C1R) = 320 cM?
No, we can’t do that with pedigree collapse. One issue with that method is that the Shared cM Project is biased to be high. Some 3C1R won’t match at all, but people who are self-reporting their matches in the Shared cM Project are not likely to report cousins that don’t match. Also, the volunteers who submit data are more likely to figure out the relationship with cousins who share more DNA.
Rather than adding up the averages from the Shared cM Project like you would do with a case of multiple relationships (i.e. double cousins), a better option for pedigree collapse is to use the coefficient of relationship. This helps you break down each path to all the common ancestors.
Coefficient of Relationship
To estimate the expected amount of shared DNA between two people, geneticists use a mathematical formula called “coefficient of relationship.” The term was coined by Sewall Wright in 1922, which he originally called the coefficient of inbreeding. 1 The coefficient can range from 0-1, with 1 being the most related, and 0 being the least related.
First, identify each individual who is a common ancestor between the two people. A straightforward example would be first cousins. They share two most recent common ancestors – their grandfather and grandmother. The number of generations separating each cousin include 2 steps up from cousin A to their shared grandfather and 2 steps down to cousin B, for a total of 4 steps. This is repeated for their grandmother. Their calculation looks like this:
So, first cousins have a coefficient of relationship of 0.125 or 12.5% (the equivalent of about 875 cM). This is how genetic genealogists estimate how much DNA a person might share with their relatives. The actual amount first cousins share ranges from 396-1397, as observed by the Shared cM Project.
It’s important to remember that each person is considered a separate relationship path, rather than ancestral couples being grouped together in the relationship paths. We often think of common ancestral couples in genetic genealogy, but when calculating the coefficient of relationship, we need to think of each ancestor separately.
With the Hopper example, we need to list each common ancestor between Glen’s child and Joseph, including those who are included in the pedigree twice due to pedigree collapse. Then we will count the number of generations separating them. I used Lucidchart to do this, adding small blue numbers on each connecting line (see figure 5).
- Robert Daniel – the hypothesis is that he was the father of Glen. Elsie was his daughter through a different woman (not Glen’s mother). So there is only one common ancestor for that relationship, hence why it is a half relationship (H1C1R). There are 5 generations separating Glen’s child and Joseph, counting up two steps and down three steps.
- Robert Anderson Hopper and Susan Pollard – Glen’s child and Joseph are 3C1R through these two common ancestors. From Glen’s child up to Robert Anderson Hopper is 4 steps, and down to Joseph is 5, for a total of 9 generations of separation. This is the same for Susan Pollard – 9 generations of separation.
- Marcus Daniel and Martha Daniel – although they already appear in Joseph’s pedigree as the grandparents of Robert Daniel, they also appear in another spot in Joseph’s pedigree, as the grandparents of Etta Hopper. We need to include this path in our calculation to account for the extra DNA they passed through Etta to Joseph.
Figure 5: Distinct relationship paths between Glen’s child and Joseph
Now that we determined each of the most recent common ancestors and the number of generations separating Glen’s child and Joseph through those ancestors, we can plug those numbers into the coefficient of relationship calculation.
The coefficient of relationship between Glen’s Child and Joseph is 0.039, or 3.9%. To convert this into centiMorgans, we can multiply 0.039 x 7000 cM which equals 273 cM. This coefficient of relationship formula predicts that Glen’s child should share about 273 with Joseph, if Robert Daniel is the father of Glen. The actual amount that they share is 239 cM. The difference between the estimated expected amount and the actual amount is 34 cM. That’s pretty close! Maybe our hypothesis is correct.
In the past, many have used 6800 cM as the total genome size to multiply by when converting percentages of shared DNA into cM.2 However, Leah Larkin graciously reviewed this article and suggested using 7000 cM instead of 6800 cM, which was based on an old map by FamilyTreeDNA. She says that the genome size at each testing company varies but most are either slightly over or under 3500 cM. When doubled for paternal and maternal chromosomes, the total genome size is 7000 cM.
Another Hypothesis – Glen’s Father was a Brother of Robert
Perhaps Robert Daniel was not the father of Glen Hopper. If this is the case, we need another hypothesis to test. Perhaps one of Robert Daniel’s older brothers, James, George, Luther, or Jesse was the father of Glen. In that case, we would need to adjust one part of our coefficient of relationship estimate. Instead of having a half 1C1R relationship with Joseph, Glen’s child would be 2C1R through the parents of Robert Daniel. Robert’s parents were Marcus Daniel [Jr.] and Jane Hopper.
The coefficient of relationship calculation would now include both Marcus Daniel [Jr.] and Jane Hopper.
We have to remember that the coefficient of relationship is an estimate. The actual amount of shared DNA will vary due to random recombination. People don’t always inherit the average amount of DNA from each common ancestor. Since we only have one data point in this example, it’s hard to see if the amount of shared DNA we observe between Glen’s child and Joseph is an outlier. If we had several data points, and they all point to the shared cM amounts being what’s expected for Robert to be Glen’s father, we would have a stronger case. We would also need to eliminate Robert’s brother’s as potential fathers. One way to do this is by testing descendants of Robert’s siblings.
Our intermediate conclusion is that both hypothesis 1 and hypothesis 2 are possible. More test-takers and matches are needed to confirm or reject the hypothesis that Robert Daniel was the father of Glen Hopper. The best test-takers to prioritize are those whose ancestors did not intermarry with this same population, as Paul Woodbury calls genetic pioneers.
Coefficient of Relationship Calculations Chart
As I was working on these calculations, I made a chart in Google Sheets that helped me visualize raising 1/2 to the power of 4, 5, 6, etc. If this is helpful for your calculations as well, you can copy the google sheet here. Are you wondering why the relationships in the chart are all half relationships? It’s because when calculating the coefficient of relationship, we are breaking down each relationship path to focus on one common ancestor at at time. When you say first cousins, that means you have two common ancestors (grandma and grandpa). When you say half 1C, that means you share only one common grandparent.
Another Example of the Coefficient of Relationship
In the chapter on endogamy and pedigree collapse in Advanced Genetic Genealogy, Kimberly Powell gives an example of calculating the coefficient of relationship. Her example includes a test taker and DNA match who both have pedigree collapse on the line they are related through, as well as multiple relationships. Her example includes ten distinct relationship paths and results in a coefficient of relationship of 0.017578 or 1.76%.3 I like how she used a genetics pedigree diagram with males as squares and females as circles and each common ancestor labeled A-J to succinctly identify relationship paths through distinct individuals.
Kimberly’s example uses two methods of calculating the expected amount of shared DNA – the coefficient of relationship, and also adding up the expected percentages of shared DNA for relationships based on the Autosomal DNA Statistics page at the ISOGG Wiki. For this method, Kimberly used a matrix showing the relationships between the test taker and the match and the expected percentages of DNA. The percentages for each relationship were then added up. The answer was similar to her coefficient of relationship calculation.
Takeaways for Working with Pedigree Collapse
If you are working with DNA results that include pedigree collapse, remember the following tips.
- Check for runs of homozygosity using the GEDmatch tool, “Are Your Parents Related?”
- Visualize pedigree collapse with the Exploring Family Trees visualization tool at https://learnforeverlearn.com/ancestors/.
- Carefully review the pedigrees of both the test taker and the DNA match to look for pedigree collapse and/or multiple relationships.
- If pedigree collapse occurs in one of your DNA matches’ trees, and the MRCA couple is in their tree twice, you may share an inflated amount of DNA.
- Calculate the coefficient of relationship to estimate expected amounts of shared DNA for each relationship path to a common ancestor. If an ancestral couple is in the tree twice due to pedigree collapse, they could appear as a separate path.
- Incorporate more test-takers to better understand unknown relationships when pedigree collapse is involved. Test genetic pioneers whose ancestors left the population. Consider Y-DNA and mitochondrial DNA tests as they are often very helpful in untangling trees with pedigree collapse or endogamy.
Also, be sure to reach out to those who have worked with pedigree collapse before for advice! I did that with this article, and am very grateful for the feedback I received. Thank you to Yvonne Fenster, Leah Larkin, and Kimberly Powell.
Resources for Pedigree Collapse
To learn more about pedigree collapse, visit the following websites and resources. All websites were accessed 5 November 2022.
“Are Your Parents Related” Tool. GEDmatch.com, https://gedmatch.com.
“Coefficient of Relationship.” International Society of Genetic Genealogy Wiki. https://isogg.org/wiki/Coefficient_of_relationship.
Cooper, Kitty. “When the DNA says your parents are related.” Blog post. 25 July 2018. Kitty Cooper’s Blog. http://blog.kittycooper.com/2018/07/when-the-dna-says-your-parents-are-related/.
Ekins, Jane. “Calculating Pedigree Collapse on DNA Matches.” Blog post. Your DNA Guide. https://www.yourdnaguide.com/ydgblog/calculating-the-pedigree-collapse-effect-in-your-dna-matches.
Ekins, Jane. “Pedigree Collapse and Your DNA Matches.” Blog post. Your DNA Guide. https://www.yourdnaguide.com/ydgblog/pedigree-collapse-and-genetic-relationships.
Estes, Roberta. “What’s the Difference Between Pedigree Collapse and Endogamy?” Blog post. 23 July 2021. DNA Explained. https://dna-explained.com/2021/07/23/whats-the-difference-between-pedigree-collapse-and-endogamy/.
Larkin, Leah. “The Endogamy Files: Visualizing Endogamy in Your Tree.” Blog post. 26 March 2022. The DNA Geek. https://thednageek.com/the-endogamy-files-visualizing-endogamy-in-your-tree/.
Larkin, Leah. “The Endogamy Files: What Is Endogamy?” Blog post. 5 May 2020. The DNA Geek. https://thednageek.com/the-endogamy-files-what-is-endogamy/.
Lyon, B.F. Exploring Family Trees. https://learnforeverlearn.com/ancestors/.
“Pedigree Collapse.” International Society of Genetic Genealogy Wiki. https://isogg.org/wiki/Pedigree_collapse.
Powell, Kimberly T. “The Challenge of Endogamy and Pedigree Collapse.” Debbie Parker Wayne, editor. Advanced Genetic Genealogy: Techniques and Case Studies. Cushing, Texas: Wayne Research, 2019. Pages 127-153.
Woodbury, Paul. “Dealing with Endogamy, Part I: Exploring Amounts of Shared DNA.” Blog post. Legacy Tree Genealogists. https://www.legacytree.com/blog/dealing-endogamy-part-exploring-amounts-shared-dna.
We also encourage you to take Diahan Southard’s online endogamy class. learn more here: Start Untangling Your Family Tree | Endogamy & DNA Course .
To learn how to incorporate DNA into your documentary research, join our online, independent study course, Research Like a Pro with DNA. In this course, we take you step by step through organizing and analyzing your matches and applying DNA evidence to focused research questions.
- “Coefficient of relationship, Wikipedia (https://en.wikipedia.org/wiki/Coefficient_of_relationship : last edited 17 October 2022, at 15:50 (UTC)).
- “Autosomal DNA statistics,” International Society of Genetic Genealogy Wiki (https://isogg.org/wiki/Autosomal_DNA_statistics#Distribution_of_shared_DNA_for_given_relationships : last edited on 17 October 2022, at 17:29), section titled “Distribution of shared DNA for given relationships.” See also “CentiMorgan,” International Society of Genetic Genealogy Wiki (https://isogg.org/wiki/CentiMorgan#Converting_centiMorgans_into_percentages : last edited on 14 December 2020, at 22:50), section titled “Converting centiMorgans into percentages.”
- Kimberly T. Powell, “The Challenge of Endogamy and Pedigree Collapse,” Debbie Parker Wayne, editor, Advanced Genetic Genealogy: Techniques and Case Studies (Cushing, Texas: Wayne Research, 2019),137-138. I reached out to Kimberly and she shared that there are two errors on p. 137-138 – in image 6.7 there should be a connecting line from H to parents A and B. On p. 138, paths 5 and 6 should be 1/2 to the 10th power = 1/1024.