What is the value of taking an immersive DNA course from an expert in the field? New perspectives, information, and practical applications to genetic genealogy. I completed the Advanced DNA Evidence course coordinated by Blaine Bettinger as part of the Genealogical Research Institute of Pittsburgh. The course exceeded my expectations, and I took copious notes on the latest and greatest advances in DNA as it relates to proving and confirming our ancestors. I highly recommend the course to anyone interested in learning more about the science of genetic genealogy from someone who can explain it in layman’s terms. Blaine will be teaching the course in person in June 2023 in Pittsburgh.
About 60 classmates joined me each day for the weeklong virtual course. Blaine informed us the class material would be heavy on science which was exactly what I wanted. Learning the science behind how we use DNA for genealogy can seem intimidating, but if we want to use it correctly, we need at least a high-level understanding. At best, we’ll dig deeper and seek to stay abreast of new developments. With that introduction, here are five of my many takeaways. I’m sharing these because questions around these topics often come up in our Research Like a Pro study groups and course members.
Learn about Y-DNA Testing at Family Tree DNA
With so many Y-DNA tests available at Family Tree DNA, you may wonder where to start. A good approach to testing is to start with an STR (Short Tandem Repeat) test at 37 or 111 markers, then move to the Big Y-700 if that looks to be promising. If there are no good matches at 111 markers, the Big Y-700 won’t offer additional matches. The Big Y-700 is best for testing hypotheses. For example, if you have researched a possible ancestor and hypothesized he belongs to your family, you can trace his known male descendants and seek a candidate for targeted testing.
Blaine gave us an excellent analogy for comparing the tests. The Y-37 test only compares DNA across 37 STRs on the Y chromosome. This test is like an old black and white TV that is grainy and doesn’t give a great picture. The Y-111 test compares the DNA across 111 STRs and can be compared to a nice color TV with a clear picture. The Big Y-700 compares across 700 STRs and is like a high-definition TV with incredible clarity.
The Big Y-700 is a big upgrade from the previous Big Y-500 and actually discovers new SNPs (single nucleotide polymorphisms) found through sequencing and places you on the Y-DNA Haplogroup Family Tree. If you and a close relative like a father, son, uncle, or cousin all do the Big Y-700 testing, you’ll be able to see your family group on the Y-DNA Haplogroup Family Tree. You can also create a surname project and test a variety of males to see how they fit on the tree. Read more of the science behind the Big Y-700 in the FTDNA White paper. Blaine’s blog post “Thinking About a BigY Test at Family Tree DNA? adds more details.
Be Wary of Small Segments
When viewing a small shared segment, say under 20 cM (centimorgans), we simply can’t tell if it is IBS (identical by state) or IBD (identical by descent). The ISOGG wiki states.
In genetic genealogy the term IBS is generally used to describe segments which are not identical by descent and therefore do not share a recent common ancestor. IBS is also used in genetic genealogy to describe small IBD segments which are shared by many people both within and between populations and which have no genealogical relevance. 1
Why is it important to know if a segment we share with a DNA match is IBD or IBS? If we are tying our hypothesis of a common ancestor to that segment of only 8cM, we could be completely erroneous – even if we triangulate on that segment with two individuals. The FamilyTree DNA – Family Finder Matching 5.0 White Paper includes a table that shows an increasing danger of false positives – segments that are IBS and not inherited from a recent common ancestor – as the size decreases. The table reveals that for segments less than 6 cM, the percentage of IBS segments increases exponentially.
For this reason, each testing company assigns a different threshold to maximize IBD segments and minimize IBS segments, but it is up to us to do our own analysis. These thresholds are not perfect and still result in false positives.
Learn more in this blog post by Ann Turner, “Anatomy of an IBS Segment.”
Explore Outliers on The Shared cM Project
To analyze a DNA match and the hypothesized relationship we share, we rely heavily on the Shared cM Project that Blaine created based on thousands of submissions from DNA test-takers. Jonny Perl hosts the project on his website, DNA Painter. When analyzing the amount of DNA you share with another person, an excellent practice is to view the histogram for the hypothesized relationship and see where the amount of DNA falls. The histogram is a bar graph showing the distribution of shared DNA for a reported relationship. You can view the histogram by clicking on the relationship.
For example, in the image below, you see the histogram for 1C1R ( first cousin once removed). You can see that most people reported matching their 1C1R at between 400 and 600 cM (the three tallest blue lines). What if you think you are related to a 1C1R and you share in the range of 1000 cM, which appears on the far right of the histogram? This could be an outlier, or there could be a different relationship.
Exploring the outliers is important if we are not going to make erroneous conclusions!
Realize the Danger of Confirmation Bias
Because genetic genealogy is based on science, we need to use the scientific method of trying in every way possible to disprove our hypothesis. If we can’t disprove it, we can conclude that the hypothesis is confirmed. When we see a DNA match, and we think we know the common ancestor, we may assume that is a correct hypothesis and not seek for any evidence that will overturn the hypothesis. Here are some of the ways we can avoid confirmation bias and test our hypotheses.
Check the size of the segment
If we avoid small segments, we can go a long way toward ensuring we are looking at segments or DNA matches where we truly share a recent common ancestor and not common ancestry from hundreds of years ago.
Ask: Does the proposed relationship and amount of shared DNA make sense?
We can use the Shared cM project to see if the relationship we have hypothesized makes sense. If the total is larger than the maximum, we need to look more closely at the DNA match. Could there be more than one relationship that we share or a different, closer relationship? If the total is smaller than the minimum, we can return to our analysis and decide if the hypothesized relationship is correct.
How can we know if we share DNA with a match on more than one line if our tree or their tree is not very complete? We can’t! For this reason, we need to always assume the possibility that we share on a different line than we’ve hypothesized. We also need to verify the paper trail for any line of descent that is important to our conclusion.
Understand Endogamy, Pedigree Collapse, and Multiple Relationships
When we encounter a situation where we suspect intermarriage between our ancestral lines, we may assume that is pedigree collapse or endogamy, but could it be simply multiple relationships? Understanding more about these terms will help you as you analyze your DNA results
We can define endogamy as marrying or reproducing within the same population for 100s of years or multiple generations. This is typically due to cultural or religious reasons such as Ashkenazi or Amish populations or due to geographic reasons such as islanders. Endogamy affects genetic genealogy because people will share many small IBD or population segments that are very old. The total amount of shared DNA will be inflated as a result, and it may look like a match has a closer relationship than is actual.
Pedigree collapse occurs when cousins reproduce, and a person will have the same set of ancestors in more than one place in the tree. Pedigree collapse over many generations can result in endogamy, but just two to three instances do not. This scenario can affect matching to closer matches but fades quickly because of segment loss. It will not have a huge effect on downstream matching.
We may see cases of multiple relationships when we discover that we are related to a DNA match through more than one line. If siblings marry siblings, the descendants of couple 1 could share DNA with descendants of couple 2 through both lines. This only affects the descendants of the two couples and their shared matching. With the loss of segments in each generation, the effect will be reduced, so look for this in fairly recent ancestry. This scenario occurred in my family tree with siblings of my great grandparents marrying. I share more DNA with the descendants of that marriage than usual because they connect through two of my lines. Exploring this inflated amount of shared DNA led to me discovering the correct connection.
I shared just a few nuggets from the Advanced DNA course in this post, but if you’re interested in learning more, consider taking the course! You can also study blog posts, watch webinars, best of all, try your hand at working with your DNA in a project.
Best of luck in all your genealogy endeavors!