This guest article was provided courtesy of Legacy Tree Genealogists, Inc. who holds all rights to the article.
NOTE: For some reason, the images in this article did not appear in the email version. Please click the link http://www.geneamusings.com/2018/02/guest-article-going-beyond-ethnicity.html to see the images in context with the text.
------------------------------------------------------------------------------
Going Beyond Ethnicity
Estimates in DNA Testing
by Paul Woodbury, (c) 2018, Legacy Tree Genealogists, Inc.
As
a specialist in genetic
genealogy, one
of the most frequent topics I address in my conversation with others
is ethnicity estimates. Someone might say something like: “I’m
not really sure how much to trust those genetic tests since my
grandmother was Italian, and I only came back with 15% Italian in my
results. If they can’t even get the ethnicity right, then what use
are they?”
In
reality, there are two parts of genetic genealogy test results:
ethnicity admixture and genetic matches. Ethnicity admixture results
analyze the mutations and segments of DNA and determine in which
populations those mutations and segments are most often found.
Genetic cousin match lists calculate the number, location and size of
segments of DNA that different individuals share in common. Based on
the number, size, and location of segments, the relationships between
a test subject and their genetic cousins are estimated. While
ethnicity results can be helpful in some specific situations, genetic
cousin match lists are the most useful element of DNA test results.
Each
individual inherits half of their autosomal DNA from each of their
parents. Beyond that, the amount of DNA shared in common is only
approximate due to a random process called recombination, which
shuffles the DNA each generation. Each individual will inherit about
25% from each grandparent, 12.5% from each great-grandparent and
approximately half the previous amount for each subsequent
generation. Although two first cousins will have both inherited 25%
of their DNA from each of their common grandparents (50% in total)
they will have inherited a different 25%. Therefore, first cousins
will typically only share about 12.5% of their DNA in common. Because
descendants along distinct lines inherit different portions of their
common ancestors’ DNA, it is important to test as many people from
distinct family lines as possible.
Every
individual in your DNA match list shares at least one segment of DNA
with you that you likely inherited from a recent common ancestor.
Based on the number of segments you share, the length of those
segments, the position of those segments, and the likelihood of
inheriting those segments over multiple generations, DNA testing
companies estimate how closely related you are to different
individuals in your match list. Closer relationship levels share
unique and distinct levels of DNA. For example, the amount of DNA
shared between siblings will be very different from the amount of DNA
shared between first cousins, which in turn is distinct from the
amount of DNA shared between second cousins. More distant
relationships, however, are slightly harder to differentiate. The
amount of DNA shared between fourth cousins could be the same as the
amount of DNA shared between fifth or sixth cousins. Some more
distant cousins may not share any DNA at all. Even though they may
have both inherited DNA from their common ancestors, they could
inherit unique segments of DNA.
So
why are DNA match lists more useful than ethnicity estimates? While
your ethnicity admixture results may report a general region of the
world where your ancestors may have lived 300-1000 years ago, match
lists give valuable clues regarding genealogical relationships to
other individuals. Most of your genetic cousins are related to you
within a genealogically relevant time frame. Even if you are not able
to determine the exact common ancestor between you and your genetic
cousins, their test results and their pedigrees may offer clues
regarding specific towns and places of origin for your own ancestors.
Through analysis and correlation of the trees, origins, and ancestors
of members of your DNA match list, you may be able to identify
previously unknown ancestors, uncover likely relatives, connect with
lost branches of your family tree, and break through genealogy brick
walls.
To
make the most of your DNA match lists, consider the following four
principles:
1.
Collaboration
Genetic
cousin match lists can be overwhelming. Where to start? How to begin?
I recommend starting with what is closest to you. Who are your
closest cousins? The more DNA a genetic cousin shares with you, the
more likely it is that you will be able to identify a common ancestor
with that individual. Even if you can already see how you might be
related to someone, collaboration can still be helpful. Just as they
will have inherited different DNA than you have from your common
ancestors, they will also have inherited different stories,
information, and documents that may be helpful for your search.
When
collaborating with genetic cousins, make your communication brief,
clear, and to the point. If it is your first attempt at contact,
briefly introduce yourself. Briefly explain your research interests
and explain why you are contacting them. Make 1-3 specific requests
of them, offer to provide assistance or information in return, and
provide direct contact information if desired.
For example, an
attempt at collaboration might look like this:
“My
name is [your name here] and it appears that we are genetic cousins.
I have been doing genealogy research for the past five years and I am
particularly interested in learning more about my maternal
grandmother’s French ancestry. Based on our shared DNA and shared
relatives, it appears that you may be related to my maternal
grandmother. Do you have ancestry from Southern France? Do you have a
family tree you can share with me? If not, could you share the names
of your grandparents or great-grandparents? I would love to
collaborate with you to determine the nature of our shared
relationship. I have performed thorough research on my French family
and have several hundred documents relating to that side of my
family. If we can determine our relationship, I would be happy to
share the documents, sources, and information pertinent to your
family tree. Feel free to contact me through this messaging system or
directly via email [email here] or by phone at [phone number here].”
Some
common requests you might make while collaborating with genetic
cousins include the following:
• Request
access to a family tree.
• Request the names of the individual’s
ancestors, keeping in mind that typically it is better to ask for the
names of grandparents or great grandparents rather than parents.
Asking for information regarding living individuals may make some
individuals feel uncomfortable and may prevent them from responding
to your request.
• Request that they transfer their test results
to Gedmatch.com or another website so you can explore your
relationship further.
• Request that they share their ethnicity
report or their match list with you.
• Request contact
information for other relatives who may know more regarding their
family history.
• Request information about the amount of DNA
and the known relationships they may have with genetic cousins you
share.
• Ask if they have any close genetic cousins who have
also tested. Knowing which of their close relatives you do not match
may help to narrow down how you are related.
• If their
relationship is already known, request information that they may have
regarding your shared ancestor and collateral relatives.
In
one recent case we performed at Legacy Tree, we were attempting to
locate information regarding an individual’s biological father whom
she had never met. She had a name and an occupation and that was all
we had to start with. When we reviewed her test results, we found
that she had a close genetic cousin who was an estimated second
cousin. Based on her relationships to the client’s other matches,
and based on her ethnicity, we knew that she was a paternal relative
of the client, but did not know exactly how. We could have spent more
than 20 hours documenting each of her great-grandparents and all of
their descendants, but instead we contacted her to ask for additional
information on her family tree. In mentioning the name of the
client’s biological father, the match knew exactly who we were
talking about and gave us information regarding his later family, his
immigration to Puerto Rico, and his death – thus pointing us to the
exact family of interest and saving us and the client a great deal of
effort.
2.
Identification
The
main goal of most collaboration is to identify the source of shared
DNA with a genetic cousin. But what happens when they never respond
to your request? Even for non-responsive matches, it is frequently
possible to determine how they are related to your family. The key to
successful identification is to use every piece of evidence afforded.
Some
common pieces of evidence frequently included as part of DNA profiles
and which might help your search include the following:
• Username:
if the username is unique or if it resembles a real name, use that to
guide searches in public records, whitepages, published email lists,
and social media accounts. Numbers in usernames often refer to
important dates like birth or marriage. Many people use the same
username with their email and social media accounts. They may also
use that same username to publish queries in online genealogy forums
relating to their ancestors.
• Profile picture: Use this to
compare against yearbooks, newspapers, obituaries, and Facebook. You
can perform reverse image searches using Tineye and Google.
•
Age, birth date, birthplace, and residence: Use this information to
guide searches in newspapers and online directories. Consider
searching databases of yearbooks. You can also use this data to
search for more recent and updated contact information.
• Small,
limited, and private family trees. If they have a tree attached to
their test results or to their member profile, use all information it
provides. Extend their ancestry for them. If the tree is private, as
is frequently the case at Ancestry.com, perform searches of your
known family names to see if any of them appear in your genetic
cousins’ private tree. Also remember that the default naming
pattern for trees at Ancestry.com is to select the surname of the
user followed by “Family Tree. Other websites follow a similar
naming pattern and the name of the private tree could provide clues
regarding your shared ancestry.
• Names of most distant known
ancestors, research interests, and lists of surnames: Use this
information to perform searches of combinations of surnames in
databases of compiled family trees and genealogical records. Once
several ancestors of a genetic match have been identified, trace
their descendants until you are able to narrow down to the match
themselves.
• Centimorgans, percentages, and number of segments
shared: Some amounts of shared DNA are unique to specific levels of
relationship. You can estimate the likelihood of different levels of
relationship using the data published at the shared cM project as
well as data published in the AncestryDNA help menus and at
ISOGG.org.
• Shared DNA matches: Though it may not be possible
to identify how a match is related to you specifically, it may be
possible to determine their likely relationship based on how they are
related to your other known matches.
In
general, social
media, newspapers,
obituaries, and public record databases are excellent sources for
locating information on living people. As you perform these searches,
however, remember to respect the privacy and wishes of those who may
not want to be contacted.
In
a recent case we were able to identify the father of a woman born in
Melanesia by extending the ancestry of several close genetic cousins
using some of the strategies listed above. Even though these genetic
cousins did not respond to requests for collaboration, and even
though they provided very little information regarding their family
trees on their respective DNA profiles, we were able to reconstruct
this woman’s British ancestry using the trees we constructed
through public records for her close genetic cousins and searching
for connections between collateral relatives of each cousin. Once we
had reconstructed her tree we were able to trace descendants of each
of her likely ancestors and identify her father.
3.
Organization
Dealing
with a huge number of autosomal DNA matches can be overwhelming and
confusing. I recommend organizing matches based on their known
relationships to the test subject and to each other. Organization of
DNA evidence follows some of the same principles as organization of
traditional research. Just as good genealogy researchers will keep
logs of their searches and their correspondence, genetic genealogists
should also keep logs of their research and correspondence. These
“logs” often take form as notes and commentary on genetic
matches. Each DNA testing company offers means of annotating DNA
matches, but frequently these notes are not searchable, making it
somewhat difficult to locate “that one match who was related to
so-and-so.”
Several
third party tools can assist in organizing your DNA matches and your
notes on those matches. The AncestryDNA helper chrome add-on by Jeff
Snavely enables automated scans of AncestryDNA data and will add
buttons to your interface at Ancestry.com. Included in these buttons
is an option to search your results by user, reported surnames or
notes. Another third party tool is the DNAGedcomClient by Rob
Warthen. This subscription app enables researchers to perform
automated scans of DNA test results at Ancestry.com and 23andMe. The
outputs of these scans are spreadsheets with information on shared
DNA, ethnicity estimates, in-common-with matches and notes on genetic
matches. As spreadsheets, they are searchable and can enable easy
location of any notes that have been added to specific matches in the
subject’s account.
Spreadsheets
are an excellent way of organizing DNA matches. Each line in the
spreadsheet can be dedicated to a different genetic cousin or match.
We might recommend keeping separate spreadsheets for different tests
or different subjects. In spreadsheets, researchers can comment on
shared segments, known relationships, potential relationships, shared
surnames, shared ancestral origins, and shared genetic cousins
between a subject and a match. These notes can then be used during
the analysis and correlation stages of the genealogical proof
standard.
Another
popular program for organizing DNA matches is Genome Mate Pro. As
professional genealogists, we rarely utilize this program for clients
since it requires a significant amount of input before meaningful
results can be organized. Nevertheless, it is very useful as a
database and organization tool for personal research and
investigation.
Organizing
your DNA matches is a daunting task not only because there may be a
large number of them, but also because they are constantly changing.
Developing a strong organization structure can seem like an attempt
to hit a moving target. It can be even more daunting if there are
multiple moving targets. The purpose of organization is to enable
genealogical discovery, and genealogical discovery is most often
achieved when pursued through the lens of a narrow and specific
focus. Technology is meant to serve as a tool to enable a
researcher’s goals and purposes, but sometimes it can become the
end in and of itself. This is as true of genetic genealogy testing as
it is of any other type of technology. Without clear goals and
research objectives, the tools genetic genealogy offers can end up
being your task masters. Rather than letting your DNA test results
dictate the direction of your research, use genetic genealogy test
results as a tool to make genealogical discoveries. Instead
of attempting to document your relationship to each genetic
cousin in your match list (an increasingly impossible task as more
and more people test), seek to identify your relationship to your
closest matches and then use that information to guide your
prioritization and investigation of other more distant matches.
Choose a specific research objective and then use your test results
to narrow down to a pool of matches which are most pertinent to your
genealogy research questions. This will make organization of your
matches much more manageable and much more useful.
We
recommend focusing on your closest matches and matches that appear to
be pertinent to the specific research questions you are exploring. If
a genetic cousin shares more than 50cMs with you, there is about a
50% chance they are related within 9-10 generational steps and there
is a much higher chance you will be able to identify a common
ancestor. Once close genetic cousins have been identified, you can
then search for other more distant cousins who are likely related
through the same ancestral lines by identifying genetic cousins who
match at least two known descendants of an ancestor of interest. You
can also eliminate other genetic cousins from consideration in your
research if they match known relatives from your other family lines.
If you are attempting to extend unknown ancestry, document which
relatives belong to your known family and then prioritize
investigation of those who also match them and who have unknown
relationships. Use relationships of genetic cousins to each other to
identify which genetic cousins are most pertinent to a research
question.
Chromosome
mapping is
a type of organizational strategy that can be helpful in some
situations and can guide collaboration with genetic cousins. For
chromosome mapping, focus on identifying your relationship to known
second cousins and more distant relatives. Then identify the segments
of DNA you share in common. Individuals who share those same segments
of DNA with you are likely related through the same ancestral lines.
Though chromosome mapping is useful as an organizational strategy, it
can also easily become an end in and of itself. Remember that your
main objective will typically be to make genealogical discoveries and
extend ancestral lines. In our experience, we consider chromosome
mapping to be the last resort for making genealogical discoveries.
Analysis of relationships, evaluation of shared DNA, and extension of
family trees between genetic cousins is the most useful approach for
genealogical discovery.
In
a recent case performed by Legacy Tree, one of our client’s was
attempting to extend the ancestry of her great-grandfather who was
born in the Southern U.S. in about 1840 with the very common name of
John Jones. Several candidate ancestral couples had been identified
as possible parents, but exhaustive traditional research had not
provided conclusive evidence for any of the candidates. We
constructed a “genetic network” of her 500 estimated 4th cousins
and identified all of the genetic cousins to whom each of them was
related. Using this information we used proprietary technology to
quickly identify groups of related individuals among the client’s
genetic matches. We eliminated from consideration those genetic
cousins who were related through her maternal ancestry and identified
several genetic cousins who were related through the ancestor of
interest. Using these genetic cousins as a “search query” we next
identified all genetic cousins who were related to at least two
descendants of the client’s great grandfather or who fit as part of
their genetic network group. Using this strategy we identified common
ancestors between more distant relatives. As a result, we were able
to connect the client’s great-grandfather to his ancestral family
and extend his ancestry an additional four generations.
4.
Evaluation
Successful
genetic genealogists apply DNA inheritance patterns and probabilities
of relationship to specific research problems. Once you have
identified a likely relationship between yourself and a genetic
cousin, determine if your proposed relationship fits with the
observed amount of DNA you share with each other. Some questions you
might consider include the following:
1.
Does the amount of DNA you share with your genetic cousin fit with
what you would expect given your documented relationship? In other
words does your documented second cousin share an appropriate amount
of DNA to be a full second cousin, or is it possible he may be a half
relative or may be related in some other way?
2. Are there other
ancestral lines that you share in common with your match which could
provide alternative explanations for your shared DNA?
3. Are there
other ancestral lines that match 1 shares with match 2 independent of
your relationship to either of them? In other words, does your
maternal first cousin also share ancestry with your paternal first
cousin independent of their respective relationships to you?
4. Do
we share other types of DNA that we would expect given our proposed
relationship? If your proposed genealogical relationship indicates
that you share common direct-line paternal ancestry, do you share a
common Y-DNA signature? If not, there may be a case of misattributed
parentage. If your proposed genealogical relationship indicates that
you share common direct-line maternal ancestry, do you share a common
mtDNA signature? If not, again there may be a case of misattributed
parentage. If you share common ancestors who could have contributed
DNA to both of your X-chromosomes, do you share DNA on the
X-chromosome, and if not, what is the likelihood of that scenario
given your proposed relationship?
5. Are there any ancestral lines
that are not well represented in your DNA match list? Are there close
genetic cousins with known relationships to each other, but no known
relationship to you?
6. Are there known relatives who you might
invite to test who could represent ancestral lines that are not
represented in your match list? Once they have tested, do they share
the amount of DNA that would be expected given their relationship? Do
they share DNA with other individuals from the family of interest who
do not share DNA with you?
When
evaluating your DNA test results, it is possible to determine the
probabilities of likely relationships based on the number of segments
and the number of centimorgans shared. Centimorgans (cMs) are a
measure of genetic recombination, and communicate the likelihood that
two points on a single chromosome will be separated in one
generation. Some ranges of shared centimorgans are more likely for
specific levels of relationship than they are for others. For example
if an individual shares 255 cMs with a test subject there is more
than a 50% chance that they are related at the level of second
cousins and nearly a 100% chance that they are related within the
range of first cousins once removed to second cousins once removed —
or some equivalent combination of relationships. The following chart
from the AncestryDNA Matching White Paper shows the probabilities of
different levels of relationship given an observed amount of shared
DNA:
In
addition to this resource, we also recommend reviewing information
from the Shared Centimorgan Project hosted by Blaine Bettinger, and
the autosomal DNA statistics pages available through the
International Society of Genetic Genealogy wiki (isogg.org). By
considering the likelihood of proposed relationships given shared
amounts of DNA, it strengthens the traditional and genetic evidence
for genealogical proof.
In
a recent case at Legacy Tree, we were assisting an individual to
document the relationship between herself and a genetic cousin with
an unknown relationship. Based on the amount of DNA they shared in
common, they should have been related at the level of third cousins.
Nevertheless, comparison of their two trees revealed that neither
shared any common surnames, ancestors, or locations in their quite
extensive family trees. Additional investigation into their shared
matches showed that the match held several genetic cousins in common
with the subject, all of whom descended from a specific ancestral
couple who lived in the 1880s in Tennessee. Consultation of the
client’s match list revealed that she had no genetic cousins from
the ancestry of her paternal grandfather, and additional analysis
revealed that her father was likely not the biological son of the man
he assumed was his father. In another case, we discovered that one
genetic cousin shared DNA with a client through their common fourth
great-grandfather, and that both of them matched several other
descendants of the same man. However, the match’s brother did not
share DNA with the client and did not match any of the descendants of
the common ancestor of interest. Additional investigation revealed
that the match’s brother was in fact a half-sibling. These stories
highlight the fact that DNA testing can result in unexpected
discoveries that may change the way you view your family, so it is
important to tread carefully and be respectful of the feelings of the
individuals involved.
Conclusion
Though
ethnicity results can be helpful in some cases of genealogical
research, there is so much more that can be done with your test
results beyond the dinner-conversation topics of your ethnicity
admixture. Collaborate with your genetic cousins to connect with
living family members and learn information about your shared
heritage. Identify your relationships to genetic cousins and document
your relationships to each other. Organize your DNA matches to better
analyze your test results. Evaluate your shared DNA with your known
relatives and determine if your proposed relationships fit with what
you would expect. By following the basic principles of collaboration,
identification, organization, and evaluation you will be well on your
way to making genealogical discoveries using your DNA test results.
Paul Woodbury is a Senior Genealogist with Legacy Tree
Genealogists, a genealogy research firm with extensive expertise in
genetic genealogy and DNA analysis. To learn more about Legacy Tree
services and its research team, visit the Legacy Tree website at
https://www.legacytree.com
===================================================
My thanks to Legacy Tree Genealogists for offering to provide this guest article on DNA testing and analysis.
Copyright (c) 2018, Legacy Tree Genealogists, Inc.
Please comment on this post on the website by clicking the URL above and then the "Comments" link at the bottom of each post. Share it on Twitter, Facebook, Google+ or Pinterest using the icons below. Or contact me by email at randy.seaver@gmail.com.