Wednesday, February 13, 2019

A Reader's Take on Problems - Part IV: The Ancestry Big Tree

In response to Dear Are You Fixing These Problems? (posted 17 January 2019), I received 38 comments on the post, and several via email.  

One of the email correspondents was a person who has Ancestry and IT experience, and offered knowledge, experience and wisdom for users of, especially on the trees and the search engines.  The comments are detailed and ring true to me based on my experience (and that of others) working on  

A)  In this post, I want to concentrate on the Ancestry Big Tree (I don't know what they call it! --  I commented on it in points 6) and 7) in the 
Dear Are You Fixing These Problems? post).  My reader's response was:
The Big Tree 
To mangle a quote, "Yes, Virginia. There is a big tree." But no, Virginia, it is NOT based on the One World Tree abomination.
The origins of the Big Tree is a stitch from the member trees. Better algorithms, pre-cleaning processes, and management (both product and development) have made the Big Tree much better than One World Tree. But, as you know, there are many bad trees in the Ancestry ecosystem. When 200 users copy a wrong fact into their tree, that doesn't make the fact correct. There is certainly an element of voting in the assembly of the Big Tree, as well as looking at the number and quality of the sources for profiles added to the Big Tree.
Ancestry also took the Big Tree a step farther than just stitching the member tree data together. They also started stitching families from the record collections. By tracing families through census collections, travel collections, and marriage records, families and individuals could be created and/or validated in the Big Tree, making it more accurate.
Is it perfect? No. It is easy to find errors in any automated process, which is why the big tree is not available as a viewable resource. It can provide some good hints (it powers suggested parent hints). As with any resource, data should be presented as a hint or suggestion, not as the truth. Sadly, too many armchair family historians tree all hints as facts.
B)  Randy's Notes and Comments:

A recent news article about human longevity and lifespans was published in November 2018 titled "Researching the Genetics of Human Lifespan" on the blog, and several articles were published in the news media about it.  The article linked to a scientific paper titled "Estimates of the Heritability of Human Longevity Are Substantially Inflated due to Assortative Mating."

The section on "Pedigree aggregation and deduplication" notes that they used 54 million Ancestry Member Trees with 6.4 billion profiles to coalesce the information to over 831 million "unique" profiles, which was reduced to 406 million profiles with parent-child relationships to perform the longevity study.  So the Ancestry Big Tree may contain 831 million unique profiles.

My opinion is that this collection of related tree profiles comprises the "Ancestry Big Tree."  However, I may be wrong!  If so, I'm sure someone will tell me...and I will tell my readers.

As I mentioned in Dear Are You Fixing These Problems?, apparently the Ancestry Big Tree is used for:

*  the Ancestry We're Related mobile app to connect the user to cousins also in the Big Tree

*  the "Possible Ancestors" feature in user's Ancestry Member Trees

*  the "Life Story Overview" feature in AncestryDNA Circles

*  a summary ancestor profile found using Google search - see for an example.

Are there other uses for the Ancestry Big Tree?

A potential use for the Ancestry Big Tree is to suggest common ancestors for DNA matches who have a small or incomplete tree.   

The resulting "Ancestry Big Tree" has inaccuracies in it, as I pointed out in my reference post.  The parent-child relationships may be wrong; the spouse relationships may be wrong; the event dates and places may be wrong; and more, I'm sure.  Why are some of them wrong?  Because every Ancestry Member Tree has errors, and many tree relationships and events are based on another person's tree.

How accurate is the Ancestry Big Tree?  I don't have any idea of the true number.  I know that I make relationship mistakes and event mistakes - perhaps on 0.5% to 1% of my tree profiles.  Pobody's nerfect, as we all know.  In my blog post How Accurate Is an Ancestry Quick and Dirty Tree?, I noted that using the "Possible Ancestors" feature resulted in 5 errors in my 4th great-grandparents or closer - a 4% error rate.

If only 1% of the relationships are wrong because of user error, that is about 8 million parent-child relationships in the Ancestry Big Tree of 831 million unique profiles.  That means that one of your (and my) 4th great-grandparents or later might be wrong in the Ancestry Big Tree.  If one of your recent x-great-grandparents is wrong, then your ancestry back from that person is wrong.  If 4% are wrong, then a significant portion of your ancestry on the Ancestry Big Tree might be wrong.

Of course, not everyone can trace their ancestry back to before the 4th great-grandparents.

C)  See earlier posts on this general subject:

*  A Reader's Take on Problems - Part I - Ancestry Member Tree Indexing (posted 23 January 2019)


Disclosure:  I have had a paid subscription to since 2000, and use the site every day.  I have received material considerations from in years past, but that does not affect my objectivity in writing about their products and services.

The URL for this post is:

Copyright (c) 2019, Randall J. Seaver

Please comment on this post on the website by clicking the URL above and then the "Comments" link at the bottom of each post.  Share it on Twitter, Facebook, or Pinterest using the icons below.  Or contact me by email at

No comments: