Monday, August 26, 2013

Puzzling Over How Ancestry.com Finds Suggested Records

One of the features I really like on Ancestry.com is the "Suggested Records" found after a searcher has found one record for a person.  Let me use George Knapp, born in 1848 in New Jersey, as an illustration.  He's my first cousin, three times removed (a first cousin of my second great-grandmother, Sarah (Knapp) Auble.  He is in my database and I want to add his wife and children to my database, and perhaps find male descendants for a Knapp Y-DNA connection.

1)  I did a search for George Knapp, born 1848 (plus/minus 2 years) in New Jersey, with the exact match box checked, on Ancestry.com:


I think that Ancestry.com changed the relative size of their search screen here, and added some color background to dropdown fields.  This is, I think, part of their "coming changes."

2)  Here are the search results:



There are matches shown for the 1850, 1860, 1870, 1900, 1910, 1920 and 1930 U.S. census;  note that there is no 1880 U.S. census match.  There are also matches in the Iowa State Census collection, 1836-1925, and in the Iowa State Census, 1895.

There were no matches in any other record collection except for the Public and Private Member Trees on Ancestry.com.

3)  I started with the 1850 U.S. census, and there is George as a 2 year-old male in the home of Charles and Sarah Knapp, with 3 other young Knapps and one Smith:


There were no "Suggested Records" to the right of the image thumbnail.

4)  The 1860 U.S. Census record shows George S. Knapp, age 12, with Charles and Sarah Knapp, 5 other Knapp persons, and a Shrader:



Over on the right side of the screen are suggested records for:  1930, 1920, 1910, 1870 U.S. census collections, and two Iowa State Census Collection matches.

5)  The 1870 U.S. Census record shows George S. Knapp, age 23, with Charles and Sarah Knapp, and 4 other Knapp persons:



Over on the right side of the screen are suggested records for:  1930, 1920, 1910, and 1860 U.S. census collections, and two Iowa State Census Collection matches.

6)  The 1880 U.S. Census record (which was not found on the Matches list in the second screen above) shows George S. Knapp, age 32 born in New York, with Mary A. and three Knapp children:


Over on the right side of the screen are suggested records for:  1930, 1900, and two other 1880 possible matches.

7)  The 1900 U.S. Census record shows George I. Knapp, age 52, with Mary A. and two Knapp children:




Over on the right side of the screen are suggested records for:  1930, 1920, and 1910 U.S. census possible matches.

8)   The 1910 U.S. Census record shows George G. Knapp, age 62, with Mary A. and one Knapp child:


Over on the right side of the screen are suggested records for:  1930, 1920, 1900, 1880, 1870 and 1860 U.S. census collections, and two different Iowa State Census matches.

9)  The 1920 U.S. Census record shows George G. Knapp, age 72, with Mary A.:


Over on the right side of the screen are suggested records for:  1930, two 1910, 1900, 1880, 1870 and 1860 U.S. census, and three different Iowa State Census matches.

10)  The 1930 U.S. Census record shows George S. Knapp, age 82, with Mary A.:


Over on the right side of the screen are suggested records for:  1920, two 1910, two 1900, 1880, and 1860 U.S. census, and three different Iowa State Census matches.

11)  Do you see the inconsistencies here?  What I noticed was:

*  The 1850 U.S. census entry was not found in any of the other "Suggested Records."

*  The 1860 U.S. Census entry was not found on the 1880 or 1900 list of "Suggested Records."

*  The 1870 U.S. Census entry was not found on the 1880, 1900 or 1930 list of "Suggested Records."

*  The 1880 U.S. census record was not found in the "Results list" (because his birthplace was listed as New York and not New Jersey), but was found in the "Suggested Records" for 1910, 1920 and 1930.

*  The 1900 U.S. census record was not found on the "Suggested Records" for 1860, 1870 and 1880.

*  The 1910 U.S. Census entry was not found on the 1880 list of "Suggested Records."

*  The 1920 U.S. Census entry was not found on the 1880 list of "Suggested Records."

*  The 1930 U.S. Census entry was found on all of the lists of "Suggested Records" (except for 1850).

I did look at the Iowa State Census collections, and there are entries for 1885, 1895, 1915 and 1925.  The 1925 collection provides the parents' names for George S. and Mary A. Knapp.

12)  What does all of this mean to me?  My conclusions are:

*  A searcher needs to check all of the matches on each "Suggested Records" list (s/he might make a list to save time).

*  If the target person has a somewhat different name (or middle initial), a different residence, or a different birthplace, that target person may not be on some "Suggested Records" list.

*  This "Suggested Records" list is not foolproof, but it works pretty well.  I estimated that it had 80% to 90% accuracy for the Records it finds (meaning the suggested record is the target person), but not for the Records it misses (meaning that it doesn't always find the target person for whatever reason).

*  I believe, but I don't know for sure, that these "Suggested Records" are created dynamically from an actual search with the given search parameters, rather than looking up records attached to one or more Ancestry Member Trees.  I may be wrong!

*  You cannot trust these "Suggested Records" to find every match possible in all of Ancestry's databases, but it really helps pick off the "low hanging fruit."  Many databases don't have a birth year or a birth place, and an exact search will not find those entries.  A searcher should do a name only search, and perhaps a name only plus a state residence search, in order to find the "hiding" entries in the Ancestry databases.  For instance, I did a "George Knapp" exact name and a "Iowa, USA" exact residence search and found his marriage record in 1874 to Mary A. Blessing.  I haven't found his, or Mary's, death record yet.

13)  What are your experiences with the "Suggested Records?"

The URL for this post is:  http://www.geneamusings.com/2013/08/puzzling-over-how-ancestrycom-finds.html

Copyright (c) 2013, Randall J. Seaver


9 comments:

Sara Gredler said...

Hmmm, my impression (in my experience and reading your workflow above) was that those Suggested Records *are* based on what has been attached to Ancestry.com trees, which is why on the 1880 census record showed up as a suggested record on other census hits but not others. I have always taken the "Suggested Records" as a collective.

Geolover said...

I agree with Sara's impression. Sometimes I see items for completely wrong names that would never be retrieved in a search. Since there is no link directly to others' savings, this is hard to check.

Anonymous said...

I have found that some other records are suggested when looking at a certain record and then, when looking at one of the suggested records, other appropriate records appear. And, as Geolover said, I also find completely wrong names

Alice Allen said...

At least the improved search feature seems to get rid of totally irrelevant results, like SSDI entries for someone born in the 1700's. I haven't done as thorough of an analysis as you have, but I do like the new search MUCH better than the previous one.

Cousin Russ said...

Randy,

I had George in an Ancestry Member Tree (Private, not indexed), and pretty much have the SAME results as you. At first, I missed the Suggested hints, as I don't normally search from within an AMT. I did the 1850 Census, Saved it to my AMT, did the 1860 Census, Saved it to my AMT. When I was looking at the 1860 I saw something on the right part of the screen, but didn't look at it, until it was too late. I slowed down for the 1870 Census and saw the hints. Observation on why 1850 wasn't listed as you noted. George (in 1850) and George S in the following.

What was NOT helpful, was hints showing up, when the Data was SAVED in a previous transaction. I had Saved the 1850 and 1860 Census, but as I progressed they showed up is subsequent Suggested Records (hints)

Back in the AMT Hints, after the 1870 Save, I had 4 Hints listed (for George), 1880, 1900, 1920, 1930, NOT the 1910 and only 1 Iowa hints, where the Suggested Records showed 2 in Iowa.

Interesting observations. Thank you,

Russ

Taco said...

As far as I can tell, it's some algorithm that displays records that are attached to one individual within other Ancestry Member Trees, together with the record that you have currently selected. If a large selection of AMT's have the same set of records attached to an individual, Ancestry can offer up that set as 'suggested records'. The reason why it will show you wrong records sometimes, is the rampant, unverified copying of information from other AMT's by other Ancestry users, including wrongly sourced individuals. When enough people copy that wrong information to their own tree, it will show up as a suggested record at some point or other. Just my two cents.

Cousin Russ said...

Taco - I am not sure that the AMT's play a role here IF you Ignore AMT hints. I did access my AMT that had Randy's person in it, but I did NOT look at any of the AMT hints. Each of the hints that I looked at, were valid. I didn't look at the two (or one) Iowa state hint, but each of the US Census Hints were for this person.

What I DO think happened my case, is that I had the Family Maker up for George (parents a siblings) in my AMT. As I added the hints (Save to George), the Search screen was updated with the family members in my AMT.

I don't know if Randy included George's family members or not in his search, but IF I clicked on Edit Search, when I entered George the first time, was very little information but after I did the Save, more people were added to the search. I didn't manually add them. The AMT to Hint feature made changes to the Search Screen.

Russ

Taco said...

Russ - the reason why I think it's based on AMT's is that some suggested records will be way down the list of regular search results, or not show up at all. This tells me Ancestry gets its suggested records from elsewhere, and the only logical source would be the AMT's. Whether or not you've set up Ancestry to ignore AMT hints has no bearing on suggested records being shown.

Suggested records also change for each individual record. For instance, 100 AMT's might have the same 1850 and 1870 census record attached to one individual, but not the 1860 census. When you attach that 1850 census record to an individual in your tree, Ancestry will suggest the 1870 census record, seeing that 100 other AMT's have it attached to an individual in conjunction with that 1850 census record, but not an 1860 census record.

Now say that 100 other AMT's have that same 1870 census record attached to an individual, but not the aforementioned 1850 census record; instead, these trees have an 1860 census record attached to that same individual. When you attach the 1870 census record to an individual in your tree, Ancestry will give you both the 1850 as the 1860 census record as a suggestion, as there are 100 trees where an individual has that 1870 census record attached as well as an 1850 census record, and there are another 100 trees where an individual has both the 1870 census record attached as well as an 1860 census record.

Sven-Ove Westberg said...

I think you are right that they pick information from the member trees. I have seen suggestions to Swedish church records that are NOT indexed.