Wednesday, July 30, 2014

Do Search Engines Provide What You Request?

I've been working on adding facts and source citations for my Seaver, Carringer and Vaux persons in my RootsMagic database, hoping that the information might help somebody else, plus enrich my database with the information.  In earlier years, I have added extractions from census records to the Notes, including source notes in the Notes, but I don't have many of them input as specific Facts in my database(either Residence or Census - I usually use Census, because it's more than a Residence).

In the process, I've been searching for persons in U.S. census records on Ancestry.com so that I can provide a decent source citation for an entered Fact.  But, there are times when an Ancestry.com search does not find what I expect to find, and that is frustrating.  Why does this happen?

The simple answer is that "What I searched for didn't match what was indexed."  That obviously applies for names (since spelling was highly variable), but I just ran across a problem that explains a lot of what I experience, and will change how I search for records.

Here's an example:  I wanted to find and source the 1850, 1860, 1870 and 1880 U.S. census records for Sampson Seaver (1830-????), born in New York, who resided in Day, Saratoga County, New York.  

My usual practice for someone like this is to go to the "Search" tab, with the "Advanced Search" fields, and use wild cards for names, add the birth year with a range, and add the state as a birth place, all restricted to Exact.  Here is my initial search screen:


When I searched for this person, I got these results:


It has the 1850, 1870 and 1880 U.S. census records, but not the 1860 census.  I wonder why?

I clicked on the 1850 U.S. Census match, and the record summary shows me:


Look there - the "Suggest Records" list on the right shows links to the 1860, 1870 and 1880 U.S. census records.  I clicked on the 1860 U.S. census link, and saw the record summary:


The spelling of the name is correct, the birth year is within the range I specified.  Why wasn't it in the Search results?

I clicked to "View original image" and the census page with the Sampson Seaver entry opened.  I also clicked on the "Index" link at the bottom left of the image page to see the indexed information:


There is Sampson Seaver, spelled correctly, with 1830 as an estimated birth year, but no birth place was indexed for him.  When you look at the census page, you can see that there is no entry in the Birth place field for Sampson Seaver.  Therefore, there was no index entry for Sampson, or for anyone else on the page without a positive indicator in the Birth place field.

So, the search I performed provided some of the correct matches - they just didn't meet all of my expectations.  If I had not checked the Birth location as "Exact," then it would have been found with my search.

Perhaps I need to not use the Birth location field, or mark it as not "Exact."  That works, but returns more matches, most of them not the right person.

An alternative would be to mark the Event field as "Any Event," and remove the "Exact" check for the year and the Location fields.  When I do that, I get all four U.S. census records, plus an 1870 Agricultural Census record, and more matches than I had before since I removed the birth year exactitude -- It found records without a birth year.  The wild cards in the name fields worked - they reduced the number of matches.



So, you can teach an old genealogist new tricks.  I think that I will change my "usual" search methods and perhaps I'll find more pertinent records for persons in my database.

I do like and appreciate the "Suggested Records" feature - it made this an easy way to find the 1860 U.S. census record, and it sparked my curiosity as to why the search didn't find the 1860 record.

This indicates, once again, that it really matters what the user puts in the search fields, and how the different filter boxes (the "Exact" year, the location filter, etc.) are selected.  My earlier conclusion was correct - the simple answer is that "What I searched for didn't match what was indexed."

The URL for this post is:  http://www.geneamusings.com/2014/07/do-search-engines-provide-what-you.html

Copyright (c) 2014, Randall J. Seaver


4 comments:

Diane Gould Hall said...

Very interesting post Randy. First of all, I rarely use the wildcard feature. And I cannot even remember the last time I checked the "exact" box in my search. This may be why I get too many hits that don't match. Of course I can then narrow it down by adding more information.
It's interesting how we all search. I'm more of a broad net first, then zoom in by adding fields. Guess I have the infamous FOMO (fear of missing out). LOL!
I too appreciate the suggestions in the right hand column and find many records that way.
Thank you for sharing this as it has given me a new thought process on my own searches.
Happy hunting,
Diane

Jimcat said...

I had one on familysearch that way, just this week. Anna B Anderson, born in Illinois, circa 1899, shows up, in the 1900 and 1910 census in Illinois this way, I could not find her in the 1920 census at all. did she pass on? Marry? etc? I took out the Anna B, just did a search with Anderson as the surname, Frank as her father, still nothing for Anna. I took out born in Illinois,added the gender as female, below and then found on page 3, Bernice Anderson, born in Illinois 1899 now in Texas, with Frank and rest of the family. Why would it not of populated this, with Illinois as the birth state and non gender? I do not know, but, I now change many variables, if the first search criteria does not produce results.

Cousin Russ said...

Randy,

I just did a blog post to see if I saw what you saw

http://ftmuser.blogspot.com/2014/07/re-do-search-engines-provide-what-you.html

Russ

Dr D said...

Randy,
There is always a trade off between PRECISION and RECALL. Recall is finding everything you want. Precision is finding nothing you don't want. As one is enhanced by your search strategy/search engine, the other is minimized. We all need to learn to adjust our search strategies to find the happy medium. Actually, we need to decide if we want to retrieve everything possible or only something very specific.