Monday, February 23, 2009

Questions and Answers - Comments about Genealogy Search Engines series

Many Genea-Musings readers know that I'm in the middle of a series of posts about Genealogy Search Engines. I'm trying to compare the search process on Ancestry.com, Footnote.com, FamilySearch.org Record Search and HeritageQuestOnline and assess them against my own evaluation criteria.

To date, I've written two introductory posts and four Ancestry.com posts. These posts have raised some interesting questions and comments, including:

1. In Ancestry.com Post 4 - reader 1976lib noted:

"You know that the information exists in the database when you start the search for these tests. In most cases a researcher is just hoping there is some information available on the person being searched. Choosing the most appropriate search, continuing through additional kinds of searches and persisting when you initially get no returns (or an overwhelming number) has to be learned behavior. "

You are correct - I do know that the 1860 census database has the information I'm seeking. I'm trying to start off the Search process, and the posts, as if I don't know that the record is available, for the purposes of assessing each web site and search engine against my evaluation criteria. At some point, I will add some discussion and recommendations for doing the additional searches required when the record you are seeking is "not found" or is really "hard to find." Ah, "learned behavior" is how we all use Search engines, isn't it? I'm still learning, and I hope that these posts help my readers learn faster.

2) In Ancestry.com Post 4 - reader Ancestry Insider offered some background about the development of Ancestry.com's New Search Fuzzy Matches development. The most interesting parts:

"In the upper-right corner of the list of results is a "View" setting that can be set to "Summarized by category" or "Sorted by relevance." Ancestry.com gave users greater control by allowing either view to be used with either search method. Viewing by category inherently adds an additional click. An argument can be made, that it makes no sense to use category summaries with ranked search results. One aspect of Ancestry.com's New Search is that it blurs the distinction between exact and ranked searching. That begs the questions, does it ever make sense to view results by category rather than by how well the results match? In my opinion, Ancestry.com added the option just to make a transition to New Search easier to old users. I believe it is one of several ways in which old exact search is inferior to new exact search."

There are several excellent observations there - by someone who was on the inside not so long ago. Thanks! The "Summarized by Category" sounds a lot more useful to me than "Sorted by Relevance." I note that I'm using "Summarized by Category" first (I don't remember picking it, it must have been picked earlier!) and then within a specific database the Search goes to "Sorted by Relevance." I really dislike getting results "Sorted by Relevance" early in the Search -- I can never find what I wanted. Perhaps I'll do another post just to test that.

3) In Ancestry.com Post 4 - reader geolover commented:

"Regarding this search, if looking for an 1860 US Federal Census entry, why are you using a global search page instead of the search page for the 1860 enumeration? It is easy as pie to bookmark the Census listings, much easier than specifying such a search via the homepage/global search. A lot faster loading, too."

I agree totally ... but I was trying to do the searches as if I was a fairly new user (and that's hard sometimes!) not familiar with all of the nuances of Searching that comes from years of experience. I've watched newbies do this...believe me, they don't know about all of the nuances (but I'm not sure I do either)! I'm also going to compare the Ancestry.com process with other web sites, which may not permit that type of Search, so I need to start at the beginning, so to speak. Great tip - thanks!

4) In Ancestry.com - Post 2 -- reader Anne Mitchell commented:

"...Just one note on the fuzzy vs exact: you don't have to do all fuzzy or all exact. One of my favorite tricks to find who I am looking for is to fill in the most exact value I can in the "Lived In (Residence)" field and click exact."

Anne has many favorite tricks, I think! She could write many blog posts about favorite tricks. Read her whole comment for an example. This is a great tip - you can specify which Search parameters are Exact or Fuzzy so that you can limit the Search results. As I noted above, I'm trying to be pretty simple here, so that I don't get too confused (or put hundreds of screen captures up!).

5) In Ancestry.com - Post 2 - reader Familytreeservice commented:

"...one problem I have always had is putting on exact information for say, Mabel Molesworth (this is a serious name, honest), born 1890, Redditch, Worcestershire, England. Fairly specific details and name I think you will agree. However, as her name is recorded as 'Mable' in one census, her result comes after about 5 pages, behind every Molesworth name in the whole of England let alone in Worcestershire. I've noticed this happens for William (Wllm) and Thomas (Thos) a lot. Any tips on avoiding that?"

The census indexes are full of "serious names" done wrong...that's the problem with informers, enumerators and indexers, but not the Search algorithm! Your example makes the case for using a "wild card" for names that can be easily "messed up" by someone who heard it wrong or did not ask the correct spelling. I'm guessing that you did a Fuzzy Match search and, since the indexed "Mable" did not equal "Mabel," the star ranking penalized the entry and it ended up on the 5th page. If you had done an Exact Match search, you would not have found her at all, but your list of match results would have been shorter!

A Search using wild cards for the names would have found your dear Mable easily - e.g., use First Name = "mab*" and Last name = "moles*" in the Search box and you would probably see your Mabel near the top of the search result list. I use wild cards for almost all of my Exact and Fuzzy match requests. If I get too many matches, then I add a birth year (if known, with a variance of like +/- 2 years), a birthplace (if known), and a residence (State/County, if known).

I'm going to discuss using "wild cards" and specific Search strategies in subsequent posts.

Thank you all for taking the time to comment and question. I will try to keep up with all of the Comments and post responses to those that have questions, tips and suggestions.

No comments: