Wednesday, September 26, 2012

NewspaperARCHIVES Records in MyHeritage Record Matches

MyHeritage announced their Record Matching technology last week (see MyHeritage Releases Record Matching Technology).  

One of the record collections highlighted in the announcement was the NewspaperARCHIVES collection (a searcher needs a subscription or a MyHeritage Data Subscription to access this collection).

I've been looking through the 982 matches for articles matching persons in my MyHeritage tree.  There are quite a few excellent matches, and many that are not my tree person but someone with a similar name.  

Here is my MyHeritage Record Matches page listing NewspaperARCHIVES.  I clicked on the "Filtering options" link to make sure I had all of the possible matches:

I could include or exclude Pending, Rejected or Confirmed matches, and could specify how many "Stars" to consider.  I  clicked "Apply."

After clicking on the NewspaperARCHIVES link to see the 982 matches, here is the top of the list (the user can choose 20, 50 or 100 matches on the screen):

The first match shown above is a 5-star match (the highest possible), which means there are several matching points like names and dates.  However, the match above is not for the person in my MyHeritage tree, since it announces a marriage in 1982 of a daughter to a couple in my tree that married in 1982 themselves.  It is one generation off, but it's a 5-star match.  That's a problem with any text matching program whether OCR or not, and is to be expected.

Of course, there were many 1 and 2 star matches found by the Record Matches for persons in my tree.  Further down the list is a 2-star match for Joseph Carringer:

On the screen above, the match is highlighted in yellow in the OCR rendering of the text.  To see the actual article, I clicked on the blue "Review Match" button below the transcription:

The newspaper name, publication date and OCR text rendering is provided at the top of the page, but after several seconds (typically 5 to 10 seconds on my computer - remember it goes from MyHeritage to WorldVitalRecords to NewspaperARCHIVES and back in order to show the image).

Scrolling down I can see the newspaper article in the relatively small window, and can zoom into the image so that I can read the article of interest.  The text matching is highlighted in yellow to help the searcher find the information of interest:

If the searcher wants to save the newspaper page, s/he can click on the Download icon (below the page image on the right - the down arrow icon).  Or the searcher can Print the full page by clicking on the Print button next to the Scribd logo above the page image on the left.

I found another really interesting article by scrolling through the matches - the transcription of this one said that Solomon Sovereign died at age 915:

Alas, the OCR got it wrong, he was only age 95.

After each match is reviewed, I decide if I want to add the information to my database.  I also click on the "Confirm" or "Reject" button so that I don't have to see the match again if I choose to limit my searches in the database to "Pending."
For the NewspaperARCHIVES collection, it is probably most time efficient to search by "Last Name" rather than "Confidence" or "Status" so that the matches for one person appear together.  

This system of Record Matches is excellent - especially when there are too many matches to consume in one sitting (I can only take two hours or so at one sitting).  Since I "Confirm" or "Reject" a reviewed match, I can search only the "Pending" matches the next time I review matches.  

As MyHeritage pointed out in their press release, no other provider of record searches provides a similar service for newspaper articles - this one is unique to date.

Disclosure:  I have a complimentary subscription to both and courtesy of MyHeritage, for which I am grateful.  However, this does not influence my objective opinions in reviews of these websites and their products. 

The URL for this post is:

Copyright (c) 2012), Randall J. Seaver

1 comment:

Margel said...

I am a big fan of Newspaper Archives. I remember just "getting lost" in it for over four hours one night when I first discovered it. The stories that can be found in newspapers are often what makes them real. I get my access through my local library card - St. Joseph County Public Library in South Bend, IN. They have a long list of databases that are available from your computer at home for individuals with a SJPL library card. Don't live in the area? They have library cards available for purchase by individuals who live outside the county - anywhere outside the county! Another great database they have accessible from home is the Historical Chicago Tribune. Always check the libraries.