Thursday, October 28, 2010

FamilySearch Bloggers Day - FamilySearch Indexing

One of the first presentations on the FamilySearch Bloggers Day (21 October) was by Jim Ericson (Software Community Manager) speaking on "Volunteer Indexing: Unlocking the World's Records One Name at a Time."

FamilySearch has been digitizing images from historical record collections in their Granite Mountain microfilm stash since 1998, and has been indexing some of those collections using volunteer indexers since 2006.  You can see the Familysearch Indexing information - including current, completed and future project lists - at http://indexing.familysearch.org/.  This presentation covered the FamilySearch Indexing history, accomplishments and their goals for the future.

Here are the notes I made for myself and my Twitter audience during this presentation (with a time stamp; #FSBlogDay is the Twitter hashtag useful for collecting tweets; times are PDT, MDT times were one hour later):

#FSBlogDay Jim Ericson next about FamilySearch Indexing - started with 11 donated volumes in 1894 8:34 AM Oct 21st 

#FSBlogDay target is 200 million names in 2010, over 750 million online now from over 100 nations 8:35 AM Oct 21st

#FSBlogDay FS records in vault, can't just put it all online. Renegotiate digital rights for many collections 8:35 AM Oct 21st

#FSBlogDay microfilming started in 1938 to capture records, now scanning and digitizing and indexing before publishing online 8:36 AM Oct 21st 

#FSBlogDay over 2.4 million rolls of film, over 1 million microfiche, over 3.5 billion images, getting 40,000 rolls per year (equivalent) 8:38 AM Oct 21st

#FSBlogDay over 1/3 images digitized 1.1% of images online, 2.6% of indexes published. 500 million new records online in 2010 8:39 AM Oct 21st

#FSBlogDay some collections are not images, only indexes. Some collections are easy to navigate, some aren't 8:42 AM Oct 21st 

#FSBlogDay now almost 500 million names index by over 375,000 volunteers 8:43 AM Oct 21st via web

#FSBlogDay doing 1930 census now - adding more fields to index to improve index. YAY! Needed A and B teams 8:45 AM Oct 21st 

#FSBlogDay Ideal Partner Profile: partner IDs collection that fits need of many users, recruits local volunteers with expertise 8:46 AM Oct 21st

#FSBlogDay promote project through society and media, outside volunteers can help complete project 8:47 AM Oct 21st

#FSBlogDay listing upcoming indexing projects - probably on FamilySearch Indexing project site 8:48 AM Oct 21st

#FSBlogDay Current needs/challenges: about 10 years to complete digitization of Vault. would take 300 years to index at present rate 8:50 AM Oct 21st

#FSBlogDay expanding internationally, but need people with language skills (only 5% of records being indexed are non-English) 8:51 AM Oct 21st

#FSBlogDay Why index? new skills - paleography, transcription, genealogy research, meaningful service, pay it forward, add content 8:54 AM Oct 21st
#FSBlogDay don't need language skills to index many records - there are helps, word lists, etc. 8:56 AM Oct 21st

#FSBlogDay Q: how will redirection work to new FamilySearch website - not enough functionality on Beta site yet, need rigor! 8:58 AM Oct 21st

#FSBlogDay All released records are on Beta, not on Record Search, a "new collection" indicator will soon appear YAY! 8:59 AM Oct 21st

Some of the issues that Jim shared in his presentation that were of interest to me:

1.  The start of the FamilySearch record collection was in 1894 with the donation of 11 books to the Genealogical Society of Utah.  The collection grew in order to fulfill the genealogy needs of the Church of Jesus Christ of Latter-Day Saints, and is funded by church donations.

2.  The Online Record Access process includes acquisition of the record collection, scanning and digitizing it, indexing it, and publishing it online.

3.  The Granite Mountain Records Vault currently has over 2.4 million rolls of microfilm, over 1 million microfiche.  The Vault is adding the equivalent of over 40,000 rolls of microfilm each year.  All of the films and fiche contain over 3.5 billion pages (images) of family history information. 

4.  Digitization of the Vault collection began in 1998, and 12 years later about one-third of the images have been digitized (so, about 1.2 billion images).  However, only 1.1% of the total number of images are published online (so, about 38 million images) from the collection.  About 2.6% of the total number of images have indexes (so about 90 million images) available online.  More than 750 million records are now on the FamilySearch.org Beta site [What is a record?  Probably a record for a person, a name along with other indexed information for that person].

5.  The FamilySearch Indexing program has indexed more than 300 million names [I thought he said 500 million in the presentation, but the handout says 300 million] since 2006, by more than 375,000 volunteers (church members and general public).  There were 139 million names indexed in 2009, and 148 million to date in 2010.  The goal for 2010 is 200 million names indexed.  All of this is very impressive!  There are more than 750 million records available online from more than 100 nations [Why the difference between 750 and 300 million?  I think it's because FamilySearch already had millions of indexed names in their International Genealogical Record collection, and these are being added to the FamilySearch Beta collection].

6.  FamilySearchThis is one reason that many collections are being added to FamilySearch Beta site as images only, without indexes or with way-pointing and ongoing indexing. 

My view is that we may see much more of this type of historical record collection - let the users browse the collection from home online rather than have to go to the FHC to order a film and go back to read the film (there are financial consequences for this, though - the film rental fees).

7.  FamilySearch would love to have more indexing partners - persons or genealogical societies with expertise and manpower to promote and complete indexing projects.  Ideally, a local society would index a collection that is a Familysearch priority and fills the needs of both the local group and the genealogy world.

8.  Since only 5% of the indexed records to-date are non-English, they are seeking indexers with language skills beyond English.  They can help with word lists and help lists for English speakers indexing records in other languages.

9.  There are benefits to volunteer indexing - the indexer obtains knowledge about genealogy, experience in paleography (reading handwriting), transcription skills, and provides meaningful service to the genealogy community while being flexible and voluntary.

10.  All released FamilySearch Indexing projects are on the FamilySearch Beta site.  Some of the collections have been removed from the Record Search Pilot site, and some are redirected from the Record Search Pilot site to the Beta site.

Disclosure: I am not an employee, contractor or affiliate of FamilySearch. FamilySearch paid my way to this Bloggers Day in Salt Lake City, including airfare, hotel, some meals and incidental expenses. I am trying to be as objective as possible. I really appreciate FamilySearch's efforts to inform the genealogy community about their products and capabilities.

3 comments:

Martin said...

"Since only 5% of the indexed records to-date are non-English, they are seeking indexers with language skills beyond English. They can help with word lists and help lists for English speakers indexing records in other languages."

That is very dangerous and completely unscholarly. You need someone fluent in a language. With just word lists how are people to know the difference between a's and o's in a foreign language in 19th century (or earlier) handwriting? That part sounds like a recipe for disaster.

I also don't see where you talk about any mechanism for reporting errors and having those errors corrected.

Aylarja said...
This comment has been removed by the author.
Aylarja said...

Thanks for your recent series of posts on FamilySearch Bloggers Day. I sat in on a similar session at the recent California Family History Expo, which got me thinking about a possible strategy for prioritizing the digitization of such an enormous volume of existing microfilm and microfiche records. One approach occurred to me:

Currently, if a patron of a local Family History Center wants to view a roll of microfilm that is stored in Salt Lake City, that person can pay $5.50 to have that roll of microfilm shipped to the FHC for review. Would it be feasible to at least offer the option to instead digitize the roll, using the borrower's fee to partially offset the expense of digitizing? And how about if the patron would be able to view the digitized microfilm online via the FamilySearch web site, from the comfort of home? Perhaps in return for the fee, and as a sweetener for encouraging digitization, the patron would have exclusive access to the resulting images for a specified period of time. At some later time, the digitzed roll would be added to a publicly accessible pool, available only in a raw, unindexed state until it could be added to an indexing project.

Crazy? Perhaps. Cost-effect? Maybe not - although if FamilySearch plans to digitize everything anyway, this would be a fairly effective means of prioritizing (only requested rolls would be digitized in this way, so by definition digitizing them would add value), and could provide some offset to the cost (though no doubt nowhere near the actual cost). The benefits to the person requesting the digitizing seem clear: submit the request from home via FamilySearch.org, pay online by credit card, access the images from home, possibly having limited exclusive access, all for a small fee roughly equivalent to the current borrower's fee. Whether a workflow could be devised that could handle the volume of requests that would almost certainly result, only FamilySearch could attempt to answer. No doubt copyright issues or use agreements would apply to some records. But I would imagine that FamilySearch already categorizes its records accordingly, so that it would be relatively easy to determine which rolls of microfilm could be managed in this way, and which ones could not.

I, for one, would gladly pay for this option. Thoughts?