Wednesday, August 29, 2007

More on Ancestry's "Internet Biographical Collection"

I think that nearly all of the genealogy blogosphere has weighed in on the controversy about Ancestry.com's use of cached web sites in their "Internet Biographical Collection." The furor over this issue boiled over while Ancestry had the database behind their subscription firewall and the primary link to the subject web pages was through a cache on Ancestry's servers.

Ancestry made some changes yesterday - they made the database Free to registered users (not just subscribers) and put two links on the summary page for each search result - they added "View Live Web Page" below "View Cached Web Page."

There are several more excellent posts by genea-bloggers today, including (and my apologies if I left someone out):

* Kimberly Powell on the About.com:Genealogy blog - "The Legality of Caching"

* Steve Danko on Steve's Genealogy Blog - "Thoughts on Ancestry's Internet Biographical Collection"

* Craig Manson at Geneablogie - "Ancestry.com: Thieves, Blunderers, Hypocrites or Fair Users?"

* Dick Eastman at the Eastman's Online Genealogy Newsletter - "Internet Biographical collection is Free at Ancestry.com" -- there are many comments here also.

* Juliana Smith (I assume) at the 24/7 Family History Circle - "Internet Biographical Collection is Free at Ancestry" -- this is Ancestry's blog, and there are many comments to a rather bland posting.

* Susan Kitchens at the Family Oral History Using Digital Tools blog - "Ancestry.com scrapes websites; places harvested content behind membership walls."

* Leland Meitzler at the Everton Publishers genealogy blog - "The Generations Nwetwork Continues to Tarnish Their Image" -- this has the most complete list of blog posts to date.

I commented on the 24/7 blog that one problem I see with this database is that if I search on my name "Randy Seaver" or my blog "Genea-Musings" that I get hundreds of hits - because many bloggers have put a link to my blog using one or both names. The numbers may increase as they add blog sites.

I am still sort of upset that they haven't crawled my Genea-Musings blog information yet - I just checked!

I do have some more comments about this issue:

* Caching appears to be common for search engines and Internet users. When I save a web page to my hard drive, I am caching it. When I save an Ancestry image to my hard drive, I am caching it (but only the image, not the Ancestry material around it).

* I think that Ancestry should do away with the "View Cached Web Page" link and just show the "View Live Web Page" link.

* The major complaint was that Ancestry was "profiting" from freely created and provided data. I sincerely doubt that Ancestry has "profited" from adding these web pages to their cache. The database was put online on 22 August 2007 and Ancestry must have invested some amount of money in creating it. No one who subscribed before that date knew it was coming online and therefore did not subscribe just for this collection. In the future, I sincerely doubt that anyone will subscribe, or will renew their subscription, to Ancestry just because of this database - I wouldn't.

* The reason that so many genealogy researchers create freely-viewed web sites, transcribe data, write blogs, etc. is to disseminate genealogy information for other researchers to use. Most of this data is searched for using a search engine - whether it's Google, Yahoo, or Ancestry. The links to the web pages are provided, as are cached web pages.

* Not everyone uses Ancestry - you can get the same result using the Search engines, at least as it applies to the information in the Internet Biographical Collection.

* There are means to prevent Ancestry from capturing the web pages, but web page masters have to know how to do it. See Susan Kitchen's blog post above (which relied on a comment by AnnieGMS to my earlier post).

* Will Ancestry add items from Rootsweb.com, Genealogy.com, USGenWeb.org or WorldGenWeb.org to this database? There is a wealth of data on these web sites, often the only mention of individuals and families. It would be a shame if contributors would delete their material from these free web sites, as some have suggested.

I admit that I've been fairly neutral on this issue - my data wasn't captured by Ancestry (yet) and hidden behind a firewall. I didn't have "hits" taken away from me by the caching of pages. I am an Ancestry subscriber. I do want to hear from Ancestry about the legal issues (although Kimberly, Craig and Steve covered them pretty well, I thought). I wish that Ancestry would carefully consider the reaction to adding a database like this before they do it.

UPDATED: 29 August, 2:30 PM, minor editorial corrections, plus adding Leland Meitzler's post to my list.

UPDATED: 29 August, 10:30 PM: Well, I just got home from going out to dinner with my Angel and attending the play "Susan and God" and found out that Ancestry has pulled the plug on access to the Internet Biographical Collection. Good. That doesn't mean that they have deleted the images - just cut the access to the collection. We'll see if they put it back at some time with appropriate credits and good links to the actual web sites, ala Google.

For a spot of humor, check out Tim Agazio's post - "So, What's New in the Genealogy world?" Thanks for ther chuckle, Tim. And welcome back.

2 comments:

Jessica's thoughts said...

Hi Randy,

I just read on Ancestry's blog that Ancestry has decided to remove the Internet Biographical Collection.

I've also commented on the article on my blog: http://jessicagenejournal.blogspot.com/

Tim Agazio said...

Randy,

Although my post wasn't as insightful as the rest of them, I really appreciate the mention.

Tim