Friday, September 21, 2007

The Mother of All Genealogy Databases

Drew Smith sent me a link to this article, Coming Soon: The Mother of All Genealogy Databases by Mike Elgan in ComputerWorld magazine. There are two pages to it - be sure to read both pages.

The author apparently is bored by current genealogy methods and databases, and he projects out 10 years and thinks that Google, or some other outfit, will come up with a database that will:

"* Enter your unique ID info (probably your Gmail username) and that of any other person, and the site would trace you both back to the most recent common shared ancestor.

"* Follow a timeline that shows the locations and migrations of ancestors all leading up to the descendant that is you.

* Track down every living relative.

* Pick any year in history and see just how many of your ancestors were alive at the time.

* Genetic family relationships could be combined with Linked-In-type social networking friendships or business relationships to render the most direct connection with anyone else ("Hey, you're the brother-in-law of my former boss's wife!").

* And many other cool tricks nobody can even think of right now.

"Such a public database would also have profound social implications. For example, it would probably render meaningless the concept of race because we would see before our eyes that we're all a lot more like Tiger Woods than any of us thought.

"Second, the privacy implications would be enormous and obvious.

"Third, such knowledge might have legal implications, calling into question various people's right to inherit property or titles.

"The prospects of such a massive, public database are staggering, scary and, yes, exciting.

"Genealogy is boring. But tomorrow's genealogy -- with a million times better results with a fraction of the effort -- well, that's something our ancestors couldn't even have imagined."

I took quite a bit there, but I thought it was the meat of the article. It is an interesting set of premises and predictions.

So -- is it possible that this will be where the art and science of genealogy and family history will be in 10 years?

I have the following observations:

* I think that genealogy will be real boring if the above predictions come true in 10 years or even 100 years. The thrill of genealogy for me is the hunt for ancestors and their stories, not knowing names, dates and places.

* Logically, this isn't going to happen because there is a dearth of records in many places over long periods of time. While some people can get back to 17th century Europe with one ancestral line or many, nobody can get back to 17th century Europe (or Asia, Africa, Latin America) with EVERY ancestral line. The records just are not available. Even the best researchers have many brick wall ancestors that are nearly impossible to find. The exception, and the wonderful example, is Iceland ancestry, of course.

* Two of these items can be accomplished now if someone has a fairly well filled out family tree in a database back into the 19th century or earlier. You could create a timeline that shows your ancestors, their locations and family situation. Likewise, you could create a list or map that shows where your ancestors live in a certain year.

* DNA tests and results cannot deal with every member in a person's ancestry yet, and probably will never be able to. The Y-DNA study traces only the patrilineal line, and determines if the person matches some other person who has been tested; the test doesn't identify the actual ancestors - it only allows a person to determine if they have a common male ancestor generations earlier. The mtDNA study traces only the matrilineal line, and does not define the ancestral line, only the genetic haplogroup that the female line comes from. If a person has cousins, aunts, uncles, etc. tested, s/he may be able to define up to 8 ancestral lines, but not the other hundreds of ancestral lines. Perhaps if every person's genome were determined this dream scenario could be realized, but that's probably real expensive.

* If all of the world's genealogy research is completed in TMOAGD, what will the genealogy industry (commercial companies, databases, LDS, magazines, writers, editors, readers, speakers, conference planners) do in their spare time? I know - they'll be memorizing their own genome coding, after all it's only several billion characters!

Those are my "off the cuff" thoughts and analyses - what do you think? Please read the article and blog about or comment here your thoughts about TMOAGD (The Mother of All Genealogy Databases). Is it feasible? Will it ever happen?

Thanks, Drew, for pointing me to an interesting and challenging article.


cheekygnome said...

Even if I wasn't a cynic I wouldn't see this happening within the next 50 years. Where is all of this data supposed to come from? GEDCOMS? Don't make me laugh. Who gets to decide which ones are accurate and which one's get pulled from [insert favorite source of mass inaccurate unsourced family tree CD's here]?

Anonymous said...

I think he was a little ambitious for a ten year time frame.

I wrote my own thoughts on where I expect things to go in the next few years at:

As I explain on the above page I agree with Greg that this won't be accomplished by mashing together gedcom files. What is needed is collections of genealogical evidence instead of conclusions, and then a way to organize the evidence by person instead of source. It is very doable with current technology. I hope that ten years from now we won't be individually searching out census entries one by one, we will have collectively gone through that process. Instead we'll be dealing with much richer and more interesting information gleaned from a wide variety of sources (newspapers being an obvious first step).

As far as Google is concerned, I think the biggest potential for Google is to incorporate in their search engine the ability to understand tags on web pages that specify relationships between people. Some work has been done in this area with regard to creating standards for "open" social networking, but it's clear it could be adapted to genealogy. Once there is a standard way to publish information about a person and about the relationships between people the search engines will be able to build some of the functionality described in the article.