Monday, June 1, 2009

Big GEDCOMs and other genealogy software things

Tamura Jones reviewed the definition of "big" and "large" when it comes to GEDCOMs in his recent article "My Large is Smaller Than Yours" on his web site Valerie C. posted about her file size on her Begin with Craft blog in "What is 'Large'." She wondered "how large is the average GEDCOM of a hobbyist researcher?"

Some family tree companies think that a GEDCOM with 7 generations or more is "large" while Tamura points out that could be only 7 persons or thousands. His opinion is that a "small" GEDCOM has about 5,000 persons in it, and a "medium" GEDCOM has about 25,000 persons in the file.

I checked my own family tree files, and found:

* My "Master" database has 23,000 persons in it. This is the one that takes:

** On Family Tree Maker 16: 13 seconds to open, 18.4 mb file size
** On Family Tree Maker 2009: 45 seconds to open, 48.7 mb file size
** On Legacy Family Tree 7: 14 seconds to open, 47.8 mb file size
** On RootsMagic 4: 5 seconds to open, 15.7 mb file size

* My "Seaver" database has 10,480 persons in it

* My "Vaux" database has 2,700 persons in it

* My "Dill" database has 1,820 persons in it

* My "Buck" database has 910 persons in it

* My "Richmond" database has 610 persons in it.

* I have about 20 more small client databases with 200 to 1,000 persons in each.

If I added all of those up, then I have upwards of 40,000 persons total that I've researched and tried to organize.

In the numbers provided above for each program, the time to load was estimated from my double-click of the desktop icon to having a useful screen (your mileage may vary!). I rounded off the number of persons in each database and the file sizes. I have very few images attached to persons in my database files, and I'm not sure which programs save them in their database files or in a separate media file.
But I consider only my "Master" ancestral database to be "Large" and it is close to Tamura's criteria of 25,000 persons. My "Seaver" database is "Medium" sized to me, and the others are all "Small" in my humble opinion.

My comparison of loading times for my one "Large" database shows the large disparity between FTM 2009 and other programs. Of the four I tested, only RootsMagic 4 loads in what I consider a "quick" time. They also have the smallest file size for an equivalent file. The load times for FTM 16 and LFT 7 are "acceptable" to me, but LFT 7 has a fairly large file size.

I'm not smart enough to know what all of that means, but as a "frequent user" of genealogy software, I want my genealogy program to load quickly, have a relatively small file size, create a standard GEDCOM file, be easy to navigate, be easy to add and edit data, use standard Source templates, quick and simple media uploads, have a quick mapping feature, have a useful Web Search feature, make great colorful charts, make ahnentafelly correct genealogy lists and reports, make editable books with embedded word processing field codes and indexes, etc. Hmm, there's probably another blog post or two based on those criteria!

All of this information and a dollar will get you a nice donut at your local bakery! Make mine cream-filled glazed, please!

What do you think? How big is "large?" What other features do YOU want in your genealogy software?

Thank you, Valerie, for blog fodder this morning.

This may be the only redeeming feature of this post: Chris Dunham says that he likes girls with big GEDCOMs. I agree completely - I'm a big GEDCOM fan!

I'm off to teach my third (of four) adult education classes on "Beginning Computer Genealogy" which has a genealogy software component to it. The ten students have asked "which program is best to use" and my response so far is "start with a free one and wait for the software industry to sort it all out." I recommended Legacy Family Tree 7 Standard Edition because it is free and has the most functionality of the free programs available. I know some experts will disagree. I have told them that a program with a standard GEDCOM export capability can create a file that any other commercial program can read and use.


Unknown said...

I have a cousin with over 300,000 entries in his GEDCOM. There's actually a reason for this... he's building a "regional" family tree. It's been very useful for me researching family interconnections between soldiers in Civil War regiments. I think he's had some problems with indexing in the program he uses (which I don't remember now sorry) but I've used an older version of his GEDCOM with @200,000 entries in RootsMagic 3 with no problems.

Taneya said...

OMG - mine is so small compared to the numbers you've shared! :-)

Geoff said...

Randy - GENViewer is an add-on program that will read the GEDCOM in mere seconds. It doesn't import the whole GEDCOM into a new database, rather it reads it so you at least know what's in it. Great program from You can download a free version as well.

J. Moore said...

Alt-F-T in paf nets me the following:

152,527 Individuals, 1274 Sources, 111,527 Citations, and 274,511 Notes.

Size on disk is 138.58 MB.

That's all hand-entered over the course of about 10 years. No importations from other people's gedcoms whatsoever.

Cites are low in comparison to notes (though that figure obviously refers to edits or something, because it would be impossible to have 120,000+ more notes than I have individuals) because for several years I collected and transcribed sources into notes rather than citing them gedcom style. I'm slowly working on translating all those notes into proper gedcom cites, but as you might imagine... it is a stupidly gargantuan task.

Anonymous said...

Happy valentine day!

[url=]ugg boots[/url] can let us warm and comfortable.So If you want to buy some gifts to you lovers or friends.I think is a right choose.

[url=]uggs outlet[/url] with 70% dicount.All boots in our site are high quality.I think you can buy them with a gift for valentine day!