Monday, July 7, 2008

Ancestry Census Indexing errors

On the APG mailing list, Neal Underwood just posted a note concerning Ancestry census indexing errors - particularly with "India" as a birthplace instead of Indiana. Neal's list:

1850 - 16
1860 - 9,023
1870 - 27,984 (9,603 in the state of Indiana; 7,303 in Iowa)
1880 - 1,291 (indexed by LDS volunteers)
1900 - 17,953
1910 - 13,284
1920 - 20,283
1930 - 5,291

The search for "India" finds West India, East India, India and other anomalies. The search for "Ind" finds "Indies" and "Indian" also - there are 157,865 entries with a birthplace including "Indian."

In 1900, there were 2,508,257 persons listed with "Indiana" as their birthplace, and another 551 with "Ind." The error rate for "India" and "Ind" is about 0.74% - about 1 in 135. But if your family is that one, then you won't find them if you specified "Indiana" as a birthplace.

** I had heard that "Kenya" was also indexed in place of Kentucky and found (but not nearly as many as India!):

1850 - 0
1860 - 1,592
1870 - 908
1880 - 0
1900 - 460
1910 - 239
1920 - 393
1930 - 25

Most of the "Kenya" listings are really for Kentucky, although there may be some for the African country, but it's nearly impossible to sort them out.

There are 2,422,891 people with a birthplace of "Kentucky" in the 1900 census, so the error rate is a bit lower - only about 0.03%, or 1 in about 3700.

** There are some listings for "Africa" as well - 1,558 in the 1900 census.

** There are also some listings for "Old" in the census indexes:

1850 - 18
1860 - 37
1870 - 479
1880 - 1,580
1900 - 785
1910 - 437
1920 - 1460
1930 - 652 and 58 "O" also

Many of the later "Old" were for "Old Mexico" (as opposed to "New Mexico"), many of the earlier ones were for "Old Deu" or similar.

** The Ancestry census indexes are not always consistent. For instance, in the 1930 census there are:

Washington - 174,417 (including some for District of Columbia)
Wash - 165 (including Wash DC and Wash Terry) - the latter meaning "Territory" I think
Was - 10 (including "this child was" - huh?

Using a wild card for Washington - "Was*" resulted in 174,629, or 212 more than just using "Washington".

** The same thing happens when you use a two word state abbreviation, like "North Dakota"

North Dakota - 128,633
ND - 10
South Dakota - 198,035
SD - 6
Dakota - 333,838

Several thousand people just listed Dakota or Dakota Territory for the birthplace.

Obviously, if you cannot find an ancestor using a birth place fully written out, you should try the wild card and also the common abbreviation for the state.

1 comment:

Annette said...

I've noticed a different systematic error for Indiana. In the earlier censuses, it was not uncommon for Indiana to be abbreviated as Ia. This seems to be always interpreted as Iowa by ancestry.com, even before Iowa existed.