Thursday, June 29, 2023

Rabbit Holes With Randy - Scanning Typed Text to Digital Text

 I'm in pretty deep with this one.  

When I started doing genealogy research in 1988, one of the first things I did was to ask my father's four living siblings for their memories of their family life in Massachusetts.  Of course, I immediately forgot what they told me because I made no notes.  

In 1990, I attended the 50th wedding anniversary of my aunt and uncle (Edward and Janet Seaver) in Leominster, Massachusetts with the aunts, uncles, and cousins  in attendance, and we discussed the family ancestry.  By then I had enough information to claim that their mother was descended from William White of the Mayflower, which everyone was excited about.  I asked everyone there to write down their family memories and their own life story.  

I received several letters from two of my aunts during the next year, and then, the big prize, four audio cassette tapes from my wonderful aunt, Geraldine (Seaver) Remley.  Aunt Gerry was her mother's caregiver for the last 20 years of her mother's life and was the family historian at the time.

I typed the letters that I received from the siblings and transcribed the audio tapes from Gerry into my word processor on my 1983 IBM PC (DOS, two floppy drives, no hard drive, Easywriter software), and printed them out (single space on a dot matrix printer), put them in a binder, and put the binder on the bookshelf.  Transcription was difficult because Gerry had a wonderful New England accent, interspersed her phrases with ums and ers, and she spoke too fast for my hunt-and-peck fingers.  It took many hours over many days to transcribe them - 40 pages!

In the 30 years since then, I have read through these documents several times, and always said to myself "I need to find these files and get the information in a digital format."  I don't have the files, or the software, so I needed to find a way to digitize then, hopefully not by re-typing them.  

I knew that there were mobile apps that could do this job - take a photo of a page, transcribe the text, and put it in a PDF file or an email message.  I could then copy and paste the text into a word processing document and edit it.  

I looked in the Apple Store and there were plenty of apps to do this.  But which ones are free to use, will do what I want, and not take a lot of time to learn?  I already had CamScanner in my iPhone, and it said it could transcribe text, so I tried that.  It never was able to transcribe the 40 lines of single-spaced text in Gerry's stories.  

I Googled for other mobile apps, and I found an app called TextScanner.  I am using that, but it lets me do only three pages a day for free, but I can do another page if I watch a 30 second video and wading through several popup ads with every tap of the screen.

Here is the first page of Gerry's memories in the binder:

When I convert this to digital text using the TextScanner app and send the text transcription to myself in an email, it looks like this (top two paragraphs):

I have found that I need to edit the transcription after I copy and paste it into a word processing file.  Some letters are not transcribed correctly, the left margin is very tight, and I have edited some words with a pen.  On many pages, there are "loose" words or phrases that are out of order (there were not any of those on the page above).  Using the app does save me typing time.  I've done 20 pages so far over a week's time.

Then there are other documents to do - the letters from the siblings, our early Christmas letters, and other typed transcriptions that I typed before 1995.  

When I have these tasks completed, I will put the PDF of the transcribed documents into my family history archives, my Google Drive account, and on my FOREVER account.  I will probably distribute them to the "next generation" of the family - there are 11 first cousins and 20 second cousins descended from my grandparents.  They may not be interested now, but they my be later.

I'm sure that there are other mobile apps that can do this job, and do it better, but I find this TextScanner app to be useful and I've learned how to use it.  If you know of a better and free mobile app to do this task, please let me know.

Time for my carrots to keep my energy up for the next round of scanning and editing.


The URL for this post is:

Copyright (c) 2023, Randall J. Seaver

Note that all comments are moderated and may not appear immediately. 

Please comment on this post on the website by clicking the URL above and then the "Comments" link at the bottom of each post. Share your comment on Twitter, Facebook, or Pinterest using the icons below. Or contact me by email at


Marian B. Wood said...

I have a similar set of documents and really appreciate this blog post!

Rand said...

Google Docs is another option. Photograph the pages, copy the images into Google Drive, and then right-click each file and select "Open with--> Docs." This creates a new Google Doc with the image on p1 and transcription on p2.

A dot matrix printer will OCR well, but other things like newspapers with tight kerning can produce transcriptions that need a little extra help (ex, f1o0d instead of flood). ChatGPT is great at cleaning up stuff like that. (It's often quicker to do it by hand, though.)