Welcome back to my series on Isabella Weatherford! In part one, I used AI to explore research questions and write the research objective:
The objective of this research phase is to examine the economic and social conditions in Dallas County, Texas, in the early 1870s and their influence on Isabella D Weatherford’s life and marriage prospects. Isabella was born on 4 Mar 1858 in Missouri; she first married John H. Carpenter on 16 Jan 1874 in Dallas County, Texas, then later married Robert Cisnie Royston on 16 Jan 1877 in Van Zandt County, Texas, and died on 9 May 1942 in Tucumcari, Quay County, New Mexico.
In this blog post, we’ll explore how to use AI tools to create a timeline of Isabella’s life and analyze the sources. This approach can significantly streamline this part of the Research Like a Pro process. The AI models used for this project were Transkribus’s Text Titan I (Super Model), ChatGPT 4o, and Claude 3.5 Sonnet.
Using AI to Build the Timeline
Artificial Intelligence can be a powerful ally in genealogical research. Some of the challenges we might face when working with documents about our ancestors are transcribing, abstracting, summarizing, and analyzing. This is where AI shines—it takes an existing document and the information it holds, then transforms it for us to use as we assemble the timeline. I used AI to help build Isabella’s timeline.
1. Gathering Data: I started by collecting all the documents and information I had about Isabella, including census records, marriage certificates, and her widow’s pension file.
2. Transcribing Documents: Previously, I had read the 57-page pension file and extracted key dates, but I decided this was a good opportunity to use AI to fully transcribe the pension. I uploaded images of Isabella’s 57-page pension application to Transkribus, ChatGPT, and Claude with the prompt:
You are an expert genealogist. Please transcribe this document accurately, preserving all names, dates, and places exactly as written.
This file was important because it contained a variety of documents spanning 58 years and included names, dates, and places about Isabella’s life. Transcription was a challenge, though, because it also contained typed and handwritten letters, applications, and forms.
Which AI model worked the best? I tried Transkribus first, and it was able to identify some words, but really struggled with most of the documents. Next, I tried ChatGPT and Claude, which were fairly accurate but I found the actual pension application to be a challenge with its mix of typed and handwritten words. Both hallucinated and made up some dates and places. For some particularly challenging forms like the application, I copied both the original document and Transkribus’ version of the transcription with the correct line breaks into Chat GPT or Claude. That worked well.
Since I was transcribing one image at a time, I created a Google Doc and pasted each image with a title of part, page #, and description. Then, I added the correct transcription below. This lets me see the original, but I also can use tools like Ctrl +F to quickly find names, dates, and places.
3. Extracting Information: After transcription, I wanted to be able to ask the AI to extract key events and dates from the entire text. I made a copy of the Google Doc and deleted all the images to make it easier for the AI model to extract the information. I saved the Google Doc as a PDF file and uploaded it to ChatGPT with the prompt
Please create a chronological list of events mentioned in this transcription, including dates and places where available.
4. Creating a Structured Timeline: I then used AI to convert the extracted information into a structured format. I prompted:
Create a CSV file with columns for Date, Place, Person, Event or Activity using the information from the previous list.
Here’s a sample of what the AI-generated timeline looked like in the .csv format.
Date,Event,Place,Source
1858-03-04,Birth,Springfield Missouri,Pension Application
1874-01-16,Marriage to John H. Carpenter,Dallas County Texas,Pension Application
1877-01-16,Marriage to Robert C. Royston,Decatur Texas,Pension Application
1915-05-02,Death of Robert C. Royston,Duncan Oklahoma,Pension Application
1929-05-06,Applied for widow’s pension,Oklahoma,Pension Application
I wanted to check where the specific data point came from in the Google Doc, so I added a column for the reference and I also directed the AI to change the date format to be consistent with day/month/year. The AI gave me instructions on how to get the .csv file into Excel, and my final spreadsheet has all of the dates, places, persons, events, and references in an organized format.
With the dates and events clearly laid out in the spreadsheet, I easily added pertinent information to my Airtable research log. Using the 2024 template, I first created an entry for the pension file in the research log table, then selected that source for each event in Isabella’s timeline. I copied and pasted the transcription from my Google Doc into the Event Details field and added the specific location in the notes field. Because the source citation is for the general pension, this will let me add that detail to the citation in my report for a specific data item.
Tips for Using AI with Transcribing Timeline Documents
- Try multiple LLMs
- Start with a good prompt:
- Role > Goal > Text > Task > Flask (terms courtesy of Steve Little)
- Check names, dates, and places carefully
- Keep refining until the result is accurate
- Use your imagination to think of new and better prompts
Analyzing Sources with AI
An important part of the research process is to analyze each source and information item. I was curious to see if AI could help me with this. First, I asked it to describe original, derivative, and authored sources according to genealogy experts. Its answer was fairly good, but I corrected a few things. For example, as genealogists, we recognize that an image copy can be classified as an original record, not a derivative record.
As I discussed the source analysis with AI, it corrected its analysis. When I was happy with the analysis, I copied and pasted it into my Airtable Timeline. This was a fun exercise and saved me a lot of time writing out the source analysis.
In the example below, you can see that the record had two parts. The first was the photocopy of the marriage, and below it was a typed transcription certified by the county clerk. I had to teach ChatGPT about the two parts then it generated a correct response. I copied and pasted the analysis straight into my Airtable research log.
 Analyzing Information with AI
Could the Large Language Models also analyze the information within a source? When I first asked the model to describe information based on primary, secondary, and undetermined information, I had to explain that the source is the container for the information, and there could be various pieces of information in one source. I tested it by uploading a death certificate and asked it to analyze for the information. ChatGPT recognized there was both primary and secondary information and identified the informants, gave me details, and its reasoning. Some of the names and dates weren’t correct, so I gave that feedback. I was happy with the finished product and added the information analysis to my timeline in Airtable.
I created a Custom GPT titled “Diana’s Genealogy Source Expert” that you can try out. Simply copy and paste in your document and give the following prompt:
Analyze this document for the type of source and information.
Refine the results as needed. Remember, you are the human in the loop! If you try it out, let me know in the comments below if AI got the analysis right.
Tips for Using AI with Source Analysis
- KEY: Have a good working understanding of source analysis
- Use AI to check your assumptions
- Question AI’s assumptions
- Start with a document where you understand the source and information analysis
- Correct the LLM to make the answers more reliable
- Copy and paste the final answer into the timeline
Conclusion
Using AI tools dramatically streamlined the process of creating Isabella’s timeline and analyzing the sources. However, it’s crucial to remember that AI is a tool to assist our research, not replace our critical thinking. Always verify AI-generated information and use your genealogical expertise to interpret the results.
In our next post, we’ll explore the historical context of Dallas County in the 1860s and 1870s and how it might have influenced Isabella’s life choices.
Best of luck in all your genealogical endeavors!
Research Like a Pro with AI Series
Using AI to Find Research Questions and Write Objectives: Isabella Weatherford Project Part 1
Using AI in Timeline Creation and Source Analysis: Isabella Weatherford Project Part 2
Using AI in Locality Research: Isabella Weatherford Project Part 3
Using AI in Research Planning: Isabella Weatherford Project Part 4
Learn more about using AI tools in our hands-on workshop, Research Like a Pro with AI.
2 Comments
Leave your reply.