
I recently had the opportunity to discuss Leo, a groundbreaking handwritten text recognition (HTR) platform with its creator, Jon Cooper. Jon gave me a hands-on demonstration, using a variety of documents that I selected specifically to see how Leo handled some common challenges genealogists face in their research. Go to the end to see the video!
This blog post was written with the assistance of AI based on the transcript of the video.
The Origins and Meaning of Leo
The platform was born out of necessity. As a PhD candidate at Stanford, Jon Cooper faced the daunting task of manually transcribing complex Elizabethan government manuscripts for his dissertation. Existing off-the-shelf transcription tools failed to provide the accuracy required for such historical documents. Together with his friend Jack, a machine learning expert from Cambridge, Jon developed what is now known as Leo.
The name “Leo” carries significant meaning:
- It originated from “Paleo,” short for paleography—the study of ancient handwriting.
- In Spanish, leo translates to “I read”.
- In Latin, leo means lion, representing a strong and dependable mascot for the platform.
Core Abilities and Features for Genealogists

Leo offers several specialized features that distinguish it from general-purpose AI models like ChatGPT or Gemini:
- Dedicated HTR Model: While Leo allows users to utilize general models like GPT-4, Gemini, or Claude, its proprietary state-of-the-art model is specifically trained for handwritten text recognition. This specialized training reduces the “over-correction” common in general models, where they might guess a plausible word rather than reflecting the actual characters on the page.
- Tabular Data Recognition: One of Leo’s standout strengths is its ability to recognize and maintain tabulated data, even in documents without clear lines. It can export these transcriptions to Word or Excel, preserving the structure for easier analysis.
- Faithful Transcription: Unlike some LLMs that automatically expand abbreviations (e.g., turning “Feby” into “February”), Leo prioritizes a faithful transcription of the original text. It can even add editorial square brackets for known abbreviations if requested, following standard genealogical standards.
- In-App Image Manipulation: The platform includes built-in tools for cropping and rotating images. This is particularly useful for focusing on specific sections of a document to improve transcription accuracy.
- Advanced Refinement Tools: Features like “Interpolate” allow users to run multiple models on a single document and then reconstruct a “best fit” transcript from the various inputs.
Beyond Transcription: Analyzing Sources with Leo’s Transform Tools

One of the most exciting aspects of the Leo platform is that it goes beyond simple transcription to help us truly analyze and understand our historical documents. The Transformations feature acts as a powerful “text-to-text” tool, allowing you to take your completed transcription and run it through a Large Language Model (LLM) for further processing. This is incredibly useful for several common genealogy tasks:
- Summarizing and Named Entity Recognition: For those of us working with lengthy folders or complex documents, Leo can generate summaries and identify “named entities”—like specific people, places, or dates—helping you quickly grasp the content of a document before diving into the full text.
- Translation: Jon Cooper highlighted a fascinating use case for researchers working with non-English sources. He demonstrated using the transformation tool to translate a 500-page Latin treatise into modern English, making it instantly readable for modern researchers.
- Fixing and Interpreting Tricky Text: While Leo is trained not to “over-correct” historical spellings during the initial transcription, you can use the transform tools to get a “push” from the AI if you are stuck on a particular word or phrase. It can suggest plausible corrections or even add editorial marks (like square brackets) to expand abbreviations or clarify meanings.
Ultimately, these tools allow us to move from just having a digital copy of a record to having a searchable, translated, and summarized research asset.
Understanding Limitations
While highly advanced, Leo—like all AI—has limitations that genealogists must keep in mind:
- Resolution and Spread Challenges: Double-page spreads can be difficult because the text becomes very small relative to the overall image size when down-sampled for processing. Accuracy is typically higher on single-page crops.
- The Problem of Proper Names: Names are inherently “improbable” data points, making them harder for probabilistic AI models to guess correctly compared to common language. For example, in the demo, the surname “Keaton” was initially transcribed as “Hecton”.
- Marginalia: Squeezed or tiny text in margins can sometimes be missed or omitted by the AI.
Benefits of Leo over Other LLMs
The primary benefit of Leo over standard LLMs is its historical context and accuracy. It is designed by researchers for researchers. It avoids the “hallucinations” of plausible-sounding text, making errors easier for a human to spot and correct. Furthermore, future updates plan to include confidence metrics, highlighting in red or green where the model is less or more sure of its work.
In conclusion, Leo is not intended to replace the skilled eye of a genealogist or paleographer. Instead, it serves as a powerful “force multiplier,” helping us work through mountains of handwritten records more quickly and accurately than ever before.
—-
Learn more about Leo and to set up your free account, go to https://www.tryleo.ai/
Watch the full Demo
The Meet Leo video is also available on YouTube




Leave a Reply
Thanks for the note!