[Please create a] Transkribus addon

csam · August 18, 2025, 3:04pm

Feature Request number: 13951

Transkribus add-on - AI based transcription of handwritten text

Transkribus is an AI based service for transcribing images with handwritten text into digital text (utf-8).
In order to use the Transkribus service one must create an account. Using the Transkribus service is not free. Each transcription has a cost of one ‘credit’, but private users get 50 ‘credits’ for free each month - so this will probable cover most genealogists needs.

The transcription is a multiple step process:
• log in to the users account
• upload an image
• select a model (e.g. “German 17th century gothic handwriting”)
• run the transcription
• eventual manual correction
• save the transcribed text
• log off

Transkribus offers a web interface, where users can upload images and do the transcription. A Java based application is available for installing on a PC, but processing still happens on Transkibus’s servers. A REST based API for integration into other software.

The Gramps Transkribus add-on
The Transkribus add-on should be available wherever an image can be added (“Gallery”).
After adding an image it should be possible to activate the Transkribus add-on, which will then upload the image and run the transcription processing. The transcribed text will be stored in Gramps note. For example if the image is added to an event, the created note will also be added to the same event.
The add-on should use keyring to store credentials for the Transcribus account, so it will only be necessary to enter username and password once.
The add-on should list available models for the Gramps user to choose from. If a model is marked as preferred, it should not be necessary to select a model every time an image is processed.
When starting a transcription the process is put into a processing queue at Transkribus, so it can take a while before the result is ready. The Gramps Transkribus add-on should not wait for the processing result, but has a background process for periodically check the processing status at Transkribus. When the transcription is ready, it should automatically be retrieved and stored in a note, and a notification displayed.

romjerome · August 18, 2025, 4:01pm

Note, by looking at 100 publicly available AI models for Transkribus, you will see that the related engine is PyLaia. So, might also run locally or via command line.

There is also kraken:

romjerome · August 21, 2025, 3:43pm

I just see that Transkribus does not provide a desktop application client anymore. Did you try arkindex or escriptorium?

They should be also able to use the same models as Transkribus.

csam · August 21, 2025, 4:34pm

When doing my genealogy research and come across a handwritten document, that I cannot read, I typically use the Transkribus web interface,
But I’m also working as a volunteer in a nearby local archive, where we are transcribing between 35.000 to 40.000 handwritten pages. It’s parish council protocols ranging from 1841 to 1950. For this project we’re using the java application Transkribus Expert Client v. 1.29.0 from december 2024, and until further is announced we’re continuing to use the java client.
There’s been discussions the last couple of years whether to continue or drop the java client - but so far nothing has been decided.
Running models created in Transkribus outside Transkribus, I’m not sure if that is possible. In my local archive we create our own models, and mark them public. But I don’t think you can download such a model. Further a transcription not only requires the HTR model, but also models for text region analysis and for layout/line analysis.

romjerome · August 21, 2025, 5:08pm

Sure deprecated does not mean stopped, as no more available, but I suppose this means that next updates on their API might block the transcription?

I was just wondering how to use such services without internet… LLMs as a local service makes sense if you have some limitations on the web traffic/flow or just do not have access to the web.

TrOCR (Microsoft) seems to share some ‘generic’ models

but you can find some others more specific, like:

By looking at this video (youtube), we can see that more than 30 000 documents could be really expensive via specialized services like Transkribus. It looks like that open-source ecosystem, LLMs, API fit better?

Running models created in Transkribus, I do not know. Just see that Arkindex can import Transkribus data and metadata. I was thinking on public models, like maybe the German 17th one. Most of them are PyTorch compatible (e.g., above PyLaia and public models references).

romjerome · August 21, 2025, 5:20pm

I did not make a large migration, but I suppose it is what I mean by “metadata”?

StoltHD · August 22, 2025, 8:41am

I have a question about using this if you can’t read the text… How can you then do a quality control that the transcription is correct and not just a hallucination of the AI…?

This is a serious concern. If no one is able to read the original handwritten text, then there is no way to verify whether the transcription is accurate or simply a plausible guess generated by the AI. Without human oversight, errors can easily slip through unnoticed, and over time these mistakes may be accepted as fact.

In the worst case, this leads to unintentional document falsification—where the AI’s version replaces the original meaning—and even historical distortion, as future researchers rely on flawed transcriptions. The integrity of archival work depends on transparency and verifiability, and using AI without proper validation risks undermining both.

csam · August 22, 2025, 9:42am

First of all, you should always do a comparison of the original text (the image) and the transcribed text (generated). Mistranscriptions and hallucinations are usually easily detected.
Regarding 17th and 18th century gothic handwriting, which for me is really difficult to read, the use of AI transcription is a great help. It may not be 100 percent correct, but you get enough transcribed to understand the content.

system · September 21, 2025, 9:43am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Outil d'aide au déchiffrage d'acte Français (French) ai , ia , handwritten-text	22	278	August 25, 2025
GrampsChat Addon for 6.0 Ideas third-party-addon , ai	21	560	March 24, 2025
Hiding "Transcript" type notes in reports Help	5	174	May 19, 2024
Une IA locale pour Gramps? Français (French) ai , ia	26	506	February 16, 2026
Gramps, AI (Artificial Intelligence) and the Future Development ai , policy	39	713	December 28, 2024

Related topics