Gramps 6.0.8
Gramps Web API 3.12.2
Gramps Web Frontend 26.5.0
Gramps QL 0.4.0
Sifts 1.3.1
locale: en
multi-tree: false
task queue: true
OCR: true
chat: false
I was playing with the text recognition for the first time today. I looked for documentation, but couldn’t find anything neither in the Gramps Web User Guide nor in github. Info on what OCR library used would be much appreciated.
I did 3 test runs
Antikva (printed latin) seems quite OK.
Fraktur (printed gothic) so and so - can easily be manually corrected.
Gothic handwriting - forget it.
Saving the recognised text output as a note is fine, but it would be nice to be able to edit the result before saving, because you have both the original image and the ocr’ed text on the same page.
the feature uses Tesseract. Actually we could try the deu_latf language for Tesseract as well.
You’re right editing the note before saving it would be useful.
It would also be fairly easy, when AI chat is enabled, to route it through the LLM to fix typos. Because that’s what I do anyway in practice, but manually.
The Transkribus API documentation is a bit confusing. On some pages they talk about Legacy API , which is a RESTful API, but no longer supported.
Then they also talk about MetaGrapho, which also seems to be outdated. Also on this page it looks like MetaGrapho is only available for large paying organisations Plans & Pricing – Transkribus
On the documentation for the Legacy API , they have for very long time said they will release a new better API. Last year this page said during 2025, now they have changed the text to 2026.
I have had some talks with people from Transkribus. Two weeks ago there was a meeting in Copenhagen, where Transkribus promised to release the new API this summer (2026). So hopefully we will see this API very soon.
I have discussed with Transkribus ideas of developing an addon for Gramps, and they would really like, if we do that.
On Github there is a TranskribusPyClient - that I have played a bit with. It’s based on the Legacy API, but it works - at least I was able to connect and disconnect to the Transkribus server.
I’d be happy to help with implementing Transkribus into Gramps Web/Gramps Desktop. I have quite a lot of experience with Transkribus.
In Transkribus every task is queued. E.g. if you copy an image from one collection to another collection the task goes into a job queue. So every thing you do should be asynchronous.
Many tasks have a cost, which is “paid” by using ‘credits’. Running a layout analysis on an image has a cost of 1/4 credit, running text recognition on an image has a cost of 1 credit. The good thing is that you can have a free account where you get 50 ‘credits’ each month. You cannot save credits from one month to the next. Alternatively, if you need more than 50 credits per month, you can buy credits. When using paid credits you have higher priority in the job queues. This means that on busy days using free credits can result in long response times.
Therefore a Transkribus integration should have a background process running, that periodically checks the job status, and when a job has finished raise a flag and/or copy the transcription into a Gramps note.
Transkribus models
It is important for the user to be able to choose the best model for the task. There’s a large number of public models available. A model is typically trained for a specific language and for handwritten or printed text. Further the model can be trained for gothic or latin characters. There’s models (often called supermodels) that covers multiple languages, and there are models that are specialised on a specific time interval for a single language
Most of the public models has a good description and statistics (e.g. CER (Character Error Ratio)), e.g. The German Giant I model. It would be fine that when the integration presents a list of available models, clicking on a model in the list will take the user to the model description/specification.
In my module the user can specify for a source image the script (e.g. Sütterlin or Fraktur) and the used primary language. Based on that information and the date relevant for the source I plan to offer a list of Transkribus models.
My module will support several transcription agents. Transkribus will be only one of them. Another one will use crowd sourcing using Discourse („Lesehilfe"). So to work with asynchronous answers is required.
Each transcription is organized in steps which are documented in revisions. Several users can contribute to a transcription in internal collaboration.
In future versions I plan to support named entity recognition. Finally this results in claims about the persons, events, and places. The claims can then be correlated with the existing objects in the tree or generate new objects. So for me transcription is only one step in a process to be supported by the genealogical application.
@hartenthaler I looked at the pricing for the new Transkribus API.
If I understand correctly, they structured this in a way to make it IMPOSSIBLE for apps like Gramps Web or Webtrees to use their service on a self-hosted app with a simple pay-as-you-go subscription, as it used to be.
You would have to buy an expensive “Organization” plan to use the API, except as a developer sandbox, which is certainly heavily rate limited to make it unusable except for developer experiments.
Unless I’m misunderstanding something, this means Transkribus is dead for me.
Last year I asked Transkribus on the status of the new coming API, and described how I imagined a Gramps integration.
This is the reply I got from Transkribus:
Thank you for sharing this idea - it’s a strong use case and makes a lot of sense.
At the moment we have two APIs: the REST API and our metagrapho API, which is currently used for processing. The REST API is being phased out, and metagrapho will also be replaced in the future. Right now, metagrapho is only available as part of our organisation plans, which limits its accessibility.
We are working on introducing a developer platform over the next months. This platform will provide APIs designed exactly for these kinds of integrations, with clear documentation and direct availability for developers.
Well, this answer is now 8 months old, and the new API hasn’t been released yet. The good news is that Transkribus has released a new majer version this week, so hopefully they will now get the time to finish the long awaited new API and developer platform.