A local AI for Gramps?

,

Hello,

I was wondering if it might not be possible to generate a custom AI model for Gramps?

For example, there are various versions of models based on Mistral (for assistance, generative and repetitive tasks, language, psychology, philosophy, relationships, etc.).

Since Geneatique is advertising its AI integration into its software, I tried to learn more.

What AI does Geneatique use?

Geneatique uses artificial intelligence for several features in its 2025 version. AI is integrated to assist with paleographic transcription of old documents, enabling automatic pre-transcription of visual documents. This includes recognizing and transcribing ancient handwritten scripts, thereby facilitating genealogists’ work12.

Additionally, AI is used for drafting and presenting notices, providing informational texts about places, first names, occupations, etc., which can be linked to specific sections in the genealogical tree13.

Geneatique’s AI engine also helps automatically analyze the genealogical tree to suggest additional information, such as INSEE death records since 19703.

These features are designed to reduce repetitive and tedious tasks, allowing users to focus on deeper aspects of their genealogical research32.

Is it Mistral?

There is no specific information indicating that Geneatique specifically uses Mistral AI. Available sources mention AI integration in Geneatique for tasks such as paleographic transcription and notice drafting, but do not specify the AI provider used45.

Mistral AI is a French startup specializing in language models and AI solutions67. Although Mistral AI is mentioned as a preference for some users due to its language specialization, there is no direct confirmation that Geneatique uses Mistral AI services8.

So Geneatique is not your client?

There is no information confirming that Geneatique is a Mistral AI client. Search results provide no details regarding a business relationship between Geneatique and Mistral AI9.

Does Gramps use Mistral or Pixtral?

There is no information indicating that Gramps uses Mistral or Pixtral. Search results provide no details regarding any relationship between Gramps and these technologies1011121314.

I almost made my system “overheat” while writing the last response! Well, clearly, according to the cited references, the AI wasn’t quite sure which Gramps was being referred to


Best regards,
JérÎme

(Gramps 5.2.2 - OS derived from Ubuntu 18.04.6 LTS)

:st_martin:Je crois que je vais enfin progresser en anglais (USA) ! :fireworks:

Mistral Large (small?) 2025-02 :
:us_outlying_islands:Certainly! Here is the translation into American English!


Exactly! I can be seen as a complement to the transcription work done by Transkribus. Here’s how I can assist you in this context:

  1. Interpretation and Analysis: Once Transkribus has transcribed a handwritten text, I can help you interpret and analyze the content, provide contextual explanations, and clarify ambiguous passages.
  2. Information Research: If you need more information on a topic mentioned in a transcribed document, I can help you research additional details or historical contexts.
  3. Linguistic Assistance: I can help translate passages, explain technical or historical terms, and provide suggestions to improve the clarity of the transcribed text.
  4. Methodological Advice: If you have questions about how to use Transkribus or best practices for document transcription, I can offer advice and recommendations.

In summary, I am here to complement the work of Transkribus by providing interactive assistance and helping you make the most of the transcriptions. If you have any specific questions or needs, feel free to ask


:st_martin:Si je comprends bien, tu es un complément du travail de transcription de Transkribus ?

Exactement ! Je peux ĂȘtre vu comme un complĂ©ment au travail de transcription effectuĂ© par Transkribus. Voici comment je peux t’aider dans ce contexte :

  1. InterprĂ©tation et Analyse : Une fois que Transkribus a transcrit un texte manuscrit, je peux t’aider Ă  interprĂ©ter et analyser le contenu, fournir des explications contextuelles, et clarifier des passages ambigus.
  2. Recherche d’Informations : Si tu as besoin de plus d’informations sur un sujet mentionnĂ© dans un document transcrit, je peux t’aider Ă  rechercher des dĂ©tails supplĂ©mentaires ou des contextes historiques.
  3. Assistance Linguistique : Je peux aider à traduire des passages, expliquer des termes techniques ou historiques, et fournir des suggestions pour améliorer la clarté du texte transcrit.
  4. Conseils MĂ©thodologiques : Si tu as des questions sur la maniĂšre d’utiliser Transkribus ou sur les meilleures pratiques pour la transcription de documents, je peux offrir des conseils et des recommandations.

En rĂ©sumĂ©, je suis lĂ  pour complĂ©ter le travail de Transkribus en fournissant une assistance interactive et en aidant Ă  tirer le meilleur parti des transcriptions rĂ©alisĂ©es. Si tu as des questions ou des besoins spĂ©cifiques, n’hĂ©site pas Ă  demander

“One of the first examples is the Transkribus project, launched by the University of Innsbruck, which uses AI to transcribe historical documents, but is somewhat outdated today.”

I strongly disagree with the author (Anne-NoĂ«lle) of this promotional article. Transkribus is currently still far superior to most “generalist” AI models in transcribing ancient and handwritten texts. OpenAI and most Mistral models using ‘Mistral-ocr’ (Pixtral, Mistral Large) tend to invent stories and make too many assumptions. While it is indeed possible to “adjust the temperature” (level of “creativity
”) in these models, inaccuracies (and hallucinations) remain prevalent. However, for a posteriori analysis, these “virtual assistants” do perform well and provide valuable assistance.

For what it’s worth.
I’m a happy user of Transkribus. Have used Transkribus in my genealogy work especially for older documents written in gothic hand writing.
I’m also using Transkribus in a large project at my local archive, where we are transcribing municipal protocols (about 50.000 hand written pages).
The quality of the transcription is heavily depending on the model being used. For this project we have developed our own model, which we retrain twice per year - every new version of the model has a better score. When we started the project, we were counting the number of errors per line, now we count the number of errors per page.

Voici la traduction :united_kingdom: / :united_states: → :france::

Pour ce que ça vaut. Je suis un utilisateur satisfait de Transkribus. J’ai utilisĂ© Transkribus dans mes travaux de gĂ©nĂ©alogie, notamment pour les documents anciens Ă©crits en Ă©criture gothique. J’utilise Ă©galement Transkribus dans un grand projet aux archives locales, oĂč nous transcrivons des protocoles municipaux (environ 50 000 pages manuscrites). La qualitĂ© de la transcription dĂ©pend fortement du modĂšle utilisĂ©. Pour ce projet, nous avons dĂ©veloppĂ© notre propre modĂšle, que nous rĂ©entraĂźnons deux fois par an - chaque nouvelle version du modĂšle obtient un meilleur score. Lorsque nous avons commencĂ© le projet, nous comptions le nombre d’erreurs par ligne, maintenant nous comptons le nombre d’erreurs par page.

This aligns with my initial impressions following recent tests with the model Transkribus compared to Mistral models with OCR support.
Thank you for this testimonial.

A brief aside to note that ‘Free’ and ‘Mistral’ seem to follow the same type of strategy (or have the same shareholders): Le Chat Pro is “offered” for 12 months to customers of the operator’s plans!

In my tests, Mistral learns very quickly from its mistakes

The copy of the response above illustrates this. The AI in question is aware of its limitations in this domain, which is already a good thing.

For use with Gramps, I also looked at the “data” on mobile (here, downloading/uploading is complicated
). AIs also seem to struggle with facts, dates, and historical representations. This is still a concern for genealogy: memory is long-term
 Mix up a date and a location with an AI, and it may become confused. If, in addition, transcription and translation are required, errors accumulate! In short, the analysis is indeed very fast, well-organized, without obvious errors or spelling mistakes—but isn’t it ultimately the genealogist’s role to initiate research leads or formulate hypotheses?

:united_kingdom: / :united_states:

An interesting ChatGPT experiment was described on Facebook.

There was a “paper saving” technique for handwritten documents in the 18th and 19th centuries called “cross writing”. In this technique, scribes filled a page normally (in portrait orientation). Then rotated the page clockwise (to landscape orientation) and wrote a 2nd page of text.

ChatGPT had problems with it. It would be a good stress test for Transcribus.

To be fair, reading cross writing is challenging for a human. It should be an excellent test for any AI.

:france:
Une expérience intéressante de ChatGPT a été décrite sur Facebook.

Aux XVIIIe et XIXe siĂšcles, une technique d’« Ă©conomie de papier » pour les documents manuscrits, appelĂ©e « Ă©criture croisĂ©e », existait. Cette technique consistait Ă  remplir une page normalement (en orientation portrait), puis Ă  la faire pivoter dans le sens des aiguilles d’une montre (en orientation paysage) et Ă  Ă©crire une deuxiĂšme page de texte.

ChatGPT a rencontré des problÚmes avec cette technique. Ce serait un bon test de résistance pour Trancribus.

À vrai dire, la lecture de l’écriture croisĂ©e est un dĂ©fi pour un humain. Ce devrait ĂȘtre un excellent test pour toute IA.

1 Like

Actually, Transkribus starts with a bit of an advantage anyway, since we define a suitable model from the outset. With a specific model for this “crossed writing,” even in another language, it will no longer be a stress test.

1 Like

Hello,
I have just discovered (not tested) an association between Mistral AI and Ancestris, via assistance in entering records.

J.

Indeed, there has been some improvement

https://console.mistral.ai/build/document-ai/ocr-playground
However, I have doubts about the version or model used by Ancestris. Version 2 was not “sufficient” for the general public and the handwritten documents commonly used in genealogy. Did Ancestris test on version 3?