Tree vivisection experiments with the Isotammi SuperTool

This is a start of a “How do I…?” article for the wiki on Approaching Automated Data Harmonization.

Manually cleaning data to make it consistent and fill in omissions is purely a misery. There are excellent tools to clean messy data outside of Gramps. You could use OpenRefine or manipulate CSV data in a spreadsheet like Excel. That means taking the data out of the Tree structure of Gramps, losing the benefit of Gramps understanding how the data element interrelate, and risking data loss during the export/import cycle.

Some issues have arisen that affect broad swathes of the Gramps community. And the grumbles have lead some talented people creating a number of special purpose tools and gifting them to the community.

These sorts of Harmonizing operations tend to be relatively simple as add-on Tools. But continually asking for more bespoke Single Purpose Tools seems like a poor use of their energy. They could be doing more creative things.

The traditional alternative (when trying to do it yourself) is to get the model of data structures in a Gramps tree, determine which elements have the data needed to confirm the issue exists, then write a Python script (with the appropriate field names) to run through the data, check the conditions and records as needed.

It makes me tired to just write that out!

On the other hand, the SuperTool is already a dictionary of the data, fields and expressions. It enables a mildly technical person to hack on the the data.

I’ve wanted to try out some of the capabilities of the Isotammi Supertool for a while. But it was too intimidating to a basic user to just jump in and start experimenting. And the potential for mangling a Tree is … frightening. It’s more likely to turn a Tree into mulch than a bonsai masterpiece.

So I’ve been looking for a good (but small enough to be digestible) Real World target in data harmonization that would exercise the add-on, Something basic enough to be used as an introduction.

Suddenly, a task loomed that I was dreading: harmonizing the origin Types for Surnames and multiple surnames.

I discussed a couple particular scenario variations with Kari Kujansuu (the creator of SuperTool) and he cobbled together some example scripts. One was very basic and the second is a bit more complex.

Now we have a tiny chunk of code which is a solution that can be reverse-engineered to a known objective. It is not pseudocode and it is an issue that impacts everyone. So we may be able to learn how to re-create the progression from “problem statement” to working script using the tool.

Kari notes that if Gramps is launched from the Console (Command Line), then the SuperTool also prints all changes to the console.

The basic objective, simply stated

I want the blank Origin types for surnames that match the Father’s surname to be marked as “Patrilineal”. I don’t want to overwrite ANY existing Origin types. If I don’t have a father, I don’t want to ‘assume’ the origin type is ‘probably’ Patrilineal.

I know that I’ve overlooked some outlying conditions. (Naturally, blended family situations could affect the determination.) But we can come back to add those refinements.

The problem arises from depending too blindly on the built-in Guessing

Gramps has 3 Surname Guessing options to offer. You select the option in the Display tab of Preferences.

The default is a Patrilineal variant, labeled “Father’s surname”, which will fill in a Surname as a Father (or his offspring) are added. But this feature overlooks setting the Origin to ‘Patrilineal’ to match the guess type.

Preparations

Experiments in data harmonization should always be tried on expendable sample data. Do NOT test on your real data.
  • copy the script below to a text file on your system
  • Install the SuperTool from Isotammi
  • Quit Gramps
  • Restart Gramps from the Command Line (Console) to enable access to a change log
  • Create a new Tree and import the Example.gramps file
  • Select a subset of rows in the Families view
  • Choose Tools > Isotammi ToolsSuperTool…
  • Click the Load button and open the saved Script. (This will only load, not execute.)
  • Warning: the following step modifies the Tree : To process the selected Families, click the Execute button
  • Review the change log that was written to the Console.

The first script

This Families view oriented SuperTool script is named set-surname-origin-to-patrilineal.script. This script processes ONLY the Families currently selected in the Families category view.
3 Likes

Two tests but difficulties too: une petite mine d'or - Forums Geneanet [fr]

1 Like

SuperTool and SuperFilter too !

1 Like

Creation and sharing of a new attributes filter for sources or citations, filtering is based on attributes name or value.

Very easy (to create and to share) with ST.

Note/Idea: It would be cool to be able to use a path (defined in preferences?) for includes, something like media path (relative to the preferences path).

1 Like

Note/Idea: It would be cool to be able to use a path (defined in preferences?) for includes, something like media path (relative to the preferences path).

There is now a new version at isotammi-addons/source/SuperTool at master · Taapeli/isotammi-addons · GitHub which has this feature.

Thanks for the suggestion.

Kari

1 Like

Hi Kari.

I had misplaced the Deep Connections Graph Gramplet before ever trying it. (I rarely add anything to the Dashboard) It requires looking at the .gpr.py file to discover that it was a “Dashboard only” gramplet. So I’ve finally had a chance to try it.

It was interesting. It did NOT like my big Tree – it kept saying that trying again might help. But, after a few attempts, the Gramplet worked with the PseudonymTree.gramps file. But that is a simple tree for testing Graphs. It is without any pedigree collapse… so it does not exercise the tools very well.

https://gramps-project.org/wiki/index.php/PseudonymTree.gramps

There were 4 items that didn’t have English translations

the “Uudelleenyritys voi auttaa” (“Trying again can help”?) error message and the “child/sibling/parent” labels. Could you make those translatable?

Those “child/sibling/parent” labels overwrite the vertical line connecting the boxes. Perhaps you could move them to the right about 15 pixels? (Or you could just add a non-breaking space to the front of each string.)

And would you consider feeding the current “Home Person” to the Person1 selection and the “Active Person” to Person2 when the browser is started? I would happily close and re-launch the browser each time to avoid the Person selection dialogs on my 40,000 person tree!

I posted a feature request to make the Add-On list server location more friendly to 3rd parties. It looks like the idea will meet a lot of resistance.

https://gramps-project.org/bugs/view.php?id=12363

Thanks!

1 Like

Thanks. Again good suggestions - I hope I am able to make the changes. It might be difficult to move the labels since IIRC the complete image is generated by Graphviz. The gramplet needs other improvements also… Maybe I will also move it to the Tools menu

2 Likes

I thought that moving the labels might be a pain. Perhaps padding those strings with a non-breaking space at the left (& on the right, to support Right-to-Left localization) would do the same thing with little effort?

19 Nov 2021 @kku announced an update fixing an issue with incrementing variables in the Isotammi SuperTool version 1.1.5

Since the Isotammi updates are NOT included in the standard Gramps Add-on Manager updates check, people may want to install the Isotammi Configuration add-on. Or, manually switch the Preferences between checking the standard Gramps add-on list:

https://raw.githubusercontent.com/gramps-project/addons/master/gramps51

and the Isotammi add-ons list:

https://raw.githubusercontent.com/Taapeli/isotammi-addons/master/addons/gramps51

Remember to switch back afterwards!
Note that the experimental FilterParams add-on hasn’t been migrated to the Isotammi add-on list yet. You don’t want to miss out on this one!

1 Like

This looks interesting. I’ll try making precise the things I want to do with my GRAMPS, first with some pseudocode for my own benefit.

Then to get real, where do I find a dictionary of all the objects which will be known to the Supertool? Like db, children, and the methods and attributes of these and anything I might pass to a function I define?

I realise this is part of learning GRAMPS coding basics, but I’m not there yet!

The SuperTool Help has that dictionary… well, actually it is more of a vocabulary list than a dictionary.

I like to use the add-on Filter gramplets, Custom Filter Rules and the Built-in rules as sample code for the different object categories. They cover the key areas.

1 Like

I’m not there yet either.

Note: try
dir()
in the “Expression to display:”

It was unclear that when you launch SuperTool with a specific Category View active, that has implicit Initialization for the primary object of that category.

1 Like

OK - I got your sample Supertool script to work, and I can see how to write a script to do some of the things I want. As I see things currently, I will set up a series of citations manually, e.g. one for each census, one for parish burials, ++, use the import text tool to bring in people mentioned, who I will want to have that citation, and then ‘correct’ all the spurious new citations created with the import text tool. I’ll want to set up residence events for census imports, as well as the (highly approximate) birth years as birth events.

1 Like

Stuck here, trying I guess the simplest possible use of the Supertool, changing the type of as yet just selected events from Residence to Census. By poking around in the code for ImportText I worked out how to import residence events, but strictly these should be Census events, to which value I changed two manually.

What is it with ‘EventProxy’ and ‘_Event__type’?

That sample code had this line

surnameobj.set_origintype(NameOriginType.PATRILINEAL)

but I can’t get something like this for events to work.

The simplest syntax would be

type = EventType.CENSUS

which I tried first. It doesn’t generate an error … but doesn’t do anything either.

Try that:

event.set_type(EventType.CENSUS)

And in initialization statements you could just use:

from gramps.gen.lib import EventType

(and check Commit changes)

1 Like

Thanks. This worked. I’ll work now of setting up more tables of data to bulk import / tidy up with applications of the Supertool. In the meantime, can you send me notes or overview of what code to use with Supertool? I understand you. @PLegoux are French - I should be able to read French text. I’d like to produce a few pages on this in English. Maybe I could try Google translate with something in Finnish from @kku ? I also have a nephew who is married to Finn - she might help me too!

I’ve produced that if they can be utile to you.

Generally I start by publishing what I’ve found/tried there:

Integrate it in my library (Ive began to translate comments in English but not all are done):

And sometimes write a what’s new post:

(I don’t know why Discourse published Notion ads instead of my post header but it is it - I don’t have that issue on other social networks)

And a lot of tweets:
https://twitter.com/search?q=from%3A%40plegoux%20supertool&t=Ox0MxPb-vJe13P6yPRlRuA&s=09

Some of them are resumed or completed in these articles:

(Same issue again)

The README in English, from @kku here

isotammi-addons/README.md at master · Taapeli/isotammi-addons · GitHub

seems pretty good to me - certainly for me to work from for a while. I’ll keep on with the ImportText tool for initial bulk uploads, and using SuperTool for refinements, corrections of what I can’d do with Import Text

I’d offer to write up what I do as example of using these tools, although I’m likely to see better ways as I go, do organising it clearly will not be easy. Maybe when I’ve finished would be better

1 Like