Tree vivisection experiments with the Isotammi SuperTool

This is a start of a “How do I…?” article for the wiki on Approaching Automated Data Harmonization.

Manually cleaning data to make it consistent and fill in omissions is purely a misery. There are excellent tools to clean messy data outside of Gramps. You could use OpenRefine or manipulate CSV data in a spreadsheet like Excel. That means taking the data out of the Tree structure of Gramps, losing the benefit of Gramps understanding how the data element interrelate, and risking data loss during the export/import cycle.

Some issues have arisen that affect broad swathes of the Gramps community. And the grumbles have lead some talented people creating a number of special purpose tools and gifting them to the community.

These sorts of Harmonizing operations tend to be relatively simple as add-on Tools. But continually asking for more bespoke Single Purpose Tools seems like a poor use of their energy. They could be doing more creative things.

The traditional alternative (when trying to do it yourself) is to get the model of data structures in a Gramps tree, determine which elements have the data needed to confirm the issue exists, then write a Python script (with the appropriate field names) to run through the data, check the conditions and records as needed.

It makes me tired to just write that out!

On the other hand, the SuperTool is already a dictionary of the data, fields and expressions. It enables a mildly technical person to hack on the the data.

I’ve wanted to try out some of the capabilities of the Isotammi Supertool for a while. But it was too intimidating to a basic user to just jump in and start experimenting. And the potential for mangling a Tree is … frightening. It’s more likely to turn a Tree into mulch than a bonsai masterpiece.

So I’ve been looking for a good (but small enough to be digestible) Real World target in data harmonization that would exercise the add-on, Something basic enough to be used as an introduction.

Suddenly, a task loomed that I was dreading: harmonizing the origin Types for Surnames and multiple surnames.

I discussed a couple particular scenario variations with Kari Kujansuu (the creator of SuperTool) and he cobbled together some example scripts. One was very basic and the second is a bit more complex.

Now we have a tiny chunk of code which is a solution that can be reverse-engineered to a known objective. It is not pseudocode and it is an issue that impacts everyone. So we may be able to learn how to re-create the progression from “problem statement” to working script using the tool.

Kari notes that if Gramps is launched from the Console (Command Line), then the SuperTool also prints all changes to the console.

The basic objective, simply stated

I want the blank Origin types for surnames that match the Father’s surname to be marked as “Patrilineal”. I don’t want to overwrite ANY existing Origin types. If I don’t have a father, I don’t want to ‘assume’ the origin type is ‘probably’ Patrilineal.

I know that I’ve overlooked some outlying conditions. (Naturally, blended family situations could affect the determination.) But we can come back to add those refinements.

The problem arises from depending too blindly on the built-in Guessing

Gramps has 3 Surname Guessing options to offer. You select the option in the Display tab of Preferences.

The default is a Patrilineal variant, labeled “Father’s surname”, which will fill in a Surname as a Father (or his offspring) are added. But this feature overlooks setting the Origin to ‘Patrilineal’ to match the guess type.

Preparations

Experiments in data harmonization should always be tried on expendable sample data. Do NOT test on your real data.
  • copy the script below to a text file on your system
  • Install the SuperTool compressed archive (SuperTool.addon.tgz) from Isotammi
    [see How To download from Isotammi’s 5.1 addon list]
  • Quit Gramps
  • Restart Gramps from the Command Line (Console) to enable access to a change log
  • Create a new Tree and import the Example.gramps file
  • Select a subset of rows in the Families view
  • Choose Tools > Isotammi ToolsSuperTool…
  • Click the Load button and open the saved Script. (This will only load, not execute.)
  • Warning: the following step modifies the Tree : To process the selected Families, click the Execute button
  • Review the change log that was written to the Console.

The first script

This Families view oriented SuperTool script is named set-surname-origin-to-patrilineal.script. This script processes ONLY the Families currently selected in the Families category view.
[title]
set-surname-origin-to-patrilineal

[description]
Goes through all children in the family and sets their surname type to PATRILINEAL if the surname matches with father's surname. See https://gramps.discourse.group/t/tree-vivisection-experiments-with-the-isotammi-supertool/1621

[category]
Families

[initial_statements]
from gramps.gen.lib import NameOriginType
counter = [0]

def get_primary_surname(p):
	return p.obj.primary_name.get_primary_surname().surname

def get_surnames(p):
	for nameobj in p.nameobjs:
		for surnameobj in nameobj.get_surname_list():
			yield surnameobj

[statements]
father_surname = get_primary_surname(father)
for c in children:
	for i, surnameobj in enumerate(get_surnames(c)):
		if surnameobj.get_surname() == father_surname and surnameobj.get_origintype() == NameOriginType.NONE:
			surnameobj.set_origintype(NameOriginType.PATRILINEAL)
			print(c.name,": set to PATRILINEAL:",surnameobj.get_surname())
			db.commit_person(c.obj, trans)
			counter[0] += 1



[filter]

[expressions]
"names updated:",counter[0]


[scope]
selected

[unwind_lists]
False

[commit_changes]
True

[summary_only]
True
3 Likes

17 posts were split to a new topic: Discussions started by Introduction to SuperTool scripts

A more complex variation relates to the social conventions of “couverture”, where women take their husband’s surname.

This can make recognizing daughters difficult in documents like obituaries and family histories. Genealogy programs tend to list everyone by their birth surname and only store the Married Name as an alias. Although that follows guidelines, this seems backwards… since women tend to use the birth surname for less than a couple decades and live under the married surname for 3-5 times as long. So I prefer the Preferred Name to be the one they used most of their life… their Married name but the form that includes a ‘patrilineal’ as well as the ‘taken’.

So, rather than merely use Gramps “Married Name” type in the Names tab, I like to use the multiple names feature. The 1st name is Patrilinear in origin (using “U.L.N.” if unknown last name). So This script is run after the Patrilinear script. And subsequent name(s) are of ‘Taken’ origin – derived from their husband(s).

So I’d like to set the null-value Origins of Females’ 2nd-10th multiple names to “Taken”. (Ideally, it would be nice to only set those that have a spouse with a matching surname. But this seems to be too ambitious. We can add that as a later refinement.)

While collating a [list of SuperTool Scripts](https://www.gramps-project.org/wiki/index.php/Addon:Isotammi_addons#SuperTool_Scripts) to link in the wiki, discovered that the 2nd of two scripts was never added to this thread.

The original idea was to adapt the Discourse posting into a WikiMedia article. But seems that a Discourse plugin offers a rollover “copy” button for “code blocks” that is not available in our wiki. That button simplifies copying the code to a new .script file that can be loaded by SuperTool. (Eliminating the need to transcribe one text box at a time. Yay!)

Copy the script below, create a new text file and save it as “set-surname-origin-to-taken.script”. Save it to a folder you can find easily when it is time to load it into SuperTool. (Or make finding the folder in the future easier. Copy the path in the breadcrumbs of the Save dialog. And set a bookmark in the GTK File Chooser.)

So here is the SuperTool script written by Kari Kujansuu during the July 2021 eMail exchange: “Can this be done with SuperTool?

The lead-in will be refined in this posting over the next couple days.

[title]
set-surname-origin-to-taken

[description]
Goes through all female persons 
sets their surname type to TAKEN for all except the first surname within each name

[category]
People

[initial_statements]
from gramps.gen.lib import NameOriginType
counter = [0]

[statements]
if gender == "F":
	for nameobj in nameobjs:
		for  surnameobj in nameobj.get_surname_list()[1:]:
			if surnameobj.get_origintype() == NameOriginType.NONE:
				surnameobj.set_origintype(NameOriginType.TAKEN)
				print(name,": set to TAKEN:",surnameobj.get_surname())
				counter[0] += 1

[filter]

[expressions]
"names updated:", counter[0]

[scope]
selected

[unwind_lists]
False

[commit_changes]
True

[summary_only]
True