Diversely transliterated (sur)names and statistics

Gharnatix · July 20, 2024, 3:52pm

Hi! I’m facing a challenge with the (sur)names in my database.

Due to necessary complications when Arabic names are transliterated to Latin alphabet, and also due to poor administration in Morocco a century ago, I have many transliterated versions of the same (sur)name.

For example: Mohamed, Mohammed, Muhammad or Nooh, Nuh, Noah.

This gives an inaccurate representation in my dashboard’s (sur)name statistics and also makes searching for a (sur)name harder.

One solution could be me changing all names to the Arabic versions, leaving the transliterated versions as alternative names under the name editor tab. Problem with that is that many of my younger family members don’t read Arabic anymore, so it would make my graph unreadable to them.

Another solution would be to choose one uniform transliteration (e.g. Mohammed) and sticking to it, entering variations to it as (family) nicknames. I was about to opt for this but then thought of first asking for advice here.

Do you see a better way in which I can fix the statistics whilst still displaying transliterated (sur)names in my graph?

emyoulation · July 20, 2024, 4:06pm

In the Gramps Glossary, the term “anglicisation” describes this problem for surnames. (But it does not apply to grouping Given names.)

The glossary term also links to the workflow to use ‘grouping’ in the Name Editor to manage such spelling variations.

DaveSch · July 20, 2024, 6:40pm

In the Name Editor, there is the Group As function. This allows spelling variations of the same surname to be grouped together in the Grouped People view (and some reports). However, this function does not alter the person’s record or surname.

Tools like the Top Surnames and Surname cloud only work off of the Surname as entered in the Preferred Name.

An alternative, if things like the top surname and the cloud are important to you, is to enter multiple name entries for people as found. This is done in the person’s edit window’s Names tab.

The name in the Preferred Name slot would be a consistent anglicized version of the surname. And you can use the Name Type “Anglicized”. Other name entries could be for other variations of the anglicized name. Another entry could be for the name in Arabic (or several if variations are found). Again, you could put these under an “Arabic” name type.

Things to keep in mind, most reports, etc utilize only the Preferred Name. However, creating a custom filter for a surname will search all name entries found in a person’s record.

emyoulation · July 20, 2024, 7:06pm

It seems like the Top Surnames and Surname cloud could be adapted to key off Groupings rather than Surnames.

But it is more important to be able to filter Dashboard gramplets to a particular subset or people. For instance, I might only want a Surname Cloud for the living descendants of one line’s progenitor. Or the Age Stats for my ancestors and their siblings. Nick has suggested a Filter for Dashboards was an option.

Gharnatix · July 20, 2024, 7:37pm

Thank you both!

@DaveSch (ah - Dutch roots I suspect? Leuk!), I’ll do just that.

For the Preferred Name I will choose consistent Latinised given names and surnames, and give them the name type ‘Latinised’ (makes more sense than Anglicised in my case).

As alternative names I will add the names in their official Latinised spelling and in their official Arabic script.

This will solve both my statistics and searching issue.

@emyoulation the adaptation of these two gramplets sounds interesting. Do you think that’s something a non-programmer like me could pull off easily? If so, that might be a simpler solution after all…
Edit: I did just realise that such an adaptation might only be a solution for surnames however, since there is no ‘group as’ option for given names.

emyoulation · July 20, 2024, 9:01pm

It looks like Top Surnames might be a good starter. The biggest learning curve is in cloning the built-in as a unique addon. (That’s so you can experiment without risk to the original Top Surnames gramplet.)

TopSurnamesGramplet already looks for the name.get_group_name() so you don’t have to learn how to do something new. It just has to be cut back to ignore the NOT grouped surname when there is an override. Looks like it might be as simple as changing a variable name on 2 lines.

github.com

gramps-project/gramps/blob/d7b5fd967f0f4f09acaaa84db7a0637423746c20/gramps/plugins/gramplet/topsurnamesgramplet.py#L78-L81


      
          allnames = [person.get_primary_name()] + person.get_alternate_names()
          allnames = set([name.get_group_name().strip() for name in allnames])
          for surname in allnames:
              surnames[surname] += 1

DaveSch · July 20, 2024, 9:10pm

I could not get it to display more than the top ten names. I changed it to display 100. The line does contain a comment that that number will be ignored.

self.top_size = 10 # will be overwritten in load

But I could not see the programming logic that does that.

Nick-Hall · July 20, 2024, 9:33pm

The on_load and on_save methods are used to store the value between sessions. I would expect the gramplet to define an configuration option so that the user can change the value, but for some reason this doesn’t appear to have been done.

DaveSch · July 21, 2024, 1:22am

Thanks, I now see where the size is supposed to actually be set. This is a quick hack to increase the number surnames returned in the list.

def on_load(self):
    if len(self.gui.data) > 0:
        #self.top_size = int(self.gui.data[0])
        self.top_size = 150

DaveSch · September 30, 2024, 7:06pm

Patch to allow users to set the number of names between 10 and 1000

PR #1780

Topic		Replies	Views
Normalizing/Grouping together variant spellings of given names Help name-sort	4	272	September 26, 2022
A few questions regarding Gramps (names & websites) Help	18	1255	January 20, 2021
Help with name changes and names in multiple languages Help	4	182	June 12, 2024
Extended surname search considering historical variations and typos Ideas	17	510	October 28, 2023
Systematic surname variants Help	5	933	October 12, 2020

Diversely transliterated (sur)names and statistics

Related topics