Extended surname search considering historical variations and typos

During genealogical research, it’s quite common to encounter situations where surnames might change their spelling over time. This can be due to historical, cultural, or linguistic reasons. Typos in old records, migration and assimilation, or simply inconsistency in spelling can lead to one surname having several variations.

For instance, the surname “Smith” might have variations like “Smithe” or “Smyth”. If a researcher is only aware of one of these spellings, they might miss out on crucial details or connections associated with the other variations.

My proposal is to develop a feature that allows grouping such surname variants together. When searching for one variant, the system would automatically consider the other associated spellings, helping in retrieving more relevant results.

Example: If we group “Smith”, “Smithe”, and “Smyth” as related spellings, then searching for “Smith” would also yield results for “Smithe” and “Smyth”.

Let’s discuss this idea. I’m curious about the community and developers’ thoughts. Do you think such a feature could be beneficial? What technical challenges might arise during its implementation?

1 Like

This feature (or rather, features) already exists.

Alternate Names Tab (a Tab in the Person Editor) allows recording of Aliases and Typos. The Name Editor allows grouping of variants.

Search names simultaneously searches a person’s alternative names

2 Likes

I know about Alternate names and I use them, but not for searching via filters. It doesnt work there or mayby I dont understand how it works.

@emyoulation simple example


Screenshot from 2023-09-26 23-08-16
As you can see, I use grouping.
When I search “Носенко”, I dont see the blue row in the search results.

One of the things to remember, creating a Group As override does NOT alter the underlying information in a person’s record.

And there are two major decisions.

  1. Which names should be grouped together.
    Simmon, Simmons ??

  2. Which name do you display as the primary.
    Smith (Smithe, Smyth, Smythe) or
    Smyth (Smythe, Smith, Smithe)

The other thing to consider is how you sort your database.

The default is Surname, Given. But in this scenario, “Smith, John” and “Smyth, John” will not be grouped together. Within the overall “Smith (Smithe, Smyth, Smythe)” grouping all the Smith’s will sort and then all the Smyth’s.

In this case, setting your default display name to Given Surname becomes a more desirable option all around.

1 Like

An example:

2 Likes

Not for your all database surnames, and not for the display, but just for these grouped surnames sort order. On the same grouping tab you have a sort order field you can use to sort that specific group with the Given Surname order.

See here and here.

Something cool would be to have that default sorting option for grouped surnames in preferences if no order is defined in the sorting field in names editor.

1 Like

Ideally Gramps should have a default Display name and a different default Sort. I have not filed a feature request.

In the Name editor, you can override independently the default sort order and the display name.

My actual display name setting is Title Given "Nickname" Surname, Suffix This covers most records. When needed, individual records get their own display name.

For sort, I am giving all records the override option Given. This is easier than you may think. Using father’s surname guessing, the sort override follows to all children when added. The only times I need to manually set the override for a new record is when adding a spouse. The other time is when adding another name record to a person. So far, I have not needed to give a name record a different sort other than Given.

1 Like

I did: 0011788: Sorting of people with grouped surnames - Gramps - Bugtracker – Free Genealogy Software

1 Like

What I would envision would to totally separate the Sort function from the Display Name.

In Preferences, like the Display name editor, there would be a Sort Name preference.

2 Likes

Getting back to what @Urchello was talking about…

Could this be done by creating another rule? A Grouping search?

The Smith/Smythe/Smithe/Schmidt is not the best example because that grouping is simply handled with a SoundEx phonetic algorithm filter. (Although, as an international tool, we really need a Double Metaphone matching feature.)

Let’s use Zimmermann/Carpenter instead.

So, perhaps a similar Rule to the Soundex one … excluding the First name, Call name, and Nickname; but extending to include Group name?

Another option would be to add “Grouped Surname” to the list of optional inputs in the “People with the <name>” filter rule. The rule allows for partial entry, as well as regex.

3 Likes

I’m not sure guys I understand you fully, but it seems that to solve my problem, I have to create filter-rules for each surname group. For example, if I want to find all the people who are grouped under: Smith/Smythe/Smithe/Schmidt, I would need to create a person filter with separate rules for each surname, using OR conditions between these rules.

I think I expected something that could conduct a similar search without the necessity to create such filters for each surname group. Let’s imagine that the system, by default, creates filters for all groups with the rules I’ve described above, however, these filters aren’t listed in the filter dropdown; they are simply used by default :thinking:. I understand this might sound a bit strange, but it’s another way to explain precisely what I’m expecting, which I find a bit lacking in the search functionality.

The information for the global Group As are stored in the database. If you open a Gramps XML file they are stored as the last entries.

Using my Smith grouping these are the entries.:

<map type="group_as" key="Smith" value="Smith (Smyth, Smythe)"/>
<map type="group_as" key="Smyth" value="Smith (Smyth, Smythe)"/>
<map type="group_as" key="Smythe" value="Smith (Smyth, Smythe)"/>

Do not know how you create a single filter with this information.

1 Like

And there are the singleton Group As overrides. These are stored in the Name’s group_as field.

1 Like

In the database, the table containing this information is called “name_group”.

2 Likes

For example with a filter which would ask to select a group name in a dropdown list of all groups value.
After selecting one group the filter would search for all names in keys for that group.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.