Why are First Name and Surname separated?

When you look at JSON schemas in name.py and surname.py of gramps/gen/lib, you notice a strange distribution of name data between objects.

  • All given names are grouped into a single string in first_name, i.e. when a person has several given names they are all in a single item, separated by spaces.
  • Name is the support object for various “attributes” such as title, nickname, grouping, proofs (citations and notes) … and anchors the array for the surname but singles out given names.
  • Surname gives a detailed structure to components of a name flagging their nature or origin in an array.

I find this not homogeneous and culturally Europe (in a broad sense, including North America) oriented. This design does not allow to take easily into account other cultures. As an example, a former colleague of mine who was of India origin had no given name and his surname had two component, one of which was frequently misused as a given name (first name). In most reports, his full name would be prefixed with something like <unknown> because of missing given names which is an error.

If each individual given name is considered a component of the Surname object (with a dedicated origin of “given” or “call”) and the convention they precede all non-given components, we gain in versatility.

Coupled with a small modification of the name format with new descriptor codes, we can then manipulate separately first and middle names (in US culture) as was asked in a question on this site or handle more easily the Indian case mentioned above.

This would also simplify general name handling as all components would then reside in a single object.

I hesitate about Suffix, Title, Nickname and Family Nickname. Are they person attributes (associated closely to the person) or name attributes (susceptible to vary across time and name variants)? Presently they are associated to the name and must be repeated if this a constant for the person. And even with nicknames, won’t they be better considered as name components with a specific connector such as a dash or “called”, “aka.” and specific origin?

A possible benefit could be the automatic hinting for Scandinavian or Spanish naming schemes where components could then be automatically added when creating the name. Perhaps with a menu selection to pre-fill items when creating the name.

This could make the whole feature more consistent.

Last, as I am working on a true relational database schema, the present state of affairs makes deletion of a person extremely tricky. Reorganising things a bit would allow to use ON DELETE CASCADE on the Name and Surname tables.

A related question is about name “share-ability”. Separating Surname from Name suggests that a family name could be entered only once and shared among family members. This is not possible presently because Surname is not a “primary object”, i…e not individualised in the data base as it lacks a handle. Does sharing a family name improve person management? This share makes sense in modern times where legal authorities are extremely picky about name inheritance but is less meaningful when dealing with old records as name spelling may widely vary across a person’s life. Name sharing would also forbid merging given names with surname as proposed above.

What is a name in the most general sense when we abstract from our native culture? How can we deal in the most “neutrel” way with this?

Your opinion?

Good question, and my 1st thought is that a name is a list of elements, where even titles can be inserted anywhere. And I say that, because I have a few dozen people where a title like baron or count is placed between the first and the last name. This means that for the former chairman of the Amsterdam stock exchange, I can use the sequence:

  1. Boudewijn
  2. baron
  3. van Ittersum

I can even separate van and Ittersum, but I prefer not to do that.

In this case, baron is a title, and before the inroduction of family names, the title would probably be the full ‘baron van Ittersum’, but today ‘van Ittersum’ is the registered family name (surname). In other cases, there may be a suffix, like ‘, heer van Amersfoort’, which is another title that can be used by the same person, where ‘van Amersfoort’ is not a family name unlike the ‘van Ittersum’ part. Also note the comma here, which is used for display purposes, and for which I really don’t like the idea that I must put that in the suffix, and can’t register it as a separator.

This example is still European, but I had a grand uncle named ‘Surendra Nihal Singh’, a late journalist, born in India. For him, I have always been told that Surendra was his given name, and the Nihal Singh part is a bit difficult. I’ve been told that Singh means lion, and tend to think that it is sort of a family nickname. When I ignore that, his name still fits the European ‘standard’.

We need to throw that whole European idea overboard for Vietnam, where the given name comes at the end, and what we see as the family name also has two components, I’ve been told. I had two fellow students from Vietnam and they told me that.

And since I can’t ask them, I asked ChatGPT:

Yes, of course! Vietnamese names have a specific structure, which consists of three parts: the family name (ho), the middle name (ten dem), and the given name (ten).

Family name (ho): The family name appears first in Vietnamese names and is passed down from the father to all of his children. Vietnamese family names are often one syllable and can be quite common, with some surnames being shared by a large number of people.

Middle name (ten dem): The middle name is also known as the "secondary name" and is used to differentiate individuals who share the same family name. It appears after the family name and can be one or two syllables long. The middle name is usually chosen to have a specific meaning, such as a virtue or a desirable quality.

Given name (ten): The given name appears last and is chosen by the individual or their parents. It is often two syllables long and can be chosen for its meaning, sound, or to honor a family member or historical figure.

So, for example, in the name “Nguyen Van Minh,” “Nguyen” is the family name, “Van” is the middle name, and “Minh” is the given name.

This suggests that the list approach is fine for all names and cultures that I know, if we can mark individual parts as a family name, middle name, clan name, etc., and use separators.

It might be nice to let users choose a different set of name components for sorting vs. display. For example, if my display format is “Surname Suffix, Title Given” I might prefer to sort as “Surname, Given”.

I’m not sure if it would. I currently use the feature for grouping surnames, with a patch for the sorting within the group.

In each name’s Name Editor window, the default Display and Sort can be individually altered.

Except that apart from the built-in options, the Default format (defined by Gramps preferences) is shared by Sort and Display. There is only one possibility in Gramps Preferences for name format in Display tab. As such, you can have only one customised format while you’d need two if you want to separate display from sorting.

True. Ideally, I would love to see independent defaults for Display and Sort. But in the person’s Name Editor, the single default option can be overridden to use a different display name option and/or to use a different sort option. These overrides will accept both the built-in name options or custom name option built by the user.

My default set in Preferences is “Title Given "Nickname" Surname, Suffix”. This works well for most relatives except for people with a Title. For these records I manually override the sort option to “Given Surname, Suffix, Title”. The same is true for displayed names where the Title needs to precede just the Surname. These records will receive both a new sort name and a new display name.

I wonder if there’s a way to append birthyear-deathyear to the sort key?

It’d be helpful to sort all the namesakes by generation.

While we’re at discussing names, I’d like to mention nicknames: currently nicknames are related to a name and there can be only one (per name). I know a few people who have different nicknames depending on the context. For instance, one family nickname and one scout one.
So the idea of a name being defined as a list of components with their qualifier would definitely help here, the context being given by a note as in the current model, for instance.