Consistent Database Method Names

Gramps has a number of methods for getting data from the database. The available methods are a combination of

  • record identifier
    • handle
    • gramps_id
  • result type
    • object
    • DataDict
    • iterator of object, handle or data

but there is little consistency in the naming of these methods. This was previously discussed in late 2024 but no consensus was reached. Personally, I find this lack of consistency frustrating so I’m picking up the baton in the hope that we can reach a consensus this time.

Proposal

Current Proposed Alternatives
get_person_from_handle get_person
get_person_from_gramps_id get_person_from_gramps_id
get_raw_person_data get_person_data
_get_raw_person_from_id_data get_person_data_from_gramps_id
has_person_handle has_person
has_person_gramps_id has_person_from_gramps_id
get_number_of_people get_person_count count_of_person, person_count
iter_person_handles iter_person_handles
iter_people iter_person
_iter_raw_person_data iter_person_data

with similar changes for the other object types.

Summary of changes

  1. Two internal methods are made public (_get_raw_person_from_id_data, _iter_raw_person_data)
    There does not appear to be a reason for these methods to remain internal
  2. The singular form is used in method names (person rather than people).
    Hopefully making it easier for non-native English speakers
  3. handle is the primary record identifier and so _from_handle is dropped from method names.
    Less typing is always a good thing
  4. _raw is dropped from methods.
    It is superfluous given _data is also present in the method name.

Proposed timeline

Gramp version Status
6.1 Proposed methods implemented. Current methods remain available. Codebase left as-is.
6.2 Codebase updated to use proposed methods. Current methods deprecated but still available. All addons updated.
6.3 Deprecated methods removed.

6.1, 6.2, 6.3 just represent future releases. The actual version numbers may vary.
A staggered implementation will make it easier for any addon developers that support multiple gramps versions from a single code base. I welcome proposals to move more quickly if this is not a concern.

Previous

We could look to combine handle and gramps_id forms of methods either via @singledispatch or parameter type inspection i.e. if isinstance(key, PersonHandle): ... elif isinstance(key, PersonGrampsID): ... Personally I’d prefer to make some improvement now.

Your comments sought.

Where does the Simple Access fit into this discussion of harmonizing method naming?

Great question @emyoulation . The good news is that the SimpleAccess API will not be impacted at all. Internally, the SimpleAccess class is using the “Current” methods from my table. In my timeline, during 6.2, this would be updated to use the “Proposed” methods. That’s a purely internal change and any reports using SimpleAccess should continue to work with no change.

I’ve updated the proposal adding the following

Current Proposed
has_person_gramps_id has_person_from_gramps_id
has_person_handle has_person
get_number_of_people count_of_person

Open questions

  1. Instead of get_person_from_gramps_id we could use the shorter get_person_gramps_id. The same for has_person_from_gramps_id
  2. I’ve proposed count_of_person but I can’t say I particularly like it.
    get_number_of_person might be slightly better.

I heartily support better method names! It is a mess right now, and any variation that you propose would be 100x better.

I much prefer the handle or from_handle suffix.

Explicit is much better than implicit.

Definitely not number_of_person, that sounds as though it would return something like the gramps_id, which is like a number.

I did make a suggestion about regularising DataDict and raw data, some time ago, but I can’t quickly find that now.

This pull request implements my initial proposal
Add aliases for consistent method names in the DB public API by stevenyoungs · Pull Request #2221 · gramps-project/gramps

It does not currently reflect any comments on this thread.

@kulath can you expand a bit on why you would prefer to keep the _handle or from_handle suffix please?
My longer term hope is that we can combine the current get_person_from_handle and get_person_from_gramps_id methods into a single method, get_person, which “does the right thing” when called with either a PersonHandle or PersonGrampsID argument. Dropping the “_from_handle” suffix was the first step in that direction.

Ideally I’d like a form of words that uses person in the name and kind of makes sense. I agree that number_of_person or count_of_person isn’t quite there :frowning:

Because when I read the code I want it to be explicit and clear whether I am using a handle or a grampsid.

Maybe I am tracing back through code or something, and if I see get_person_from_handle, then I know that what I have is a handle and I can analyse the code accordingly. Some things work with handles, and some with IDs, for example IDs are user facing while handles are not etc.

How about person_count? or get_person_count perhaps?

Of the two I personally prefer get_person_count. I’ve updated the table in the initial post and added an Alternatives column.

OK. If you see a call to get_person_from_handle you expect (hope :slight_smile: ) the author is passing in a handle.

Fully agree. Handles are internal, are unique and, with recent changes, exist.
Gramps ID are user facing, not guaranteed to be unique and the record may not actually exist.

Don’t we have unique index on Gramps IDs ?

No, gramps id can be repeated and are not unique.

What is the reason for this? In common programming ids are usually meant to be unique, otherwise you cannot distinquish one id from another.

All Gramps default to using the same ID pattern. So with the pattern I%04d all the Trees start with Person I0001. So merging based on IDs (that might be duplicates but refer to totally different Person) is not viable.

However, the Handle is generated in a way that it is very likely to be unique.

In practice, a Gramps ID is the same as a GEDCOM ID. At best, it is only unique within the dataset it belongs to.