Update the default ID Formats for 6.1?

The default index limits are too small for the most highly populated object types.

Please offer your opinion about changing the defaults for Events, Citations, People and Places to 5 digits instead of 4. (This would also underline that the ID Formats are not limited to all being the same number of digits.) Or maybe change all the ID Formats to 5 digits except for Repositories?

At 10,000 objects Gramps exhausts the Leading Zeros sorting capacity. Then you have to add to the ID Formats size preference and run the Reorder Gramps IDs tool.

In the Example.gramps sample tree, the 4 largest object categories are at 13% to 35% of capacity. And that is not a terribly large tree.

3,500 Events
2,900 Citations
2,200 People
1,300 Places
  800 Families
   30 Notes
   10 Media
   10 Sources
   10 Repositories
2 Likes

It might also be worth considering whether to start automatic numbering at ‘1’, reserving ‘0’?

The logic being that we could use the zeroth record for special purposes. For instance, the zeroth person could inherit the Active and Home. (Assuming that a person was not specified as being the home person and no person is currently active.) Or maybe that person could be used for the Tree Owner.

And the Zeroth Note might be used as a To Do for the whole tree.. or a Note introducing the Tree.

The Zeroth Place could be the focus if this Tree is a “One-Place Study”. Or used as the Home Place reference.

The Zeroth Source or Repository might be the foundation for the tree.

I have no objections to changing the defaults if there is support for this.

2 Likes

I long ago gave up the notion that things begin at one (1). I did make the person record that I use to hold other objects so that they are never deleted as “unused” (notes and media I use for the narrweb) have the I000000 id. It took editing the xml backup and changing its handle to all zeros (0). When I reorder the ID’s it stays at I000000 and the first person in the database gets I000001.

1 Like

I think these kind of hidden, special meaning values make systems harder to use. We can use tags (for example, or something else) for things like what you suggest. That makes it more explicit, and easier to understand.

3 Likes

You’re right, Easter eggs generally do make software seem arbitrary.

But it seemed like changing the starting line might be a global some of the idiosyncratic Dashboard Gramplets follow more of a pattern.

For instance, the To Do gramplet seems to default unpredictably. I’ve taken toletting it show me which Note it chooses to display in a Tree. Then move the contents of that “To Do” to a new Note (including the references) and then co-opt the original for the “whole tree” To Do note.

The record with ID zero may have existed at one point in time, but also may now have been deleted by the user. So we still need to handle all scenarios in the code.
As a user I’d be upset if I changed my home person and the ID suddenly changes to zero. There’s also the problem of what to change the ID of the previous home person to - since it can’t be zero anymore.

Don’t take this negatively, it’s always good to explore alternative solutions.

2 Likes

I have grown to resent the way Tags are implemented.

One of the good uses for them is worklists or as sorting/colorcoding metadata. But using them that way revises the “Last Modified” metadata for a record. Which is bad.

(Worse for tagging while importing preference setting, since that obfuscates all the history of which records have been fallow the longest. Normally, the import preserves the “last modified” data… but not when the tagging preference is enabled.)

It would be better of Tags points at objects rather than objects pointing at Tags.

So glad when others share this way of thinking.

And I do not mind when my proposals have holes shot in them. It is so much better do that early… rather than discovering (after huge effort) that the work was for naught.

I’ve no objection - I increased the to 5 digits for Person and Event in my own tree.

If we go ahead, there’s quite a bit of work to do to “fix” the test suite, which currently assumes 4 digits for all Gramps IDs :frowning:

2 Likes

I have always set it to 6 digit and for my test with the norwegian 1910 census import I set it to 8 digits, just to be sure…
And I set all of them to the same number of digits regardless…

As long as they can be changed, I think 5 is a good middle way, most users usually get 10-15K or more entities in their database after some research and a few gedcoms…

If the system of Gramps IDs were to be reworked, I would be in favor of calling them “labels” instead, since they are not necessarily unique (which is what “ID” implies to me at least), and can be overridden with descriptions (I mean, labels) like “Grandma Jo” or “Cousin Max”.

But anyway, users can already just change the format from “I%04d” to “%05d” to get 5 digits.

Personally I hope I never get to the point where I need 5 digits; I have enough to do already! :slightly_smiling_face:

Why bother about IDs ?
My understanding is that IDs are remnants of GEDCOM.
IDs are not guaranteed to be unique, they are not used as primary keys i Gramps, primary keys are called handles.
IDs can be edited, they are just text strings and you can write anything you like - I have done this a couple of times, when I thought I was writing in another field. As strings you get funny sorting - e.g. ‘P12345’ is sorted before ‘P2003’. Yes, in my settings I have ‘P%04d’ for place ID, but that doesn’t limit the max ID to ‘P9999’.
So from my point of view, we could just drop the IDs, or as @GeorgeWilmes suggests rename ID to Label.

1 Like

The automatic ID generation runs out of leading zeros. And that screws up the sorting.

However, you can increase the P%04d to P%05d and then run the Reorder Gramps IDs tool. After that, Gramps will be able to correctly sort up to 100,000 Places. (Or, if you were talking about Individuals/Persons, the ID would be I%05d )

The IDs are not an artifact of GEDCOM. They are a ‘convenience system’. Which is to say that they are a more human-friendly identifier than the hexadecimal internal Handles. Plus, they are shorter… so they consume less screen space or space in Charts blocks.

e.g., in the example.gramps sample tree:
The Home Person is ID I0044 which stands for handle GNUJQCL9MD64AM56OH

1 Like

So the handles are also strings, how sad and ineffective. A 32 bit non-signed integer would be much better (represents more than 4 billion values) and indexes much smaller.

1 Like

This illustrates @emyoulation 's reasoning to increase the default size from 4 to 5 digits.

It seems like this could be solved in a different way. The database knows what the last assigned ID was for each type of data (Person, Family, etc). Couldn’t we make a dynamic system that automatically expands to whatever size is necessary?

So, when an ID reaches 9999, Gramps would change the setting to %05d and then pad out existing ID’s with a leading zero (0). In an ideal world! It would probably be easier to send up an alert that the user should take the appropriate action. After all, they could choose to let it ride.

Exactly! Here is Claude Code’s plan (I’d put this in Github discussions but it was never enabled):

Gramps ID Overflow — Investigation & Plan

Background

Each Gramps object has an ID like "P00023". The number of digits and leading zeros
is controlled by a config setting (e.g., preferences.iprefix = "I%04d"). When a
database grows large enough that the numeric portion overflows the format width (e.g.,
"I%04d" % 10000 silently produces "I10000"), the IDs go out of order and the user
must run Reorder Gramps IDs to fix them.

The goal is to detect this overflow and show a non-disruptive warning with a link to
the Reorder IDs dialog — without interrupting imports or data entry.


Current State (from code investigation)

ID format config keys

Object Config key Default Index field
Person preferences.iprefix I%04d pmap_index
Place preferences.pprefix P%04d lmap_index
Event preferences.eprefix E%04d emap_index
Media preferences.oprefix O%04d omap_index
Citation preferences.cprefix C%04d cmap_index
Source preferences.sprefix S%04d smap_index
Family preferences.fprefix F%04d fmap_index
Repository preferences.rprefix R%04d rmap_index
Note preferences.nprefix N%04d nmap_index

Key files

  • ID generation: gramps/gen/db/generic.py:1194_find_next_gramps_id()
  • Format validation: gramps/gen/db/generic.py:1027_validated_id_prefix()
  • Format→closure: gramps/gen/db/generic.py:1042__id2user_format()
    • Regex: r"(.*)%[0 ](\d+)[diu]$" — only matches zero-padded formats
    • If no width specifier, closure just passes ID through unchanged
  • Reorder tool: gramps/plugins/tool/reorderids.py
    • Regex: r"(^[^\d]*)%(0[3-9])d([^\d]*$)" — requires 3–9 digit zero-padded format
    • Already detects overflow internally (index > index_max) but never surfaces it
    • There is commented-out overflow detection code at generic.py:1057–1065

Current overflow behavior

_find_next_gramps_id() uses Python’s % operator which silently expands past the
format width: "I%04d" % 10000"I10000". No warning is emitted. This happens
identically whether the record was created via the UI, an import, or any other path.


Plan

Constraint: never interrupt an import or data entry

The approach is:

  1. Compute overflow on the fly from data already in memory — no metadata storage needed.
  2. Show a non-modal warning bar in the main window, with a button to open Reorder IDs.

Step 1 — Overflow check helper on the db class

File: gramps/gen/db/generic.py

Add a method get_id_overflow_types() that iterates the 9 (prefix_fmt, map_index)
pairs and returns the set of object type names where the index has exceeded the format’s
maximum value.

The check for each object type is simply:

# e.g. for "I%04d", width=4, max=9999
if map_index > 10 ** width - 1:
    overflowed.add(object_name)

Only applies when the format contains a fixed-width specifier (%0Nd). If the format
has no width (e.g., "P%d"), skip it — there is no fixed maximum to overflow.

No writes to the database, no metadata flags, no startup scan. The result is always
computed fresh from the current in-memory state.


Step 2 — Warning bar in the main window

File: gramps/gui/mainwin.py (or equivalent)

Connect to the database-changed signal. After the database loads, call
db.get_id_overflow_types(). If non-empty, show a non-modal Gtk.InfoBar at the
top of the content area:

Some Gramps IDs have exceeded their format width (Persons, Families).
[Reorder Gramps IDs]

The button fires the existing Reorder IDs tool directly. The bar is dismissible and
recomputed on every database-changed event — so it disappears automatically after a
successful reorder without any explicit “clear” call.

Because the check is based on the current map_index values (which are updated on
every write), this works for overflow caused by normal data entry, imports, or any
other source.


Step 3 (future / optional) — Dynamic format widening

Longer term: when overflow is detected, automatically promote the format from %04d
to %05d (or wider) and update the config. This changes user-visible behavior and
affects exports and merges, so it warrants a separate discussion and should not be
bundled with the warning feature.


Edge cases

Custom prefix formats

  • Custom prefix, standard width — e.g., "FAM-%04d" → fully supported; overflow
    check works the same way on the numeric portion.
  • No width specifier — e.g., "P%d". The ID grows unboundedly by design.
    __id2user_format’s regex won’t match, and the Reorder tool falls back to defaults.
    No overflow concept applies; skip these object types in the check.
  • Manually entered IDs — e.g., user types "Smith-001" directly. These are stored
    as-is; _find_next_gramps_id is not involved. Out of scope for this feature.

Warning precision

When the format has a known width, the warning message should be specific:

Person IDs have exceeded the 4-digit format (I%04d). Run Reorder Gramps IDs to fix.


Files to touch

File Change
gramps/gen/db/generic.py Add get_id_overflow_types() helper method
gramps/gui/mainwin.py (or equivalent) Show Gtk.InfoBar warning with Reorder IDs button when overflow detected
1 Like

For those audience members wondering: “How does the AI figure all of this out?” I will say that it took me a few minutes to write a very specific prompt outlining exactly what it needs to pay attention to, followed by some additional revisions to its plan. You still have to have some developer skills to get this level of behavior from any AI.