Understanding Gramps dates

(AIO64-5.2.3-r1-aa03f5a under Windows 10)

I am writing a parser for the Gramps XML format and I had a hard time wrapping my head around the various date types Gramps uses. According to my understanding and to this Gramps wiki page, there are 4 different date types, encoded in the exported XML:

  • daterange: for an event that happened within a range;
  • datespan: for an event that existed within a range;
  • dateval: for an event that happened at a rather specific point in time;
  • datestr: for purely textual dates, uncomputable;

Now, the standard date (dateval) can have specifiers:

  • from, to: Spans a range with indefinite upper or lower boundary. So it refers to an event that happened at a time (or spanned a range) before a point in time.
  • before, after, about: By default spans a range of 50 years -, + or ±, related to a given point date (can be set in preferences).

I find this scheme confusing, due to the following reason:

I assume that a standard dateval encodes a specific point in time (e.g. a birth event). Now, only dateval has from, before, etc. specifiers. This suggests that these specifiers apply only to events that happened at a specific point in time, but not to events that span a range. Moreover, these specifiers are meant to indicate a level of uncertainty. However, the fact that specifiers are not available for daterange and datespan suggests that such dates are known exactly (i.e. there is no uncertainty). Thus while a from or after encodes uncertainty for a dateval, a datespan cannot encode for uncertainty. Here is an example to illustrate my problem: how would one code a continuous event with a spanning date (e.g. X living at Y) whose actual range we do not know, only the upper/lower boundaries? Storing a simple datespan does not inform user about any uncertainty in the boundary dates.

Of course, we can always add a note for clarity, but the issue at hand is conceptual, and requires some clarification. In my understanding, it stems from a conflation of two concepts:

  1. whether an event is a point event in time (e.g. birth, death, coded by dateval, but also by daterange) or a continuous event spanning a range (e.g. living somewhere, having a job, etc., coded by datespan). This is about the real property of an event, independent of whether we know its correct date or not.
  2. whether we know when exactly the event happened in time or there is uncertainty.

Can anyone please clarify the issue and point out if I am mistaken in my interpretation? Thanks in advance!

Read the date section in the Gedcom 7.0 specification for details.

Indicating uncertainty in a datespan is not well supported in Gramps.

So I take it as a confirmation on my above assessment, and that this issue was already known by most of you. Is GEDCOM 7 support in Gramps coming anytime soon?

I was also puzzled by the ambiguity about dates. Depending on record, they designated either an “instant” event or the duration of a condition. And these two concepts are encoded exactly the same, using two elementary data elements whose significance change whether you target an “instant” or a duration.

I went through some thinking and made notes for myself. I attach this document. Read the chapter about dates. I’d be glad to share your opinion.

GrampsIdeas-5x-en-0.pdf (267.9 KB)

Hi

Dates are a real minefield in GRAMPS if you must have GEDCOM support
then you will have to use/wait for GRAMPS as it develops or if like me
you are prepared to face the consequences of going your own way then do so.
My particular issue is that using “between 1939 and 1940” takes up far
too much screen real estate on a graphical view (so I use 1939 >< 1940).
The other issue much discussed previously is the semantic difference
between range and span( a term personally I would never use with regard
a date).
Last point I also have the advantage I consider all dates subject to
change (approx 1940 might become 3rd Qtr 1939 might become 25 Sep 1939)
such that I do not use the certainty/confidence term with regard dates I
use Tags for either an individual or an event expressing a level of
certainty much more visual.
phil

Thanks @comeng, your comment pushed me to the direction where I am not looking for a perfect solution but one that fits my needs.

I have nothing against range and span semantically (one encodes a pointlike event, the other a continuous one), but against the fact that point events can be encoded with uncertainty but not continuous events.

I also agree with your assessment that dates are fluid, subject to change. On this not, I’d make a cautious further step: all input data in Gramps (or in any genealogy software for that matter) should be considered to have some level of uncertainty. I do not know why only certain data (like dates) have the means to store uncertainty in Gramps, but I am intrigued by these choices.

Thanks for the document, I will scan through it.
What is the purpose of this pdf though? Will any of the suggested changes be implemented in the next version? Or is this for public debate? Can we suggest solutions to particular problems (like for dates)?

Since I only sporadically poke into Gramps code, this document is first a reminder about my introspection. I have started to switch to a full SQL data description (in a somewhat different direction as the one recently described, using JSON). While at it, I try to “sanitize” the various records. Dates are one of those where I’d like to have a more regular implementation covering all “numeric” cases (with text representation as a fallback for corner cases like “simultaneous to the battle of the Five Armies”).

Unfortunately, I have been very busy with other projects and I didn’t implement yet my ideas about dates (the SQL version of the other records works well, though a bit messy because I want to keep maximum reuse of Gramps code).

If you feel some of my ideas make sense, I am ready to debate.

EDIT:
Forgot to mention my next target will be Names. I feel the current implementation is a bit “mixed”, hesitating between detailed “tagged” description (like family name decomposition) and global grouping (all given names are grouped together in a single string). Also, its underlying model is too much Western-European-oriented.

My preliminary design would be to simplify names to an array of components made of a “name” (give name, family name, nickname, …), a connector, prefix, suffix, a tag (telling if is a given name, family name, why not inherited from mother/father so that cultural conventions, e.g. Spanish or Scandinavian, can be automatically managed, …) and origin.

For me
If you are uncertain or lacking confidence about anything with regard an
individual you have to be upfront with this (on display immediately
whatever the View) hence my use of Tags
I operate with 4 Tags somewhat based on the Scottish legal system
Qual. Pure Speculation (Typically information found in another’s tree or
literature)
Qual. Not Proven (Typically one or two pieces of information with little
supporting evidence)
Qual. Balance of Probabilities (Typically multiple pieces of information
with some supporting evidence)
Qual. Beyond Reasonable Doubt (Typically the same as Balance or
Probabilities but added personal knowledge of individuals or events, DNA
matches and/or other solid evidence)

But I suspect there are as many ways of dealing with this as there are
GRAMPS users
phil

Right!

I settled to a somewhat similar strategy. And I assigned a colour to each tag. This results in record item being displayed in the first tag colour in lists (Persons, Families, Events, …). Since several tags can be assigned to records, a careful ordering is needed to get a colour relevant to the intended goal of drawing attention on problematic records.

  • a confident and complete record has no tag. Therefore it displays black (standard display)
  • a family or person record without children is tagged “no posterity” and displays gray (indicating end of lineage)
  • “skeleton” records (just created as placeholders) are tagged “new” and display red

  • I have various other tags for progressive improvement in accuracy. A special tag is used for lack of documents, such as archive destruction, and for legal constraints, like privacy protection for recent archives and living people.

I find the tag colour feature particularly nice to give a quick “intuitive” visual clue about the current state of the research.