I am exploring an alternative way of storing Gramps data into an SQL database (a replacement for the present BSDDB emulation over a relational DB).
Dates are complex objects. They are described by their own JSON schema. Presently this schema is imported into various “primary” records and dates are “local” to the record (included inside it).
However, due to the complexity of the date structure, it is not practical to duplicate the sub-structure into “primary” TABLEs, leading to an excessive number of columns, not speaking of the needed column name disambiguation when several dates coexist in a record. It would be more practical to isolate dates into their own TABLE and reference them through a “handle”.
But this immediately raises issues.
- Ideally, date records should be unique
How is this uniqueness defined? The Julian Date Number (JDN) is not enough because a date can be expressed in various calendars and calendar is part of date semantics. - If dates are shared, what happens when an event requires a modification?
The modification is valid only for this event and must not change others references to this date. Consequently, a modification should imply the creation of a new date record. - What to do when a date is created?
To maintain uniqueness, the TABLE should be queried to see if an existing date record is identical. Would this query have a negative impact on performance? - How can we avoid “ghost dates”, i.e. dates no longer referenced by another record?
A common precaution is to associate a reference counter. When the counter goes down to zero, the record can be deleted. Would this be enough?
I’m looking for implementation ideas.
EDIT
I didn’t mention the difficulty stemming from “incomplete” dates where a JDN can’t be computed, like Spring 1750 or 5th year of king Ottokar which end up as text. Therefore the access key to a date record must handle both cases (perhaps converting JDN to text).
EDIT 2022-05-20
Experimenting with a TABLE representation for dates (automatically generated from the JSON schema), I changed Date objects to Date(PrimaryObject). This allows dates to be potentially tagged or flagged as private (independent from their reference in address, citation, event, ldsord, media, name, placename and placeref records.
Does it make sense to attach tags or mark a date as private on its own?
If no, I can configure as Date(TableObj)) to eliminate privacy and flags.