Representing uncertain source assertions and identities

I mainly use Gramps Web, but this seems to me to be fundamentally a Gramps data model / evidence modeling question rather than only a UI question.

While reading the earlier discussions around assertions and personas (this post), I realized that several situations I encounter during genealogy research seem closely related to those ideas.

One issue I run into is that many sources do not directly establish facts, but instead contain partial, ambiguous, or indirect evidence that I would like to preserve separately from my final conclusions.

Examples include:

  • census ages implying approximate birth years,
  • uncertain handwriting/transcriptions,
  • variant spellings of names,
  • tentative identification of persons,
  • inferred residences or relationships.

Currently I mostly store these things in citation notes, but that has several limitations:

  • difficult to compare assertions across citations,
  • difficult to revise interpretations later,
  • not queryable,
  • and the distinction between observation, interpretation, and conclusion becomes blurred.

One workflow I encounter occasionally:

A record contains a difficult-to-read name. Initially I may only be comfortable transcribing it as something like:

“Niels Jenss.”

At that stage I do not yet know exactly which Person object this refers to.

Later, after finding additional records, I may conclude that this is actually “Niels Jensen” and link the assertion to a specific person.

What I would ideally like to preserve is:

  • the original transcription/observation,
  • my interpreted reading,
  • the confidence level,
  • and the eventual person linkage,
    without losing the uncertainty or research history.

Similarly for dates:

  • a census age may imply birth year 1837/1838,
  • a marriage record may imply 1838/1839,
  • a death record may state 1836,
  • while a baptism gives a direct date.

I would like all of these source-derived assertions to remain visible and structured, while still allowing one preferred genealogical conclusion.

One thing that especially interests me in the earlier “persona” discussions is the idea of representing source-level or provisional entities before linking them to a final genealogical conclusion.

Sometimes a record clearly refers to a person or event, but I am not yet confident enough to connect it to an existing Person or Event object in my tree.

In those situations, it would be useful to preserve:

  • the source-specific representation,
  • the associated assertions,
  • and the uncertainty,
    before deciding whether it matches an existing person/event or represents a different one.

I am curious:

  • do others encounter similar workflows?
  • are there existing Gramps practices/plugins for this?

There are many issues in your post, issues that everyone meets sooner or later.

Before showing how I handle them, let me mention I use the “standard” Gramps, not Gramps Web.

Names

Gramps provides the possibility to assign different names to a person. Thus, I cope with clerk’s “literacy” by storing document spelling as one name with a citation.

In case, I am in trouble reading a name (bad hand scribbling, faded away document, damage to it, …), I synthesise whatever I can and usually add a question mark (to draw my attention in lists).

If it appears later that this person is the same as one already in my tree, I transfer secondary objects (events, notes, …) to the other person and delete the record.

One thing I handle with difficulty is homonymy. During some periods, christening rules easily created people with same name (same given, same surname). And if age difference was relatively small, it is extremely difficult to attribute later event to the correct person.

Since I always make a transcript of records in a Note (needs less disk space than an image), I add a line with my doubts. This line is prefixed with a conventional “sentinel” I can query for in the Notes list.

Dates

Again, Gramps has provision to tag dates with uncertainty (dates Before, After and About). But don’t confuse them with (IMHO badly designed because concept difference is not enough emphasised) durations (From, To and Between).

However, if you don’t use “span” dates events of life like residence, you can more explicitly show your guess with a From … to … duration, here abused to represent uncertainty.

Sources and Citations

Sources have no confidence level and this is good (at least in my workflow). Sources sometimes span a long period of time with different contributors, each with his/her culture, education, reference system, … Thus assigning a confidence level to Citations seems the adequate track.

In order to spare time later, I always attach to a Source what could be called a reading helper note (containing pointers to important “break points” in the source, indexes to tables). I frequently add a rating about the source, such as readability, image quality (e.g. LDS microfilms are recorded with 1 bit per pixel making ink and reverse page bloating indistinguishable), direct indexing possibility, … This is where various information can be entered.

General assesment

Since my tree is becoming quite large, I can no longer rely on my sole memory or Todo Notes to give me clues on what to do next. I then designed a set of flags to show visual progress of my research. Flags can be applied to any Gramps record. Therefore the flags can be assigned to people, families, events but also to notes, sources, citations, media, …

There some “progress logic” in my flags:
New → Partial → “final result”
where “final result” can be no flag (all possible data has been gathered and confidence is quite high), Lapse (when records are lost – war, fire, destruction, loss, … – and no further progress seems possible), Restriction (record too recent and law restricts access). These flags are mutually exclusive in my workflow.

I added a generic Check flag, taking precedence over all others, to draw my attention on records I think faulty, waiting further examination.

For the record, I also have another “shortcut” flag, with lowest priority, No Posterity, to give a “dimmed” colour to people or families without descendants.

I don’t know if I have addresses all your concerns, but this is the present state of my approach to uncertainty and doubt.