Form Actions - User Editing of Action Detail

I’ve reached a bit of an impasse on the editing capability I proposed for Form Actions. See PR#267
After nearly three weeks of thought I am still unsure on how to move it forward.
I’ve put my thoughts below in the hope that others can suggest ways forward and / or persuade me that editing is not required.

In the options below, I consider a simplified census form with just Name and Age columns.
One action is provided, to create a birth event, with citation, and reference it from the relevant person on the form. To keep it simple, the date of the birth event is set to “date_of_census - Age”
Age is measured in years, but for young children on the form, may be specified in months, weeks or days using non-standardised ways e.g. “9m”, “9 mnths”, “9 months”, “9/12”, “6 weeks”, “6 wks”, “6w” etc. etc.
Let’s assume we cannot reliably, automatically, convert all non-integer text to a time span.
Even if for Age this might be technically possible:

  1. forms may have other fields where this may not be possible
  2. the user may have reasons for wanting to edit the object(s) being created or modified. e.g. to add notes

Possible solutions:

  1. No action editing capability
    User picks the actions to run. Run the actions.

    Restricts the range of actions which could be provided for a given form. To mitigate, consider
    a. only offering to create birth events where the age is an integer value
    b. for non-integer values, create a birth event with no date set.
    User would need to manually find and edit the event afterwards.
    Harder for user to add additional information to modified objects (e.g. notes) because they have to subsequently locate the object(s) to edit.

  2. Post-action editing
    User picks the actions to run. Run the actions. Present user with a list of modified objects, organised by action. Allow user to edit objects until OK clicked.

    May restrict the range of actions which could be provided for a given form. To mitigate
    a. do not offer birth event action for non-integer ages
    b. for non-integer ages, create a birth event with no date set.
    Could indicate which events should be manually edit.

  3. In-action editing
    User picks the actions to run and indicates which they wish to edit. Run the actions. Show required edit windows, sequentially, during action run.

    Compel the user to edit the event date, for non-integer ages, before event is created and referenced.
    Allow optional editing for integer ages.
    If multiple actions are run and edited, there may be a long sequence of edit windows displayed.
    May need supplementary UI so user knows which action is currently being run

  4. Pre-action editing
    Possible actions are shown to the user. User can, optionally, edit the detail of each action. User picks the actions to run. Run the actions

    Don’t want to modify the db before the user chooses which actions to run. Therefore need to store some kind of object the “diff” for each action step. When the action is run, the “diff” is applied to the current db object
    Diff may create a new object in the db.
    On more complex forms, multiple actions may edit the same object (e.g. Person). Changes made by earlier actions must be preserved.

Option 3 is currently coded in PR#267

Option 4 sounds appealing but presents a number of technical challenges

  • how to create the diff?
    one idea is to store the delta between the pre and post modified object JSON represenation.
    when the action is run, need to merge this partial JSON representation into the object being created / modified.
  • do want to use standard Edit windows (for consistency; user is familiar with these windows) but they currently add / commit to the db when OK is clicked
  • from a edit window, you can edit linked objects too. These should really be captured in the diff as well.


1 Like

I’d like to move towards a source-based data entry framework with the following steps:

  1. Transcription - The user records exactly what they see in a document.
  2. Interpretation - The user adds interpretation based on their knowledge and experience.
  3. Object linking and creation - The user links people in the source to people in their family tree and creates new objects where appropriate.

Forms provide an easy way capture information from structured documents that could be used to generate a transcription. Also, due to the way they store information, they also link to people in the family tree.

My suggestion would be extend the form definitions so that users can add interpretation. For example, when recording the name “Smith, John” the user should be able to tag “Smith” as a surname and “John” as a given name. The user should also be able to associate a date range with an age. Of course the form definition could contain a hint to how this could be derived, but it would be ultimately up to the user to decide the date range.

This this idea could easily be adapted to work with unstructured transcriptions in the future.

Creating objects from a form or transcription with added interpretation should then be fairly easy. So the solution is probably closer to your option 4 - pre-action editing.

1 Like

It always seemed like the 9/12 or 6w Census entries were a validation on a Zero age entry.

So why not transcribe them that way? Don’t even try to actually parse that text verification. That is, for 20 June 1860 census, calculate a Birth Event date of “calculated about 1860” having an Event Description of “age '9/12’ logged in 20 June 1860 Census”

To my mind transcription should preserve exactly what you see in the source document / image. You, the transcriber, may believe the information to be wrong, (you may know it is wrong) but it does not alter the fact that in this particular document, that is what it says.
As a user I want “9/12” converting to a date (or date range) in the event so I can, for example, sort events chronologically.
I prefer to have a single Birth event per Person. So using the Event Description would be problematic. You could put the information as an event reference note though, albeit at the cost of duplicating information already entered into the Form. Personally I prefer to avoid duplication of data - sooner or later I always end up with discrepancies!

Lots to think about there. Thanks.

My initial thought is that the form is, in the main, the Transcription today (or at least can be used that way by the user). There are exceptions, for example:

  • the “Name” column which is partly an interpretation by virtue of the user having to choose a Person object.
  • UK census form definitions have a single Age column whereas the source documents have Male and Female age columns

I wonder if having a second “interpretation” table (in many cases, likely to be similar to the original form) would work? The interpretation table could be interleaved with the transcription table (row by row say), or it might bl cleaner to have a separate window (and automatically show the transcription window to help the user). The software can pre-populate the interpretation table based on rules. The user can adjust the pre-populated interpretation data as required and use their knowledge to add additional interpretation data. When the user is ready, they can apply the intepretation to create / update objects in the db. An empty cell in the interpretation table would mean don’t do anything.

Do you think the interpretation data should be stored in the db, or could it be temporary and discarded once used to update the db?

Could you clarify the problem? Each Event has a Description so it wouldn’t necessitate creating multiple Birth events. Wouldn’t you just skip creating a separate Birth Event and merely Share a household Census Event if a Birth Event already exist for a person?

Actually, ALL American Census that log ages instead of dates means Date transcriptions have to be approximations. (The Swedish census turn Americans green with envy.) So storing the Birth event with a calculated Date (including the Quality markers ‘calculated’ & ‘about’) & explicitly including a Description with the original Age entry clad with an explanation seems reasonable.

I agree. Although as you mention the age/sex columns in the UK census definitions don’t really allow this. It is possible to edit the name and it will be stored as an attribute.

We should store the interpretation alongside the transcription.

For example, we may calculate a date from an age and store it in another attribute. It could be displayed as a tooltip over the age, and we could edit it by double-clicking the age field, a context menu or perhaps adding an extra button.

I don’t currently use forms, but I am trying to get into the habit doing as you describe in terms of citing sources first and creating/updating people and events afterwards.

I would add “Translation” as an optional step between “Transcription” and “Interpretation”. For example, when I have a German baptismal record, I create a Note of the type “Transcript” in which I enter the German text (although I use the modern font rather than one that reflects the old orthography). Then I create a separate note of the type “Translation” in which I enter the information in English. One might say that translation is really part of interpretation, and certainly there is an element of interpretation involved in translation (for example, coming with an equivalent meaning of a person’s occupation), but that is not the same kind of interpretation that you mention, which is independent of language.

1 Like

Yes, translation is definitely an extra step. It would also be very useful to include when sharing the transcription with others.

If a single person appears in several years census and two or more have a non-standard Age, you have to decide what to put in the person’s single birth event description field.

It would be possible to amend the form the have a male and female age column for UK census definitions. That would keep the forms closer to a pure transcription.

I’d spotted that the name could be editd. Slightly counter intuitively, you do the interpretation first (by choosing the Person record) and then amend the name back to be the transcription :wink:

Currently a form event stores attributes in both the event itself and the event reference as shown below (1841 UK census)

It would be easy enough to add additional attributes to store the interpreted data, for example by suffix of the attribute name:

In this example, “Age” is the transcription, “Age_Interpreted” the interpretation (which might be better called “Birth Date”). When it comes to updating the birth event, there are three scenarios

  1. no birth event for the person exists. Create a new event, using the interpreted birth date and add a citation.
  2. a birth event already exists.
  • if the interpreted birth date overlaps the current birth event date, add a citation and optionally narrow the date range for the birth event.
  • if the interpreted birth date does not overlap the current birth event date. It’s not obvious what should happen. Probably nothing

In terms of how the user enters the interpretation data, double click and context menu in the form editor window are not easily discoverable. A button is more discoverable, but one button per form field is a lot of buttons. I was thinking about a parallel window, dedicated to the interpretation data. Modeless so that the form editor window (transcription) could be visible at the same time. Male age and female age columns on a UK census transcription would be “Birth Date” and Gender columns in the interpretation window.

Or is this the sort of place where “Estimated” would be used?

I would argue that ‘Calculated’ is appropriate here (from a UK census anyway). The source document gave us an Age in years. From that a date range was calculated.

Here’s a starting point (it’s only a bitmap so ignore rendering inconsitencies)

Red text is explanatory

The Transcription window is the current Form editor window. The only change would be to potentially remove the Person selector button

The interpretation window can be launched from the form editor window or the forms add-on window.
In the interpretation window:

Name column is read-only. It shows the primary name of the selected person.

Birth Date is the interpretation of the Age column. Perhaps the first three rows could be created by the software, and the user manually added the fourth.

Occupation is the description for an occupation event. The user has interpreted the transcription “Ag Lab”, a common UK census abbreviation, into “Agriculatural Labourer”. Perhaps the software could do common mappings like this.

Where Born is a reference to a Place. Here the user has interpreted the translation “Leicester” to choose the place Leicester in Leicestershire, England. (not that I’m sure how to store the place reference in an attribute).

What’s not covered?

  • creation of families (not relevnt for UK 1841, but later UK census have Head, Wife, Daughter, Son etc)
  • use “Name” to add an additional name and / or citation to an existing additional name

and probably more

The problem I often see with Census data is that Census takers (& those interviewed) rounded up and rounded down ages… even when they weren’t fibbing.

Some prople figured age as of January 1 or December 31… or actually precisely considered the day of the census and their birthday. So Census can only reliably calculate a birth year range.

Calculated implies a level of accuracy that can’t be achieved with Census within the constraints of GEDCOM. And, for better or worse, the GEDCOM specification has influenced design specs for most genealogy software.

The decrepit GEDCOM standard only supports 1 DATE_APPROXIMATED modifier: About, Calculated, Estimated. Or you can also use a DATE_RANGE: Before, After, Between. But it doesn’t permit combining the attributes as an approximated (calculated) range (between).

(If your range fell within a calendar month or year, you COULD imply a range by dropping the day or month. That is fully compatible with the Calculated spec.)

Gramps currently does support combining dates ranges with approximations! And another current thread here is discussing eliminating that for the GEDCOM export because it fails compliance testing.

It’s a conundrum.

This was a design decision. I didn’t want the user to enter data and then not link a person.

If I were writing this again, from scratch I would probably store the transcription as a marked-up note rather than in attributes.

Yes, that would work. I was thinking of a prefix such as “%”. e.g. “%Age” or “%DOB”.

Remember that this need to be translated into different languages.

Not a good idea. Users always complain about the number of windows.

Consider creating an interpretation section alongside or below the existing table. It could be created dynamically and contains fields such as “Date”, “Place”, “Transcription”, “Transliteration” etc…

Store the place as a handle.

Not for all. I apparently adapted the forms for “interpretation.” Unfortunately, in manual mode.
My db has a lot of parish register books. First of all, I made castom forms to connect people to one event (baptism, marriage, census). After I started add img of book’s page in forms. This image have notes with type “source_text”. This notes is my “transcription”. Then manually I add in this form’s citation a referense to img’s note (source_text). And now, manually, add in events (birth, death, military, occupation and other) references to form’s citation.
This is tiring. And I look forward to such discussions.

Maybe, “interpretation” forms will created like now “Transcription”, with xml mask? I would like to have an aportunity for make forms with “row=events”.

1 Like

Yes. The aim will be to define interpretation fields in the XML form definitions.