Invalid dates on import because language not installed

Users doing importing have reported that the importer’s date parser may have problems validating non-english date text strings.

If they subsequently have the appropriate language active, forcing a re-parsing of the string allows Gramps to recognize the string as a valid date value.

Perhaps Gramps could have a Date Revalidation tool that helps interactively re-process dates marked invalid. (Or suspiciously valid Events tagged with an ‘imported on’ automatically created tag.)

It could let the user select from the languages known (but not installed) for re-parsing & validation. Then show ONLY the previously invalid dates which are now valid (with pre/post process string/date value). Rather than approve sight unseen, have the user approve (a screen-full at a time, not individually but still allow individual excluding) the re-parsing.

A more advanced option might be to parse that date through all possible languages and list the languages that return a valid date. This provides a ‘hint’ of which language to select for re-parsing.

Maybe different calendars could be selected too?

A worrisome thought is that some dates might be mis-parsed on import and need to be reversed into the invalidated form. Then run back through the parser with the correct language. (Although it might be safer to compare against the original import file and restore any valid dates to an invalid string.)

It would be good to be able to hint to the date parser any ambiguous formats that are DEFINITELY not to be used in the more stringent parse. (e.g., don’t accept any slashed years as Julian dates or a string in ####-##-## format because they are yyyy-dd-mm rather than yyyy-mm-dd order.)

Is this a more a GEDCOM issue as I was under the impression that the GEDCOM 5.5.1 standard only encode the English version of the month names and anything else is just a plain text entry that should according to GEDCOM standard not be interpreted/parse as anything else?

Even mentioned by Tamura Jone.

The FamilySearch GEDCOM 5.5.1 specification defines MONTH like this:

MONTH:={Size=3}
[ JAN | FEB | MAR | APR | MAY | JUN |
JUL | AUG | SEP | OCT | NOV | DEC ]
Where:

JAN         January
FEB         February
MAR         March
APR         April
MAY          May
JUN          June
JUL          July
AUG         August
SEP         September
OCT        October
NOV        November
DEC        December

Bug reports:

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.