GEDCOM import - treatment of REFN and TYPE

Before submitting a feature request, or attempting to hack the code myself, I thought it best to ask why something is the way it currently is.

When I import a GEDCOM file from WikiTree (for example), it has this kind of data for each person:

1 REFN 12345678
2 TYPE wikitree.user_id
1 REFN 87654321
2 TYPE wikitree.page_id
1 REFN 40
2 TYPE wikitree.privacy

Gramps stores this as three attributes of the same type, “REFN”, with the values 12345678, 87654321, and 40, respectively. To each attribute, it attaches a note of type “REFN_TYPE”, with text “wikitree.user_id”, “wikitree.page_id”, and “wikitree.privacy” respectively.

Wouldn’t it be more useful to create three attributes having distinct types “wikitree.user_id”, “wikitree.page_id”, and “wikitree.privacy”, and do away with the notes? What are the reasons for the current design?

I’m not very familiar with GEDCOM, so maybe there are other uses of REFN and TYPE where the current design makes more sense. But in that case, should there be an option to do it one way or the other?

If you’re doing hacks for non-standard GEDCOM tags, the first thing I usually do is try to create a sample of the data in the way that I’d want Gramps to receive it. Then export that as a GEDCOM for reference.

As an example with WikiTree, I created a 1 person GEDCOM for President George Washington. It has 1 person, 2 sources (a WikiTree page for George, another for the Presidents Project page) and a Internet Website for George’s page.

0 HEAD
1 SOUR Gramps
2 VERS GrampsAIO64-5.1.3-2
2 NAME Gramps
1 DATE 23 JAN 2021
2 TIME 12:35:54
1 SUBM @SUBM@
1 FILE C:\Users\<username>\Desktop\Captures\GeorgeWashington.ged
1 COPR Copyright (c) 2021 .
1 GEDC
2 VERS 5.5.1
2 FORM LINEAGE-LINKED
1 CHAR UTF-8
1 LANG English
0 @SUBM@ SUBM
1 NAME
1 ADDR
2 CONT USA
2 CTRY USA
0 @I000000@ INDI
1 NAME George /Washington/
2 GIVN George
2 SURN Washington
1 SEX M
1 SOUR @S000000@
2 PAGE https://www.wikitree.com/wiki/Washington-11
2 DATA
3 DATE 20 DEC 2021
1 SOUR @S000001@
2 PAGE https://www.wikitree.com/wiki/WikiTree-6
1 WWW https://www.wikitree.com/wiki/Washington-11
1 CHAN
2 DATE 23 JAN 2021
3 TIME 18:35:07
0 @S000000@ SOUR
1 TITL WikiTree (Person)
1 CHAN
2 DATE 23 JAN 2021
3 TIME 18:31:57
0 @S000001@ SOUR
1 TITL WikiTree (Page)
1 CHAN
2 DATE 23 JAN 2021
3 TIME 18:34:50
0 TRLR

For a WikiTree converter, I’d probably convert the wikitree.user_id and wikitree.page_id to something usable in Gramps. The Privacy flag seems fairly extraneous

In hopes of eventually having a reference ID search page feature for Gramps, I have been adding citations to a WikiTree source where the Page is the ID. (“Washington-11” for George Washington on WikiTree.) The search parameters change as websites evolve. But the person IDs on WikiTree have been pretty stable.

So, what you might do is add an intercept to the REFN conversion that looks for the wikitree specific TYPEs and convert them in the same fashion as a WWW or a SOURce chunk.

IF you want to get rid of these Attributes the Set Attribute Tool delete function will do it. The tool only works on Person attributes.

Is it the same set of Notes shared between relevant Attributes or does each Attribute have a unique note?

Thanks for all of the responses so far.

Thanks, but no, I want to keep them.

Yes, besides the REFN and TYPE lines, the GEDCOM file generated by WikiTree also includes lines like this:

1 WWW https://www.WikiTree.com/Washington-11

And Gramps uses that to create a “Web Home” type entry on the person’s “Internet” tab.

That’s good to hear, but I think storing the other values as well might be a good safety precaution. It says here that “When an LNAB [last name at birth] is changed, in the background we need to create a new WikiTree profile and merge the existing profile into it.”

I think there are seven different levels, as described here.

Yes, I think I’ve found the relevant code in gramps/plugins/lib/libgedcom.py but I’m not very good with python and don’t want to waste my time if there are important reasons why it was coded the way it is.

I would submit a feature request asking for a change, or for the option of doing it differently, if the current method is just arbitrary rather than purposeful. Can anyone say?

My understanding is that the current method is a generic method of preserving data in non-conforming chunks. And most programs HAVE to twist the chucks of this grossly outdated standard.

If you wanted to clone the current GEDCOM and make a WikiTree flavored GEDCOM… more power to you! (That’s why there is a GED2 add-on.)

But deforming the standard GEDCOM 5.5.1 tool to import or export in a non-standard way would be a bad idea. We’d take a lot of justifiable fire from the people who do standards testing.

No idea, but maybe you should look at making it a GEDCOM extension addon?

Gramps was originally created when GEDCOM was at version 2 or 3, and Gramps features have not always been kept in sync with GEDCOM features. REFN, RIN, AFN etc. were added during later versions, so we tried to support them as best as possible.

So for REFN, in order to make it a transparent as possible for GEDCOM import/export, the attribute was set to REFN, (making it clear that that is how to export it again), and the value was stored. That left REFN.TYPE to deal with. The author decided to put that in a note.

I suppose that we could have done special code to recognize the REF_TYPE note and export it the way it originally looked attached to the REFN record, but I see that we did not do that.

1 Like

Just ran across a facebook posting of another website on v 5.5 GEDCOM compatibility & feature support testing: “GEDCOM Assessment

Here’s the Gramps 5 test, the list of applications tested, and their test GEDCOM file download.

Their test result actually creates a workable list for making our importer more flexible. Not only does it give the actual test value with pass/fail results, it grades the quality of import on a field-by-field basis. Noting the level of data preservation on fails.

Even when the results are cryptic, it links back to a description of the data, what the ideal results would be & how specific incompatibilities would be reported.