_WLNK events (weblinks) are transformed when exported to GEDCOM

Gramps 5.1.4 running under Ubuntu 20.04.2

Ancestry.com exports Weblinks as a _WLNK event in their exported GEDCOM.
example: Weblink in GEDCOM from Ancestry.com

1 _WLNK http://dl.antenati.san.beniculturali.it/v/Archivio+di+Stato+di+Bari/Stato+civile+della+restaurazione/Gioia/Matrimoni/1857/005616807_02768.jpg.html
2 TITL 1857 #77 Nicolangelo Ferrara + Maria Luigia Lionetti marriage

Gramps stores this _WLNK as an event with the _WLNK URL put in the Event Description field and the associated TITL ignored.

When I export a subset of this database to a new GEDCOM from Gramps the _WLNK event gets chopped up into this:

1 EVEN http://dl.antenati.san.beniculturali.it/v/Archivio+di+Stato+di+Bari/Stat
2 CONC o+civile+della+restaurazione/Gioia/Matrimoni/1857/005616807_02768.jpg.ht
2 CONC ml
2 TYPE _WLNK

I can see what is happening but unfortunately the Topola Viewer doesn’t deal with this well and shows only the first line in the _WLNK “event”:
http://dl.antenati.san.beniculturali.it/v/Archivio+di+Stato+di+Bari/Stat

This could be a bug in the Topola Viewer but is there a way around it? The viewer deals just fine with the original _WLNK from Ancestry.com but I am using Gramps to spit the database to share with specific DNA matches.

I am not sure why it is being stored as an event because information is lost anyway by putting the URL into the Description field then the associated TITL has no where to go and is lost.

I hear you say possibly that this is a Topola Viewer issue but thought I would raise it here just in case there might be a workaround (I am thinking Isotammi)
Thanks,
Cindy

1 Like

The Gramps data model has a special URL structure for Internet links.

However, it is only accessible from the People, Place & Repository objects in the current release. I’ve seen 5.2 targeted changes that will extend URLs to other objects.

Since it sounds like the GEDCOM is importing _WLNK into Person object Events (they might also exist for Family objects), that data should be compatible with Gramps URLs.

Generally, a leading underscore in a block identifier indicates a custom data structure. So Ancestry probably has implemented a non-standard dialect of GEDCOM so that they support weblinks. (URIs are defined structures in the new GEDCOM7 specification but not in the original 5.5.1 version.) Maybe the GEDCOM Extensions import can be adapted to put the _WLNK data into a comparable Gramp URL object so that the Title is not discarded?

(Is the TITL data discarded? Please look for a custom attribute in the Event Editor for the TITL data. Normally, the Gramps GEDCOM import is very conscientious about preserving data it doesn’t recognize.)

We’ll have to ask @kku (Kari) if SuperTool can be used to rewrite _WLNK Events into URL objects.

What form Topola Viewer expects to see for URLs is a question for their forum.

Yes it looks like the TITL data is discarded. There are no attributes either in the Reference Information or the Shared information.

This weblink is attached to the person only in Ancestry and not to the Family (this marriage for instance I had to add a weblink to both parties). It is not attached to an event like a source is. It would make more sense for the _WLNK to be imported to a URL structure tied to a person at import time since it isn’t attached to an event.

Yes I think it could be done

But part of information seems loose at gedcom import and it can’t be set as the URL description.

@CProkofiev: are you sure there is no GEDCOM Import note created to store that url title?

1 Like

I’ve been thinking about the Ancestry dialect of GEDCOM that @CProkofiev (Cindy) and @ennoborg (Enno) have described.

Maybe it would be better to look at forking the “GEDCOM Extensions” add-on into an Ancestry specific dialect reader?

I think that we are all on the same page that importing any foreign format into a blank Gramps Tree is the preferred 1st stage when merging data into an existing Gramps Tree. It is the most likely Place to have a failure so a blank Tree minimizes the chance of collateral damage. And, once there have 2 versions of a GEDCOM from the same external source, you can do a DIFF with Import Merge Tool add-on.

But also, it would create an opportunity to explore ways of importing the Ancestry IDs and storing them as custom attributes. Then those might be usable for disambiguating syncing records with external archives… like Ancestry.

We know that the GEDCOM7 is going to support sharing ID aliases from the originating systems. We just don’t know when GEDCOM7 samples will start appearing in the wild.

You are right. That is exactly where they went. I was looking before in the attributes and I missed this. Here they all are for one record (there are some entries for the media object too)
Records not imported into INDI (individual) Gramps ID I312274516055:

Line ignored as not understood
Line 5548: 2 TITL 1835 Pt1 #415 Maria Luigia Lionetti birth Bari Gioia Nati
Line ignored as not understood
Line 5550: 2 TITL 1857 #77 Nicolangelo Ferrara + Maria Luigia Lionetti marriage
Line ignored as not understood
Line 5557: 2 _CROP
Line ignored as not understood
Line 5558: 3 _LEFT 53
Line ignored as not understood
Line 5559: 3 _TOP 38
Line ignored as not understood
Line 5560: 3 _WDTH 288
Line ignored as not understood
Line 5561: 3 _HGHT 288
1 Like

When I look at the IDs in Ancestry GEDCOMs, with 12 digits, I get the impression, which may be wrong, that they are designed to make every person on Ancestry unique, so that all user trees can be stored in a single database, and be identified by their ID.

This means that, when you don’t delete your tree from Ancestry, and then upload another one, the IDs will probably be quite stable, and can be possibly used to merge changes when you download the same tree again.

But at this moment I don’t believe that these IDs are as persistent as the _UID that was probably introduced by PAF, more that 20 years ago, and that is supported by more than a dozen programs, as listed here:

https://www.tamurajones.net/The_UIDTag.xhtml

This _UID is technically equivalent to the handles that we use inside Gramps, although the PAF variant has a checksum that a normal UUID does not have. The point though is, that it’s unique, and supposed to be persistant, meaning that a program should never change it. And that’s why it’s an attribute, and not a regular ID, or pointer, in GEDCOM terms.

When you upload a GEDCOM with _UID tags to Ancestry, and download it later, their values will still be there, although the tags will appear as UID in the downloaded file. And I know by experience that, when you replace UID with _UID, PAF will process them, and make it possible to merge persons by unique ID (meaning _UID), and PAF is so smart, that it can automatically merge all persons that have identical data, leaving the persons with changes to the user. This type of merge is very fast and powerful, and it should be possible in Gramps too.

We do have support for importing _UID tags, and I have code to generate these too, and they don’t exclude support for aliases either, but I want to mention them, because they are generated by dozens of existing programs, like Ancestral Quest, Legacy, PAF, Reunion, and RootsMagic.

And in theory, they are way more smart ways to improve merging, but I won’t describe them right now.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.