Merging data from two databases

5.1.4 win10
Has anything been developed to help me save some time? I’m not a coder.
I have a copy of my tree on Ancestry with 4400 persons. I have collected a lot of sources with these records. I downloaded a copy of the tree and created a new db in GRAMPS to take a look at how well the information was recorded. It is surprisingly good.
I realized that I have not done a good job of creating “residence” events based on census records.

Here is the question. Is there any non-manual way to extract some of these events from the Ancestry db and create them in my FamilyTree db? There maybe 10,000 of these.

Matching the persons might be a small issue, but most should match since I tried to keep the names, DOB and locations the same.

Any ideas? Is even possible?

The best solution I can give as a noncoder myself is using Power Query in Excel or the data import, import the different files (gedcom and XML) using this tools in Excel, that way you can define the column names.
Create a CSV in the Gramps format and save it and import it to Gramps, most of your data would be imported.

Alternatively, you can use one of the Excel to Gedcom addons for Excel found on Sourceforge or at Github.

3rd option is to use the excellent software OpenRefine, import and consolidate the data and export it to CSV or even XML or SQLITE format and then import it to Gramps.

It will be some technical stuff and testing and failing, but you will get clean data out when finished.
You will need to set up a template for either the Gramps XML or the export/import SQLITE format, if nooen that already have done it want to share their OpenRefine Template with you (I don’t have one, sorry)

1 Like

In practical terms, no. Gramps has no bulk merge, and other programs that can do such things can do that only when persons have unique IDs that were exported to GEDCOM and came back unchanged. I did such a trick with Ancestry and PAF long ago, but since Gramps does not create such IDs, you can’t do it with the tree that you have on Ancestry.

Issues like these were discussed long ago, and some progress was made in the sense that Gramps files, which have their own internal IDs, can be merged in a smart way, but not GEDCOMs. You can read more about that on our wiki:

https://www.gramps-project.org/wiki/index.php/GEPS_009:_Import_Export_Merge

Can you tell us a little more about this situation? Have you added sources on both sides, meaning on Ancestry, and in your tree in Gramps? And have you made other changes on both sides? Are both trees about the same size? And if you’d be forced to abandon one, which one would that be?

I have a tree on Ancestry myself, to which I also added some pictures, and which can also be changed by cousins in North America, so I understand your problem quite well, but have no easy answer.

Note: If you have pictures on Ancestry, it’s probably better to use RootsMagic with that site, because it can download your tree with pictures, and store them in a special media folder. And that’s something that you miss when you just download a GEDCOM from that site. RootsMagic also has a merge function that runs way faster than Gramps. There’s even a sort of bulk merge in the paid version.

My main official copy of our Family Tree is in GRAMPS. I have uploaded the same tree to Ancestry and FindMyPast to help with the search of source records and other family members.
Ancestry and FMP are fairly good at finding Census and other records, so you click YES and they are added.
However, I was not good at adding these records in GRAMPS. Now I see how they could help when using the Geography feature and tracking where all the family moved to.
I downloaded a GEDCOM from Ancestry and created a new clean DB to see what was downloaded, and was pleasantly surprised how well Ancestry created the “residence” events in GRAMPS.
So, without weeks of work, can I somehow copy these events from one DB to the other.
Of course, the first thing that comes up is a good mapping of the the personsID from one DB to the other.
One DB has a 5 digit ID number and the Ancestry record is 13 digits. In many cases the names will not be exactly the same nor the DOB.
I think the first job would be to create a mapping table, they you could filter on “residence” events and copy them from one DB to the other.
I have 2034 people with a “residence” event out of 4405. So there may be as many as 6000 events to copy over.
Is there a field in GRAMPS that could be used to store the AncesrtyID in, that could be used in a merge of the records??? Just trying to think outside the box.

After writing the previous reply, I did an export of the “residence” event from my Ancestry DB to a csv file.
It is much more complicated than I thought with references to Place names also. Just too much mapping of the data elements would have to be done.
So, thank for your help and ideas, but if I want to add more of these events I’m just going to have to take more time and do it manually.

Somehow, I think that we need some sort of AI, because this is such a common problem. I have asked about this on StackExchange, and on Facebook, just to find out about software that could compare GEDCOM files, and most of these failed, because they stumbled over differences in IDs, or GEDCOM dialects. And although part of the ID problem was solved when the _UID was introduced in PAF, that’s of no use, because Gramps does not process that. It would have helped if you used RootsMagic, although you’d have to pay to get the bulk merge in that.

And like you discovered, exporting events (via CSV) is no use either, because of their link to places, and their dependance on the persons (or families) they were registered for. That is at least how they exist in GEDCOM, and also how you get them from Ancestry or FMP.

With AI I mean something like RootsMagic’s automatic merges explained here:

Merging Duplicate People - RootsMagic Wiki

This is a feature that you only get when you pay (USD 39.95) and it may actually be worth it, knowing that you already spend some money on Ancestry and FMP, and it can be quite a time saver. It is a bit of a gamble though, because there is no way to test the full version for a few weeks without paying. And its usefulness also depends on the amount of persons that have exact duplicate names and other data. If they are the majority, it might very well pay off.

The Duplicate search merge described in the 2nd paragraph is available in the free version, and it is quite fast, both while searching, and while merging. It has a scoring system, just like Gramps, and it’s also faster, because it automatically moves to the next result when you click Merge (or Not a match, which is also a paid feature). So maybe you like clicking that a few thousand times. :slight_smile:

This may be more outside the box thinking than you expected, also because it is not open source, but the program’s integration with Ancestry and FamilySearch (sync and hints), and FindMyPast, and My Heritage (hints only) may be worth it anyway, especially when you haven’t used many of Gramps’ special features like the place hierarchy or shared events.

The alternative would be hoping that some fellow developers decide to add a smart merge in Gramps, where one can also start with persons that have matching names and DOB etc. If they are the majority of your people, that will also be a big time saver.

P.S. We have attributes where one could store the Ancestry ID, but that is not much use, because in that case, you’d still need to figure out the mapping in some way (manually). This is exactly why most other program use that _UID as introduced by PAF.

6 posts were split to a new topic: TreeMerge - porting an experimental GEDCOM matching tool

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.