How to generate a Gramps xml file outside of Gramps?

OK, got it. I thought that lineage linked meant connections between individuals and families, as used in Gramps and GEDCOM, but now that I read the GedcomX specification, which uses the same term for links between individuals, without families, I see that you’re right.

What other links would you consider then? Properties linked to persons? Group membership, like persons working in an organisation, or as members of a church, or other association?

I believe Evidence-based is another primary form.

But there’s an Event Linked flavor of GEDCOM – although the the FORM declared in its header is still LINEAGE-LINKED

I first started using Gramps as a Source Linked database. There was a 1924 reference booklet that was presented in such a dense & convoluted format that it was necessary to feed the data into a different format to grasp it. I chose Gramps and was hooked.

Years later, I’d like to output that chunk of genealogical data for others readers of that booklet. Unfortunately, extracting just the data supported by that reference is not an option.

One Place studies are place linked. Family Name studies are surname linked.

You can export all of those independently to csv format by using the Export View function.

Maybe a feature request to “Export all data having the <tag>”?

Unfortunately, the Export view doesn’t cover all the data for any Object, nor maintain the structured tree. You have to do too much reconstructive surgery on each export/import cycle.

And the data formatting doesn’t play well with the “comma” delimiter of CSV. (It might work better with TSV - ‘tab’ delimiters.) We use commas & double-quotes too much. (Particularly painful to exchange are: the lat/long format in degrees/minutes/seconds format instead of decimal; and the collated name & place fields.) It works better going out as .odt format but re-assembling the data after manipulating it is still ugly.

Relations: links between any object in the database that will give logical information about the research you are doing. e.g. no limits.

I thought I had a workaround for a few minutes by hanging objects on a Dummy person & exporting just that 1 person.

But there were a few glitches.

The initial experiment was very messy. It required hanging all the different object types on a Person… which meant each Place would need its own Event. And hanging all those Families on a dummy profile was nearly as bad.

Then I realized I could use a Note. Every linked object there would be carried along in the Export. And I’d only have to clean out the dummy person & dummy note after the import.

For the most part that worked… except that there is a bug where the export process double-filters the People. So every Person (& every Family) but the dummy person was dropped in the process. But all the other types of objects carried along as hoped!

That meant two smaller scope goals: 1) getting a bug fixed, and 2) having the view export to add a 3rd format (an internal alternative the existing external options of CSV & Open Office): output to a Gramps Note (either new or append to an existing)… where each ID has an internal link.

(Outputting a filtered view list to a Note has a LOT of other uses.)

I made a PR for that: CSV: possibility to select the dialect. by SNoiraud · Pull Request #1314 · gramps-project/gramps · GitHub

2 Likes

I got an earlier version of your answer in my mail, and I agree. In research, you want to be able to create relations between all sorts of objects without restrictions, and preferably right from the database, and not from an export.

This means that, in an ideal world, it would be more interesting to have a more open format inside the database, and not just in Gramps XML.

1 Like

That takes care of outbound and will be very helpful for porting data outside. Thank you!

But neither the Import CSV nor the Import Text gramplet allow bringing in data with other than Comma delimiters. (A simple copy/paste of cells from Excel comes into the Import Text Gramplet as tab delimited.)

I added this feature to importcsv. Tested and I see all places are created and you don’t need to recreate the hierarchy.

1 Like

That’s great!

Sorry to have to ask instead of trying it myself. My Hotspot is offline until the cellular data plan resets in a couple days. I can read but not download!

I just tried something different. When running the export, I used a custom person filter “Nobody” (which returns values that do not match the filter rule “Everyone”). The output seems to include all objects other than people or families (i.e, places, repositories, sources, and citations, but also events and media). Not sure what I would ever use it for, but found it interesting.

I see that one can change the order of the filtering in the Export Options dialog (by using the “Change order” button) but I haven’t tried that yet. Maybe that would help?

Unfortunately not. It is noted at the end of the bug report (in the Additional Notes section) that the test was repeated with the People rule as first (the default) and last with no difference.

If you want a new database, you can populate the Places and Sources/Repository without having to start from scratch again.

1 Like

Sometimes sites have information that gets lost when it’s converted to standard GEDCOM, and this may be the case with FamilySearch which used the richer GedcomX to communicatie with client programs like getmyancestors, and also programs like Ancestral Quest and RootsMagic, which I use to download parts from the shared family tree.

I do agree that modifying getmyancestors to write Gramps XML is probably overkill, because there is not much that get lost in standard GEDCOM. The only thing that I can think of right now is the name type, and for me that wouldn’t be worth it, because the shared tree is too messy for that. But even when you think that part is important, it may be easier to add a custom _TYPE tag to the getmyancestors output, and make sure that Gramps can read that. The amount of code involved in such a construct is way smaller than adding full support for Gramps XML.

Please note that GEDCOM itself is not as inferior as some may think, because basically it’s just a format, like JSON and XML, meaning that it’s a way to store objects with nested contents and relations beteeen them. This means that with a couple of custom tags, we could make a GEDCOM export that includes all the information that we have in the Gramps database, or Gramps XML. In fact, this is exactly what RootsMagic does, in a very nice way. According to Randy Seaver, it’s the only program that can read its own GEDCOM without losing anything, and I know that it has full support for citation templates, in GEDCOM format.

This does not mean that I’m against creating Gramps XML. I’d love to see a web scraping program that creates Gramps XML extracts from my favorite sites, just like @PLegoux suggested. But that also means that we probably need a smarter import too, to avoid duplicate locations and all that.

1 Like

Yeah, a BeautifulSoup or Selenium add-on for Gramps would be nice for Gramps. ( html - Use Python to Scrape for Data in Family Search Records - Stack Overflow or Web Scraping using Selenium and Python )

But data scraping makes wonder how (or whether) to integrate the data with your painstakingly curated Tree? In many cases, there is an attraction to doing prospecting trips in your research to collect leads. But while collecting, you’re well aware that most of the leads won’t pan out. You just want keep them neat and accessible, not fold them into your Tree prematurely.

1 Like

Note: The name type does exist in GEDCOM 5.5.1, so currently I don’t see much reason to create a Gramps XML writer for getmyancestors.

When I think of webscraping, I don’t necessarily think of Python. That’s partly because I haven’t done any Python in years, and only programmed in C#, but maybe more importantly because it would be nice to have that in a Chrome/Edge or Firefox-add-on, which probably means that it’s better to use JavaScript.

For the integration part, I would suggest a specialized piece of XML, that stores the data of all participants present at an event, and their roles, just like we already store these data when we use the forms gramplet, but in a more generic way, not using attributes, but using a real schema. And you can find a possible example for this, on this source page, of a Dutch site that happens to have a birth record for a Marie Antoinette Evelina Legoux, who was born in Brugge, and who died in Paris:

You can view this page in 4 languages, including French and English, but the interesting part is in the page source, close to the bottom. When you look at that, you can see an XML document, that has all necessary source data in it, meaning the persons, their roles, all about the event, and the source meta data that the site uses to generate a formatted citation. It’s all there, and it’s documented here:

A2A is a local standard, used by all archives in The Netherlands, but this particular record shows that it was also adopted by at least one archive in Belgium, and the daughter’s death record shows that it was also adopted by an archive in France.

Note: The site also has a McCullough family emigrating from Rotterdam.

1 Like

OK, good point, and something that can be addressed, when someone finds time for that. And that leads to questions like:

  1. Would it be worth our (developers’) effort to create an exporter for places, which includes all attached notes, sources, etc? I don’t need that myself, because I work with a single large tree, but it can be a big time saver for users who work with separate ones.

  2. Or could we just as well rely on an external tool, that reads a Gramps backup file, and does this outside Gramps? For me, that would work just as well, and it would allow me to write such a tool in a language that I’m way more proficient in, like C#, and another person might prefer Java for that. And an advanced Python developer could even write an independent tool that reads the pickled data from our database.

My personal view is that there is no real need to create another exporter, because a Gramps XML backup has the full database in it, and any smart person can figure out what to do with it, even without documentation, simply by reading the XML, and trying to make sens of that. That’s reverse engineering, and it can work quite well. It’s also what we do, when we’re confronted with an exotic GEDCOM file, created by a company that has better things to do, than to write documentation for competitors.

And in the case of places, one can also think of a selective import instead.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.