How to generate a Gramps xml file outside of Gramps?

I made a PR for that: CSV: possibility to select the dialect. by SNoiraud · Pull Request #1314 · gramps-project/gramps · GitHub

2 Likes

I got an earlier version of your answer in my mail, and I agree. In research, you want to be able to create relations between all sorts of objects without restrictions, and preferably right from the database, and not from an export.

This means that, in an ideal world, it would be more interesting to have a more open format inside the database, and not just in Gramps XML.

1 Like

That takes care of outbound and will be very helpful for porting data outside. Thank you!

But neither the Import CSV nor the Import Text gramplet allow bringing in data with other than Comma delimiters. (A simple copy/paste of cells from Excel comes into the Import Text Gramplet as tab delimited.)

I added this feature to importcsv. Tested and I see all places are created and you don’t need to recreate the hierarchy.

1 Like

That’s great!

Sorry to have to ask instead of trying it myself. My Hotspot is offline until the cellular data plan resets in a couple days. I can read but not download!

I just tried something different. When running the export, I used a custom person filter “Nobody” (which returns values that do not match the filter rule “Everyone”). The output seems to include all objects other than people or families (i.e, places, repositories, sources, and citations, but also events and media). Not sure what I would ever use it for, but found it interesting.

I see that one can change the order of the filtering in the Export Options dialog (by using the “Change order” button) but I haven’t tried that yet. Maybe that would help?

Unfortunately not. It is noted at the end of the bug report (in the Additional Notes section) that the test was repeated with the People rule as first (the default) and last with no difference.

If you want a new database, you can populate the Places and Sources/Repository without having to start from scratch again.

1 Like

Sometimes sites have information that gets lost when it’s converted to standard GEDCOM, and this may be the case with FamilySearch which used the richer GedcomX to communicatie with client programs like getmyancestors, and also programs like Ancestral Quest and RootsMagic, which I use to download parts from the shared family tree.

I do agree that modifying getmyancestors to write Gramps XML is probably overkill, because there is not much that get lost in standard GEDCOM. The only thing that I can think of right now is the name type, and for me that wouldn’t be worth it, because the shared tree is too messy for that. But even when you think that part is important, it may be easier to add a custom _TYPE tag to the getmyancestors output, and make sure that Gramps can read that. The amount of code involved in such a construct is way smaller than adding full support for Gramps XML.

Please note that GEDCOM itself is not as inferior as some may think, because basically it’s just a format, like JSON and XML, meaning that it’s a way to store objects with nested contents and relations beteeen them. This means that with a couple of custom tags, we could make a GEDCOM export that includes all the information that we have in the Gramps database, or Gramps XML. In fact, this is exactly what RootsMagic does, in a very nice way. According to Randy Seaver, it’s the only program that can read its own GEDCOM without losing anything, and I know that it has full support for citation templates, in GEDCOM format.

This does not mean that I’m against creating Gramps XML. I’d love to see a web scraping program that creates Gramps XML extracts from my favorite sites, just like @PLegoux suggested. But that also means that we probably need a smarter import too, to avoid duplicate locations and all that.

1 Like

Yeah, a BeautifulSoup or Selenium add-on for Gramps would be nice for Gramps. ( html - Use Python to Scrape for Data in Family Search Records - Stack Overflow or Web Scraping using Selenium and Python | ScrapingBee )

But data scraping makes wonder how (or whether) to integrate the data with your painstakingly curated Tree? In many cases, there is an attraction to doing prospecting trips in your research to collect leads. But while collecting, you’re well aware that most of the leads won’t pan out. You just want keep them neat and accessible, not fold them into your Tree prematurely.

1 Like

Note: The name type does exist in GEDCOM 5.5.1, so currently I don’t see much reason to create a Gramps XML writer for getmyancestors.

When I think of webscraping, I don’t necessarily think of Python. That’s partly because I haven’t done any Python in years, and only programmed in C#, but maybe more importantly because it would be nice to have that in a Chrome/Edge or Firefox-add-on, which probably means that it’s better to use JavaScript.

For the integration part, I would suggest a specialized piece of XML, that stores the data of all participants present at an event, and their roles, just like we already store these data when we use the forms gramplet, but in a more generic way, not using attributes, but using a real schema. And you can find a possible example for this, on this source page, of a Dutch site that happens to have a birth record for a Marie Antoinette Evelina Legoux, who was born in Brugge, and who died in Paris:

You can view this page in 4 languages, including French and English, but the interesting part is in the page source, close to the bottom. When you look at that, you can see an XML document, that has all necessary source data in it, meaning the persons, their roles, all about the event, and the source meta data that the site uses to generate a formatted citation. It’s all there, and it’s documented here:

A2A is a local standard, used by all archives in The Netherlands, but this particular record shows that it was also adopted by at least one archive in Belgium, and the daughter’s death record shows that it was also adopted by an archive in France.

Note: The site also has a McCullough family emigrating from Rotterdam.

1 Like

OK, good point, and something that can be addressed, when someone finds time for that. And that leads to questions like:

  1. Would it be worth our (developers’) effort to create an exporter for places, which includes all attached notes, sources, etc? I don’t need that myself, because I work with a single large tree, but it can be a big time saver for users who work with separate ones.

  2. Or could we just as well rely on an external tool, that reads a Gramps backup file, and does this outside Gramps? For me, that would work just as well, and it would allow me to write such a tool in a language that I’m way more proficient in, like C#, and another person might prefer Java for that. And an advanced Python developer could even write an independent tool that reads the pickled data from our database.

My personal view is that there is no real need to create another exporter, because a Gramps XML backup has the full database in it, and any smart person can figure out what to do with it, even without documentation, simply by reading the XML, and trying to make sens of that. That’s reverse engineering, and it can work quite well. It’s also what we do, when we’re confronted with an exotic GEDCOM file, created by a company that has better things to do, than to write documentation for competitors.

And in the case of places, one can also think of a selective import instead.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.