Format problems in Gramps XML

bruzzel · March 11, 2024, 2:26pm

Gramps version 5.1.6 on Debian Linux 6.1.76-1 (2024-02-01).
I have been using Gramps for 12 years now, and have a fairly large tree (about 20.000 persons and 30.000 events). In order to correct some mistakes I made years ago, to simplify things (e.g. merge identical sources) and to eventually split it up in separate trees, I’m writing a script (in perl, don’t know Python, sorry) which manipulates the export XML.

Up to now that went very well, but with my latest development Gramps refuses to import the new XML, complaining about missing relations.

Is there any way I can get more information out of it: where in the XML the problem occurs, what items are not correctly linked etc. ? That would enormously facilitate my debugging (the XML is 20 MB!).

ennoborg · March 11, 2024, 2:33pm

You can merge sources with the Isotammi tools, and relations are bidirectional. Can it be that you forgot to change some reference handles?

emyoulation · March 11, 2024, 2:38pm

Have you tried running gramps from Linux’s terminal? There might be additional messages that will be routed there.

There are also command line options for additional debugging features.

bruzzel · March 11, 2024, 2:44pm

The sources are already merged, including the re-referencing to the new single source. The import went well at that stage. What I’m now doing is creating new “Baptism” events where I added a note to Birth (“baptized on the same day”). The new events get new ID’s (E…) counting up from the last E number found, a new handle and change time, and I insert an eventref to the new handle in the where the eventref to the Birth already exists. But somehow, somewhere that went wrong apparently… How to find out where/what?? The error message is rather terse.

emyoulation · March 11, 2024, 2:53pm

You might also explore the import xml module (around where the message occurs) and insert some debugging “print” statements.

(Had to do that for some thumbnailer troubleshooting. Described in another thread)

emyoulation · March 11, 2024, 3:15pm

@Nick-Hall

A common suggestion has been to tweak the XML (with Perl or Grep).

Do the Import modules track the line number of the data file being imported?

If so, can you recommend a standard Gramps debugging line to insert to print the line number of the XML/gpkg file that caused a parsing error?

(Are there any ‘smart’ debugging options? Where the print line is skipped if Gramps isn’t in a debug mode? Or where only the 1st x number of instances are reported to avoid a 30,000 line bebug report.)

bruzzel · March 11, 2024, 3:25pm

The only additional information I received from the command line import is that the problem occurs at 19% of the import file. No idea if that refers to bytes or lines.
I couldn’t get debug to work (what is a LOGGER_NAME ?).
Tweaking the import script is a bit problematic as I don’t speak Python.
I think I’ll write a perl script just to check if every handle has one or more corresponding hlinks, and vice versa. Can’t be that difficult with what I already have.
Thanks for the suggestions so far.

emyoulation · March 11, 2024, 3:26pm

Inserting a simple print line in the Python outputs to the Terminal. And, I think, to the default log file.

(Note that the biggest problem I have with Python is how sensitive it is to white space. If white space is SO critical, why isn’t there an easy option to show placeholder characters for spaces & tabs?)

ennoborg · March 11, 2024, 3:32pm

You can use Visual Studio Code for Python debugging in Gramps. That’s probably the easiest when you run Gramps from source, from the maintance/gramps51 branch. You will need to be able to read enough Python to figure out where you can set a breakpoint. You can tell Visual Studio Code to start Gramps.py in the gramps directory.

You can download the program here

and when you open a Python file, it will automatically download the Python extensions, so you can be up and running quite fast.

Visual Studio Code can run git too.

bruzzel · March 11, 2024, 3:40pm

Writing the check script was indeed a peace of cake, and it revealed the errors (all my mistakes, no problem with Gramps!).
Thanks!

romjerome · April 17, 2025, 1:58pm

Suite du sujet Format problems in Gramps XML :

You should still be able to do that via the “old method”: set an ID on your records! This will generate a new handle on import. +/- like the CSV import. Note, I also played with XML data in the past (migration to Gramps 0.x !!!). For fun, I looked at an old set of gramplets for XML handling. There was some issues. Improvements have been backported to 5.2.x branch for one gramplet, as an experimental addon with built-in ElementTree module (python). Should be easy to hack it for your own need or any print statements.

There is an other one (gramplet, but less flexible), which will group some validation checking and provide a quick debug return.

It does not reinvent the wheel. The idea was to use most gramps’ modules and methods as possible. So, few “external” python stuff.
Both gramplets aim to test some ways and share samples of usage around our data stored into a .gramps

Some investigations are pointing one or two possible improvements and some fix may be missing on last versions available. Anyway, the code still runs with last python versions and I re-checked many part of the code on last weeks. So, I should be able to help if you need to check something very specific on last gramps xml format.

(Gramps 5.2.5 , linux)

Still very experimental, but has been improved on last weeks:

(rewording and reformat)

Project Overview

Objective: To enhance data handling and migration capabilities within Gramps, focusing on XML data and leveraging existing Gramps modules and methods.

Key Concepts

Record Identification:

Utilize record IDs to generate new handles upon import, similar to CSV import methods.
Ensure compatibility and ease of integration with existing Gramps functionalities.

XML Data Handling:

Experience with XML data migration, particularly from older versions of Gramps (0.x).
Exploration of gramplets for XML handling, identifying areas for improvement and potential backporting to the 5.2.x branch.

Technical Details

Gramplets for XML Handling:

First Gramplet:
- Utilizes the built-in ElementTree module in Python.
- Allows for easy customization and debugging with print statements.
- Potential for backporting improvements to the 5.2.x branch as an experimental addon.
Second Gramplet:
- Less flexible but provides quick debug returns and validation checking.
- Aims to leverage Gramps’ modules and methods to avoid reinventing the wheel.

Code Compatibility:

Code runs with the latest Python versions.
Recent re-checks confirm the functionality and compatibility of the code.

Potential Improvements

Identified Issues:

Some fixes may be missing in the latest versions available.
Investigations point to one or two possible improvements.

Support and Assistance:

Availability to help with specific checks on the latest Gramps XML format.
Willingness to share samples of usage and test various approaches around data stored in a .gramps file.

Conclusion

The project aims to enhance Gramps’ data handling capabilities by leveraging XML data and existing Gramps modules. The focus is on improving gramplets for XML handling, ensuring compatibility with the latest Python versions, and providing support for specific checks and improvements.

romjerome · April 20, 2025, 2:35pm

A custom (by content) ‘lite’ Gramps XML file has been generated via an addon, but still need to created some “fake entries” for passing the complete validation tests (DTD, rng, XSD). Also there is a versioning issue if one does not care on current gramps branch: need to match the DTD version set on XML header too, and not only for the gramps’ version.

I did not go too far on these “fake entries” experience. They are all attributes on primary objects (handle, change, type). So, without a valid number for change, you could get an issue on import. Same for the handle.

A little bit strange, as I was not able to fully reproduce some issues reported on

Maybe need to modify the behaviour around Source primary object data import, which during testing, was not able to fully support the ‘old style XML’ (ie. ID value only), or to only limit import for People or Places data. Buffer issue somewhere on XML import with old ID method? Anyway, maybe no one still keeps some old backup files with the ‘old style’. And even such file, there is maybe so many upgrades since Gramps 1.x, that failure might raise at the beginning of the upgrade process (and not on sources mapping with ID only). Maybe a possible “corner” issue (import and custom source data set) for CSV import of Import Gramplet?

This “tiny” Gramps XML generator runs with lxml lib, but maybe few differences with ElementTree API (built-in lib).

emyoulation · April 20, 2025, 3:14pm

Could this be adapted to show an XML of the selected objects (with secondary objects) in the Gramplet instead of (or alternatively to) an external file?

romjerome · April 20, 2025, 4:06pm

I think so. Maybe for the secondary objects support, and for going further (more far into experimentations), need to use 3rd-party lib, like this one:

>>>from elementpath import select, XPath1Parser
>>>from xml.etree import ElementTree
>>>root = ElementTree.XML('<A><B1/><B2><C1/><C2/><C3/></B2></A>')
select(root, '/A/B2/*', parser=XPath1Parser)
[<Element 'C1' at ...>, <Element 'C2' at ...>, <Element 'C3' at ...>

An XPATH (whatever versions) selector could be great. elementpath lib seems to be designed for that. But we could do that without this library. Maybe (not tested):

>>>from lxml import etree # could be also from xml.etree import ElementTree
>>>parser = etree.XMLParser(remove_blank_text=True)
>>>root = etree.fromstring(b'', parser)
>>>print(etree.tostring(root.find(".//surname"), method='xml', encoding='utf-8', pretty_print=True).decode('utf-8'))
>>>print(root.findall(".//surname")) # will be surnames list
>>>for s in surnames:
>>>    print(etree.tostring(s, method='text', pretty_print=False, encoding='utf-8').decode('utf-8'))

Now, as the generated custom Gramps XML has been validated (structure, syntax, content, etc.), and as the file was imported into a new Family Tree. I suppose that I am at the end of the experimentation for customization around a Gramps XML lite-file.

github.com/romjeromealt/addons-source

gramps60/lxml/lxmlGramplet.py

maintenance


      
          #LOG.debug(description)
          
          # relative and absolute paths
          
          src = os.path.join(mediapath, src)
          
          # windows OS ???
          if not src.startswith("/"):
              src = os.path.join(USER_HOME, src)
          
          #LOG.debug(src)
          
          # only images
          
          if mime.startswith("image"):
              thumb = get_thumbnail_path(str(src), mtype=None, rectangle=None)
              #LOG.debug(thumb)
              self.text_page += Html('img', src=str(thumb), mtype=str(mime))
              self.text_page += fullclear
              self.text_page += Html('a', str(description), href=str(src), target='blank', title=str(mime))
              self.text_page += fullclear

As there is also experimentations for a local wrapper (local HTML file)

github.com/romjeromealt/addons-source

lxml/lxmlGramplet.py

2089ba42c


      
                  end_iter = self.text.get_end_iter()
                  format = self.text.register_serialize_tagset()
                  text = self.text.serialize(self.text,
                                          format,
                                          start_iter,
                                          end_iter)
                  LOG.info(text)
                  info = self.text.get_text(start_iter, end_iter, True)
                  self.text.set_text(custom_jsonl + info)
          
          def post(self, html):
              """
              Try to play with request and parse the HTML content.
              """
              try:
                  # Open the HTML file
                  import urllib
                  with urllib.request.urlopen(f'file://{html}') as response:
                      data = response.read()
          
                  # Parse the HTML content

The idea was also to look at Gramps objects into the Family Tree. I looked at diff.py module and addons like scripts for Gram.py or SuperTools, but does it make sense to call addons for an addon… Into the built-in gramps ecosystem, SimpleAccess will be limited to Primary objects (I guess, maybe that’s why it is a simple access!).

Note, these addons are set of experimentations. For going further a dedicated addons might be a better solution, at least for debug. eg., lxml will raise errors on any minor issues. Most of them will pass on python because I suppose it is related to the C stuff (CPython, libxml, libxslt?).

On the other hand a XPATH selector itself as standalone tool but without providing an additional feature for Gramps, will be an other experimentation. Maybe we could have fun with a navigation into a Gramps XML file, but like any gedcom reader or navigator, need to have something missing (lack) on most bloc-note (text reader).

If gedcom 7.0 was a xml file format, maybe this could be an alternate converter or bridge?

Note, I thought that the interlinks counting into etree gramplet were only the relations between objects, but it seems that attributes will be also included into the counted XML tags… So, these attributes records might be displayed, but for a real analysis, need something more advanced, like reports and tools (e.g., Database Differences Report or Import Gramplet).

About secondary objects, to have a simple list could make the XML matching easier (pointer), but handle/hlinks are often for parent records (top object). So, this might generate many lines of code, just for retrieving secondary object from DB. Maybe without need to make a comparison, user does not care of navigation into secondary objects from a Gramps XML file? And comparison without data from DB could be fun during experimentation, but I am not certain that this could be easily tested by myself with current ecosystem? Via orjson ?

romjerome · April 20, 2025, 4:38pm

Currently, it does some print() or logging via console. The Gtk.TextBuffer() will only display either a string self.text.set_text(_('No file loaded...')) or a list.

Do you mean to pass the curent info log to the Gtk.TextBuffer()?
Something like:

import xmltodict, json
    with open(Path(filename), "rb") as file:
-       self.text.set_text('xmltodict')
        document = xmltodict.parse(file, dict_constructor=dict)
+       self.text.set_text(xmltodict.unparse(document, pretty=True))
-       LOG.info(xmltodict.unparse(document, pretty=True))
+       LOG.info('xmltodict')

I just wanted to avoid the ressources usage on the user side. Also, this has been tested with the content of example.gramps. We will not be able to use something like that for 1Go of Gramps XML data! To ‘pretty print’ only via logging was for me, more safe. Note, pretty print will generate a larger file (also in memory). So, it is nice for displaying a XML file but I did not make robust stress test for Gtk.TextBuffer() limits. I saw that Import gramplet can handle a piece of Gramps XML code, but I am not certain that very large data set will never generate a crash.

This can explain, why I often called clear() during experimentations. On the first versions (more than 10 years ago), ressources were more limited. So, I got some CPU/memory issues with very large Gramps XML file, then quickly added limits to not reproduce these issues anymore.

Do you mean secondary objects like children of the top level Primary object into the hierarchy, or secondary indices into the DB (XML file or SQL tables)?

From such hierarchical view, it seems to me that references (and backreferences) - as interlinks into a flat raw XML database - will have their own tags or Element into the ElementTree. Everywhere there is a location for an attribute object (Event, Person, EventRef, ObjRef, Family, Source via srcattribute list, Object [media]), or sequence with any object ref, there is a secondary record (secondary indices or secondary objects).

References and backreferences

/database/events/event[0]/place/@hlink
/database/places/placeobj[0]/@handle

Secondary indices (DB) ?

/database/header
/database/header/researcher
/database/header/mediapath/text()
/database/name-formats/format[0]
...
###Attribute sample###
...
/database/events/event[0]/attribute[0]
/database/events/event[0]/attribute[0]/@priv
/database/events/event[0]/attribute[0]/@type
/database/events/event[0]/attribute[0]/@value
/database/events/event[0]/attribute[0]/citationref[0]
/database/events/event[0]/attribute[0]/citationref[0]/@hlink
/database/events/event[0]/attribute[0]/noteref[0]
/database/events/event[0]/attribute[0]/noteref[0]/@hlink
...
/database/events/event[0]/type/text()
/database/events/event[0]/dateval
/database/events/event[0]/description
/database/people/@default
/database/people/@home
/database/people/person[0]/gender
/database/people/person[0]/name[0]/surname
/database/people/person[0]/address[0]
/database/people/person[0]/url[0]
/database/places/placeobj[0]/coord
/database/objects/object[0]/file
/database/notes/note[0]/text/text()
/database/bookmarks
/database/namemaps/map[0]

I am not certain, maybe a possible typo on the surname(s) sequence?
To have such specific groups might help for checking either secondary indices or objects with references. Once with know the Element, to “pretty print” will display the Element itself and the children. To get a map via references is outside of such flat raw resume and maybe should be coded in python somewhere else.

I will check the surname XML hierarchy as the pseudo-documentation needs a fix on this section. We cannot set surname tag twice. ‘‘Fixed’’

romjerome · April 20, 2025, 5:01pm

Note, you will have to comment or disable all next and previous lines with self.text.set_text() because it will be so fast that ‘pretty printing’ the XML content, should be at the end of the run…

>>>from xml.etree import ElementTree
>>>tree = ElementTree.parse(Path('filename')) #sample with example.gramps
>>>root = tree.getroot()
>>>NAMESPACE = '{http://gramps-project.org/xml/1.7.2/}'
>>>print(root.find(NAMESPACE + 'header'))
>>>print(ElementTree.tostring(root.find(NAMESPACE + 'header')))
>>>for nf in root.find(NAMESPACE + 'name-formats'):
>>>    print(ElementTree.tostring(nf))

will return something like this:

<Element '{http://gramps-project.org/xml/1.7.2/}header' at 0x7f3d9a73a278>
b'<ns0:header xmlns:ns0="http://gramps-project.org/xml/1.7.2/">\n    <ns0:created date="2025-03-18" version="6.0.0" />\n    <ns0:researcher>\n      <ns0:resname>Alex Roitman,,,</ns0:resname>\n    </ns0:researcher>\n    <ns0:mediapath>{GRAMPS_RESOURCES}/example/gramps</ns0:mediapath>\n  </ns0:header>\n  '
b'<ns0:format xmlns:ns0="http://gramps-project.org/xml/1.7.2/" active="1" fmt_str="SURNAME, given (common)" name="SURNAME, Given (Common)" number="-1" />\n  '
...

We have the children hierarchy and content by calling the matching tag Element.
etc.

There is just some minor changes on behavior and xpath handling between ElementTree and the API for lxml. Same for the counting according to their versions.

Also, XML attribute matching does not always make sense as the parent tag/object keeps them as items (Element into the tree). That’s also why I tried to be more direct as possible. find() will match the first tag into the hierarchy. This works fine for Primary objects like Events, People, etc. findall() was used for counting, and maybe if one wants to make a specific iteration or pseudo-mapping. I did not look at others functions as I am not using the most recent python versions (3.10 or 3.12).

>>>for one in root:
>>>    print(one)

<Element '{http://gramps-project.org/xml/1.7.2/}header' at 0x7f6e66cad278>
<Element '{http://gramps-project.org/xml/1.7.2/}name-formats' at 0x7f6e6cb5f188>
<Element '{http://gramps-project.org/xml/1.7.2/}tags' at 0x7f6e6cb5f2c8>
<Element '{http://gramps-project.org/xml/1.7.2/}events' at 0x7f6e6cb5f458>
<Element '{http://gramps-project.org/xml/1.7.2/}people' at 0x7f6e651e9cc8>
<Element '{http://gramps-project.org/xml/1.7.2/}families' at 0x7f6e5fbe3f48>
<Element '{http://gramps-project.org/xml/1.7.2/}citations' at 0x7f6e5f8787c8>
<Element '{http://gramps-project.org/xml/1.7.2/}sources' at 0x7f6e5eb90868>
<Element '{http://gramps-project.org/xml/1.7.2/}places' at 0x7f6e5eb963b8>
<Element '{http://gramps-project.org/xml/1.7.2/}objects' at 0x7f6e5e84f368>
<Element '{http://gramps-project.org/xml/1.7.2/}repositories' at 0x7f6e5e84fb38>
<Element '{http://gramps-project.org/xml/1.7.2/}notes' at 0x7f6e5e8556d8>
<Element '{http://gramps-project.org/xml/1.7.2/}bookmarks' at 0x7f6e5e8705e8>
<Element '{http://gramps-project.org/xml/1.7.2/}namemaps' at 0x7f6e5e870818>

for one in root:
    print(one)
    for two in one:
        print(two)
        for three in two:
            print(three)

<Element '{http://gramps-project.org/xml/1.7.2/}header' at 0x7feaa1100278>
<Element '{http://gramps-project.org/xml/1.7.2/}created' at 0x7feaa1100a98>
<Element '{http://gramps-project.org/xml/1.7.2/}researcher' at 0x7feaa1100408>
<Element '{http://gramps-project.org/xml/1.7.2/}resname' at 0x7feaa2f25e08>
<Element '{http://gramps-project.org/xml/1.7.2/}mediapath' at 0x7feaa2f25f48>
<Element '{http://gramps-project.org/xml/1.7.2/}name-formats' at 0x7feaa2f25ef8>
<Element '{http://gramps-project.org/xml/1.7.2/}format' at 0x7feaa2f25e58>
<Element '{http://gramps-project.org/xml/1.7.2/}tags' at 0x7feaa2f25d18>
<Element '{http://gramps-project.org/xml/1.7.2/}tag' at 0x7feaa2f25b38>
<Element '{http://gramps-project.org/xml/1.7.2/}tag' at 0x7feaa2f25cc8>
<Element '{http://gramps-project.org/xml/1.7.2/}events' at 0x7feaa2f25c28>
<Element '{http://gramps-project.org/xml/1.7.2/}event' at 0x7feaa2f25b88>
<Element '{http://gramps-project.org/xml/1.7.2/}type' at 0x7feaa2f259f8>
<Element '{http://gramps-project.org/xml/1.7.2/}dateval' at 0x7feaa2f258b8>
<Element '{http://gramps-project.org/xml/1.7.2/}place' at 0x7feaa2f256d8>
<Element '{http://gramps-project.org/xml/1.7.2/}description' at 0x7feaa2f25868>
<Element '{http://gramps-project.org/xml/1.7.2/}event' at 0x7feaa2f257c8>
<Element '{http://gramps-project.org/xml/1.7.2/}type' at 0x7feaa2f25778>
<Element '{http://gramps-project.org/xml/1.7.2/}description' at 0x7feaa2f25728>
<Element '{http://gramps-project.org/xml/1.7.2/}event' at 0x7feaa2f255e8>
<Element '{http://gramps-project.org/xml/1.7.2/}type' at 0x7feaa2f25638>
<Element '{http://gramps-project.org/xml/1.7.2/}dateval' at 0x7feaa2f252c8>
<Element '{http://gramps-project.org/xml/1.7.2/}place' at 0x7feaa2f25318>
<Element '{http://gramps-project.org/xml/1.7.2/}description' at 0x7feaa2f25368>
<Element '{http://gramps-project.org/xml/1.7.2/}event' at 0x7feaa2f254a8>
<Element '{http://gramps-project.org/xml/1.7.2/}type' at 0x7feaa2f25458>

etc.

Sorry, there was an old remaining issue with whitespace under some common default paths, like Program Files (Windows OS), Application Support (Mac OS), Google Drive, etc. and gzip(p)ed Gramps XML file.
Feel free to improve, modify the etree gramplet. The “playground” used for the above testing is under the def parse_xml() section. Less than ten lines under a python console should provide the expected information too.

for three in two:
    print(three, three.tag, three.tail, three.attrib, three.items())

will return:
…

     {'hlink': '_a5af0ebb9cb14a540b8', 'role': 'Primary'} [('hlink', '_a5af0ebb9cb14a540b8'), ('role', 'Primary')]
<Element '{http://gramps-project.org/xml/1.7.2/}childof' at 0x7f6483f7fd68> {http://gramps-project.org/xml/1.7.2/}childof 
       {'hlink': '_PQLKQCZXJL39KAJ927'} [('hlink', '_PQLKQCZXJL39KAJ927')]
<Element '{http://gramps-project.org/xml/1.7.2/}citationref' at 0x7f6483f7fdb8> {http://gramps-project.org/xml/1.7.2/}citationref 
     {'hlink': '_c140d26615d00343a2a'} [('hlink', '_c140d26615d00343a2a')]
<Element '{http://gramps-project.org/xml/1.7.2/}first' at 0x7f6483f7fc78> {http://gramps-project.org/xml/1.7.2/}first 
         {} []
<Element '{http://gramps-project.org/xml/1.7.2/}surname' at 0x7f6483f7fcc8> {http://gramps-project.org/xml/1.7.2/}surname 
       {} []
<Element '{http://gramps-project.org/xml/1.7.2/}gender' at 0x7f6483f7fe58> {http://gramps-project.org/xml/1.7.2/}gender 
       {} []
<Element '{http://gramps-project.org/xml/1.7.2/}name' at 0x7f6483f7fea8> {http://gramps-project.org/xml/1.7.2/}name 
       {'type': 'Birth Name'} [('type', 'Birth Name')]
<Element '{http://gramps-project.org/xml/1.7.2/}eventref' at 0x7f6483f7ff98> {http://gramps-project.org/xml/1.7.2/}eventref 
...
    {'hlink': '_a5af0ed68655861efbe', 'role': 'Family'} [('hlink', '_a5af0ed68655861efbe'), ('role', 'Family')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f6483899db8> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_ID6KQC0QKF8901H8ZG'} [('hlink', '_ID6KQC0QKF8901H8ZG')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f6483899e08> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_3LWKQCO1STR7E2WKB5'} [('hlink', '_3LWKQCO1STR7E2WKB5')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f6483899e58> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_PLWKQCF4RWXWG1G60A'} [('hlink', '_PLWKQCF4RWXWG1G60A')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f6483899ea8> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_EMWKQC03WYSNOW7OS2'} [('hlink', '_EMWKQC03WYSNOW7OS2')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f6483899ef8> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_PNWKQC1MHXVPWXURT3'} [('hlink', '_PNWKQC1MHXVPWXURT3')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f6483899f48> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_NUWKQCO7TVAOH0CHLV'} [('hlink', '_NUWKQCO7TVAOH0CHLV')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f6483899f98> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_AVWKQCFEVZ1VAPVY8O'} [('hlink', '_AVWKQCFEVZ1VAPVY8O')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f64838a0048> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_3XWKQCDDBNSGVE84ET'} [('hlink', '_3XWKQCDDBNSGVE84ET')]
<Element '{http://gramps-project.org/xml/1.7.2/}childref' at 0x7f64838a0098> {http://gramps-project.org/xml/1.7.2/}childref 
       {'hlink': '_SXWKQCHK1ZFY3K3U27'} [('hlink', '_SXWKQCHK1ZFY3K3U27')]
<Element '{http://gramps-project.org/xml/1.7.2/}citationref' at 0x7f64838a00e8> {http://gramps-project.org/xml/1.7.2/}citationref 
     {'hlink': '_c140d28db3d175718fb'} [('hlink', '_c140d28db3d175718fb')]
<Element '{http://gramps-project.org/xml/1.7.2/}rel' at 0x7f64838a0188> {http://gramps-project.org/xml/1.7.2/}rel 
       {'type': 'Married'} [('type', 'Married')]
<Element '{http://gramps-project.org/xml/1.7.2/}father' at 0x7f64838a01d8> {http://gramps-project.org/xml/1.7.2/}father 
       {'hlink': '_ZDPKQCR0W4EC0JYQ0H'} [('hlink', '_ZDPKQCR0W4EC0JYQ0H')]
<Element '{http://gramps-project.org/xml/1.7.2/}mother' at 0x7f64838a0228> {http://gramps-project.org/xml/1.7.2/}mother 
       {'hlink': '_HDPKQCVUZ1TN61K6DS'} [('hlink', '_HDPKQCVUZ1TN61K6DS')]
<Element '{http://gramps-project.org/xml/1.7.2/}eventref' at 0x7f64838a0278> {http://gramps-project.org/xml/1.7.2/}eventref 
...
{'hlink': '_c9658726f7b5d7a6086246c1242'} [('hlink', '_c9658726f7b5d7a6086246c1242')]
<Element '{http://gramps-project.org/xml/1.7.2/}ptitle' at 0x7f6482158408> {http://gramps-project.org/xml/1.7.2/}ptitle 
       {} []
<Element '{http://gramps-project.org/xml/1.7.2/}pname' at 0x7f6482158458> {http://gramps-project.org/xml/1.7.2/}pname 
       {'value': 'Bennington'} [('value', 'Bennington')]
<Element '{http://gramps-project.org/xml/1.7.2/}placeref' at 0x7f64821584f8> {http://gramps-project.org/xml/1.7.2/}placeref 
     {'hlink': '_c96587264e44365e02812c02bbe'} [('hlink', '_c96587264e44365e02812c02bbe')]
<Element '{http://gramps-project.org/xml/1.7.2/}ptitle' at 0x7f6482158598> {http://gramps-project.org/xml/1.7.2/}ptitle 
       {} []
<Element '{http://gramps-project.org/xml/1.7.2/}pname' at 0x7f64821585e8> {http://gramps-project.org/xml/1.7.2/}pname 
       {'value': 'Shawnee'} [('value', 'Shawnee')]
<Element '{http://gramps-project.org/xml/1.7.2/}placeref' at 0x7f6482158688> {http://gramps-project.org/xml/1.7.2/}placeref 
...
<Element '{http://gramps-project.org/xml/1.7.2/}bookmark' at 0x7f6481fee688> {http://gramps-project.org/xml/1.7.2/}bookmark 
     {'target': 'person', 'hlink': '_AWFKQCJELLUWDY2PD3'} [('target', 'person'), ('hlink', '_AWFKQCJELLUWDY2PD3')]
<Element '{http://gramps-project.org/xml/1.7.2/}bookmark' at 0x7f6481fee6d8> {http://gramps-project.org/xml/1.7.2/}bookmark 
     {'target': 'person', 'hlink': '_35WJQC1B7T7NPV8OLV'} [('target', 'person'), ('hlink', '_35WJQC1B7T7NPV8OLV')]
<Element '{http://gramps-project.org/xml/1.7.2/}bookmark' at 0x7f6481fee728> {http://gramps-project.org/xml/1.7.2/}bookmark 
     {'target': 'person', 'hlink': '_Q8HKQC3VMRM1M6M7ES'} [('target', 'person'), ('hlink', '_Q8HKQC3VMRM1M6M7ES')]
<Element '{http://gramps-project.org/xml/1.7.2/}bookmark' at 0x7f6481fee778> {http://gramps-project.org/xml/1.7.2/}bookmark 
   {'target': 'family', 'hlink': '_9OUJQCBOHW9UEK9CNV'} [('target', 'family'), ('hlink', '_9OUJQCBOHW9UEK9CNV')]
[]
<Element '{http://gramps-project.org/xml/1.7.2/}map' at 0x7f6481fee8b8> {http://gramps-project.org/xml/1.7.2/}map 
   {'type': 'group_as', 'key': 'Fernández', 'value': 'Fernandez'} [('type', 'group_as'), ('key', 'Fernández'), ('value', 'Fernandez')]

etc.

UPDATE

I enabled some options (Boolean and String). This could be also more flexible for debug or any print statement. One option will dump the file and will print something like that:

{http://gramps-project.org/xml/1.7.2/}database = None [ObjectifiedElement]
    {http://gramps-project.org/xml/1.7.2/}header = None [ObjectifiedElement]
        {http://gramps-project.org/xml/1.7.2/}created = '' [StringElement]
          * date = '2025-03-18'
          * version = '6.0.0'
        {http://gramps-project.org/xml/1.7.2/}researcher = None [ObjectifiedElement]
            {http://gramps-project.org/xml/1.7.2/}resname = 'Alex Roitman,,,' [StringElement]
        {http://gramps-project.org/xml/1.7.2/}mediapath = '{GRAMPS_RESOURCES}/example/gramps' [StringElement]
    {http://gramps-project.org/xml/1.7.2/}name-formats = None [ObjectifiedElement]
        {http://gramps-project.org/xml/1.7.2/}format = '' [StringElement]
          * number = '-1'
          * name = 'SURNAME, Given (Common)'
          * fmt_str = 'SURNAME, given (common)'
          * active = '1'
    {http://gramps-project.org/xml/1.7.2/}tags = None [ObjectifiedElement]
        {http://gramps-project.org/xml/1.7.2/}tag = '' [StringElement]
          * handle = '_bb80c229eef1ee1a3ec'
          * change = '1288512479'
          * name = 'complete'
          * color = '#076780873bf0'
          * priority = '1'
        {http://gramps-project.org/xml/1.7.2/}tag = '' [StringElement]
          * handle = '_bb80c2b235b0a1b3f49'
          * change = '1288512442'
          * name = 'ToDo'
          * color = '#efb60c280c28'
          * priority = '0'
    {http://gramps-project.org/xml/1.7.2/}events = None [ObjectifiedElement]
        {http://gramps-project.org/xml/1.7.2/}event = None [ObjectifiedElement]
          * handle = '_a5af0eb667015e355db'
          * change = '1284030602'
          * id = 'E0000'
            {http://gramps-project.org/xml/1.7.2/}type = 'Birth' [StringElement]
            {http://gramps-project.org/xml/1.7.2/}dateval = '' [StringElement]
              * val = '1987-08-29'
            {http://gramps-project.org/xml/1.7.2/}place = '' [StringElement]
              * hlink = '_08TJQCCFIX31BXPNXN'
            {http://gramps-project.org/xml/1.7.2/}description = 'Birth of Warner, Sarah Suzanne' [StringElement]
        {http://gramps-project.org/xml/1.7.2/}event = None [ObjectifiedElement]
          * handle = '_a5af0eb696917232725'
          * change = '1284030602'
          * id = 'E0001'
            {http://gramps-project.org/xml/1.7.2/}type = 'LVG' [StringElement]
            {http://gramps-project.org/xml/1.7.2/}description = 'Custom FTW5 tag to specify LIVING not specified in GEDCOM 5.5' [StringElement]
        {http://gramps-project.org/xml/1.7.2/}event = None [ObjectifiedElement]
          * handle = '_a5af0eb698f29568502'
          * change = '1284030602'
          * id = 'E0002'
            {http://gramps-project.org/xml/1.7.2/}type = 'Birth' [StringElement]
            {http://gramps-project.org/xml/1.7.2/}dateval = '' [StringElement]
              * val = '1928-07-09'
            {http://gramps-project.org/xml/1.7.2/}place = '' [StringElement]
              * hlink = '_1GTJQCCXZ3YO5QOFS'
            {http://gramps-project.org/xml/1.7.2/}description = 'Birth of Garner, Howard Lane' [StringElement]
        {http://gramps-project.org/xml/1.7.2/}event = None [ObjectifiedElement]
          * handle = '_a5af0eb69b82a6cdc5a'
          * change = '1284030612'
          * id = 'E0003'
            {http://gramps-project.org/xml/1.7.2/}type = 'Birth' [StringElement]
            {http://gramps-project.org/xml/1.7.2/}place = '' [StringElement]
              * hlink = '_Q8VJQCBTTFJ6B54QBI'
            {http://gramps-project.org/xml/1.7.2/}description = 'Birth of Schultz, John' [StringElement]

There is also a recover option which should (in theory) pointing to a possible parsing issue by stopping the stream. I made a quick test with a malformed file using .gramps extension and correct DTD version:

test.xml:291: parser error : Opening and ending tag mismatch: event line 286 and evant
    </evant>

but it is xmllint stuff, so I am not certain that this will work fine under Windows (AOI) or Mac OS bundle, within a lxml object/element. The standalone cmd for xmllint is ok. 3rd-party lib like lxml might have a different support according to OS.

The debug_xml will also look at xmltodict 3rd party lib (if exists) and some lxml options/functions. I just grouped some command lines or tools for generating some print or logging statements. These options could make to debug less confusing.

Topic		Replies	Views
Rewrite Gramps XML importer? Beta Testing hacks	12	156	April 15, 2025
Gramps XML version format Development	13	113	May 23, 2025
Upgrade database [XML version 0.0.0 from Gramps v2.x] Help data-recovery	7	317	February 1, 2023
Import Text Gramplet (Unrecognized imported Death) Help bug-filed , data-import	7	68	October 5, 2024
Import very old [Gramps 3.0.4] database Help data-recovery	5	580	December 4, 2021