Format problems in Gramps XML

Currently, it does some print() or logging via console. The Gtk.TextBuffer() will only display either a string self.text.set_text(_('No file loaded...')) or a list.

Do you mean to pass the curent info log to the Gtk.TextBuffer()?
Something like:

import xmltodict, json
    with open(Path(filename), "rb") as file:
-       self.text.set_text('xmltodict')
        document = xmltodict.parse(file, dict_constructor=dict)
+       self.text.set_text(xmltodict.unparse(document, pretty=True))
-       LOG.info(xmltodict.unparse(document, pretty=True))
+       LOG.info('xmltodict')

I just wanted to avoid the ressources usage on the user side. Also, this has been tested with the content of example.gramps. We will not be able to use something like that for 1Go of Gramps XML data! To ‘pretty print’ only via logging was for me, more safe. Note, pretty print will generate a larger file (also in memory). So, it is nice for displaying a XML file but I did not make robust stress test for Gtk.TextBuffer() limits. I saw that Import gramplet can handle a piece of Gramps XML code, but I am not certain that very large data set will never generate a crash.

This can explain, why I often called clear() during experimentations. On the first versions (more than 10 years ago), ressources were more limited. So, I got some CPU/memory issues with very large Gramps XML file, then quickly added limits to not reproduce these issues anymore.


Do you mean secondary objects like children of the top level Primary object into the hierarchy, or secondary indices into the DB (XML file or SQL tables)?

From such hierarchical view, it seems to me that references (and backreferences) - as interlinks into a flat raw XML database - will have their own tags or Element into the ElementTree. Everywhere there is a location for an attribute object (Event, Person, EventRef, ObjRef, Family, Source via srcattribute list, Object [media]), or sequence with any object ref, there is a secondary record (secondary indices or secondary objects).

  • References and backreferences
/database/events/event[0]/place/@hlink
/database/places/placeobj[0]/@handle
  • Secondary indices (DB) ?
/database/header
/database/header/researcher
/database/header/mediapath/text()
/database/name-formats/format[0]
...
###Attribute sample###
...
/database/events/event[0]/attribute[0]
/database/events/event[0]/attribute[0]/@priv
/database/events/event[0]/attribute[0]/@type
/database/events/event[0]/attribute[0]/@value
/database/events/event[0]/attribute[0]/citationref[0]
/database/events/event[0]/attribute[0]/citationref[0]/@hlink
/database/events/event[0]/attribute[0]/noteref[0]
/database/events/event[0]/attribute[0]/noteref[0]/@hlink
...
/database/events/event[0]/type/text()
/database/events/event[0]/dateval
/database/events/event[0]/description
/database/people/@default
/database/people/@home
/database/people/person[0]/gender
/database/people/person[0]/name[0]/surname
/database/people/person[0]/address[0]
/database/people/person[0]/url[0]
/database/places/placeobj[0]/coord
/database/objects/object[0]/file
/database/notes/note[0]/text/text()
/database/bookmarks
/database/namemaps/map[0]

I am not certain, maybe a possible typo on the surname(s) sequence?
To have such specific groups might help for checking either secondary indices or objects with references. Once with know the Element, to “pretty print” will display the Element itself and the children. To get a map via references is outside of such flat raw resume and maybe should be coded in python somewhere else.

I will check the surname XML hierarchy as the pseudo-documentation needs a fix on this section. We cannot set surname tag twice. ‘‘Fixed’’