Example.gramps and 1.7.2 grampsxml.dtd

On future Gramps 6.0, it seems that there is a minor schema update/upgrade around eventref and citationref.

-<!DOCTYPE database PUBLIC "-//GRAMPS//DTD GRAMPS XML V1.7.1//EN"
                "http://gramps-project.org/xml/1.7.1/grampsxml.dtd"
+<!DOCTYPE database PUBLIC "-//GRAMPS//DTD GRAMPS XML V1.7.2//EN"
                "http://gramps-project.org/xml/1.7.2/grampsxml.dtd"

-<!ELEMENT eventref (attribute*, noteref*)>
+<!ELEMENT eventref (attribute*, noteref*, citationref*)>

Maybe ‘example.gramps’ should also set ‘1.7.2’ instead of ‘1.7.1’ into its header?

"http://gramps-project.org/xml/1.7.1/grampsxml.dtd">
<database xmlns="http://gramps-project.org/xml/1.7.1/">
...
      <eventref hlink="_a5af0ecb107303354a0" role="Primary">
        <attribute type="Father Age" value="28">
          <citationref hlink="_c140d7c7bdc1fedb030"/>
        </attribute>
 ->     <citationref hlink="_c140d5e5dbf300411bf"/>
...

A current mixup between 1.7.1 and 1.7.2 versions ?

Jérôme

1 Like

Maybe version should be also set into ‘libgrampsxml.py’ ?

For info, I get stuck by trying to update an addon. So, either publish 1.7.2 version on gramps hosting

or remove the addition (one line) on example.gramps for Gramps 6.0.

This is cosmetic, as Gramps does not really check (validation) dtd, rng (or xsd). So, currently, only a maintenance issue around an experimental addon with a 3rd-party lib (lxml). Anyway, it is just for having something up-to-date on xml documentation and file format.

I’ll ask Sam to publish the v1.7.2 schema.

2 Likes

Thank you. It was not very important and I was able to skip the warning message by making tests offline.

For information, I was able to run the validations check (dtd, rng, xsd).
Just get a cryptic “error : Unknown IO error” from console. This could be related to some old methods used into the gramplet.

Also, one line generating an error, has been commented:

File ".gramps/gramps52/plugins/lxml/lxmlGramplet.py", line 360, in ParseXML
    self.text.set_text(_('Parsing file...'))
AttributeError: 'list' object has no attribute 'set_text'

I suppose that I should use something like self.text[0] somewhere into the gramplet, but I do not know why there is now an AttributeError. Anyway, it was just for displaying a message, no one will see it as the gramplet will quickly display an updated text.

Jérôme

Maybe it is not new… but I cannot import my custom .gramps with data set, anymore !

It is related to this section into ‘importxml.py’:

def inaugurate_id(self, id_, key, prim_obj):
        """
        Equivalent of inaugurate but for old style XML.
        """
        if id_ is None:
            raise GrampsImportError(
                _("The Gramps Xml you are trying to " "import is malformed."),
                _("Attributes that link the data " "together are missing."),
            )

The data set (surnames, places and sources) was filed into a “raw-flat” .gramps file:

<database xmlns="http://gramps-project.org/xml/1.7.2/">
  <people>
    <person>
      <name>
        <surname>Abbott</surname>
      </name>
    </person>
    <person>
      <name>
        <surname>Abot</surname>
      </name>
    </person>
    <person>
      <name>
        <surname>&#37428;&#26408;</surname>
      </name>
    </person>
  </people>
  <places>
    <placeobj>
      <pname value="AK"/>
    </placeobj>
    <placeobj>
      <pname value="AL"/>
    </placeobj>
    <placeobj>
      <pname value="&#931;&#953;&#940;&#964;&#953;&#963;&#964;&#945;"/>
    </placeobj>
  </places>
  <sources>
    <source>
      <stitle>All possible citations</stitle>
    </source>
    <source>
      <stitle>Baptize registry 1850 - 1867 Great Falls Church</stitle>
    </source>
    <source>
      <stitle>Import from test2.ged</stitle>
    </source>
    <source>
      <stitle>World of the Wierd</stitle>
    </source>
  </sources>
</database>
 (part, section 'example.gramps' lite)

It seems that we need to assign a random id attribute (id_) on the modified .gramps file. Not really a problem because I made few custom importations by this way, in the past.

Jérôme

It still works, but only for few primary objects.
e.g., for places or people

<database xmlns="http://gramps-project.org/xml/1.7.2/">
   <people>
    <person id="">
      <name>
        <surname>Abbott</surname>
      </name>
    </person>
  </people>
  <places>
    <placeobj id="">
      <pname value="AK"/>
    </placeobj>
 </places>
 <sources>
   <source id="">
    <stitle>Import from test2.ged</stitle>
   </source>
  </sources>
</database>

So, one could import an external flat database, at a glance. In this case, a place database or a list of surnames, but could be census, individuals, events, sources, notes, etc.

About internal handles, it makes sense to keep them for an update or comparison. So, I do not want (or try to limit) any assignation on handle for this test.

Anyway, I wonder if a gramps xml reader viewer/reader could be useful? Maybe via a web app or a local one. I saw that streamlit provides a complete ecosystem for that, with few lines of code. Why not a transformer ? CSV Import or Text Import addon are providing something like this.

With gramps xml file format we have some validation tools. So, either play with the flexibility of gramps or let the control and check of data to xml file format (markups, tags, elements and attributes). This experimental list needs a minor update for Gramps 6.0 (maybe 5.2), but it gives a quick overview of a possible hierarchical “template” for looking at the content of our gramps xml files.

Nick,

I was no more able to provide a Pull Request (github account and n check issues).

-GRAMPS_XML_VERSION_TUPLE = (1, 7, 1)  # version for Gramps 4.2
+GRAMPS_XML_VERSION_TUPLE = (1, 7, 2)  # version for Gramps 6.0
1 Like

Around People and Families because it is related to Eventref.

/database/people/person[0]/eventref[0]/noteref[0]
/database/people/person[0]/eventref[0]/noteref[0]/@hlink
+/database/people/person[0]/eventref[0]/citationref[0]
+/database/people/person[0]/eventref[0]/citationref[0]/@hlink
..

As ‘example.gramps’ was used for generating the ‘grampsxml.xsd’ file, any missing location on the sample, was also missing into the list.

$ sudo io.elementary.code /usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py
or whatever code editor or IDE

A minor change :

 def inaugurate_id(self, id_, key, prim_obj):
        """
        Equivalent of inaugurate but for old style XML.
        """
-        if id_ is None:
+        if id_ is None and key !=2:

Should let you import source titles too. I suppose that there is a relation with citation object (mapping list ?). So, need more investigations for something “clever” and proper (large project for an import at a glance).
Jérôme

Do you want to make a PR for that?

I am still looking if this could be a good improvement… I suppose that it might be a very quick (and safe ?) solution on large import for sources (with or without citations), also for dataset (e.g., census).

KEY=2 will only match source object, but I do not understand this custom behavior. I saw the comment about the legacy model (prior to gramps 5.0), but neither sourceref nor source creation seem to call inaugurate_id() in this case (id_== None). Need to look at citation and inaugurate(). I suppose that I should launch a gramps session with traceback.

Doug,

The improvement might be on:

 def start_source(self, attrs):
        """
        Add a source object to db if it doesn't exist yet and assign
        id, privacy and changetime.
        """
        self.update(self.p.CurrentLineNumber)
        self.source = Source()
        if "handle" in attrs:
            orig_handle = attrs["handle"].replace("_", "")
            is_merge_candidate = (
                self.replace_import_handle and self.db.has_source_handle(orig_handle)
            )
            self.inaugurate(orig_handle, "source", self.source)
            gramps_id = self.legalize_id(
                attrs.get("id"),
                SOURCE_KEY,
                self.sidswap,
                self.db.sid2user_format,
                self.db.find_next_source_gramps_id,
                self.db.has_source_gramps_id,
            )
            self.source.set_gramps_id(gramps_id)
            if is_merge_candidate:
                orig_source = self.db.get_source_from_handle(orig_handle)
                self.info.add("merge-candidate", SOURCE_KEY, orig_source, self.source)
        else:  # old style XML
            self.inaugurate_id(attrs.get("id"), SOURCE_KEY, self.source)
        self.source.private = bool(attrs.get("priv"))
        self.source.change = int(attrs.get("change", self.change))
        self.info.add("new-object", SOURCE_KEY, self.source)
        if self.default_tag:
            self.source.add_tag(self.default_tag.handle)
        return self.source

a simple check (about None Type) could be a proper solution.

Ah, yes, sorry, maybe a pull request for this one (6.0 branch)?

I still do not fully understand this custom behavior, but I know where it occurs:

if "handle" in attrs:
   ...
else:  # old style XML
    self.inaugurate_id(attrs.get("id"), SOURCE_KEY, self.source)
 File "/usr/lib/python3/dist-packages/gramps/gui/dbloader.py", line 531, in do_import
    dbstate=self.dbstate,
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 204, in importData
    info = parser.parse(xml_file, line_cnt, person_cnt)
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 1063, in parse
    self.p.ParseFile(ifile)
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 3354, in startElement
    f(attrs)
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 2336, in start_source

and maybe why:

2025-03-15 16:56:03.887: ERROR: dbloader.py: line 548: Failed to import database.
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/gramps/gui/dbloader.py", line 531, in do_import
    dbstate=self.dbstate,
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 204, in importData
    info = parser.parse(xml_file, line_cnt, person_cnt)
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 1063, in parse
    self.p.ParseFile(ifile)
  File "../Modules/pyexpat.c", line 473, in EndElement
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 3360, in endElement
    self.func("".join(self.tlist))
  File "/usr/lib/python3/dist-packages/gramps/plugins/importer/importxml.py", line 3092, in stop_source
    self.db.commit_source(self.source, self.trans, self.source.get_change_time())
  File "/usr/lib/python3/dist-packages/gramps/gen/db/generic.py", line 2023, in commit_source
    self._commit_base(source, SOURCE_KEY, trans, change_time)
  File "/usr/lib/python3/dist-packages/gramps/plugins/db/dbapi/dbapi.py", line 661, in _commit_base
    self.dbapi.execute(sql, [obj.handle, pickle.dumps(obj.serialize())])
  File "/usr/lib/python3/dist-packages/gramps/plugins/db/dbapi/sqlite.py", line 136, in execute
    self.__cursor.execute(*args, **kwargs)
sqlite3.IntegrityError: NOT NULL constraint failed: source.handle

but it should not be specific (custom error) to source object. All primary objects have the same code. So, I should get (gramps 5.2.x, not tested on gramps 6.0 beta) the same limitations on place and people?

Attached the result, after two importations of the same ‘raw-flat’ testcase database:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE database PUBLIC "-//Gramps//DTD Gramps XML 1.7.1//EN"
"http://gramps-project.org/xml/1.7.1/grampsxml.dtd">
<database xmlns="http://gramps-project.org/xml/1.7.1/">
  <header>
    <created date="2025-03-15" version="5.2.2"/>
    <researcher>
    </researcher>
  </header>
  <people>
    <person handle="_fd8049879a24729d06caa85a051" change="1742045785" id="1">
      <gender>U</gender>
      <name type="Birth Name">
        <surname>Abbott</surname>
      </name>
    </person>
    <person handle="_fd8049879b0aa0316b5e2b75f7" change="1742045785" id="2">
      <gender>U</gender>
      <name type="Birth Name">
        <surname>Abot</surname>
      </name>
    </person>
    <person handle="_fd8049879b85eaad390c549760" change="1742045785" id="3">
      <gender>U</gender>
      <name type="Birth Name">
        <surname>鈴木</surname>
      </name>
    </person>
    <person handle="_fd805ed2f88652aaf79bb820561" change="1742045785" id="I0002">
      <gender>U</gender>
      <name type="Birth Name">
        <surname>Abbott</surname>
      </name>
    </person>
    <person handle="_fd805ed2f9374bdaefeebfb65c" change="1742045785" id="I0003">
      <gender>U</gender>
      <name type="Birth Name">
        <surname>Abot</surname>
      </name>
    </person>
    <person handle="_fd805ed2f9962f2abb70236037a" change="1742045785" id="I0004">
      <gender>U</gender>
      <name type="Birth Name">
        <surname>鈴木</surname>
      </name>
    </person>
  </people>
  <sources>
    <source handle="_fd8049879d47c75ff5f5a6c13ed" change="1742045785" id="S0000">
      <stitle>All possible citations</stitle>
    </source>
    <source handle="_fd8049879da36d14d969fdcecc7" change="1742045785" id="8">
      <stitle>Baptize registry 1850 - 1867 Great Falls Church</stitle>
    </source>
    <source handle="_fd8049879dd1675d72f9a3e571d" change="1742045785" id="9">
      <stitle>Import from test2.ged</stitle>
    </source>
    <source handle="_fd8049879e0ad12acea404229b" change="1742045785" id="10">
      <stitle>World of the Wierd</stitle>
    </source>
    <source handle="_fd805ed2fae736f8a24c0dc5b40" change="1742045785" id="S0001">
      <stitle>All possible citations</stitle>
    </source>
    <source handle="_fd805ed2fb31e3304c6aa7479d" change="1742045785" id="S0002">
      <stitle>Baptize registry 1850 - 1867 Great Falls Church</stitle>
    </source>
    <source handle="_fd805ed2fb67ae9ca701e89466e" change="1742045785" id="S0003">
      <stitle>Import from test2.ged</stitle>
    </source>
    <source handle="_fd805ed2fb91899e8a8b3679a8d" change="1742045785" id="S0004">
      <stitle>World of the Wierd</stitle>
    </source>
  </sources>
  <places>
    <placeobj handle="_fd8049879c067b3f23e914d8c84" change="1742045785" id="4" type="Unknown">
      <pname value="AK"/>
    </placeobj>
    <placeobj handle="_fd8049879c950045ff24e757295" change="1742045785" id="5" type="Unknown">
      <pname value="AL"/>
    </placeobj>
    <placeobj handle="_fd8049879cf4096aca899fe80e3" change="1742045785" id="6" type="Unknown">
      <pname value="Σιάτιστα"/>
    </placeobj>
    <placeobj handle="_fd805ed2fa02eec9d1ba677af23" change="1742045785" id="P0002" type="Unknown">
      <pname value="AK"/>
    </placeobj>
    <placeobj handle="_fd805ed2fa6498b1bf127506bd2" change="1742045785" id="P0003" type="Unknown">
      <pname value="AL"/>
    </placeobj>
    <placeobj handle="_fd805ed2faa3de2f0af7e2f157f" change="1742045785" id="P0004" type="Unknown">
      <pname value="Σιάτιστα"/>
    </placeobj>
  </places>
</database>

So, everything sound good. Without the extra-specific limitation (check) for old XML style (and id) on Source object, we could “easily” import 1 000 000 entries. Maybe just need to assign something large for id range (gramps preferences) and check db performances before?
Jérôme.

Doug,

There is maybe a bug (at least on 5.2.x), because this custom behavior on Source object will be only on 2 sources (of 4 in total!). Always the same group:

<stitle>Baptize registry 1850 - 1867 Great Falls Church</stitle>
<stitle>Import from test2.ged</stitle>

A length issue?A buffer issue? string vs byte ? an id type/format ? I do not know. Anyway, as surnames and place names are only short samples, I suppose I should, at least, also test them with large names.

Pull Request #2021.

1 Like

Done

Sorry for the delay, looks like I missed it last time!

Does not look like Gramps 6.0 updates the schema as it still shows 1.7.2.

1 Like

Thank you, Sam!

No problem, no urgency. It seems that no Gramps XML importation will be blocked for a missing url.

Also, as there is no final Gramps 6.0.0 release yet, maybe I was the only one to ping https://gramps-project.org/xml/1.7.2/grampsxml.dtd on last days! Just a question, can we know how many Gramps XML importation are done by monitoring this file or urls?

Well, maybe I should revert the commit for ‘example.gramps’ update (#PR2021)? I thought it was few lines…

--- example.gramps	2025-03-17 08:51:55.353306928 +0100
+++ example.gramps	2025-03-16 15:51:07.119161072 +0100
@@ -1,9 +1,9 @@
 <?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE database PUBLIC "-//Gramps//DTD Gramps XML 1.7.1//EN"
-"http://gramps-project.org/xml/1.7.1/grampsxml.dtd">
-<database xmlns="http://gramps-project.org/xml/1.7.1/">
+<!DOCTYPE database PUBLIC "-//Gramps//DTD Gramps XML 1.7.2//EN"
+"http://gramps-project.org/xml/1.7.2/grampsxml.dtd">
+<database xmlns="http://gramps-project.org/xml/1.7.2/">
   <header>
-    <created date="2025-01-27" version="5.1.6"/>
+    <created date="2025-03-16" version="6.0.0"/>
     <researcher>
       <resname>Alex Roitman,,,</resname>
     </researcher>

… but done via the Github web interface… So:
66,083 additions, 66,083 deletions not shown because the diff is too large. Please use a local Git client to view these changes. :worried:

I also noted that ‘example.gramps’ missed a value on the optional code field for the Place object. Not a crucial data (does Gramps look at attributes order on file format?), but it raised (also on 52 branch) an error on the attributes sequences (ordering)! I tested DTD, rng and xsd schemas, so validations were stricts… Structure, content, or unique tags seem to pass tests. Maybe just a minor issue pending around pseudo-log (it is what I used for the metadata section into Gramps XML) on the header, which could be related to my testing gramps version.