Trailing white space in Gramps, Web Sync, and JSON export/import

I just came back to a long standing issue in Gramps Web Sync which has to do with white space in Gramps objects’ string fields.

Gramps’ XML exporter strips all leading and trailing white space in string fields before writing to a file. This makes the export lossy, strictly speaking. Although usually we wouldn’t really care, it is a problem for Gramps Web Sync, which uses Gramps XML exports of the remote and local databases and then compares them in an in-memory database.

One solution would be to add stripping of white space to the check & repair script.

But I was wondering whether, with Gramps 6.0, the cleanest solution wouldn’t simply be to use the JSON export/import plugins from addons-source. In contrast to Gramps 5.0, this should now be the closest representation to the database representation as possible. It would also make it much easier to directly show diffs between JSON objects during the sync.

@dsblank as the original author of the plugins and @prculley as the one who updated it for 6.0 (using object_to_string and string_to_object) – what do you think? Do you agree this would avoid such ambiguities?

One potential downside is that it would require sync addon users to install the JSON addon as well – not sure if depends_on in the plugin file is enough to enforce that?

Here is a very old issue (pre-Gramps 6.0) where this is discussed

Does that pertain to Notes content as well? Because stripping leading white space could mess with the markdown positional offsets.

Good point! Yes, I think that could offset styled text tags - I haven’t tried.

1 Like

I would go even further: it seems that the database APIs should enforce the rule that strings should be stripped as they go into the database, and/or when retrieved from the DB.

We can also move the JSON import/export addon to the builtin plugins. I suspect that it hasn’t changed much (at all) and doesn’t need to be separate.

But I’m fine with whatever your recommendation would be moving forward.

1 Like

Perhaps moving from an add-on should wait until there is a decision in the discussion about managing versions and dialects in the import/export plug-ins.

The GEDCOM 5.5 versus 7.O .ged files (with the multitude of dialects and custom tags) are the conversation driver. But v2.1 versus v3 .vcf importing/exporting will have similar needs.

Yes, that would seem like the best solution.

Note that the fix function in the XML export removes some control characters in addition to both leading and trailing whiteapce.

I can implement that - would it be a fix for the 6.0 maintenance branch?

No. The change should be made in the master branch.