Problem merging duplicate citations, in 5.2

Gramps 5.2 on Linux Mint

When I convert my 5.1 database to 5.2, everythings seems to be OK, but when I try to merge duplicate citations, what I often do after importing new branches, I get an exception:

2024-05-01 17:27:39.425: ERROR: grampsapp.py: line 188: Unhandled exception
Traceback (most recent call last):
File “/home/enno/gramps/gramps/plugins/tool/mergecitations.py”, line 205, in on_merge_ok_clicked
raise MergeError(
gramps.gen.errors.MergeError: Encountered an object of type Note that has a citation reference.

And this is quite a nasty one, because there is no way out, other than by killing Gramps using the task manager, leaving me with a locked database. There is no corruption, but it is annoying, and the error did not show in 5.1, for that same tree.

A quick glance at the source code, using blame, suggests that the merging code hasn’t changed, so it looks like a problem with the upgrade. That is why I still run 5.2 from source, and won’t migrate until I know what’s going on.

Also, when I run Check & Repair on the converted tree, I see warnings like these:

2024-05-01 17:26:51.701: WARNING: check.py: line 2542: FAIL: Bad Note Link found, Place: handle: c5f484ec35150d286652a2957f6
2024-05-01 17:26:51.783: WARNING: check.py: line 2542: FAIL: Bad Note Link found, Person: handle: f540866195a7bcd15b4eb3650f2
2024-05-01 17:26:51.783: WARNING: check.py: line 2542: FAIL: Bad Note Link found, Person: handle: f54086619615a25a66d5daad4df
2024-05-01 17:26:51.785: WARNING: check.py: line 2542: FAIL: Bad Note Link found, Note: handle: f26022bc6fc57226f60575155f4
2024-05-01 17:26:51.785: WARNING: check.py: line 2542: FAIL: Bad Note Link found, Person: handle: f540865e8ff4c79b9a0ebd1fa9

Do they mean that something went wrong during the conversion, or were these things not checked in 5.1? And can these issues be related?

This is possibly related to PR #1391.

How can that be? When I read that PR, it seems to be about a new feature, which gives me a chance to add a citation to an event reference. And if I don’t use that feature, it should not create new links between objects, right?

And on migration from the 5.1 schema. either by opening a 5.1 database, or importing one into an empty one in 5.2, these links can’t be there, because it’s a new feature. Am I missing something here?

To me, this feels like a bug that’s blocking my migration to the 5.2 schema, because the merge citation tool doesn’t check the consistency of backlinks without a reason.

The only schema change in v5.2 was the addition of citations to event references. That’s why I suggested looking at PR #1391. It was a simple change though, so I don’t see why it would cause a problem.

OK, thank. I may give it another look this weekend. But first, here is another thought.

And that’s that the code fragment that looks for mergeable objects hasn’t changed since 5.1

   for handle in db.iter_source_handles():
       dict = {}
       citation_handle_list = list(db.find_backlink_handles(handle))
       for class_name, citation_handle in citation_handle_list:
           if class_name != Citation.__name__:
               raise MergeError(
                   "Encountered an object of type %s "
                   "that has a citation reference." % class_name
               )

and as far as I can check, backlinks can be of other types, like when I add a note to a source. And that’s the type that I see in the exception. And then it’s quite weird that I didn’t see that before, because that backlink should also have been created in 5.1. Is that right?

In that case, I can just as well replace the raise with a continue, and that works quite well when I test it.

Neither do I. I just ran 5.2 from source with Visual Studio Code, after adding a breakpoint to the raise line, and found that the handle was indeed referring to a note, attached to a source. And since there’s nothing wrong with that, I suggest that the raise is replaced by a continue.

See also: https://gramps-project.org/bugs/view.php?id=13205

It looks like this was caused by PR #1098. Links in notes now create backlinks. Previously they were just soft links.

We should probably change the if statement to:

if class_name not in [Citation.__name__, Note.__name__]:

Links in Notes and Associations may be why Deep Connections is such a wallowing resource pig.

And interesting thing is that Deep Connections had a massive slowdown between 4.0.x and 4.2.x releases. Or I might have just hit a limit in handling pedigree collapse.

OK, thanks. Looks like that PR also explains the other messages, about Bad Note Links, which are links that I still have in my 5.1 database that point to something like dark matter …

I’m still having problems with that raise though. It should not happen, but if there is backlink that is not a citation or a note, which is probably extremely rare, the user’s still confronted with a locked UI, and DB, when it’s killed.

I think it’s the latter, because when we discussed this earlier, I did not see relevant changes in the source code.

One can still test this on Windows, where it’s quite easy to install different versions, and see whether there are any differences in speed. One might need to rely on GEDCOM though, for a test file, because there is no way back in time with Gramps XML.

1 Like

I don’t think so, because notes don’t need to be merged. The tool merges citations, not notes.

I also see no need to raise an exception, because even if there’s a backlink from another object type than a citation, or a note, there is no need to annoy the user with that. Link faults, if any, should be treated by Check & Repair.

The code is checking the backlinks of a source. The only objects that reference sources are citations and in more recently notes (through links).

It doesn’t actually merge notes.

If you have any other object type that directly references a source, then that is a problem.

I recall the CSV Text Import and the Data Entry gramplets having interface for adding Sources instead of Citations. Hopefully, they create dummy citations. But it might be worth checking to see what they do.

Particularly since the new Forms improvements will hopefully start citing as it adds objects and attributes.

The Forms addon already adds a citation to every object that it creates.

1 Like

That’s right, but throwing an exception blocks Gramps, and IMO that’s a bigger problem than ignoring a link that can be repaired, or removed, with another tool.

The DBErrorDialog may be appropriate for cases like these.

2 Likes

I think that it’s the latter, and the Gramplet also slows down quite a bit when you increase the number of generations in preferences. That’s 15 normally, and I have it at 25. And it seems that this causes lots of problems, partly because the algorithm seems to have no memory of nodes that it has visited already, which is something that you need to have, or sometimes happens naturally when it’s recursive, and there is no pedigree collapse.

When I try PAF, it can find all connections between me and persons that I share ancestors with in seconds, and a more modern program like the Family Tree Builder desktop program created by My Heritage can find deep connections between me and any person in a flash.

GeneWeb, the program used by Geneanet, can calculate and display multiple connections in about a second too, and that program runs in an interpreter, which suggests we might see similar speeds with Python. It has an advantage though, because its data model is much simpler, and only allows one parent family for a person.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.