When I look at the backup files that I received from a fellow user, who complained about the corruption of forms data, it is now clear that these data can indeed be corrupted by an operation that may look quite trivial, like the merging of citations. I tried that here, on an old back-up that he sent, and the effect is the same as what I saw after the analysis of a recent backup of the same tree.
Based on what I see in the data after merge, I think it’s safe to say that merging citations has disastrous consequences for forms data, and that users need a remedy, or maybe even more than one.
The 1st probably is prevention: When I run the merge duplicate citations tool, there is no warning about the effect on forms data, and it doesn’t detect the existence of forms data either, meaning that we now have corruption built in Gramps itself. And in this case, it doesn’t really help to see the usual warning about the deletion of history, and the advice to make a backup, because as a user I just see that as a routine notice, which I have learned to ignore, just like I ignore the kid crying ‘Wolf!’. Users are not made aware of the consequences, which seem to be quite bad, especially because they’re almost completely hidden. Users only see the corruptions when they edit an existing form and see that it contains a lot of unrelated persons, not the ones that they entered. And they are also not warned when they run check & repair.
The 2nd remedy would probably be repair, although I doubt that this is possible, which is why I put Removing in the subject too. And that is something that the user doesn’t seem to be able to do either, because the forms data is not visible in our editors, except in the forms Gramplet itself, which shows only a subset of the forms data, because it only follows the 1st link when multiple forms are attached to the same citation, which is exactly what happens after a merge.
When I look at the XML, I can see an easy way to get rid of all forms data, so that the user can make a fresh start, but I really dislike the idea that users may depend on such a trick to maintain data integrity.
So here’s the question: What can we do to help?