Repairing Note Cleanup
As I had about 1300 notes with duplicated links, I did not want to edit them by hand, and looked into regular expressions, using Notepad++, to find and delete the extra links.
I have not written regular expressions before, so I used the regex tester at https://regex101.com/ to help build my search string.
There was a Stackexchange question which helped me, see notepad++ - How to find and mark all duplicate paragraphs using Notepad ++? - Super User
Regex in Notepad++
It seems to me that the following search/replace in the Notepad++ Replace dialog has deleted all the duplicate links that were created by Note Cleanup.
- tick Wrap around
- select Regular expression
- tick Matches newline
- The search for string is
(<style name="link".*?<\/style>)\s+\1+
- The replace string is
\1
Fixing the tree
-
In Gramps create a backup of your tree without media
-
In file explorer (I’m on Windows) right-click the .gramps backup file you just created and open it with 7-Zip (I have 7-Zip installed on my PC but another tool would do the same).
-
In the 7-Zip window right-click the .gramps file and chose Open then select to open with Notepad++ - it takes a while to open as it is a big file.
-
In Notepad++ enter the search/replace dialog as above, and click Replace All
-
Save the fixed file – with a new name just in case.
Note
I have not checked everything yet, but it seems to have worked. I think it saved time, and at least by automating I did not overlook some duplicates if I stepped through it all by hand.
I have Notepad++ v7.9.5
There are some link anomalies that Note Cleanup seems to have created, that I mentioned earlier, but I will have to find them again. I guess regex could fix them too.
And thanks again SNoiraud - I had never looked into the .gramps file.