Note cleanup tool

I recently used the Note clean-up tool and achieved a lot of success, but today I visited my citation tree and saw that many citation notes (also called apparently “source text”) were untouched by the work done by the previous use of the clean-up tool.
Is this an oversight or a limitation?
The note correctly appears in the notes tab and in the citation tab and is published to Navweb report with these additional characters after the use of the clean up tool.
here is a sample of the stuff not touched by the previous pass of the tool;

Brosnahan-Con Brosnahan-Julia P. Brosnahan-Julia P. Brosnahan<br>Siblings: Jeremiah Lenihan-Edward Lenihan-Patrick John Lenihan-David Lenihan-Timothy John Lenihan-John Lenihan-Denis John Lenihan-Michael Linihan-Nellie Lenihan-William Lenihan-Daniel Lenihan-Ellen Lenihan-John Linihan-William Lenihan<br>This person appears to have duplicated relatives. View it on FamilySearch to see the full information.

and here another one;
Mariam Brosnahan (born Lenihan)<br>Birth name: Mariam Lenihan<br>Gender: Female<br>Birth: 1881 - Caherlavoy Co. Limerick Ireland<br>Marriage: Feb 4 1899 - Tournafulla Parish-County Limerick<br>Burial: Baptismal Sponsors - Catherine Lenihan<br>Parents: John M. Lenihan-Ellen Curtin<br>Husband: Patrick Brosnahan<br>Children: Cornelius

I’m not too clear; are these actually Notes? Is this what you see in the Note editor when started from the Notes view? You mention “source text”, which, depending on where you see it in Gramps might not be notes. If this is actually part of the Citation (part of the top of the box in the Citation Editor) or Source (part of the top of the box in the Source Editor) the tool would not have fixed it up.

The <br> tags in Notes should have been converted to newlines.

Let me give a use case I am seeing.
I have person in my tree. I discover a source which mentions her and there is a good amount of collateral text and information about this source, so I create/define a new ‘source’. Much of this information talks to the generalities, dates, publishers and context of this source and can be put into the ‘Source Citation’ tab of the person-detail GUI, if its not already there. This dialog is labeled "Person: xxxxxx yyyyyy -Gramps. The person field also has a ‘notes’ field for general information about the person. This notes field is part of the top-level person ‘Notes’ . It is a one to many relationship between this person and these notes.

So now our person has at least one source (maybe more!) which we can cite for specific events’ instances.

For any of these arbitrary sources Gramps gives us a ‘source text’. It is also a ‘note’ object and is bound to the specific source itself. This ‘source text’ note type is presumably intended to provide general information, context etc, about this source. There is a many-to-many relationship between a person and relevant and applicable sources. One person can have many source and one source can have many persons. A citation provides a bridge from source to person.

For a citation there is the possibility of also adding a note to it. This type of note is also listed as a ‘source text’. It might be better to call it a ‘citation text’… but there it is. There is a many to one relationship between a source and a citation (there can be many citations to a specific source), and a one to many relationship between a person and a citation (one person can have many citations to a specific source).

These ontologically nested note types seem to all be Gramps notes nonetheless. Gramps stores them as different note types, where they exist, but they are all Gramps notes, it seems to me.

The note cleanup tool does not seem to address all the three categories of above Gramps notes. We should consider that there there might even be other type of notes equally un-addressed (media notes, place notes, etc).

In my case, the tool only seems to address the person notes (but does so very well!) and somehow ‘some’ of the source notes but none of the citation notes. The debris persists.

All of these notes types have the same need for clean-up since they frequently arrive via cut-and-paste of html/xml/MU/MD pages.

My question goes to whether the tool is architected to traverse all the current note types. The tool may have been developed before newer types of notes were added. Just a thought.

A separate question is whether all the possible mark up ‘debris’ in these notes is identified for the tool to act upon. Probably a harder task!

The Note cleanup tool doesn’t care what the note type is, it scans all the notes in the Notes view.

The fact that you are mentioning “source text”, makes me wonder what version of Gramps you are running. At one time in older Gramps versions, we apparently had some styled text attached directly to sources. During a Gramps upgrade to more recent versions this should have been converted to our current Notes, with a type of “Source Text”.

The Note cleanup tool wasn’t released for those old versions, so that should not apply…

If you want to send me an XML backup of your tree or a subset that contains the items at issue, I can look at it to see what is going on. paulr2787 at gmail dot com

I am running 5.1.2 on Linux Mint Tessa
In fact all the three types of notes are pain old notes with the normal note Notation.
Person notes are sanitized well. Some notes associated with Sources are also seemingly sanitized (however it might be that I didn’t actually have a problem with these so I cannot be sure they are only ‘partially’ sanitized).
The ‘offending’ ones (all Notes of citations which do have the type “Source text”. They seem to be invisible to the search function of the clean-up tool.

Attached on a gmail thread is a small sub tree showing the issue. I did load up this tree and try to use notes clean-up but it can’t see the citation notes here either.

I got you test file and discovered the issue. The Notes in question already have styles applied (the '<a>' was yellow).

Since the tool was originally intended to create styles from html, it ignored Notes that have styles already present. It is more work to edit a Note to clean it up and preserve present styles. I will have to give this some thought, but don’t give up hope.

What great news.
Can you direct me to some place where these ‘styles’ are explained. I want to increase my resilience (aka knowledge) to the developing trends’ downsides.
Do you want me to raise a Mantis issue or will you do so?
Thank you for your patience and your dedication.

brian

New_Note_editor_dialog
Describes how to use styles (bold, italic, colors etc.). This posting is sufficient to get me started on a possible fix. But you can add a bugtracker issue to keep me reminded if you want.

To anyone following this; I’ve uploaded an updated version of the NoteCleanup that should deal with the issues raised.

5 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.