How to remove extra empty lines from Notes

Most of my notes have a number of empty lines at the end. This probably happened when importing a gedcom file from another genealogy program.
Most of the time that is not really a problem, but when running reports, and multiple notes are printed after each other, it doesn’t look very nice.
Is there a way to correct this?

Gramps 5.1.5 Linux Mint 20.3

You could maybe export the data as XML or SQLite (I think XML is the safest regarding export of all data fields from Gramps) open the file in a text editor search the data for multiple spaces or line shift/new line and then replace it with a single line shift/new line, then save the XML and import it to a new database…
It is also possible to do this in OpenRefine, but I think it will be easier to do it in a text editor like VS Code…

It should be easy to create a simple Regex search in VS Code or another text editor that did the job, it might be you need to rerun a few different patterns, but that depends on the text and what type of character code/line code used.

You need to be careful editing Notes from outside the GUI.

If the Note has any links or styling (bold, italic, font size, color, etc.), the markup is hidden at the end of the Note content. Since it is not inline, it is very sensitive to character position changes.

If there isn’t any markup in your tree, you can do external edits.

Yeh, but I think it is better to just remove any markup all together than having 50 empty lines in a note.

I have been through that work myself, with some badly formatted html notes from Legacy familytree, and it is nearly impossible to fix it in Gramps note editor…

And that’s why I wrote that it might be that multiple runs are needed, I just didn’t remember the correct wording when I wrote it (it’s 04 AM)

I didnt have any problems fixing the formatting with VS Code, I just removed anything that was in the beginning and end of the text first, then I started to remove any html tags in the notes, and at the end I replaces any /n or other “new line” codes with just a space to get a continues string of text, so that the Notes occurs as plain text in Gramps, most often the formatting from other software imported with the gedcom import doesn’t work specially good anyway, so better with a plain text note, than a note with really bad text formatting and structure.

Digression:
I actually used the same workflow on the old Microsoft Office Word documents, all the formatting was in the “last character of the document”, so in 80% of occasions, if you had a faulty document file (.doc), you could open in in a plain text editor, remove the “last character” and save, and it would open in Word again but without any text formatting…
I actually saved a few writers from losing their work that way back in the days when I worked as a tech. supporter for Microsoft products.
Biggest project I was able to save this way was a 1500 pages book, where the last 300 pages was written without saving the work, but luckily the auto-backup had saved a copy, but couldn’t be open because the Mac had crashed (very usual back in the 90’s)…

Not the same as the Gramps XML file, but similar problem… remove any “code” between the XML tags and you only have plain text.

2 Likes

A tip related to this dicussion, when inputting Notes, do not put any return character at the end of the text. If the cursor ends up on a new line, there will always be an extra empty line printed in your reports which contain that note. CSS will take care of spacing between paragraphs.

As you are on linux, the best solution is:

1 - export your database in xml (not compressed)
2 - run the following command:
sed -i.sav ‘/^$/d’ your-xml-file
3 - you can verify the differences with:
diff -urN your-xml-file.sav your-xml-file

This will remove all empty lines from your xml file.
You can now reimport your new gramps xml file in a new database.

3 Likes

Thank you very much Serge. That will probably be the solution. I will try it later.

That was perfect Serge. Thank you!

1 Like

Maybe this could be also an additional feature for the NoteCleanup addon?

I have successfully used regex in the Gramps note editor sidebar [\n{2,}\Z] to filter for all notes [~17K in my case] with instances of 2 or more trailing newlines. I run this every few months to eliminate notes that end up with some extras at end. When you first run it, there can be a daunting number to deal with, but it is actually very quick to work through and edit them out using the standard Gramps note editor.

2 Likes

Your method allows one return character at the end of the note. Not ideal.

OK, perhaps a single trailing newline may be an issue for html output (though just as easily filtered for), but for odt output it is fine.

Alter the expression to \n{1,}\Z

Perfect, it only looks at Returns at the end of the Note, ignoring any within the body of the note. Thanks.

Gramps already has a strip (or is it a trim?) on the name entry. It chops whitespace off both beginning & end. If such a feature was added, it would better if it attacked more that newlines that are only at the end.

But it also would have to be designed to not break the markup.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.