Ignored Lines in my GEDCOM file

(running 5.2 on W10)
My gedcom file contained over 100,000 “Line ignored” lines.
By doing this:

grep -v ignored oldfile.ged > newfile.ged

I reduced the file size by an impressive 23% in less than 1 second without compromising the quality of the contents (in cygwin)! Is there a convenient way to do this for non-grepping people?

This may not be true.

Those lines meant the GEDCOM line was uncognizable to the importer plugin but would probably be recognizable to a human.

So a note was created with that tag, its data and and what fault occurred. Gramps go to painful lengths to create such notes to do the exact opposite of blindly discarding potentially useful data.

You may have deleted 100,000 line of real data … but we cannot tell from your description.

While that makes sense in general, the fact is that in the case of my file, every single line eliminated contained nothing except that the Line was ignored as not understood. This is easily confirmed by outputting those lines and investigating them.

for example:

grep ignored oldfile.ged | sort -u | wc

gives 1 line.

They should identify the line number and line content. Sounds like a bug

Can you mention which Genealogy program created the file; shown in about the first ten lines of the text file.

When that result was shown in the GEDCOM import report dialog did you expand out the dialog to show the full message or right click and select all to paste the result in a text file to examine as if you only saw “Line was ignored as not understood.” more of the message would exist to the right of the line (see screenshot)?

Also but switching to the note category view you should be able to filter on the custom type of GEDCOM import .

Thanks Gioto.
This file has been the long result of many programs, so although it is coming from Gramps, there have been exports in its past from Ancestry and others. It is so large (for a human reader) that it I am trying to clean up some of the problems, but not finding anything very helpful. Oftentimes, an identical note is represented 10 or 20 times and I’m not sure how to make all the references point to one, then delete the others. But the notes with no information would be nice to delete as well, along with the pointers to it. I suspect my grep method leaves references hitting non-existing things. (?)
I see your “ignore” lines have additional data. I’m not sure why mine don’t.

Here’s some such lines:

Take a look at the MultiMergeGramplet, a Gramps addon from the Isotammi project of Finland.

Places category has an automerge option for duplicate places:

The MultiMerge gramplet has different capabilities in the different Categories. For instance, the Notes can combine (collate) the content of near duplicate Notes being merged. (This helps when Notes have identical text but with different markup. Markup cannot be clipboarded between notes but does work within a single note… so long as the ‘active’ focus doesn’t move outside the Note Editor dialog. So a merged note with links and font stylings can be collapsed. (Hint… a Notes gramplet undocked from the Notes category an expanded on a 2nd screen in portrait mode lets you browse through the Note records without having to open and close windows.

There’s also a Tool → Family Tree Processing → Find Possible Duplicate People…. However, it only combines the Person objects. So all the duplicate secondary objects have to be handled separately.

My guess is one of those many programs and websites at some point truncated the GEDCOM line “1 CONT” itself and lost the data, you might have to experiment to find who in the chain was responsible for that loss and report it to them because who knows what else was lost!

1 Like

Thanks to everyone. I know my files are the result of many imports and exports from at least 4 or 5 different products over many years. I am going to more carefully document things in the future, hopefully. I’ll see if I can get the “gramplet”.

2 Likes