Very large GEDCOM import - really happy

Very large gedcom import - really happy

Operating System: GhostBSD 25.02-R14.3p2 all updates

gramps-6.0.4

10 generations of descendants of one of my great grandfathers.

gedcom was 15157613 lines.

Product Name: HP Pavilion 17-g161us
Processor: 5th generation Intel® Core™ i3-5020U processor (2.2 GHz)
Memory: 6GB DDR3L SDRAM

Machine up 1 day 4 hours 12 minutes [For GEDCOM import]

Python on CPU’s for 26.2 hours.

Import Auto-Backup [ Gramps XML backup] to another machine 55 minutes.

Database Summary Report

Individuals
Number of individuals: 144596
Males: 75749
Females: 68717
Individuals with other gender: 0
Individuals with unknown gender: 130
Incomplete names: 5277
Individuals missing birth dates: 5443
Disconnected individuals: 0
Unique surnames: 20423
Individuals with media objects: 0

Family Information
Number of families: 62705

Event Information
Number of events: 966559

Place Information
Number of places: 132166

Source Information
Number of sources: 1232593

Citation Information
Number of citations: 1234871

Repository Information
Number of repositories: 0

Media Objects
Number of unique media objects: 0
Total size of media objects: 0 MB

Note Information
Number of notes: 233240
8 Likes

Welcome to Gramps

Looking at the statistics I’m a bit puzzled over the sources and citations numbers - they are almost the same.
In my own Gramps system I have a 1 : 6 ratio (source : citation)

The file was downloaded from FamilySearch using getmyancestors on December 20 2024.
I have not had time to look it over yet.

That’s not unusual for exported GEDCOMs.

Ancestry isn’t trying to make it easy to use have useful and efficient exports. So the redundancies in Sources and Citations are not consolidated. You’ll notice the Place count is also unnaturally large.

I would not be surprised to find Pedigree collapse Ancestors duplicated too. (I wonder if the duplicates have identical _APID indentifiers? That might allow intelligent merging.)

From gedinline-4.0.2.jar

Submitted by XXXXX XXXXXX
Encoding UTF-8
GEDCOM version in file 5.5.1
GEDCOM version assumed 5.5.1

Analysis time 268 seconds to analyse the file (excluding upload time)
Speed 6224 records per second

Lines 15157613 Number of lines in the GEDCOM file
Records 1673136 Number of records
Warnings 16127 Total number of warning messages
User-defined 157424 Number of lines with user-defined tags

Individuals 144596 Number of individuals in the GEDCOM file
Males 75749 Number of males
Females 68717 Number of females
Other 130

Families 62705 Number of families
Marriages 5453 Number of marriages
Places 833188 Number of places mentioned (not necessarily unique)
Source records 1232593 Number of source records

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.