Somehow, Discourse things I’m too pushy (am I?), so I can’t send another reply to the XML subject. That’s why I’m pasting what I wrote into a new topic:
The best solution, I think, is to work with a concept that I learned from fellow developer Thomas Wetmore, way back in 2010, in the days of the Better GEDCOM community. He referred to that as the persona, and although it was the first time that I read about that, the term was most probably not invented by him. You can find his paper about that on this page, at #72:
https://tech.fhiso.org/cfps/papers
Simply put, you can see a persona as an extracted person, which can be stored like any other person in the database. And when you work in that way, every record results in a group of personae, each of which is linked to a single source, where the person has a role, like parent, or child, or witness, whatever, just like we already have in the attributes generated by the forms gramplet.
Personae exist in lots of places, like in the left hand column, when you match sources on FamilySearch, and they also appear on Ancestry, whenever you try to match a person in another tree, or a source, with your own.
Model wise, the easiest way to deal with personae is to import them as normal persons, and tagging them as extracted, or something like that. And where software like Clooz or Evidentia breaks the concept by merging personae into a single person, and loosing information, the proper way to deal with them is to create a link between personae and who you think is a true individual (or conclusion person). This is what the developer of the Dutch program Centurial calls correlation.
Correlation works like merging, but it’s not destructive. It means that when I tell the software that I think that Johann Herman Borgstette born in Spenge, and Jan Harmen Borgsteede married in Amsterdam are the same person, the software creates a new (conclusion) person, that links to these two personae. And because it creates links, this virtual merging can be undone easily, so you will not end up polluting your tree.
In a way, this is quite close to association, where you can already create a link between persons, meaning that we don’t even need to change the data model much, except that we need another label than ASSO to mark this link as evidence, or something.
When you visit the openarch site that I mentioned earlier, you can see that you can already download a GEDCOM for each source, so for that site, you don’t even need special extraction software. The only change you need is an import that tags each imported person as extracted, so that you can apply a filter, if you want.
Personally, I would also need some software to deal with the citations, because I don’t want formatted citations in my database. But that’s another issue.