Updated DNA gramplet ready for review

An updated DNA gramplet PR is ready for review

The doc page is already in place Addon:DNAgramplet

The changes include merging all Associated people graphs together and fixing the graph refresh. Please review and provide suggestions/comments.

3 Likes

Looks interesting.

Can this type of data be added to the Example.gramps? Or better yet, be appended to to the file collection as a separate importable file?

It is helpful if users can exactly reproduce the example captures in the wiki.

I updated the doc page with the two datasets and described how to create the Associations to reproduce the illustrations with these two datasets.

1 Like

Cool! I was able to reproduce the material as shown in the Wiki by creating Associations and pasting in the provided text as the Note content.

I’ve got a couple questions.

These DNA Associations must always have mirror reciprocal relationships, right? DNA Associations that shares the Source & Notes

I recall reading about an Associations Tools that would add missing reciprocal Associations. That tool could probably be enhance for DNA Associations but would need to understand the Rules.

A thought occurred to me regarding privacy. I doubt anyone would want the underlying segment data to print in reports. So I’d expect to set that Note’s privacy flag.

But it would be nice to be able to ignore some of the DNA Associations to simplify what’s shown in the DNA Gramplet. So maybe the Gramplet could omit Associations set to Private. (Without regard to the Note and/or Source Privacy.)

Interesting stuff even if I do not grasp the graph layout yet.

Thanks’

It could be helpful to include the number of SNPs as an additional column in the note containing the match data. Whereas the cM value gives some idea of the distance of the relationship, the number of SNPs gives some idea of the quality of the match. A match with fewer SNPs should be viewed with more caution. Also, different testing companies have different overlap which particular SNPs they examine (see here).

As for the reciprocal relationships – yes, except for the maternal/paternal indicator. For example, I match my first cousin, whose mother was my father’s sister. For me, the match is on my paternal side, and for her the match is on her maternal side. Would the reciprocal relationship have its own note containing the same data? Or would it have no note at all?

I also created the SyncAssociations addon that creates the same Association Note if it does not exist for the reciprocal relationship. When I did this I realized that just copying the Note would assume the same Maternal/Paternal flag. Not necessarily the right thing.

However, the DNA gramplet uses the flag only as a ‘hint’ since the gramplet uses the closest calculated relationship between the two people if it exists. Only if there is no known relationship between the two people does it use the flag. In this case, since we cannot determine the correct relationship we can either copy it or change all the flags to U when creating the reciprocal.

When Nick wrote the original version of this gramplet, he included this M/F flag as a per-line input. Rather than break this format, I used it in the graph as a hint. Maybe a better option would be to remove this flag from the input Note and use either the calculated relationship or Unknown. Thoughts?

Privacy - I am not sure whether people would set the Note as private. I did not describe how to do that. I will add to the doc. I am not sure ignoring any private DNA Association when graphing is the right approach. I dont think gramps does that in any other case (except export). But I am open to feedback on this.

1 Like

Currently the SNPs is not part of the input. I was trying to be consistent with Nicks original implementation for compatibility but maybe that was a mistake (probably no current userbase for that experimental gramplet).

The SyncAssociation addon uses the same Note when creating the reciprocal Association. If we remove the M/F/U flag from the Note, then this would not be an issue.

If we break compatibility, I would propose to remove the M/F/U flag and add the SNPs as the last field. Then the gramplet could add the SNP to the tooltip. Thoughts?

1 Like

That’s handy, but it’s not always necessarily correct in terms of where the shared segment came from, if there are multiple common ancestors along different paths. My parents were unrelated, but I have some distant cousins with whom I share common ancestors on both my maternal and paternal sides (because a paternal cousin happened to marry a maternal cousin). If they ever have their DNA analyzed, and if they happen to share any segments with me, then we’ll have to triangulate with other matches to figure out which side each matching segment came from.

(Even when a match is clearly paternal or maternal, a given segment is not necessarily from the most recent common ancestor, if there are other common ancestors along different paths. Again, triangulation with other matches is needed in order to sort it out.)

I’ll try loading up some of my matches and see what other feedback I might have. Thanks for working on this!

They certainly can if they wish, so that they can avoid exporting the data for example, but I don’t see why the graph should suppress it in that case.

All things being equal, the closer the relative the higher probability that the matching segment comes from that path.

You are absolutely right that when there are multiple common ancestor paths, it requires triangulation. One way to address this is if there are multiple paths to a common ancestor in your tree, then make the segment in both maternal and paternal instead of the closest. That is the only option I can think of.

BUT:
Consider the case of a sibling. The maternal and paternal paths are the same distance. So it would probably be best to put in both graphs.

Consider someone that has a path thru a grandparent and another path thru a great great grandparent. In this case, the close path is most likely correct. Would you still want to see the segment match in both graphs? I don’t think so.

The current algorithm uses the first, so a sibling will be in only one graph. Let me work on fixing this case where there is a maternal and paternal path of the same distance and show in both graphs instead of just one.

I was just thinking that there were not filter options to show only SOME (or SPECIFIC) associations of DNA type. And I don’t know how much effort would go into adding a ‘Configuration’ option for this gramplet. (It’s probably need configuration to allow the colors to be changed for Associations or add a key that list them.)

But there were Privacy options and I thought they could have be shoehorned into the purpose. Another workaround would be to change the Association type from DNA to DNA2

The DNA Association sole purpose at this point is to be able to do the DNA gramplet graph. Personally, I have 18 Associations on some people and a config to enable/disable individually would be very busy. I cannot think of a reason to add an Association but then not use it in the graph. In this case, if the Association is DNA2, for instance, it will be stored with the person but not used in the graph.

Adding a color key is on my list of updates once I get all of the first pass feedback.

could you change the rendering options for the base genes? Currently they are a no-fill rectangle with an outline. and the graphics is very confused between what is a bar and the background when zoomed in.

DNA Painter went with a pale fill and no outline. It works a bit better. They also added a color key to the Association person.


example: www.thefamilyheart.com/wp-content/uploads/2018/09/DNA-Painter-Profile-for-Elizabeth.png

Is there a name for this format? When I looked at a similar CSV list exported another system (FamilyTreeDNA), it had a header line. (“Name,Match Name,Chromosome,Start Location,End Location,Centimorgans,Matching SNPS”)

If this is NOT a custom Gramps SNP list format, maybe the Association type should be the standards name? Or, if the DNA was used as Gramplet flag, maybe the Source could be used to identify the format.

In the case of a sibling, the segment could be a half-match (just one parent) or a full match (both parents). Adjacent half- and full match segments could be reported by the vendor as a single matching segment. The GEDmatch site makes it easy to tell the difference.

I think different providers may have different formats, but the one used in the note can be specific to the gramplet.

However, your comment reminds me that information about DNA matches, just like other information, comes from a source, and that source should be documented and cited somehow. I’ve only just begun to think about this, but if I do store my DNA match data in Gramps, I’ll want to create some appropriate Source(s) and use Citations. Those citations could have Notes and/or Attributes, and could be attached to multiple Persons and/or Associations.

With respect to this gramplet, though, it makes me wonder if the Association is really necessary, if instead the Note containing the segment details were simply attached (not copied) to all of the persons involved. At a minimum, the user could attach it to the parties involved in the match. If they feel confident about which ancestor it came from, they could attach it there as well, and also to all of the people along the relationship path. (I’m assuming in this example that the user creates a separate note for each segment, and that there is no maternal/paternal indicator.)

The format is not standard, but GEDmatch is somewhat similer: Chromosome, Start, End, cM, Matching SNPs. This supports the idea of removing the M/F flag and adding SNPs to the data.

Unless I hear other comments against, I will change the format to align better with those apps.

1 Like

I can see your point. When I superficially looked at the documentation, I assumed the Association to be opposite what it actually was. I added the woman’s DNA data to her own profile and Associated with the target person.

But that awkward directionality is inherent to the Flag for maternal/paternal side… which I had not seen explicitly defined in any download match data formats. (Indeed, most of my Matches are with cousins where I haven’t determined the connection. I have only positively identified the relationships of a) Without that Flag, the Association would be simple & symetrical, and could be used as markers to statistically validate waypoints in evidence based genealogy. That is, of the 4,019 matches reported by FamilyTreeDNA, I’ve only positively ID’d four: an uncle, my mother’s cousin & her daughter, and a surprise of an illegitimate paternal 2nd cousin)

But, if Associations were not used and notes were attached instead, it could get very confusing. Notes can be attached to too many record types. But Associations can only be attached at the Person level.

So I suspect that the Associations Notes are as good of a place as we have for DNA match data.

Good idea. I will change bg to be a pale non-competing color and add a legend.

1 Like

If you ask the question: ‘who has shared DNA data with person A’, having an Association makes this answer obvious. If each Note were attached to two people instead of thru an Association, I think it would be challenging to ask this question.