Using attribute values in comparisons or calculations

(There’s another current discussion thread about “Gramps and recording and comparing DNA-matches”. In that thread, I ask a question which has broader applicability, and so I’m repeating it here as a new, more general discussion topic, in case people who have experience with this might not be reading the other discussion thread because they’re not interested in DNA.)

I would be interested to learn whether there are precedents among Gramps users for storing multi-part values within a single attribute, and using them in comparisons or calculations. Has anyone attempted to store data that way, and then use it in a report, filter, or whatever?

The example I gave in the other discussion was DNA-specific but the idea is general. Have you ever stored a set of delimited values within the value of a single attribute, and then done some coding to parse the values and use them in comparisons or calculations?

Thanks for any suggestions.

At the moment I’m not using attributes for this, but I would rather have one data pair/triple/quadruplet for each value, than add a list of the same to one attribute field, but it will depend a little on what data, and if there will come an addition of fields for the attributes…

A list of values can be difficult to read, as it will be a string of delimitated values, and if it is i.e. a triplet you need to use 2 types of delimiters, I personally thnk it would actually be easier to read something that was json formated…

I do use attributes for some data, mostly in events, where I add additional data, like in norwegian census’ its often also state how many animals and the types of them, how many barrels with potatoes, wheat and the size of the farm…
This will be even more useful when/if Gramps get the support for Events on Places, because then it will be possible to compare this values over time for the place/farm/company/ship (because this type of information is more place depended, than person depended)…

I can see a lot of usage for attributes, also for places if it get updated with atleast a date/period field, and if it get updated to a quadruplets(four-tuples) value, it would actually be possible to use the attributes for custom relations between entities, but then again, it starts to be a feature that get extremely complex, and a table in a Custom “Relation” Note would maybe be a better option…

It is a lot of values/information that I have for multiple documents that fits better as attributes than in a Note…
I.e. one of my ancestors hold more than 25 Patents, and a lot of the metadata for each patent would be of better use in a attribute than in a note…
Same goes with the extra values for a farm (in Norway we also had taxes on the farms (the "Silver Tax and Clerical “tiende” are two of them) and I am sure there was the same in most other countries to, where different taxes was calculated on the income of the land, not who owned or used the land…

Edit (Adding): … and in a historical perspective it would be interesting to be able to compare that kind of information for i.e. a farm or a company (also a place), to see how the place evolved over time and maybe with different generations of the same family…
In Norway there are farms and farmland that has been in the same family for 800 and maybe as long as a thousand year, and for many of those places, its possible to find a lot of metadata, but even for a 2-300 years old place, it could be interesting to see some of that information and be able to compare and do calculations on it, and to export it in some useable formats for other research software (read CSV or json as examples)…

1 Like

Related Feature Request

1 Like

I’ve normalized all my citations like this: [sourceIdentifier]/imgNumber (recordingDate, recordType, recordNumber. Persons) - sourceUrl

I’ve also made filters using regex to point me unregular citations not following my own norm, to check and correct what I’ve wrongly typed.

Normalizing attributes could be done too. You have to decide what information you want store and what (unique or not) set of delimiters could represent each part of that information.

When you say you want to make comparisons or calculations with these attributes, what kind? Attributes filters could use regex too so it’s possible to compare templates of your own and attributes content, even parts of attributes content against your template, but calculation??

Thank you, Patrice. Here is an example of the kind of comparison I am thinking of. Suppose I use an attribute to store information about a segment of autosomal DNA. The attribute would contain several values: a chromosome number, a start position, an end position, and perhaps some other values. Now suppose I want to compare two of these attributes, belonging to two different persons, to see if they have overlapping segments. First I need to compare the chromosome numbers to see if they match. I could do that with regex as you suggest. But now I also need to compare the start and end positions to see if the segments overlap. For this, I need to compare the start position of one segment to the end position of the other segment, and vice versa. And the comparisons are for inequalities (less than or greater than), and I don’t think regex can do that?

You could use attributes to store DNA data.

How do you want to enter the data? How do you want to display the results of a comparison?

Hi Nick,

As I mentioned at the beginning of this thread, this question arose from a different thread that was specifically about DNA, and I started this thread to discuss attribute parsing more generally. Please refer to the other thread for the DNA-specific discussion.

I asked similar questions in the other thread. I’m happy to write a prototype if someone gives me some sample data and describes how they want the results displayed.

Just spent a fair amount of time looking for the other thread. Since the Discourse site is so new, I expected it would be in the Gramps-Users archive… way too many search hits!

If you can ‘quote’ another thread, it helps when trying to following the discussion.

Sorry Brian, it’s here.

1 Like

@Nick-Hall, If you shall write some prototype for dna, maybe you could see of you could use some of the python libraries thats already out there?

most of them already have the algoritms for the most common Views already working… Lots of research labs use python and R in genetic research, and make there code open source…

I think a few of them already are mentioned…

2 Likes

One wonders if this isn’t an opportunity to think about a handshaking interface to other special purpose projects in Python based open source.

There are SO MANY opportunities to shore up Gramps weaknesses with mature projects. Interfacing to a document management systems & DNA analysis tools are just 2 examples where it would take YEARS to build the same capabilities into Gramps.

Yes. I’m not planning to write any new analysis code. We should use existing tools.

Firstly, we need to know the format of the results from the DNA testing companies. Do we have any examples to work from? Do all companies provide the same data?

Then we need to know the requirements of our users. What results are interesting? How do they want them displayed?

I have never got answers to these questions, so the DNA functionality in Gramps has never progressed past a very simple experimental gramplet.

1 Like

I am also curious to know what the outcome would be. I have tens of thousands of matches on a handful of DNA sites. Most of the individuals concerned are poorly documented and often do not identify themselves by name. So I do not want them in GRAMPS until I know more about them.

On the other hand, my GRAMPS data consists of people with whom I already have a documented relationship. I add our DNA match, if known, as an attribute showing cM overlap and source of their data (eg, 23&Me), but it is merely an interesting fact, not functional. And it relates to me only. I don’t bother to document DNA matches between others.

1 Like

I do not do DNA research at this point Nick… But I can answer the first…
All the companies have a little different “formats”…
I don’t know if you have seen this page, but I think it can give “some” answers… (this is the “tools” page)
https://isogg.org/wiki/Raw_DNA_data_tools

I think Lineage might be a good solution (https://github.com/apriha/lineage), but its under the GNU license, I don’t know if it can be used with Gramps…

This library also support most of the companies for consumer genetic DNA: https://pypi.org/project/snps/

Regarding viewing the data, I think this libraries comes with the most common “viewing alternatives”, but a Graph that would show any linked data in a weighted format based on DNA, and maybe a colored corner or something on the person box in the graphical trees for everyone with a test attached, and a colored edge line between any people up to the “oldest known common relative” based on DNA…

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.