Exporting Attributes (ICD codes for a Family Medical History)

Just had a discussion on the Gramps Matrix IRC regarding recording Cause of Death for relatives.

The initial proposal was that they would use the Description to record the particular maladies affecting the person. But since their goal was to eventually calculate a genetic predisposition to fatal illnesses, I felt this would be too difficult to maintain enough consistency for analysis.

I suggested that they use ICD-10 codes ( International Classification of Diseases) as a custom attribute instead. It would be less susceptible to typos.

But the question then becomes, how do you export Custom Attributes with their Events (or other objects) to a CSV file for analysis?


Python ICD tools

  • hcuppy is a Python implementation of HCUP Tools and Software (Healthcare Cost and Utilization Project)

  • icd 0.1.2 Tools for working with ICD codes and co-morbidities

  • icd9 Python library for hierarchy of ICD9 Codes

  • nlc-icd10-classifier A simple web app that shows how Watson’s Natural Language Classifier (NLC) can classify ICD-10 code. The app is written in Python using the Flask framework and leverages the Watson Developer Cloud Python SDK

  • ICD API samples in Python

Just curious, are they tracking only diseases which resulted in death, or also any case of such diseases even if the person survived that and then later died of something else (or is still living)? In other words, does this really belong on a Disease event rather than a Death event?

If I knew enough Python, I would add add the attribute as a column to the Event category view so that it could be included when exporting the view as csv. And if I knew more Python than that, I would make the category view configurable so that the user could choose which attributes to add. But I don’t, so instead I would export my database to SQLite and write some SQL.

1 Like

If the Attribute is stored as Cause of Death type Event, then yes. Interestingly, the co-morbidity Python tool creates a 3x3 matrix of ICD Codes for analyzing contributing or underlying conditions.

However, the Medical Information type Events could be used for wider purposes: recording medical procedures, inoculations, illnesses. ICD codes are commonly used for Medical Billing purposes and could be used for building a living Medical History. A new report for an Individual’s Medical Information could generate something you take to a new doctor or for a Hospital stay.

Anyway, if you use an Event for the disease, instead of an attribute, then it would be easier to export.

I wonder if there are other use cases for yet another possible new feature: casualty between events.

1 Like

I believe that an explicitly naming an Event type or specifying it in the Description is too error-prone. ICD codes are easier to group & validate.

Although new features are always welcome, I am more interested in what is possible NOW. We have 2 built-in Event types that are viable for the purpose. And a single Attribute would allow adding nearly infinite resolution.

Can a CSV formatted report be written (by an end-user) that includes a user defined Attribute value?

Why do you think I have asked over and over again for a full Database export to CSV?

I suspect that a full export to CSV isn’t a workable objective. A flat tabular form is just too limited of a file format.

But it works fine for narrow snips of data.

And I know we have user creatable reports. So we can design a report that sidesteps some of the data chunks that would be problematic.

Sorry I wasn’t clear. I mean, instead of putting the code in the value of the attribute, you could put it in the description of a “Disease” event".

Regardless of where you store the code, you’ll need a lookup table to decode it into the appropriate description, unless you concatenate the code and description and store that as the value of the attribute (or as the description of the event). That’s one reason why it might be better to export the data from Gramps.

Yes, users who know enough Python can create their own add-on reports. Or are you referring to something else?

I know we can do reports. I’m asking how we reference custom Attributes to insert them in a Report

Strange that more or less every single large research project use csv as one of the main formats for data if that is the case

I think you need to take into account what both Excel, R, Python and Perl actually can do working with tabular data.

Out of tabular data you can create full network graphs, you can use fields for calculation, you can create 3D models if you like.

AND them most important thing is that nearly all research tools, including Python has an extended amount of support for that format.

It is also a lot easier to work with than a .sql file, or a not correctly formated json file.

You can use vscode to work with csv data if you like, so, no, there are no limitations on a tabular format in that way, but yes, it would be easier to have an import export in a json-ld or grampml file if you want to analyze data in a RDS or graph tool.

But to analyze tabular data and do extracts of data, is so easy, even I can manage that.

Problem is when people allways shall discuss why something is not helpful, but instead want to exstract 3 fields out of a database with thousands, just so that they can do that one simple task.

If you want to get some of the data from the Gramps xml, it is easy to use either Excel with Power Query or Python with xmltree and numpy or something similar.

AND, if you shall create a report for every single thing someone want, Gramps wil be extremely bloated with reports.

Why is it so little will to make Gramps to a software that are “interoperative” with other research tools through a few export/import additions, so that people can use ANYT tool they like to do analyzes on there data…?

This start to remind me more and more about a locked down proprietary mindset, not a open source/open data mindset.

I can with my csv conversion from gramps xml import the data to Cytoscape and make a cluster or a node size data field for just what you talked about in your first post. because that is what most network graph tools are made to do.

Let’s not get overwrought.

Yes, CSV is useful and has its place. And that’s why everybody uses it as a crutch. Heck, Gramps already uses it as such a crutch.

But if any format covered all the bases, it would be the only format anybody used. Since there ARE many formats, it is reasonable to accept that they all have limitations.

So sometimes our crippled situations demand a motorized wheelchair instead of crutch. But it’s still good to have a decent crutch available too.

I’m just asking how to make a custom crutch that will work in a specific situation.

If we don’t let the ideal blind us & block our way, it could be that discussing how to customize the CSV … even via a Book of Reports… gets us a nudge closer to that ideal.

It isn’t that no one wants to… it’s just that everybody has their own projects and limits to their capacity.

I want to rework large parts of the manual where the information is scattered. But the questions that pop-up in the support forums send me haring off after answers to those questions & nearly always require a tweak to clarify the wiki or a Bug Report for some under-explored interaction. The experience is enlightening but leaves me scattered and frustrated.

In other projects involving database systems, writing reports has proven a Users entry point for expanding file export options.

I often wrote a reports that output in the target system’s format. Reports allowed incremental on-screen development of simple structured formats (like tab delimited or comma separated) without the hassle of getting a development project approved. Once it worked on-screen, I’d redirect the output to file.

Once that file actually worked for import, I’d have a validated design document that could easily be converted to a subroutine which eliminated the bottleneck of the redirection. And then it was a trivial task for a competent programmer to adapt my clumsy hack into clean feature for the program.

I’d like to write similar reports for Gramps. They would be good sample reports (covering all the data structures) that people can adapt into standard Genealogical reports or webpages. And adapt into other formats.

As far as I can see (which, admittedly, is perhaps not far enough) the “simple access API” doesn’t seem to expose attributes. But at least, I think, it gives you a handle to the objects to which the attributes are attached, so maybe you could still access them when writing a report using that API. Otherwise I guess you would have to navigate the data model directly, perhaps looking at the code for the Extended Attributes gramplet for clues.

I too would like to try writing reports someday, but since I’m more familiar with SQL than Python, I currently find it much easier to use the SQLite export. (It’s also less risky! :slightly_smiling_face:)

By the way, has anyone ever used the “Query” add-on? I haven’t been able to get it to register, though I don’t get any errors about it.

No, have a read of the support page

Gary has prototyped a report that writes an Attribute table. Already getting some interesting & useful results. (I have pruned nearly 800 unneeded Attributes from my main Tree.)

With any luck, his efforts will result in a nice small & tight template for doing similar reports.