Ethnicity pie charts drill-down visualisation

Inspired by this thread, I wanted to circulate the idea of creating a pie chart representation of ethnicity, using person attributes set for the ancestors, and then drilling down into descendants, in case a family has children, to build a chart showing ethnic origin, just like in this post: How do you record Enclave births? - #4 by emyoulation

As probably discussed multiple times, it’s not a scientifically accurate way to confirm the origin share, but we could give the user the possibility to say what they want to use it for – nationality/ethnicity/identity. Given that we’re barely 20 years into consumer DNA testing, which is still not widely available in certain places, it’s not possible to have such a test done for many people in the family tree.

My idea is to build a d3 visual using a specific attribute(s) set for a person. Then, if there is data for their partner and a child, we can graph the distribution of ethnicity that the descendant has, using the same graph method as other visual relationships are drawn. One of the exotic pivots here could be to populate the descendants’ attribute fields automatically; however, this could mess with manually entered data that one could want to add to a specific descendant.

What do people think of this idea?

1 Like

I think that would be fantastic, but I would actually think it bigger:

  • A generic way to define categories (e.g. by attribute value)
  • Different visualizations of the categories, pie chart being one possibility.

We already have the d3-based fan charts which allow category coloring based on some predefined things, like religious denomination (based on the Religion event). It would be cool to use the same kind of categories both for fan charts and pie charts. We could share much of the same JS logic.

To me, the main question is how to shape the UI to define the categories to be plotted.

2 Likes

Sorry for the silence from my side. That is great, I would love to see the fan chart logic reused, but I currently do not know a way to represent quote-on-quote pie chart data on such a visual, adding number/place hints. I’m not great at data, but I’ll experiment and come back in a while.

If a way to consolidate the Fan Chart fundamental functions, it would help grow their capabilities.

In Gramps for destops, there are Fan charts as Charts View modes, gramplets and Reports. But only 1 type of Fan chart is available in all three. And not all the features are available in all types.

Perhaps most inconvenient of all, you cannot mirror the configuring done in one view mode (or gramplet splitbar, or report) to its brethren. So you cannot interactively configure a Fan Chart in a Chart view then apply it to a Report or Book.

[And … “mmmmm, pie!” :face_savoring_food: :fork_and_knife: ]

I am not certain to be able to find who is french!
Close to south or north of German speaking countries area, the South-West of France is close to Spanish area, South-East to Italy, North-West to United Kingdom or Ireland, Québec is close to the American culture, what about Louisiana (USA), etc. So, even with all these french “mother or father tongues”, we cannot really trust any rule for setting a strict definition for “one french”…

Once again, appreciate the input regarding the Fan Chart vs. Pie chart and how it can be used (languages spoken or ethnicity or etc). It’s still tough for me to comprehend visualising the percentage distribution on a fan chart, so I decided to go with a prototype of a pie chart. The current dilemma I’m facing is the fact that given the context, we can only calculate the distribution for the person in focus and his descendants, but we don’t have that context when we go down and click to focus on the child, in this case Jane Adams.

The first picture illustrates the general idea of a “parent → child → child → …” chart, but the further calculation (in red) is currently broken; please disregard that.

And this is what we get when we switch to Jane, and since she doesn’t have any attribute data explicitly defined, there is nothing to calculate. Fetching upstream data from her parents is one unreliable option (what if they also do not have data), but what else? My feeling is that this either needs to stay ad-hoc, i.e “only view from this perspective” or propagate the fields using existing methods into the database. Could be done to the nearest descendant which has ethnicity data explicitly set, but overall is intrusive.

Some updates over the weekend. I did some more work and tried a soft database propagation for ethnicities. How this works is the user sets an ethnicity attribute or multiple of them, either with a colon-separated percentage of that ethnicity or without. We then set calculatedEthnicity to descendants unless some descendant have ethnicity already set. The calculated ethnicity field is updated upon changes up the tree. The only caveat is that I had to do direct REST API calls to fetch the spouses, as the IsSpouseOfFilterMatch doesn’t support dynamically constructed filters.

I find this useful for myself; however, I’m not sure how the wider audience would respond to such a change, given that this is not the conventional fan chart attribute item but an actual section in the visuals page.

4 Likes

While I see significant potential in a tool like this, there are also some fundamental methodological pitfalls that need to be acknowledged. The most serious issue is that we generally do not know the actual ethnicity of our ancestors.
This means there is a high risk of introducing false or imagined ethnicities into a genealogy simply because the user wants a result. The outcome becomes an artificial construction based on data that does not exist in the historical record — even for relatively recent generations.

Ethnicity cannot be inferred from a person’s country of origin, language, region, or social group or similar information alone.
Historical sources often used broad, imprecise, or outdated categories, and modern concepts of ethnicity cannot be projected backwards without creating anachronisms.
A tool that attempts to calculate or visualize ethnicity without solid historical evidence risks presenting an illusion of precision where none exists.
In practice, it may end up visualizing the user’s assumptions rather than the ancestor’s identity.

However, if such a system is designed not as an ethnicity calculator but as a flexible, parameter‑driven analytical tool, it could become extremely valuable for historical, genealogical, and social‑scientific research. The key is that the user must be able to choose which attributes, event types, and relationships to include in the analysis. This shifts the focus away from speculative ethnicity and toward verifiable, source‑based data.

A dynamic configuration model would allow users to select:

  • which attributes to analyze (language, religion, occupation, nationality, social status, etc.)

  • which event types to include (residence, employment, migration, education, military service, etc.)

  • which relationships matter (household membership, employment relations, apprenticeships, maritime service, adoption, etc.)

  • which contextual elements to visualize (places, time periods, social networks, occupational clusters)

This approach enables meaningful analysis of historical patterns without inventing data. For example, one could examine how occupations changed across generations, how migration shaped language use, how religious affiliation shifted over time, or how household structures influenced identity. Because the parameters are user‑defined and based on documented events, the results remain grounded in actual sources rather than assumptions.

In this way, the tool becomes a framework for exploring social history, cultural change, and demographic patterns — not a mechanism for assigning imagined ethnic percentages. It respects the limits of the historical record while still offering powerful ways to visualize and interpret the data that does exist.


Methodological Note: The initial draft of this text was written in Norwegian. Microsoft Copilot was used to assist with translation into English and to refine structure, readability, and argument flow. All factual claims, interpretations, and conclusions were reviewed and validated by the author. AI assistance was used strictly as a linguistic and editorial tool.

2 Likes

You are totally right. It is much more complex. If the ethnicity of your father is Polish, then this makes no sense for your grandfather, maybe you can call it Slavic in that generation. It the ethnicity of your mother is Ukraine, the ethnicity of of your grandfather is maybe Russian. There are different ethnicity categories at different times. If you go back some thousand years the category is maybe „hunter and collector“. You cannot calculate and propagate ethnicity. You can use DNA data to get better estimates than calculating ethnicity from one generation to the next.