Update Sources and citations

Over the last year, it has become apparent that @StoltHD has designed some very intricate workflows. And these workflows let him do some analysis that sounds incredibly powerful. I admire his mastery of the spectrum of research tools.

Yet some import/export deficiencies cost him excessive effort when doing workarounds. Very likely, sometimes workarounds have proven impossible.

So I worry that he might have become fixated on enabling a specific user-powered workaround because it has broad application. But there could be a better solution that wouldnā€™t have as many code development detoursā€¦ and that doesnā€™t leave data crumbs at each exchange of format. Many times, these solutions are counterintuitive to non-programmers.

We donā€™t know because we cannot see his workflows. And without understanding these workflows & grasping the objective, we would have to follow blindly. Most techies cannot do that, they need to fully grok the final objective.

1 Like

Itā€™s not advanced at all!!!

  • I use one and test some external software for Research Notes (Foam in VSC, Obsidian and Joplin)
  • I use Network Graph Software for analyzing and visualizing of my data and research
  • I use a Reference Manager Software for Sources, since there is no realt source management in Gramps
  • I use a timeline software to create timelines for more than people.
  • I use a Foxit PhantomPDF for PDFs and LibreOffice and MS Office in addition to Zettlr and Scrivener for what they are useful for.
  • I use GIS software for Maps
  • I use Freeplane for Mindmaps.
  • I used Twine for manually creating some graphs, until I started with Obsidian and later Foam.

Both VSC and Obsidian have extentions/plugins for Mermaid, Graphviz, Zotero and a lot of other useful tools.

How difficult can it be to understand that having a interchangeable format with CSL based Reference Manager would benefit Gramps as a research tool?
When that alone will open the possibility to use 20-30 other research tools and publishing platforms that support CSL.

How difficult can it be to understand that usage of GIS software to create advanced maps is a great feature?
Or that analyzing You database in Gephi, Cytoscape or Constellation will give you possibilities to find relations and connections you canā€™t find easily in Gramps or in a table formatted list.

How difficult can it be to understand that having Research Notes and Logs in a Text Based format that support links between different Notes and that support links to images and PDFā€™s and web pages, and where you also can add citations, benefit researchers and Gramps users.

My workflow is simple, my resent project include a few hundred or sow Crew lists for steamers sailing on America in the period 1922 to 1925/30 and Newspaper lists for arrival and departure of approx. 100 Norwegian ships and to find all the available Crew lists for those ships, and link the different occurrence of Crew members to a research log for that person, and the same for all the ships.

to be able to analyze all this information Gramps is useless, because it do not show links/relations between different objects, only direct relations between object other than people can be displayed.
Therefore I use network graph for that, with feature to set color, names, different icons for different types etc.

To be able to see that two people with different spellings of a name, or 8 people with the same surname served on the same ship in a period of time, can give clue to wrong name written for a seaman and that it actually can be the same person.
IT also give the change of finding out if two records of persons on different ships with the same name can be the same person, based on time and the Ports a ship visited.

A timeline will give more detail of this types of findings.

The same can be done for passenger lists, or any other type of list where you have multiple hits on people, but do not know if they actually are different people or one and the same.
A network graph can show if two people with the same name was at different places in a period, and a timeline can show if there could be possible that one person could move to another place in the given period.
i.e. two names for seamen, on two different ships, same birth year, one in record from Oslo, the other in Boston, if the timespan is less than 28 days, it has to be two different peopleā€¦
And when you have 50, 60 or maybe 300 of this records, doing this manually is near impossible.

So since Gramps doesnā€™t support export to graphml, I have started to use Foam, and Obsidian with the JUGGL plugin (it can save graphs for unstructured and structured data (Notes and Tables) to cytoscape.json, that format can be imported to cytoscape if the JUGGL graph lack some feature.

All this Sources is saved with metadata in Zotero, and with CSL plugins for those software I can add Citations and Bibliography and add those to my graph, so I also see where the information comes from.
I can use a plugin in Zotero to export all my Notes to Markdown, including any annotations in PDF files, with both link to the zotero item the Note belong to and any other type of links and wikilinks.

The Graph will automatically show me any links and nodes that have links to the same objects, and it will show me all categories (tags, keywords, etc) that I have configured, with different colors or shapes, if I need to group those together, i can just import the graph to cytoscape, and easily do that there, or I can create clusters for different types of information based on the YAML frontmatter variables I use.

I can easily see where there are links between multiple people based on journeys, documents, places, or other people without using hours in a table view. In Gramps it will not be possible to find those type of connections.

My technical workflow today is to export a XML from Gramps with the information I have, import it to Excel with Power Query, Create a mapping to tables for the different Gramps objects.
Run a vba script that creates two new tables, one with Nodes (objects) and one with Edges (Links, Citations and Relations between any object in Gramps), save those two tables to CSV, then create a Markdown Note for each line in the original tables, where each Column are used as a Variable Name in YAML in addition to other formats of structured/unstructured Note text. The Script also create the wikilinks from selected columns and save each Note to a given sub folder in my research log storage.

Problem with this, even though I copy a Bibliography string from Zotero to Gramps, is that the sources is not consistent, so I must manually control all the old sources and citations, because it can be that I have made a change in one program and forgot to do the same change in the other.

This script also creates a link with full path that I want to be able to export to Gramps as a media link, by updating the XML file and import it back to Gramps.

Problem is that I need to do this work (not the scripts, they can I just run) every single time I want to include some data.

IF, Gramps supported interchangeable formats like graphml (or similar), RDF/OWL JSON-LD and CSL JSON, I could have exported the data from Gramps, combined that in Cytoscape with the other data I have, analyzed it, and imported the updated result back to Gramps.
The Research Notes is my choice of workflow, I find the Notes feature in Gramps to limited, and way to difficult to keep updated and sorted, specially because i use a lot of links in them and have no way of displaying those links graphically.
I do not expect Gramps to automatically and magically import those Notes to the correct objects in Gramps.

THE FINAL OBECTIVE is to make Gramps able to interchange data with other research software, nothing else, and provide more powerful features without bloating Gramps, the ā€œfilterā€ā€“system is CPU heavy enoughā€¦
Multicore, multithread support, including multithread read/write to the database, would be another great feature, but that no one seems to understand, reducing an advanced filter run by divided by 6 or 12 for larger databases (100K plus objects), would be a lot of minutes saved.

And then we have the Main and Sub Eventsā€¦

The minute you want to do more than ā€œjustā€ lineage-linked research, there is a lot of features that is needed, and Gramps has the potential of being a great multi use research tool, not just ā€œyet another genealogy software for Linux, that ā€œsupportā€ MAC and Windows toā€, support for interchangeable formats that is commonly used in historical research (this include a serious interchangeable Reference Format, Events for Places, and Main-/Sub-Events. with those 3 extra features, Gramps is a research tool, not only a registration software for genealogy data.

1 Like

This is a lovely bit of sharing. Thank you.

I think people understand the formats themselves and the portal such format create into other tools.

What they donā€™t have a clear vision on is: how vital parts of the formats could be supported in Gramps without breaking the Gramps model or limiting Grampsā€™ ability in other areas. There are always tradeoffs & limitations when implementing a format. And before a structure is imposed, the potential remains limitlessā€¦ even though unrealized. So negotiating that is always a frustrating discussion.

1 Like

AS long as those formats are xml or json, there is no limitations, there is no limitations for what you can store in the json or xml file formats I mention, if another software donā€™t support the variable, it just ignore it.
Those formats mentioned multiple times can hold ANY value thatā€™s used in Grampsā€¦

The only limitation is the willingness to actually do or not do, and if you see the glas as half full or half empty.

If it was so extremely difficult, and useless, why is it so many large Open Source Software Projects that support interchangeable formats.
Gramps already have a JSON export, but it is not supported by any other software without manually hacking it and even then it is problematic to get other software to read it.

And even though the XML format is ā€œreadableā€, it needs transcoding to be useful in any other software, i.e. you need to be a developer to be able to use your data collected in Gramps if you want to do anything else than the limitation the gedcom format gives your.

When you quantify, you limit.

So, when a format defines something like a date, they may choose to use an epoch date (which limits to some number of seconds from 1 January 1970, 03:14:07 UTC) or a Gregorian date which eliminates the time portion, other calendars & seasons.

These are the limits I meant rather than saying that any particular format is inferior.

As long as the willingness to find limitations is higher than the willingness to find solutions, there will always be limitations an impossible tasks.

And as long as developers take shortcuts (Like Microsoft and Apple) regarding their calendars, there will always be problems, stick to ISO standards and Attributes and a lot is solved.

Never overcomplicate and create problems where there are none.

Start before, end after, sometime between, itā€™s all application specific attributes an should be stored and exported as that. But all of this attributes on dates is actually just mathematical expression since dates is numbers, i.e. before a date is ā€œless thanā€ ā€œ<ā€ after is ā€œlarger thanā€ > and ISO do actually support time period (between), there is a mathematical expression for most attributes to datesā€¦ and if the dates was saved in ISO format and the software provided full support for iso 8601, people would maybe find it useful to use it.

It is really strange how people always looks for problems instead of solutions!

Exploring the limits is part of finding the best way to leverage an opportunity. It is the limits, the edge cases that inspire extraordinary solutions.

It is NOT looking for problems. Half-full or half-empty; if you discover that your limit is that you need 100% to accomplish the task then you have room to add 50% more or the opportunity to double the efficiency of your system.

Instead of critical all volonters writing gramps, If you think something is missing in gramps, write the specifications for what you think important and write an extension.

You are not on a commercial product but on a free project. Everybody can participate. If you are not a developper, perhaps you know someone who can write what you wish.

2 Likes

This thread is becoming too overheated to conform with civility guidelines.

Letā€™s put it on a hiatus for a few weeks.

2 Likes

With the addition of the Citation plugin expansion in the upcoming 5.2, perhaps it is time to re-open this conversation?

Citation related items in the 5.2b1 Announcementā€™s change log:

Highlights:

  • Add citations to event references. PR#1391

Technical:

  • Add support for CITE plugins Provide a single default plugin that replicates the existing functionality. PR#1402
  • Add missing get_number_of_citations method. PR#859 and 857 (db/base.py, proxy/proxybase.py)

Gramplets:

  • Citations gramplet: Add date, page, and confidence. Fixes #9224
    • Change columns order and size.
    • Sort correctly by date.

GUI

  • Move privacy column in editor citation tabs
  • Add Back/Forward labels to citation tree view. Fixes #12510
  • Add Abbreviation column to source and citation selectors. Implements #11710
  • Increase information in database summary text report. Add type counts for events, places, sources, citations, repositories and notes.
  • Add ā€˜HasAttributeā€™ filter rule to repositories, sources and citations. Fixes #9845

Any idea about how we can add some functionality that works for everyone, regardless of culture and language? All things that I have read until now are too big for practical purposes, and donā€™t work with the meta data that is actually used by the archives.

I hate to say it (given the currently active thread about merging citations affecting the data generated by the Form gramplet) but a Citation variation of Form would seem promising.

Create a Citation form that adds a family of custom Attributes to Sources and/or Citations. Then create Cite plug-ins (with a CSL parser as a possible choice) that use those standard attributes to write more expansive citations.

Of course, Reports (or Books) would have to be adapted to choose a Cite plug-in.

1 Like

And I know that you are right, and I would even love to go a step further, meaning that we have forms that have lines for meta data, as provided by the archives that we use, and for the person data, all in one.

With such forms, it would be very easy to import data from the Dutch open archives site, and they would make imports from FamilySearch sources also a lot easier. Both sites use flat models, no source/citation hierarchy, and for such models, well designed forms are a great tool.

Many of these techniques are already available in programs like Legacy (for Windows), Reunion (macOS), and RootsMagic (macOS and Windows). And there are a few more, like Family Historian. I use RootsMagic here, to interact with Ancestry and FamilySearch, and itā€™s the only one from this list that has a free version that has full support for user made citation elements.

2 Likes