0012219: Expand ImageMetadata gramplet to support XMP and IPTC metadata tags

Good to know, thank you

Understood, but it would be needed for updating or testing/debugging of older or newer versions?

Still, would using the MSYS package be required to, for instance, install a package requiring say numpy?? Such as PhotoTagging plugin. (PT)

The basis for all my questions is really whether reworking my metadata display plugin to use GExiv2 would not still end up giving me the same issues if I wanted the PT plugin?

If you want to make your plugin available to our Windows users, then you will have to use packages that are installed as part of the AIO. There is no requirement to use the AIO or MSYS2 to run Gramps on Windows, but I expect that most people use the AIO installer.

If an addon requires a package that is not in the AIO, then you could always ask a package maintainer to add it. Paul Culley (@prculley) is our Windows maintainer and he quite often posts in this forum.

Failing that, feel free to add a Linux-only plugin to the third-party addons repository. It obviously won’t be included in core Gramps, but our Linux users may be find it useful.

Thank you, it looks like I will have to go the Linux only way and if any one finds the plugin useful or of interest, then they’ll be welcome to drum up support to have the necessary infrastructure made part of the AOI installer.

Note: @Teimue (Heiko Teichmüller) posted an inquiry to the “Gramps for Genealogists” Facebook group related to image regions defined in Gramps with the Tree data subsequently exported for online publication with WebTrees.

He is using the GED2 (GEDCOM extensions) exporter add-on… but that only supports a single Thumbnail region. He wants to export multiple FaceDetection (or manually assigned) regions for group photos.

It was suggested that he join the conversation here.

And here I am! :wink:
So I use the integrated image detail function in Gramps

1 Like

And with GEDCOM 5.5 Export and Import in my Webtrees Homepage it looks like this:
screen_20221205_163933

That’s a shame, because both Gramps and Webtrees can handle clipping data. Unfortunately, the image section data is not included on the way.
Since I didn’t just do this with group pictures but also with documents, it affects several thousand picture sections.
How could we elegantly solve this task?

2 Likes

This sounds like a compatibility issue between the Gramps export and the WebTrees import. Gramps is not providing the detail necessary for WebTree to import the regions.

I suggest reaching out to the WebTrees developers. Ask how the clippings/regions need to be defined in the GEDCOM export so that WebTrees can successfully import the information. Once you have that information, I suggest opening a feature request on the Gramps Mantis website (https://gramps-project.org/bugs). Maybe a developer there could look at changes to the export process.

After a bit more work, I have cobble together another plugin which uses the GExiv2 library and this ought to make it possible to install it for the Windows AIO package, which I have just done.As it is it is still quite plain, but it seems to display all the data I can find.
My current tests include only JPG & PNGs

# File: displayExiv2.gpr.py
# using Gramps built-in the GExiv2 interface
register(GRAMPLET,
        id="Display Exiv2 Data", 
        name=_("Display Exiv2 Data"),
        description = _("Gramplet to display Exiv2 image metadata"),
        status = STABLE,
        version="0.0.2",
        fname="displayExiv2.py",
        height = 20,
        gramplet              = 'DisplayExiv2',
        gramplet_title        = _("Display Exiv2 Data"),
        gramps_target_version="5.1",
        )


# -*- coding: utf-8 -*-
#!/usr/bin/env python
# displayExiv2 module
#
# Gramps - a GTK+/GNOME based genealogy program
#
# Copyright (C) 2009-2011 Rob G. Healey <robhealey1@gmail.com>
#               2019      Paul Culley <paulr2787@gmail.com>
#               2022      Arnold Wiegert nscg111@gmail.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

#-------------------------------------------------------------------------
#
# GNOME modules
#
#-------------------------------------------------------------------------
from gi.repository import Gtk

## display image metadata using GExiv2 library instead of Exiftool
# *****************************************************************************
# Python Modules
# *****************************************************************************
import os

"""
Display Exiv2 Gramplet
"""
#-------------------------------------------------------------------------
#
# Gramps modules
#
#-------------------------------------------------------------------------
from gramps.gen.plug import Gramplet
from gramps.gen.utils.file import media_path_full
from gi.repository import Gtk
import gi
gi.require_version('GExiv2', '0.10')
from gi.repository import GExiv2
##from gi.repository.GExiv2 import Metadata

# code from: - only good for JPGs
# https://www.thepythoncode.com/article/extracting-image-metadata-in-python
from PIL import Image
from PIL.ExifTags import TAGS



#-------------------------------------------------------------------------
#
# Gramps modules
#
#-------------------------------------------------------------------------

from gramps.gui.listmodel import ListModel
from gramps.gen.const import GRAMPS_LOCALE as glocale
_ = glocale.translation.gettext
from gramps.gen.utils.place import conv_lat_lon
from fractions import Fraction
from gramps.gen.lib import Date
from gramps.gen.datehandler import displayer
from datetime import datetime

# ----------------------------------------------------------

class DisplayExiv2(Gramplet):
    """
    Displays the metadata of an image.
    """
    
    def init(self):
        self.set_text( "Exiv2 Metadata" )
        gexiv2Version = GExiv2.get_version()
        self.gui.WIDGET = self.build_gui()
        self.gui.get_container_widget().remove(self.gui.textview)
        self.gui.get_container_widget().add(self.gui.WIDGET)
        self.gui.WIDGET.show()
    
    # ----------------------------------------------------------    
    def db_changed(self):
        self.connect_signal('Media', self.update)
    
    # ----------------------------------------------------------
    def build_gui(self):
        """
        Build the GUI interface.
        """
        self.view = MetadataView3()
        return self.view
    # ----------------------------------------------------------
    def main(self):
        active_handle = self.get_active('Media')
        if active_handle:
            media = self.dbstate.db.get_media_from_handle(active_handle)
            if media:
                full_path = media_path_full(self.dbstate.db, media.get_path())
                has_data = self.view.display_metadata(full_path)
                self.set_has_data(has_data)
            else:
                self.set_has_data(False)
        else:
            self.set_has_data(False)

# ----------------------------------------------------------

class MetadataView3(Gtk.TreeView):

    def __init__(self):
        Gtk.TreeView.__init__(self)
        self.sections = {}
        titles = [(_('Key'), 1, 235),
                  (_('Value'), 2, 325)]
        self.model = ListModel(self, titles, list_mode="tree")


    # ----------------------------------------------------------
    def display_metadata(self, full_path):
        """
        Display the metadata
        """
        self.sections = {}
        self.model.clear()

        ver = GExiv2.get_version()
        """ Make sure the file exists"""
        if not os.path.exists(full_path):
            head, tail = os.path.split( full_path )
            label = tail  
            node = self.__add_section('File not found')
            label = tail
            human_value = head
            self.model.add((label, human_value), node=node)  
            return False

        lastLeadin = ""
        retval = False
        #logLeadIn = True 
        logLeadIn = False
        #logData = True 
        logData = False
   
        with open(full_path, 'rb') as fd:
            try:
                buf = fd.read()
                metadata = GExiv2.Metadata()
                metadata.open_buf(buf)
                # # add header to identify GExiv2
                label = 'GExiv2 Version'
                node = self.__add_section("Metadata")
                self.model.add((label, GExiv2._version), node=node)
                self.model.add((label, str(ver) ), node=node)


                get_human = metadata.get_tag_interpreted_string
                # handle Exif tags
                for key in metadata.get_exif_tags():
                    tagType = metadata.get_tag_type(key)
                    label = metadata.get_tag_label(key)
                    human_value = get_human(key)
                    leadin = key.split('.',2)
                    section = leadin[0]
                    if logData:
                        print(f"{key:50}: {human_value}")
                    if lastLeadin != section: 
                        lastLeadin = section
                        if logLeadIn:
                            print( "  section: " + section +" lastLeadin: " + lastLeadin + "\n" ) 
                        
                    node = self.__add_section(section)
                    if human_value is None:
                        human_value = ''
                    self.model.add((label, human_value), node=node)

                # handle IPTC tags
                for key in metadata.get_iptc_tags():
                    label = metadata.get_tag_label(key)
                    human_value = get_human(key)
                    leadin = key.split('.',3)
                    section = leadin[0]
                    if logData:
                        print(f"{key:50}: {human_value}")
                    if lastLeadin != section:
                        lastLeadin = section
                        if logLeadIn:
                            print( "  section: " + section +" lastLeadin: " + lastLeadin + "\n" ) 
                        
                    cleanName = leadin[2]
                    # See the IPTC Spec IIMV4.1.pdf
                    if cleanName == "CharacterSet":
                        if human_value == "\x1b%G":
                            human_value = 'UTF8 - ESC%G'
                        else:
                            temp = human_value
                            human_value = 'Unknown char set: ' + temp

                    node = self.__add_section(section)
                    if human_value is None:
                        human_value = ''
                    self.model.add((label, human_value), node=node)

                # handle XMP tags
                for key in metadata.get_xmp_tags():
                    label = metadata.get_tag_label(key)
                    human_value = get_human(key)
                    leadin = key.split('.',3)
                    section = leadin[0]
                    if logData:
                        print(f"{key:50}: {human_value}")
                    if lastLeadin != section:
                        lastLeadin = section
                        if logLeadIn:
                            print( "  section: " + section +" lastLeadin: " + lastLeadin + "\n" ) 
                        
                    cleanName = leadin[2]
                    # # See the IPTC Spec IIMV4.1.pdf
                    # if cleanName == "CharacterSet":
                    #     if human_value == "\x1b%G":
                    #         human_value = 'UTF8 - \x1b%G'
                    #     else:
                    #         temp = human_value
                    #         human_value = 'Unknown char set: ' + temp

                    node = self.__add_section(section)
                    if human_value is None:
                        human_value = ''
                    self.model.add((label, human_value), node=node)

                
                #n = self.model.count
                if self.model.count <= 3:
                    head, tail = os.path.split( full_path )
                    label = tail  
                    node = self.__add_section('No Metadata found in: ')
                    label = tail
                    human_value = ''
                    self.model.add((label, human_value), node=node)  
                self.model.tree.expand_all()
                #retval = self.model.count > 0
            except:
                pass

        return retval

    # ----------------------------------------------------------
    def __add_section(self, section):
        """
        Add the section heading node to the model.
        """
        if section not in self.sections:
            node = self.model.add([section, ''])
            self.sections[section] = node
        else:
            node = self.sections[section]
        return node

    # ----------------------------------------------------------
    def get_has_data(self, full_path):
        """
        Return True if the gramplet has data, else return False.
        """
        if not os.path.exists(full_path):
            return False
        with open(full_path, 'rb') as fd:
            retval = False
            try:
                buf = fd.read()
                metadata = GExiv2.Metadata()
                metadata.open_buf(buf)
                for tag in TAGS:
                    if tag in metadata.get_exif_tags():
                        retval = True
                        break
            except:
                pass
        return retval

# ---------------------------- eof ------------------------------
1 Like

OK. Have it installed and running on a Windoze 10 box. The “Yawn” image in the example.gramps tree was the only image with significant metadata.

Now trying to figure how its organization of data compares to the other gramplets. (Why is the GExiv2 Version listed twice in the “Metadata” section with different values?)

Thanks for embedding the name of the .gpr.py file in the comments. The other gramplets use the word “metadata” in the title of their Gramplets. Searching for “data” finds 52 built-in and add-on plugins. Metadata finds only 4. (Yours is one of them by virtue of its Description.)

Thank you for confirming you could install it under Windows.
Since this is my first try at using the built-in GExiv2 facility, there is lots I am not happy with, unsure or suspicious of.
But this version also handles other extensions; the only ones I have trid thus far are JPGs & PNGs

In the past I had been using the C++ interface to Exiv2, but got away from it for various reasons.
Exiv2 does have some advantages over Exiftool, but not enough to keep me using it.
Still now that it is so difficult - at least for me - to use Exiftool as a general base for metadata work, I had to give it a second look.
Unfortunately, for my own Python metadata utility I ended up with another Python interface to Exiv2 this one by Jim Easterbrook. And there seem to be differences - aside from the version of Exiv2 they appear to be based on.

The reason there are 2 version strings is simply that I found 2 ways to get that bit of information and right now did not really know how to make the distinction clear enough nor what either really means.
In previous discussion in this thread, Nick Hall identified the GExiv2 library in the AIO as

Our Windows AIO includes exiv2 0.27.5 and our Mac bundle includes exiv2 0.27.4. Both use version 0.14.0 of the gexiv2 python bindings.

But since I do my debugging and experimenting under Mint, I have no idea what the second number really means or refers to. It is the value a function GExiv2.get_version() returns.
Whether it is even a legitimate call, I have no idea, but the VSCode debugger and the Python (3.10.6) it runs seems to be happy with it.
In fact, the second value on the Mint machine turns out to be 1400 (-> 0578H)

As it stands now, there are many differences between what Exiftool spits out for a given image - assuming data is present. No doubt some of the missing data is not shown because I have not yet figured out how to get to it and that will be on the agenda over the next little while.

I am experiencing severe latency in the Gramps GUI with this Gramplet active.

What kind of overhead is involved in polling a media object for metadata? Does it have to re-read the entire file every time the active object changes or the view is refreshed? Is embedded data at the beginning, end or throughout the file?

Adding a couple/few seconds of delay before it attempts to poll might help? And letting changing active object focus interrupt polling of metadata would be good too. So changing active object by scrolling the selection focus allows the GUI to become responsive again? Please make certain the Gramplet remains idle unless it is the active gramplet in a vidiable sidebar or visible bottombar of the active view… or is detached/undocked.

The Geography view will cache map tiles. Can this do something similar?

Perhaps there is some kind of session caching that can store a temp alias file in a tiny sidecar or Digital Asset Management file compatible with the GExiv2 functions. So the tool could read the temp file instead of the full media file if the timestamp is newer for the alias file. Writing back updated info would of course have to freshen the original & force freshening of the cached file.

Note: some developers have experimented with lowering priority for their Gramplets on a multi-core systems. It seems like this would an excellent thing to keep a tool like this from slowing down the GUI.

That kind of problem is several grades above my current experience with Gramps and all things connected to it.
About the only thing I am reasonably confident on, is that this (and presumably similar plugins) are tied to the Media view and quite naturally, changing the media object will require reading its data.
During development & debugging of the code, any such latency would not likely be noticeable and up to now, I have not done much work with my data to have noticed any delays.

On the metadata side, I understand that image formats have certain specs/requirements regarding the placement, absolute & relative, of some of the data sections, but I know only too well that not all software respects those specs, It seems that some software, when editing metadata can, actually move the metadata to the very end of the image file. Some apps even pad (bloat) the data when they write the data to disk, assuming that further editing will be more efficiently handled if there is space to add data to the current data segment without having to rewrite the whole file. It is my assumption that these apps do a sort of inplace patching when updating the data

As for potential fixes you have mentioned, that definitely would be well above my level of capabilities.

OTOH, I would expect that the OS (WIN??) would play a major role in lazy writing and or caching.

They just changed the Facebook group name to “SaveMetadata.org” … but without posting any new content to explain the change. Seems more like churning than progress.

Well, it is a rather dense topic, but still, I am surprised there has not been any serious discussion or feedback.
My conclusion is that it most likely is because few genealogists are familiar enough with the possibilities of metadata.

Aside from yourself, there seems very little interest or action on this topic even here.

IMO, one of the main reasons is because - from my reading - image metadata seems way more active in the Win world and there is little support for it in the Win Gramps version and it is harder to add support for it in that version.
A bit of a chicken & egg problem.

You could create a new topic on here or write a wiki article explaining how you use metadata, so other genealogists can follow your workflow and understand the possibilities too. :smiley:

Personally, I don’t care.
I only scan old photos.
When I use recent media, I give an accurate description.
The coordinates are useless.
If you need to select more than one person from a photo, we don’t need to include them in the media.
It’s a waste of time in development for just a few photos.

That the coordinates are whole number percentiles IS extremely limiting. (Or perhaps you meant the GPS coordinates? Probably not, since those have great potential.)

When trying to tag persons in group photos of a reasonable resolution, the coarseness makes it difficult to accurately set a rectangle. Since one of my objectives is to tag all the family reunion photos included in family newsletters, family websites, and Facebook groups, it is frustrating.

Actually, I don’t have a ‘workflow’ for my metadata & Gramps - yet :frowning:

As a bit of backstory:
Over the years, I have worked mostly with Windows and got to use and (sort of) like or at least get comfortable with their compiler/debugger IDE. Although I also worked with Solaris and even tried my hand at plain old Linux for a while.

My interest in metadata started years ago when I tried to organize my documents releted to genealogy under Windows and long before I knew about Gramps.

Initially, and after reading up on the topic, I expected I would be able to organize ‘everything’ using file names and directory trees with enough data in the names to handle it ALL.
But soon enough I ran into limitation due to allowed length of path & file names.
That lead me to metadata and sidecar files. Then to simply embedded metadata. But I found little information on just what data I could use for genealogy purposes, much less information of how to organize and use it in the apps I was using to store that information.
Eventually, I found Gramps, but was intimidated by the learning curve and since I saw no way to make use of what I saw as very important, I moved to something else for quite some time.
During this time, I learned a lot more about metadata, but found no genealogy app which allowed me to make use of that information. And Istill have not found anything of the sort, though by now I am not looking around very much for that.
May that be as it is, I eventually switched to Gramps, in the hope, thatas an open source project, I might be able to find a way which would make the learning curve worthwhile. Years ago, I even got to the point of building the AIO version for Windows.

Right now, I have tried to run Gramps under Mint and as a first step, was able to modify one of the plugins to at least display all of the metadata in a given image - under Linux.
As a ‘Windows guy’ I am having problems trying to develop a plugin under Linux and then port it to the Windows AIO.

A second problem is that there seems to be no consensus which data should be saved or where. (Currently there seem to be well over 24,000 available metadata tags to chose from. Naturally not all will be ‘good’ candidates, but even so the number of choices is overwhelming.)
At least within the Gramps ‘community’, we ought to be able to come up with a list of suitable data and metadata tags which would be useful.
I have asked, but all has been pretty quiet on that front.
One possible issue might be to ensure the selected tags are available for all image types. Initially we probably need to consider at least JPG, PNG & TIFF, ++??
FWIW, I believe that such metadata can also be added to PDFs, which is important for myself.

A third issue is to find or create a suitable utility to add, modify or delete theses fields to a an image or other data file.
If any one wanted to add this facility to Gramps directly, that would be another option. Still any such development would require some sort of ‘standard’ to work with.

Fourth issue: As it stands now, any changes to the files - images etc - will change the checksum calculated by Gramps (and saved in the DB ?) and hence will require the user to run a verify and update cycle.

So, again, as long as Gramps has no easy way to, at least, display and use that data, there is little chance things will change.

And judging from https://gramps.discourse.group/t/adaptive-github-download-installer/2490/7
which says that Windows users represent 2/3 of all Gramps users, …

For now, I have slowed my Gramps work to a trickle as I am working to sort out the PDF possibilities and further develop my Win-only utility to be able to inspect and edit metadata of interest.

At one point, I was working on a Python clone of that Win-only utility, but there only so many hours in a day, and it is way to far behind the Win version to make it anything more than a ‘proof of concept’.

IMNSHO, the big effort should go into a web version which would make it possible for all interested parties to use it and unify the development effort, because then it can be worked on - possibly even using a docker container - no matter which OS the developers run and are comfortable with.

Just my 4 bits :slight_smile:

Can you please elaborate with specific examples? To what extent are these “metadata of interest” things that were at some point typed in by you via various interfaces (other than your camera or scanner)? I’m wondering if they might be things that would make sense as Attributes in Gramps, whether those attributes be attached to the Media object or elsewhere (Source, Citation, etc.). Maybe not for you personally, since you are already heavily invested in metadata, but for Gramps users in general.

If the gramplet then created Attributes for items selected by the user, that could actually be helpful (depending on what these “metadata of interest” are). And rather than being a gramplet, maybe it could be an optional part of the dialog that occurs when users add a new Media object.

Thanks to both of you for bringing new perspectives to the forum!