Hi!
I’m back at working on adding mimetypes to notes.
I think the update script should add a text/html
mimetype to notes of type “Html code”. What should it do with the others? Add a custom mimetype, like text/x-gramps-styledtext
for instance, or leave it empty? check whether there’s any formatting and add text/plain
in case none is found or should we keep this for the GEDCOM export?
Would it make sense to add a text/csv
mimetype for the DNA Segment Map Gramplet notes?
Totally unrelated: gramps_upgrade_21
keeps some blobs in the metadata table “if someone tries to downgrade the db”. Is there actually any downgrade procedure? Should I also add a downgrade function?
Notes are designed to be output in a variety of formats. Document generators process the notes and determine the mime type of the output. The notes themselves don’t really have a mime type.
In Gedcom 7.0 only two mime types are supported: text/plain
and text/html
.
Currently we only export plain text to Gedcom. Formatting information is lost. In the future we could write html.
Special note types such as HTML code or DNA segment maps should probably be exported as text/plain
so that the structure is preserved.
Gramps has HTML markup notes for Narrative Web page inclusions.
And the native styling of Notes uses a Out-of-band markup or positional markup when writing Notes to XML.
Can we assume that, sooner or later, Notes will include some other formats too. We already use GitHub markdown format for README.md files, MediaWiki markup, Discourse styling markdown (a mix of markdown, BBCode, HTML), Sphinx MyST dialect of reStructuredText (reST). These seem to be the most likely candidates.
So should note mimetypes anticipate this?
(And there seems to be a SimpleDoc format too? Is that what is used in Gramplets like the “Welcome to Gramps” plugin? )
Language/System |
Type |
Syntax Style |
Typical Use Case |
Example: Bold Text |
Notes |
HTML |
Markup |
Tag-based |
Web pages, emails |
<strong>text</strong> |
Very expressive, but verbose |
MediaWiki Markup |
Markup |
Wiki-specific |
Wikis (e.g., Wikipedia) |
'''text''' |
Some HTML allowed, unique wiki syntax |
GitHub Markdown (.md) |
Markdown |
Lightweight, minimal |
README files, docs, comments |
**text** or __text__ |
Converts to HTML, easy to read/write |
Discourse Styling |
Markdown, BBCode, limited HTML |
Mixed (Markdown, BBCode, HTML) |
Forum posts, discussions |
**text** (Markdown)
[b]text[/b] (BBCode)
<strong>text</strong> (HTML) |
Supports Markdown, BBCode, and safe HTML |
Sphinx MyST Markdown |
Markdown |
CommonMark + Sphinx extensions |
Technical docs, Sphinx projects |
**bold** |
Extends standard Markdown with roles, directives, and cross-references for Sphinx |
|
|
|
|
|
|
Yes. Document generators are plugins, so adding a new format involves writing a new plugin. This could be in the form of a third-party addon.
That is not necessary. Only the document generator needs to know about the mime type, not the note.
1 Like
I recall that the CSV data for the DNA Segment Map Gramplet is stored using “DNA” type Association Notes. Doesn’t it seem like Media Files are better place to link CSV data?
There are other addons that can leverage user customized CSV files. (Such as the new Historical Context and WebSearch addons.) One they are linked as Media Objects, the OS’s application for "view"ing CSV files is used to open them instead of the Note Editor. (Although Gramps only supports opening with the Default OS application. A Context menu to “Open With…” would be very welcome.)
For DNA, there are several “standard” data exchange formats:
Format |
Type |
Syntax Style |
Typical Use Case |
Notes |
VCF |
Markup |
Tab-delimited |
Variant data exchange, analysis pipelines |
Industry standard; supports SNPs and other variants |
PML (OMG) |
Markup |
XML-based |
Interoperability, database exchange |
Standardized by OMG; platform-independent |
Tab-delimited |
Plain text |
Tab-separated |
Simple data exchange, spreadsheets |
Supported by many tools; less standardized than VCF |
XML |
Markup |
XML |
Interoperability, programmatic exchange |
Flexible; used in tools like SNPper |
Not sure I am following the top-level issue here. The data used by the DNA Segment Map gramplet can be either csv or tsv. It is generated via cut-paste from the various separate external apps that provide the segment info (FamilyTreeDNA, GEDmatch, MyHeritage, LivingDNA, …)
The goal was to make the import easy from these apps. Editing, if needed, can be done in the Note itself currently. Since the format provided from these apps sometimes changes and is language-dependent (radix, thousep), some editing may be required. There is no markup in these notes.
I don’t think changing to use a Media File instead of an AssociationNote has any advantage, and does have disadvantages:
- requires an external editor.
- Person Ref editor would need to add a Gallery tab.
2 Likes
Yes. The DNA Segment Map gramplet uses notes as a convenient place to store data. These notes are not really intended to be read directly by the user.
I believe that this is working toward supporting a GEDCOM 7 functionality
See
- 0013176: [GEDCOM 7] Support Mime-types for notes
- 0012226: [GEDCOM 7] Support Import & Export of New (June 2021) version
- Pull Request #2047 : Add mimetypes to notes by olivierberten
GEDCOM 7.0 modernizes media handling throughout the specification, including requiring valid MIME types for multimedia objects. However, for notes specifically, GEDCOM 7.0 does not require a MIME type tag for plain text or markdown notes, as the format is implied by context (plain text or markdown). The introduction of markdown as an allowed format is the main change relevant to the “type” of note content, but not as a formal MIME type tag.
As I said in my previous post, Gedcom 7.0 only supports two mime types: text/plain
and text/html
. The minimum requirement for text/html
is that we recognise the p
, br
, b
, i
, u
, s
, sup
and sub
tags, together with the &
, <
, >
, "e;
and '
entities. Since superscript and subscript were added to the editor, this is no longer a problem.
All that is need is a simple HTML parser to read Gedcom containing notes with text/html
content.