Can you make the "noindex" html meta tags optional for Search engines?

Apparently in reaction to this post Adding Noindex to the Narrated Web Site? - #5 a noindex was added to basepage.py. So in effect, one person’s desire not to be indexed now forcibly applies to everyone. A more proper solution would have been to make this an optional feature.

1 Like

The author has repeatedly expressed a dislike of Google. So he has no personal interest in doing work to add a preference interface to make noindex optional.

Of course, you can can hack the lines pointed out in the posting to bypass adding interface.

But if you have the skills to make an option, he’d probably accept an enhancement PR submitted through GitHub. (Assuming the patch passed pylint and Blank quality tests.)

While I have 44 years of development experience, Python is not part of my skillset. I do now have a shell script / awk to edit out the authors expressed dislikes in all the generated files, which took about 5 minutes to whip up.

1 Like

I suggest that you add a placeholder person to your tree with a “Gramps” surname. Then add a Note detailing the manual patch and a link to this discussion. And link the awk script as media or (more rot resistantly) paste the script into a Note. You can use a custom Type for the note… like Patch or Hack. Use the Private switch for the placeholder person.

Anytime the report is updated, your patch will revert. So having a workflow documented is helpful.

Have you noticed this Discouse thread:

I hoped that it would become another place where users could share their customizations for Gramps. (So they could recollect what needing tweaking after the next update. )

1 Like

In case you misunderstood: I run my script to patch the generated web report pages, not to patch the python source files.

1 Like

Ahhh. I did misunderstand.

But the suggestion holds true. I may run a report intensely for a few days then not need it again for years. Without a reminder, i have to recall there was a problem, recall the nature of the issue, isolate it, then workout the solution again.

There is a similar issue with Books of Reports. I’ll spend a fair amount of time compiling customized reports into a Book and forget about the objective and that the Book exists. (Then fixing things if the Book used custom filters that have been changed or deleted.)

The best should be to not create the meta tags in the html code.
The lines 1675 and 1676 containing meta5 and meta6

You need to remove these two lines in basepage.py or comment out them as follows:
> # create additional meta tags

    meta = Html("meta", attr=_meta1) + (
        Html("meta", attr=_meta2, indent=False),
        Html("meta", attr=_meta3, indent=False),
        Html("meta", attr=_meta4, indent=False),
        # Html("meta", attr=_meta5, indent=False),
        # Html("meta", attr=_meta6, indent=False),
    )
3 Likes

I’d have to repeat that every time a new version of Gramps gets installed. I’ll stick to my own sh / awk, i.e. the old-fashioned unix way, as post-processing after generating a fresh web report. I can also stick a meta description tag in a few select pages in a similar way. Bing still refuses to index any page without that specific tag present, and there’s no way to get into duckduckgo without passing the Bing gauntlet first.

If you want to have the “noindex” an option, please make a feature request.

2 Likes

Using Gramps 5.1.5 on Linux

The narrated web site reports are wonderful.
Is there an easy way to add a “Noindex” meta tag in the html header of all the Gramps-generated html pages ?
The goal is to prevent Google or other robots from indexing the pages.

At the moment, I think I will use the “sed” command to insert what I want in all those files, but it’s not my favourite solution.

Any idea ?

Thank you,

–yves

and if this is not the most appropriate place to post such a question, please point me to the right forum

Yes, there is an easy way to do that.
You can add the following lines in basepage.py

--- a/gramps/plugins/webreport/basepage.py
+++ b/gramps/plugins/webreport/basepage.py
@@ -1538,12 +1538,16 @@ class BasePage:
         _meta3 = 'name="generator" content="%s %s %s"' % (
             PROGRAM_NAME, VERSION, URL_HOMEPAGE)
         _meta4 = 'name="author" content="%s"' % self.author
+        _meta5 = 'name="robots" content="noindex"'
+        _meta6 = 'name="googlebot" content="noindex"'
 
         # create additional meta tags
         meta = Html("meta", attr=_meta1) + (
             Html("meta", attr=_meta2, indent=False),
             Html("meta", attr=_meta3, indent=False),
-            Html("meta", attr=_meta4, indent=False)
+            Html("meta", attr=_meta4, indent=False),
+            Html("meta", attr=_meta5, indent=False),
+            Html("meta", attr=_meta6, indent=False)
         )
 
         # Link to _NARRATIVESCREEN  stylesheet

Can we implicitly add these guidelines without asking the question?
I don’t want to add a new option to the narrative web.

1 Like

I vote Yes!

I just modified my code.

May I ask why? Why publish a tree if you don’t want to make it easy to find? You could just put your tree on OneDrive, Google Drive, etc, if you just want select people to be able to access it.

Craig

I don’t want anybody is able to find my data. Only the people I give the URL will be able to see them.

1 Like

Gramps 64 6.0, Windows 11 Home.

I am new to Gramps and have been trying / like the Narrated website.

I would like my website to be indexed by search engines but I think entries in all the pages are preventing this-

    *<meta name="robots" content="noindex" />*
*    <meta name="googlebot" content="noindex" />*

Is there a setting somewhere I have missed to exclude the noindex lines?

Apologies if I have missed something obvious and thank you for any suggestions.

I do not see an option.

But you could comment out the lines in the report source. (If you do that, remember to put a note that talks about how and why you customized Gramps. It will have to be repeated each time Gramps is updated.)

https://github.com/gramps-project/gramps/blob/9aa872287dfbac66cf6e6bd38cd0e9607699172a/gramps/plugins/webreport/basepage.py#L1828-L1829

Wrong lines to comment out. See


Thank your for your quick reply :- )

I will try that for a for a few pages, perhaps the Individual.html lists, see if / how the site gets indexed?

My site comprises hundreds of pages, too many to do them all each time I update, I was sort of hoping there might be a general setting so I could exclude noindex everywhere.

Just a few might pages might work well enough though?

You misunderstand.

If you comment out those lines out in the basepage.py file of your Gramps installation, then all the pages generated with the Narrated Web Site report will be without those lines.

However, when you upgrade Gramps (like from version Gramps 6.0.0 to 6.0.3), then you will have to tweak the basepage.py source code page again.

OK, my apologies, I had misunderstood!!

I don’t really have any programming experience, but thought it worth a try, I found the file and opened in notepad OK but not allowed to save it?
I guess its ‘permissions’ will look further and have another try.

Many Thanks :- )

Yes, permissions can be aggravating.

On my OS (Fedora), I find it easier to copy the file to a user‐editable folder (like Documents), make/save the edit there, then drag the edited file back to the original folder. That offers a chance to confirm you want to overwrite a protected file.

(This saves the process of changing the folder protection to user-writable, then back to read-only protected after testing is finished.)

If you screw up the file beyond recognition, you can always restore a copy from the (correct version of the) GitHub repository.