AI Chat with OpenAI returns an error on successfull call

Running versions:

Gramps 6.0.5
Gramps Web API 3.4.0
Gramps Web Frontend 25.10.1
Gramps QL 0.4.0
Sifts 1.0.0
locale: en
multi-tree: false
task queue: true
OCR: true
chat: true

I’ve configured the GRAMPSWEB_LLM_MODEL: gpt-5-nano, set the OPENAI_API_KEY and added the GRAMPSWEB_VECTOR_EMBEDDING_MODEL though it seems not necessary to use this with a hosted LLM?

I’ve got a paid account and an active, wide-scoped token configured. When I make a request in thew chat, I get “Invalid message format”, apparently from here.

Here is a log entry from the container:

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Batches: 100%|██████████| 1/1 [00:00<00:00, 11.93it/s]

I also noticed that the first call in a new chat fails almost instantly, while the second is more ‘thinking’

Any help appreciated!

Yes, it is necessary. Without it, you cannot generate vector embeddings and your tree content will not be accessible to the LLM.

I’ve got a paid account and an active, wide-scoped token configured. When I make a request in thew chat, I get “Invalid message format”, apparently from here.

Perhaps GPT-5 doesn’t support the (“old”) completions API. Can you try with GTP-4?

Thanks for the clarification. I initially attempted to use gpt-5-nano , which apparently also supports the completions endpoint. Switching to GRAMPSWEB_LLM_MODEL: gpt-4o-mini didn’t help and the first request seems to be failing outright (without a web call?):

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2
Batches: 100%|██████████| 1/1 [00:00<00:00, 11.88it/s]
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Batches: 100%|██████████| 1/1 [00:00<00:00, 15.22it/s]

I’m happy to debug this further locally if I get the devcontainers working. As far as I understood, there is no way to switch to debug-level logging at the moment.

You can use `–log-level debug` in gunicorn.

What is this log you are showing? The “Batches” are from the celery container I guess?

No, the batches are also displayed in the main container, at least at the gunicorn debug level. I enabled debugging in all containers, and also checked in the browser’s devtools. I’m getting the same 422 Unprocessable Content on the front when I call to /chat

Here is a log from the grampsweb container:

[2025-10-30 12:35:35 +0000] [11] [DEBUG] POST /api/chat/
[2025-10-30 12:35:35 +0000] [11] [DEBUG] Contextualizing prompt 'Who is my mother?' with context '*Human message:* Who is my father?'
DEBUG:gramps_webapi.app:Contextualizing prompt 'Who is my mother?' with context '*Human message:* Who is my father?'
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"

So something is wrong with the response processing.

Configuration:

      GRAMPSWEB_LLM_MODEL: gpt-4o-mini
      OPENAI_API_KEY: "long-api-key"
      GRAMPSWEB_VECTOR_EMBEDDING_MODEL: sentence-transformers/distiluse-base-multilingual-cased-v2

Ok. The POST to the API is 200, so that part is fine. Unforunately, the code processing the response is wrapped in a try/except that swallows the exact error. So indeed it would be great if you could try the dev container.

But first, can you please go to search in Gramps Web, toggle on semantic search and see if that works?

It could be that the actual error is that your vector search doesn’t work and throws an error that is just swallowed, leading to the misleading message in the UI.

Hello David,

It all worked after rebuilding the semantic search database, simply as that. I wasn’t attentive here, however, I must admit I would expect it to be a little clearer as a documented step. Maybe I could contribute. I’m now looking forward to expanding the use of the hosted LLM to doing web search, for the models that support it. This could be a potentially good way to explore online data (or model knowledge) to expand on the history of an entity or person in the family tree.

Thanks again,

Tim

1 Like

Contributions are of course welcome! This applies to documentation as well as code.

Especially documentation can profit a lot from having somebody who is not the one who implemented a feature, but trying to set it up from scratch. Doc repo: GitHub - gramps-project/gramps-web-docs: Documentation for Gramps Web

Regarding code, you might be interested in this new draft PR of mine where I’m trying to lay the groundwork for tool calls: Use Pydantic AI for LLM chat by DavidMStraub · Pull Request #720 · gramps-project/gramps-web-api · GitHub

1 Like