Talk to your tree!

Thank you very much for this new tool! I am very interested in experimenting with it in grampsweb - especially due to the reported issues of the ChatWithTree Addon in the windows AIO version of gramps desktop because the litellm Python module cannot be installed ( ChatWithTree gramplet addition by MelleKoning · Pull Request #762 · gramps-project/addons-source · GitHub ).

I am successfully running grampsweb on a RaspberryPi 4 8GB and now want to activate the new AI chat function but I am having difficulties following the instructions of the related help page ( Setting up AI chat - Gramps Web ).

From what I understood, I have to change the settings in the docker-compose.yml, but I cannot activate the chat function:

Gramps 6.0.6
Gramps Web API 3.6.0
Gramps Web Frontend 25.12.0
Gramps QL 0.4.0
Sifts 1.1.1
locale: en
multi-tree: false
task queue: true
OCR: true
chat: false

Here is how the docker-compose.yml looks like (in bold my changes):

services:
grampsweb: &grampsweb
image: Package grampsweb · GitHub
restart: always
ports:

  • “80:5000” # host:docker
    environment:
    GRAMPSWEB_TREE: “Gramps Web” # will create a new tree if not exists
    GRAMPSWEB_CELERY_CONFIG__broker_url: “redis://grampsweb_redis:6379/0”
    GRAMPSWEB_CELERY_CONFIG__result_backend: “redis://grampsweb_redis:6379/0”
    GRAMPSWEB_RATELIMIT_STORAGE_URI: redis://grampsweb_redis:6379/1
    GUNICORN_NUM_WORKERS: 2
    GRAMPSWEB_VECTOR_EMBEDDING_MODEL: sentence-transformers/distiluse-base-multilingual-cased-v2
    GRAMPSWEB_LLM_BASE_URL: http://ollama:11434/v1
    GRAMPSWEB_LLM_MODEL: tinyllama
    OPENAI_API_KEY: ollama
depends_on:
  - grampsweb_redis
volumes:
  - gramps_users:/app/users  # persist user database
  - gramps_index:/app/indexdir  # persist search index
  - gramps_thumb_cache:/app/thumbnail_cache  # persist thumbnails
  - gramps_cache:/app/cache  # persist export and report caches
  - gramps_secret:/app/secret  # persist flask secret
  - gramps_db:/root/.gramps/grampsdb  # persist Gramps database
  - gramps_media:/app/media  # persist media files
  - gramps_tmp:/tmp

grampsweb_celery:
<<: *grampsweb # YAML merge key copying the entire grampsweb service config
ports: []
container_name: grampsweb_celery
depends_on:

  • grampsweb_redis
    command: celery -A gramps_webapi.celery worker --loglevel=INFO --concurrency=2

grampsweb_redis:
image: Docker Hub Container Image Library | App Containerization
container_name: grampsweb_redis
restart: always

ollama:
image: ollama/ollama
container_name: ollama
ports:

  • “11434:11434”
    volumes:
  • ollama_data:/root/.ollama**

volumes:
gramps_users:
gramps_index:
gramps_thumb_cache:
gramps_cache:
gramps_secret:
gramps_db:
gramps_media:
gramps_tmp:
ollama_data:

Can anyone please help me find the way to activate the AI chat? What am I doing wrong?

Thanks a lot!

Hi,

OPENAI_API_KEY: ollama is not needed (& doesn’t make sense & has no effect). Apart from that, your config looks ok.

But the information you shared is not consistent.

You have chat: false, but this will be true if the VECTOR_EMBEDDING_MODEL and LLM_MODEL config parameters are non-empty, regardless of whether they are correct or consistent.

So the fact that you do set these parameters in your compose file but it’s not reflected in the system info means that something is wrong about the information you shared. Perhaps you did not restart your containers after changing the configuration?

Hi David,

thank you very much for your quick reply!

You are right, something seems to be odd in my parameters (I am not a Linux user and just trying around).

As you suggested, I removed the line OPENAI_API_KEY: ollama even though it is explicitly mentioned to add it in the related help page:

“When deploying Gramps Web with Docker Compose, you can add an Ollama service […]

and then set the LLM_BASE_URL configuration parameter to http://ollama:11434/v1. Set LLM_MODEL to a model supported by Ollama, and pull it down in your container with ollama pull <model>. Finally, set OPENAI_API_KEY to ollama.”

I did not run: ollama pull <model> Where is this supposed to be run? In the command line?

ollama pull tinyllama
bash: ollama: command not found

However, now the chat seems to be activated in the system info:

Gramps 6.0.6
Gramps Web API 3.6.0
Gramps Web Frontend 25.12.0
Gramps QL 0.4.0
Sifts 1.1.1
locale: en
multi-tree: false
task queue: true
OCR: true
chat: true

I restarted the containters:

docker-compose down
[+] Running 5/5
Container home-grampsweb-1 Removed
Container ollama Removed
Container grampsweb_celery Removed
Container grampsweb_redis Removed
Network home_default Removed

xxx/home $ docker-compose up -d
[+] Running 5/5
Network home_default Created
Container ollama Started
Container grampsweb_redis Started
Container grampsweb_celery Started
Container home-grampsweb-1 Started

However, I cannot see the chat window :face_with_peeking_eye: Where is it supposed to appear?

I started the creation of the search index for the semantic search yesterday evening and it seems to be pending at 99% since then:

check_circle Status: 10027/10030 (both circles for creation and actualisation are pending).

The docker-compose.yml now looks like:

services:
grampsweb: &grampsweb
image: Package grampsweb · GitHub
restart: always
ports:

  • “80:5000” # host:docker
    environment:
    GRAMPSWEB_TREE: “Gramps Web” # will create a new tree if not exists
    GRAMPSWEB_CELERY_CONFIG__broker_url: “redis://grampsweb_redis:6379/0”
    GRAMPSWEB_CELERY_CONFIG__result_backend: “redis://grampsweb_redis:6379/0”
    GRAMPSWEB_RATELIMIT_STORAGE_URI: redis://grampsweb_redis:6379/1
    GUNICORN_NUM_WORKERS: 2
    GRAMPSWEB_VECTOR_EMBEDDING_MODEL: sentence-transformers/distiluse-base-multilingual-cased-v2
    GRAMPSWEB_LLM_BASE_URL: http://ollama:11434/v1
    GRAMPSWEB_LLM_MODEL: tinyllama
depends_on:
  - grampsweb_redis
volumes:
  - gramps_users:/app/users  # persist user database
  - gramps_index:/app/indexdir  # persist search index
  - gramps_thumb_cache:/app/thumbnail_cache  # persist thumbnails
  - gramps_cache:/app/cache  # persist export and report caches
  - gramps_secret:/app/secret  # persist flask secret
  - gramps_db:/root/.gramps/grampsdb  # persist Gramps database
  - gramps_media:/app/media  # persist media files
  - gramps_tmp:/tmp

grampsweb_celery:
<<: *grampsweb # YAML merge key copying the entire grampsweb service config
ports: []
container_name: grampsweb_celery
depends_on:

  • grampsweb_redis
    command: celery -A gramps_webapi.celery worker --loglevel=INFO --concurrency=2

grampsweb_redis:
image: Docker Hub Container Image Library | App Containerization
container_name: grampsweb_redis
restart: always

ollama:
image: ollama/ollama
container_name: ollama
ports:

  • “11434:11434”
    volumes:
  • ollama_data:/root/.ollama

volumes:
gramps_users:
gramps_index:
gramps_thumb_cache:
gramps_cache:
gramps_secret:
gramps_db:
gramps_media:
gramps_tmp:
ollama_data:

Oops :see_no_evil_monkey: Then I guess you need it, sorry - I didn’t write those docs.

However, I cannot see the chat window :face_with_peeking_eye: Where is it supposed to appear?

You need to select the user groups able to use AI features in the user management settings first.

I started the creation of the search index for the semantic search yesterday evening and it seems to be pending at 99% since then:

Please check the logs of the celery container.

Many thanks again :slight_smile:

I found the setting in the user management and can see the chat now. I oversaw this step in the documentation - my apologies.

Nevertheless, the chat only returns “unexpected errors”.

Is it because the search index is still pending at Status: 10029/10033?

This is the log of the celery container :face_with_peeking_eye:

(_main_.py:8): Gtk-CRITICAL **: 16:06:35.540: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

INFO [alembic.runtime.migration] Context impl SQLiteImpl.

INFO [alembic.runtime.migration] Will assume non-transactional DDL.

(celery:1): Gtk-CRITICAL **: 16:07:08.268: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

/usr/local/lib/python3.11/dist-packages/celery/platforms.py:841: SecurityWarning: You’re running the worker with superuser privileges: this is

absolutely not recommended!

Please specify a different user using the --uid option.

User information: uid=0 euid=0 gid=0 egid=0

warnings.warn(SecurityWarning(ROOT_DISCOURAGED.format(

-------------- celery@5d2951f64583 v5.6.0 (recovery)

-– ***** -----

-- ******* ---- Linux-6.12.47+rpt-rpi-v8-aarch64-with-glibc2.36 2026-01-07 16:07:31

- *** — * —

- ** ---------- [config]

- ** ---------- .> app: default:0x7fa822f7d0 (.default.Loader)

- ** ---------- .> transport: redis://grampsweb_redis:6379/0

- ** ---------- .> results: redis://grampsweb_redis:6379/0

- *** — * — .> concurrency: 2 (prefork)

-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)

-– ***** -----

-------------- [queues]

.> celery exchange=celery(direct) key=celery

[tasks]

. gramps_webapi.api.tasks.check_repair_database

. gramps_webapi.api.tasks.delete_objects

. gramps_webapi.api.tasks.export_db

. gramps_webapi.api.tasks.export_media

. gramps_webapi.api.tasks.generate_report

. gramps_webapi.api.tasks.import_file

. gramps_webapi.api.tasks.import_media_archive

. gramps_webapi.api.tasks.media_ocr

. gramps_webapi.api.tasks.process_chat

. gramps_webapi.api.tasks.process_transactions

. gramps_webapi.api.tasks.search_reindex_full

. gramps_webapi.api.tasks.search_reindex_incremental

. gramps_webapi.api.tasks.send_email_confirm_email

. gramps_webapi.api.tasks.send_email_new_user

. gramps_webapi.api.tasks.send_email_reset_password

. gramps_webapi.api.tasks.send_telemetry_task

. gramps_webapi.api.tasks.update_search_indices_from_transaction

. gramps_webapi.api.tasks.upgrade_database_schema

. gramps_webapi.api.tasks.upgrade_undodb_schema

[2026-01-07 16:07:32,893: INFO/MainProcess] Connected to redis://grampsweb_redis:6379/0

[2026-01-07 16:07:32,904: INFO/MainProcess] mingle: searching for neighbors

[2026-01-07 16:07:33,929: INFO/MainProcess] mingle: all alone

[2026-01-07 16:07:33,961: INFO/MainProcess] celery@5d2951f64583 ready.

For the chat errors look at the grampsweb container log.

Thanks.

Here is the grampsweb log:

(main.py:8): Gtk-CRITICAL **: 16:06:35.580: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

INFO [alembic.runtime.migration] Context impl SQLiteImpl.

INFO [alembic.runtime.migration] Will assume non-transactional DDL.

[2026-01-07 16:07:02 +0000] [10] [INFO] Starting gunicorn 23.0.0

[2026-01-07 16:07:02 +0000] [10] [INFO] Listening at: http://0.0.0.0:5000 (10)

[2026-01-07 16:07:02 +0000] [10] [INFO] Using worker: sync

[2026-01-07 16:07:02 +0000] [11] [INFO] Booting worker with pid: 11

[2026-01-07 16:07:02 +0000] [12] [INFO] Booting worker with pid: 12

(gunicorn:12): Gtk-CRITICAL **: 16:07:07.399: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

(gunicorn:11): Gtk-CRITICAL **: 16:07:07.408: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

INFO:httpx:HTTP Request: POST http://ollama:11434/v1/chat/completions “HTTP/1.1 404 Not Found”

[2026-01-07 16:09:25 +0000] [11] [ERROR] Unexpected error in agent: status_code: 404, model_name: tinyllama, body: {‘message’: “model ‘tinyllama’ not found”, ‘type’: ‘api_error’, ‘param’: None, ‘code’: None}

ERROR:gramps_webapi.app:Unexpected error in agent: status_code: 404, model_name: tinyllama, body: {‘message’: “model ‘tinyllama’ not found”, ‘type’: ‘api_error’, ‘param’: None, ‘code’: None}

INFO:flask-limiter:ratelimit 1 per 1 second (192.168.0.30) exceeded at endpoint: api.token_refresh

Ok, something is obviously wrong with:

httpx:HTTP Request: POST http://ollama:11434/v1/chat/completions “HTTP/1.1 404 Not Found”

The log of the ollama container says:

time=2026-01-07T20:05:37.646Z level=INFO source=routes.go:1554 msg=“server config” env=“map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]”

time=2026-01-07T20:05:37.647Z level=INFO source=images.go:493 msg=“total blobs: 0”

time=2026-01-07T20:05:37.647Z level=INFO source=images.go:500 msg=“total unused blobs removed: 0”

time=2026-01-07T20:05:37.648Z level=INFO source=routes.go:1607 msg=“Listening on [::]:11434 (version 0.13.5)”

time=2026-01-07T20:05:37.651Z level=INFO source=runner.go:67 msg=“discovering available GPUs…”

time=2026-01-07T20:05:37.659Z level=INFO source=server.go:429 msg=“starting runner” cmd=“/usr/bin/ollama runner --ollama-engine --port 35641”

time=2026-01-07T20:05:37.978Z level=INFO source=server.go:429 msg=“starting runner” cmd=“/usr/bin/ollama runner --ollama-engine --port 33525”

time=2026-01-07T20:05:38.282Z level=INFO source=types.go:60 msg=“inference compute” id=cpu library=cpu compute=“” name=cpu description=cpu libdirs=ollama driver=“” pci_id=“” type=“” total=“7.6 GiB” available=“6.3 GiB”

time=2026-01-07T20:05:38.282Z level=INFO source=routes.go:1648 msg=“entering low vram mode” “total vram”=“0 B” threshold=“20.0 GiB”

[GIN] 2026/01/07 - 20:08:36 | 404 | 3.221055ms | 172.18.0.4 | POST “/v1/chat/completions”

Is the error related to the port 11434?

I don’t know, perhaps an outdated Ollama? I suggest you debug the 404 with curl first to be completely independent of Gramps Web

Thanks again, David. Your help is really appreciated, but I guess I’ll have to leave it at this point as this is getting too complicated for me.

You did pick the most complicated option…

Based on this the question is, have you downloaded tinyllama model on the ollama server?

Obviously not as I do not know how to do this. I am struggling with this part:

Set LLM_MODEL to a model supported by Ollama, and pull it down in your container with ollama pull <model>.

How do I pull it down in my container?

Connect to the Ollama container via a terminal. On Windows this would be something like, docker run -it ollama bash if the Ollama docker image was named ollama, and in the terminal run ollama pull tinyllama.

thanks for your help. I managed to pull ollama using:

docker exec -it ollama sh

ollama pull tinyllama

Still, somethings seem not be correct, yet:

This is the log of the ollama container:

time=2026-01-19T11:10:22.029Z level=INFO source=routes.go:1554 msg=“server config” env=“map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]”

time=2026-01-19T11:10:22.032Z level=INFO source=images.go:493 msg=“total blobs: 5”

time=2026-01-19T11:10:22.033Z level=INFO source=images.go:500 msg=“total unused blobs removed: 0”

time=2026-01-19T11:10:22.036Z level=INFO source=routes.go:1607 msg=“Listening on [::]:11434 (version 0.13.5)”

time=2026-01-19T11:10:22.040Z level=INFO source=runner.go:67 msg=“discovering available GPUs…”

time=2026-01-19T11:10:22.046Z level=INFO source=server.go:429 msg=“starting runner” cmd=“/usr/bin/ollama runner --ollama-engine --port 33985”

time=2026-01-19T11:10:22.157Z level=INFO source=server.go:429 msg=“starting runner” cmd=“/usr/bin/ollama runner --ollama-engine --port 46259”

time=2026-01-19T11:10:22.246Z level=INFO source=types.go:60 msg=“inference compute” id=cpu library=cpu compute=“” name=cpu description=cpu libdirs=ollama driver=“” pci_id=“” type=“” total=“7.6 GiB” available=“6.4 GiB”

time=2026-01-19T11:10:22.246Z level=INFO source=routes.go:1648 msg=“entering low vram mode” “total vram”=“0 B” threshold=“20.0 GiB”

[GIN] 2026/01/19 - 11:12:09 | 400 | 90.007254ms | 172.18.0.5 | POST “/v1/chat/completions”

and the grampsweb container:

(main.py:8): Gtk-CRITICAL **: 11:10:28.203: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

INFO [alembic.runtime.migration] Context impl SQLiteImpl.

INFO [alembic.runtime.migration] Will assume non-transactional DDL.

[2026-01-19 11:11:09 +0000] [10] [INFO] Starting gunicorn 23.0.0

[2026-01-19 11:11:09 +0000] [10] [INFO] Listening at: http://0.0.0.0:5000 (10)

[2026-01-19 11:11:09 +0000] [10] [INFO] Using worker: sync

[2026-01-19 11:11:09 +0000] [11] [INFO] Booting worker with pid: 11

[2026-01-19 11:11:09 +0000] [12] [INFO] Booting worker with pid: 12

(gunicorn:11): Gtk-CRITICAL **: 11:11:14.176: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

(gunicorn:12): Gtk-CRITICAL **: 11:11:14.219: gtk_icon_theme_get_for_screen: assertion ‘GDK_IS_SCREEN (screen)’ failed

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

INFO:sentence_transformers.SentenceTransformer:Use pytorch device_name: cpu

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/distiluse-base-multilingual-cased-v2

INFO:httpx:HTTP Request: POST http://ollama:11434/v1/chat/completions “HTTP/1.1 400 Bad Request”

[2026-01-19 11:12:09 +0000] [12] [ERROR] Unexpected error in agent: status_code: 400, model_name: tinyllama, body: {‘message’: ‘registry.ollama.ai/library/tinyllama:latest does not support tools’, ‘type’: ‘api_error’, ‘param’: None, ‘code’: None}

ERROR:gramps_webapi.app:Unexpected error in agent: status_code: 400, model_name: tinyllama, body: {‘message’: ‘registry.ollama.ai/library/tinyllama:latest does not support tools’, ‘type’: ‘api_error’, ‘param’: None, ‘code’: None}

any ideas?

Thanks!

We’re getting closer!

“tinyllama:latest does not support tools” means that this model cannot call tools, which Gramps Web AI chat needs though. Before Gramps Web API v3.6.0, the tool was quite dumb - it only did semantic search and then used the LLM to summarize the results.

Since v3.6.0, the tool is much smarter by using tool calling. The LLM selects the right tool (e.g. search, filter), then the tool is executed, then the LLM summarizes the result. However, this means the tinyest models cannot be used - you need to pull a model that supports tool calling.

This means our documentation is outdated - it currently says “You can try whether tinyllama meets your needs”.

thanks again for your help! I guess this means that my intended setup using an RPi won’t work anyway if the tinyest model doesn’t support tool calling. :frowning:

TinyLlama != Tinyest LLM. :slight_smile: Look through the list of models on Ollama.com to find one that suits.

1 Like

ok, then I have faith again and I’ll try other models. can you give some general recommendations for what I should be looking for? e.g. ollama offers categories such as “cloud”, “embeddings”, “vision”, “tools”, “thinking”… shall I go for “tools”? Which size would make sense? tinyllama has 1.1B.

For now, I found:

llama3.2 llama3.2:1b

granite4 granite4:1b

gemma3 gemma3:1b

smollm2 smollm2:1.7b

Any recommendations? Thanks!

You should be looking for “tools”. But a general warning: just because a model allows you to use a tool, does not mean that it will do so effectively. That means the smaller the model (in general) the more “fake” information it will generate. I hate the term “hallucination” because it implies that sometimes it is on drugs. But these models are always on drugs.

2 Likes

I think you have to try and see! I cannot say more than naively expect that larger models will be slower and give better results, but who knows :wink:

What might also be relevant is the context window - more is better.

If you find a model that gives decent results on a RPi, that would be very valuable information for the community I think!

1 Like