Memory tuning & native caches¶
titiler-openeo uses memory in two very different places, and an OOM kill looks identical for both:
- Python heap — the
ImageData/numpy cubes produced while evaluating a process graph. - Native / off-heap — GDAL and
/vsi*caches. These live outside the Python heap, are not freed by the garbage collector, and add on top of per-request working memory. This page is about keeping them bounded.
The single most common production footgun is shipping a dev
VSI_CACHE_SIZE(512 MB) to Kubernetes. It is a per-file-handle cache and will OOM a pod.
The three native knobs¶
| Variable | Scope | Default | Notes |
|---|---|---|---|
GDAL_CACHEMAX |
process-global block cache | 5% of RAM | Values <100000 are MB, otherwise bytes. One cache per worker process, so it multiplies by WEB_CONCURRENCY. |
VSI_CACHE_SIZE |
per file-handle | 25 MB | In-memory cache of each open file. Total ≈ VSI_CACHE_SIZE × (concurrently open handles). This is the dangerous one. |
CPL_VSIL_CURL_CACHE_SIZE |
process-global LRU for /vsicurl chunk reads |
16 MB | Pin it so it can't drift; raise only if the budget allows. |
VSI_CACHE must be TRUE for VSI_CACHE_SIZE to apply.
Why VSI_CACHE_SIZE is the trap¶
It is allocated once per open file handle, not once per process. A single
process-graph request fans out to roughly items × bands × read_threads open
handles (the ThreadPoolExecutor reads in RasterStack plus rio-tiler's mosaic
reader). At the dev value of 512 MB per handle, even a few dozen concurrent
handles blow past a 2 Gi container; at the chart default of 5 MB the same fan-out
costs a few hundred MB at most.
The native budget¶
Per pod, the native caches cost roughly:
native ≈ WEB_CONCURRENCY × (GDAL_CACHEMAX + CPL_VSIL_CURL_CACHE_SIZE) # process-global
+ (concurrently open file handles) × VSI_CACHE_SIZE # per-handle
This must sit comfortably below resources.limits.memory, leaving the
remainder for the Python heap (the process-graph cubes). Worked example for the
chart defaults at a 2 Gi limit, single worker:
GDAL_CACHEMAX = 200 MB (process-global)
CPL_VSIL_CURL_CACHE_SIZE = 64 MB (process-global)
VSI_CACHE_SIZE = 5 MB × ~100 handles ≈ 500 MB (worst case)
--------------------------------------------------------------
native peak ≈ 0.2 + 0.06 + 0.5 ≈ 0.76 GB
remaining for heap ≈ 2 Gi − 0.76 ≈ 1.2 GB
If you raise WEB_CONCURRENCY (multiple uvicorn workers per pod), remember the
process-global caches multiply by it — GDAL_CACHEMAX × WEB_CONCURRENCY
alone can dominate the limit.
Recommended production values (Helm chart defaults)¶
The chart (deployment/k8s/charts/values.yaml) ships safe defaults:
env:
GDAL_CACHEMAX: "200" # 200 MB, process-global
VSI_CACHE: "TRUE"
VSI_CACHE_SIZE: "5000000" # 5 MB PER HANDLE
CPL_VSIL_CURL_CACHE_SIZE: "67108864" # 64 MB, pinned
Rules of thumb:
- Keep
VSI_CACHE_SIZEin the low single-digit MB. Never copy a dev value here. - Keep
GDAL_CACHEMAX + CPL_VSIL_CURL_CACHE_SIZE(×WEB_CONCURRENCY) to a small fraction of the limit. - Size
WEB_CONCURRENCYto the pod's CPU, then re-check this budget.
Dev profiles are not production¶
.env.cdse / .env.eoapi are local-developer profiles (single user, low
concurrency) and intentionally use a large VSI_CACHE_SIZE (512 MB) and
CPL_VSIL_CURL_CACHE_SIZE (200 MB) for throughput. Do not copy these into a
Helm release or container deployment — they are flagged inline in those files.
Guard¶
A Helm unit test (deployment/k8s/charts/tests/native_cache_test.yaml) asserts
the chart's native-cache env vars stay at their bounded defaults, so an
accidental bump (e.g. pasting a dev VSI_CACHE_SIZE) fails CI and forces a
conscious review.