Caching AWS secrets in memory securely
Calling get_secret_value on every request burns through Secrets Manager rate limits and adds latency; caching it forever means you keep using a credential after it rotates. The fix is a thread-safe, TTL-bound, in-memory cache. This page extends AWS Secrets Manager Integration.
Problem 1: a fetch on every request
# ANTI-PATTERN: hammers the API and gets throttled
def handler(event):
secret = boto3.client("secretsmanager").get_secret_value(SecretId="prod/db")
return connect(secret["SecretString"]) # one API call per invocation
Under load this hits ThrottlingException and adds a network round-trip to every request.
Problem 2: a cache with no expiry
# ANTI-PATTERN: never refreshes, so rotation breaks the app
_SECRET = boto3.client("secretsmanager").get_secret_value(SecretId="prod/db") # cached forever
Once the secret rotates, this value is stale and every connection fails until a redeploy.
Secure implementation
# secrets/cache.py
import json
import threading
import time
import boto3
from pydantic import SecretStr
_client = boto3.client("secretsmanager")
_lock = threading.Lock()
_cache: dict[str, tuple[float, dict]] = {}
TTL = 600 # seconds; keep below rotation interval
def get_secret(secret_id: str) -> dict[str, SecretStr]:
now = time.monotonic()
with _lock: # thread-safe: one fetch under contention
cached = _cache.get(secret_id)
if cached and now - cached[0] < TTL:
return cached[1]
raw = _client.get_secret_value(SecretId=secret_id)["SecretString"]
parsed = {k: SecretStr(v) for k, v in json.loads(raw).items()}
_cache[secret_id] = (now, parsed)
return parsed
One lock means concurrent requests trigger a single fetch, not a thundering herd. The TTL guarantees the app re-fetches after rotation; SecretStr keeps values out of logs.
Gotchas & version-specific behaviour
- Use
time.monotonic(), nottime.time(), so a clock adjustment cannot extend the TTL. - The cache lives in memory only — never
pickleit to disk or a shared cache. - In multi-process servers (gunicorn), each worker has its own cache; size the TTL accordingly.
- Set the TTL strictly below the Secrets Manager rotation interval so stale values expire fast.
Production parity checklist
- TTL is shorter than the rotation interval.
- Cache access is guarded by a lock; no duplicate concurrent fetches.
- Values wrapped in
SecretStr, unwrapped only at the driver call. - IAM scoped to
GetSecretValueon the exact ARN. - No secret is ever written to disk or a shared store.
Conclusion
A locked, monotonic-based TTL cache eliminates throttling while staying rotation-aware. Pair it with the rotation patterns so the app picks up new credentials automatically.