Cache Parameter Store values to reduce API calls

SSM Parameter Store has per-account throughput limits, and a process that re-reads its parameters on every request will hit ThrottlingException under load. A short-TTL in-memory cache fixes it. This page adds caching to the custom source, extending Settings from AWS Parameter Store.

Problem 1: re-reading on every settings construction

# ANTI-PATTERN: builds Settings() per request, calling SSM each time
def handler(event):
    return Settings()      # fresh get_parameters_by_path on every invocation

Each Settings() triggers a fresh SSM round-trip, throttling under traffic.

Problem 2: caching forever

# ANTI-PATTERN: module-level read, never refreshed
PARAMS = boto3.client("ssm").get_parameters_by_path(Path="/myapp/")  # stale after update

Parameters that change (a rotated SecureString) are never picked up.

Secure implementation

# config/ssm_cache.py
import time
import boto3
from pydantic import SecretStr
from pydantic_settings import PydanticBaseSettingsSource

class CachedSSMSource(PydanticBaseSettingsSource):
    _cache: dict[str, object] = {}
    _loaded_at: float = 0.0
    TTL = 300                                  # seconds

    def __call__(self) -> dict[str, object]:
        now = time.monotonic()
        if self._cache and now - self._loaded_at < self.TTL:
            return self._cache                 # serve from cache within TTL
        ssm = boto3.client("ssm")
        fresh: dict[str, object] = {}
        for page in ssm.get_paginator("get_parameters_by_path").paginate(
            Path="/myapp/", Recursive=True, WithDecryption=True,
        ):
            for p in page["Parameters"]:
                fresh[p["Name"].rsplit("/", 1)[-1].lower()] = p["Value"]
        type(self)._cache, type(self)._loaded_at = fresh, now
        return fresh

    def get_field_value(self, field, field_name):
        return None, field_name, False

The TTL caps SSM calls at one per interval per process, while still refreshing rotated values. Decrypted secrets become SecretStr once they reach the model.

Gotchas & version-specific behaviour

  • Use time.monotonic() so clock changes cannot extend the TTL.
  • Class-level cache is per process; multi-worker servers each cache independently.
  • Keep the TTL below any rotation interval so rotated SecureString values are picked up promptly.
  • The cache holds plaintext in memory only — never persist it.

Production parity checklist

  • TTL caps SSM calls and stays below the rotation interval.
  • Cache lives in memory only; nothing written to disk.
  • WithDecryption=True with scoped kms:Decrypt.
  • Secrets wrapped in SecretStr in the model.
  • Throughput stays within SSM limits under peak load.

Conclusion

A monotonic-based TTL cache inside the SSM source removes throttling without serving stale secrets. For the base source it caches, see Load Pydantic Settings from AWS Parameter Store.