Implementing custom validators for AWS ARNs and URLs

A typed str field accepts "arn:aws:lol" and "http://internal" without complaint — both are strings. Only a domain validator catches a malformed ARN or a non-TLS URL at startup instead of at the first AWS call. This page writes those validators, extending Custom Validators & Constraints.

Problem 1: a plain str accepts anything

# ANTI-PATTERN: malformed ARN passes validation, fails at runtime
class Settings(BaseSettings):
    topic_arn: str        # "arn:aws:lol" is a valid str — and a broken ARN

The error surfaces later as an opaque AWS InvalidParameter, far from the config that caused it.

Problem 2: a greedy regex that backtracks

# ANTI-PATTERN: nested quantifiers — catastrophic backtracking (ReDoS)
ARN_RE = re.compile(r"^(arn:(.*)+:.*)+$")   # can hang on long malicious input

Unanchored, nested-quantifier patterns can take exponential time on crafted input.

Secure implementation

# config/identifiers.py
import re
from pydantic import field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict

# Anchored, linear-time: each segment is a bounded character class.
ARN_RE = re.compile(r"^arn:aws:[a-z0-9-]+:[a-z0-9-]*:\d{12}:[\w\-/:.*]+$")

class Settings(BaseSettings):
    model_config = SettingsConfigDict(extra="forbid")
    topic_arn: str
    webhook_url: str

    @field_validator("topic_arn")
    @classmethod
    def check_arn(cls, v: str) -> str:
        if not ARN_RE.match(v):
            raise ValueError(f"topic_arn is not a well-formed ARN: {v!r}")
        return v

    @field_validator("webhook_url")
    @classmethod
    def check_https(cls, v: str) -> str:
        if not v.startswith("https://"):
            raise ValueError("webhook_url must use https")
        return v

The ARN pattern is anchored with bounded character classes, so it runs in linear time. The URL check enforces TLS. Both raise a precise error naming the field and the bad value.

Gotchas & version-specific behaviour

  • In pydantic v2 use @field_validator with @classmethod; the v1 @validator is deprecated.
  • For URLs you can also use pydantic’s HttpUrl/AnyUrl types, but a custom check gives a clearer message and lets you enforce https only.
  • Anchor every pattern (^...$) and avoid nested quantifiers to stay ReDoS-safe.
  • Partition (arn:aws-us-gov:) and Suffix segments vary — widen the service/region classes if you target GovCloud or China regions.

Production parity checklist

  • Every ARN and URL field has an anchored, linear-time validator.
  • URLs enforce https://; no plaintext endpoints accepted.
  • extra="forbid" rejects stray keys.
  • A CI fixture builds the model with valid and invalid values to guard the regex.
  • Error messages name the field and the offending value.

Conclusion

Anchored, ReDoS-safe validators turn malformed ARNs and URLs into a clear startup failure instead of a runtime mystery. For URL-specific database and cache identifiers, see Validate Database and Redis URLs with Pydantic.