Implementing custom validators for AWS ARNs and URLs
A typed str field accepts "arn:aws:lol" and "http://internal" without complaint — both are strings. Only a domain validator catches a malformed ARN or a non-TLS URL at startup instead of at the first AWS call. This page writes those validators, extending Custom Validators & Constraints.
Problem 1: a plain str accepts anything
# ANTI-PATTERN: malformed ARN passes validation, fails at runtime
class Settings(BaseSettings):
topic_arn: str # "arn:aws:lol" is a valid str — and a broken ARN
The error surfaces later as an opaque AWS InvalidParameter, far from the config that caused it.
Problem 2: a greedy regex that backtracks
# ANTI-PATTERN: nested quantifiers — catastrophic backtracking (ReDoS)
ARN_RE = re.compile(r"^(arn:(.*)+:.*)+$") # can hang on long malicious input
Unanchored, nested-quantifier patterns can take exponential time on crafted input.
Secure implementation
# config/identifiers.py
import re
from pydantic import field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict
# Anchored, linear-time: each segment is a bounded character class.
ARN_RE = re.compile(r"^arn:aws:[a-z0-9-]+:[a-z0-9-]*:\d{12}:[\w\-/:.*]+$")
class Settings(BaseSettings):
model_config = SettingsConfigDict(extra="forbid")
topic_arn: str
webhook_url: str
@field_validator("topic_arn")
@classmethod
def check_arn(cls, v: str) -> str:
if not ARN_RE.match(v):
raise ValueError(f"topic_arn is not a well-formed ARN: {v!r}")
return v
@field_validator("webhook_url")
@classmethod
def check_https(cls, v: str) -> str:
if not v.startswith("https://"):
raise ValueError("webhook_url must use https")
return v
The ARN pattern is anchored with bounded character classes, so it runs in linear time. The URL check enforces TLS. Both raise a precise error naming the field and the bad value.
Gotchas & version-specific behaviour
- In pydantic v2 use
@field_validatorwith@classmethod; the v1@validatoris deprecated. - For URLs you can also use pydantic’s
HttpUrl/AnyUrltypes, but a custom check gives a clearer message and lets you enforcehttpsonly. - Anchor every pattern (
^...$) and avoid nested quantifiers to stay ReDoS-safe. - Partition (
arn:aws-us-gov:) and Suffix segments vary — widen the service/region classes if you target GovCloud or China regions.
Production parity checklist
- Every ARN and URL field has an anchored, linear-time validator.
- URLs enforce
https://; no plaintext endpoints accepted. extra="forbid"rejects stray keys.- A CI fixture builds the model with valid and invalid values to guard the regex.
- Error messages name the field and the offending value.
Conclusion
Anchored, ReDoS-safe validators turn malformed ARNs and URLs into a clear startup failure instead of a runtime mystery. For URL-specific database and cache identifiers, see Validate Database and Redis URLs with Pydantic.