Handling nested configuration in YAML safely

Nested YAML is where two bugs hide: yaml.load will execute arbitrary objects, and YAML’s implicit typing turns on, no, and 3.10 into the wrong Python values. This page parses nested config safely and validates its shape. It extends YAML & JSON Parsing Strategies.

Problem 1: yaml.load executes objects

# ANTI-PATTERN: a crafted tag in the file runs code during parsing
import yaml
config = yaml.load(open("config.yaml"))   # full loader: remote code execution risk

A document containing !!python/object/apply:os.system runs during load. There is never a reason to use it on a file you do not fully control.

Problem 2: implicit typing changes values

# config.yaml — YAML 1.1 implicit typing surprises
feature:
  enabled: on        # parses to True, not the string "on"
  version: 3.10      # parses to float 3.1, dropping the trailing zero
  region: no         # parses to False (Norway country code!)

safe_load still applies YAML’s implicit typing, so version: 3.10 silently becomes 3.1. Quote ambiguous scalars and validate types explicitly.

Secure implementation

# config/nested.py
from pathlib import Path
import yaml
from pydantic import BaseModel, Field


class Feature(BaseModel):
    enabled: bool = False
    version: str                      # forced to string; "3.10" stays "3.10"
    region: str


class AppConfig(BaseModel):
    model_config = {"extra": "forbid"}   # reject unknown nested keys
    feature: Feature
    replicas: int = Field(ge=1, le=100)


def load(path: str = "config.yaml") -> AppConfig:
    raw = yaml.safe_load(Path(path).read_text())   # safe loader only
    return AppConfig.model_validate(raw)           # validate the whole nested tree

safe_load removes the execution risk; nested pydantic models force every value to its declared type, so version stays the string "3.10" and an unexpected key fails with extra="forbid".

Gotchas & version-specific behaviour

  • Quote version-like scalars (version: "3.10") so YAML does not parse them as floats.
  • on/off/yes/no are booleans in YAML 1.1 — quote them if you mean strings.
  • safe_load returns None for an empty file; default to {} before validating.
  • Deep merges of multiple YAML files need an explicit recursive merge — YAML anchors do not merge across files.

Production parity checklist

  • Every config file is parsed with safe_load and validated by a nested model.
  • extra="forbid" rejects typo’d nested keys.
  • Ambiguous scalars are quoted to avoid implicit retyping.
  • A CI test loads and validates each config file so a bad edit fails the build.
  • Secrets are referenced from a secret store, never embedded in the file.

Conclusion

safe_load plus nested pydantic models with extra="forbid" makes nested YAML both safe to parse and correct once parsed. For format trade-offs, see TOML vs YAML vs JSON for Python Config.