Core Configuration Patterns & File Formats

Every production incident that starts with “it worked on my machine” traces back to configuration that loaded differently in two places. This section is the map for loading configuration in Python deterministically: where values come from, which source wins when they collide, and how to validate the result before a single request is served.

Python configuration loading pipeline Configuration sources — OS environment, .env file, YAML/JSON/TOML and CLI flags — feed a precedence resolver, then pydantic-settings validation, then the application. OS environment .env file YAML / JSON / TOML CLI flags Precedence resolver Validated config Application
Many sources, one deterministic precedence order, validated once before the app starts.

What this section covers

Topic Why it matters Go deeper
Environment variables The 12-factor baseline; how Python reads and types os.environ safely Environment Variables & os.environ
.env file management Loading local secrets without mutating the process environment or committing them .env File Management
Precedence rules The deterministic order that decides which source wins a collision Configuration Precedence Rules
YAML / JSON parsing Parsing structured config safely without arbitrary-object execution YAML & JSON Parsing Strategies
CI/CD config validation Gating misconfiguration in the pipeline before it reaches production CI/CD Config Validation

Environment variables are the baseline

A 12-factor process reads its configuration from the environment. The problem is that os.environ returns strings only, and a missing key raises KeyError at the worst possible moment. Read every value through a single typed accessor so the failure is explicit and early.

# config/env.py
import os

def require(key: str) -> str:
    try:
        return os.environ[key]
    except KeyError as exc:
        raise SystemExit(f"Missing required environment variable: {key}") from exc

DATABASE_URL = require("DATABASE_URL")  # fails fast at import, not mid-request

Key rule: never call os.getenv with a silent default for a value the app cannot run without. A wrong default is more dangerous than a crash. The full typing rules — booleans, ints, lists — live in Environment Variables & os.environ.

.env files belong in development, never in git

A .env file makes local development ergonomic, but it must never overwrite a value the platform already injected (an IAM role, a Kubernetes secret). Load it without clobbering the real environment.

# config/loader.py
from pathlib import Path
from dotenv import dotenv_values

# Read into an isolated dict; do NOT mutate os.environ blindly.
file_values = dotenv_values(Path(__file__).parent / ".env")

Key rule: override=False is the safe default — platform-injected values must win over a developer’s .env. See .env File Management for the gitignore and pre-commit setup.

Precedence is a contract, not an accident

When the same key is set in three places, the winner must be defined in advance and identical in every environment. The conventional order, highest priority first: CLI flags → OS environment → .env file → config file → hard-coded defaults.

# config/resolve.py
def resolve(key, cli, env, dotenv, defaults):
    for source in (cli, env, dotenv, defaults):   # first hit wins
        if key in source and source[key] is not None:
            return source[key]
    raise KeyError(key)

Key rule: document the order once and enforce it everywhere — drift between local and production precedence is the root cause of “works locally” bugs. Details in Configuration Precedence Rules.

Structured files: parse safely

YAML and JSON express nested configuration that flat environment variables cannot. The danger is yaml.load, which can instantiate arbitrary Python objects from untrusted input.

# config/files.py
import yaml

with open("config.yaml") as fh:
    data = yaml.safe_load(fh)   # never yaml.load() on untrusted input

Key rule: safe_load only, always. The nested-config patterns are covered in YAML & JSON Parsing Strategies.

Gate it in CI before it ships

Configuration errors should fail a pipeline, not a pod. A standalone validation step that instantiates the settings object catches missing keys and malformed values before the container is promoted — the full GitHub Actions and GitLab CI recipes are in CI/CD Config Validation.

Anti-patterns & common mistakes

  • Silent defaults for required valuesos.getenv("DATABASE_URL", "sqlite:///dev.db") ships a dev database to production when the real var is misspelled.
  • override=True on dotenv loading — overwrites platform-injected secrets with stale local values.
  • yaml.load on untrusted input — a remote-code-execution vector; always safe_load.
  • Branching business logic on ENV stringsif os.environ["ENV"] == "prod" scatters environment knowledge across the codebase instead of into config.
  • Reading os.environ in twenty modules — fragments configuration and makes it untestable. Centralize on one settings object.
  • Committing .env — the single most common secret leak. Gitignore it and scan for it in pre-commit.

Decision flow: which source for which value?

Is the value a secret (password, token, key)?
├── Yes → inject from the platform's secret store (Vault / AWS Secrets Manager / Doppler);
│         mirror locally via an uncommitted .env file.
└── No → Is it environment-specific (URL, region, pool size)?
        ├── Yes → OS environment variable, typed and validated at startup.
        └── No → Is it structured / nested (feature maps, routing tables)?
                ├── Yes → a checked-in YAML/JSON/TOML file, parsed with safe_load.
                └── No → a hard-coded default inside the validated settings model.

CI/CD integration checklist

  1. Add a .env.example with dummy values and keep the real .env gitignored.
  2. Run a secret scanner (gitleaks or detect-secrets) as a pre-commit hook and a CI job.
  3. Add a pipeline stage that imports and instantiates the settings object; fail the build on any error.
  4. Run that stage with PYTHONWARNINGS=error to surface deprecation and syntax warnings.
  5. Diff the settings schema between staging and production to catch drift before promotion.
  6. Block deployment if any required key is unset in the target environment.

Bringing it together

Configuration in Python is reliable when it is boring: every value enters through one typed settings object, the precedence order is fixed and identical across environments, .env files stay local and uncommitted, structured files are parsed safely, os.environ is read once and typed, and CI gates reject mistakes before they ship. From here, layer type-safe validation with pydantic-settings and pull secrets from a managed store under enterprise secrets management.