Core Configuration Patterns & File Formats
Every production incident that starts with “it worked on my machine” traces back to configuration that loaded differently in two places. This section is the map for loading configuration in Python deterministically: where values come from, which source wins when they collide, and how to validate the result before a single request is served.
What this section covers
| Topic | Why it matters | Go deeper |
|---|---|---|
| Environment variables | The 12-factor baseline; how Python reads and types os.environ safely |
Environment Variables & os.environ |
| .env file management | Loading local secrets without mutating the process environment or committing them | .env File Management |
| Precedence rules | The deterministic order that decides which source wins a collision | Configuration Precedence Rules |
| YAML / JSON parsing | Parsing structured config safely without arbitrary-object execution | YAML & JSON Parsing Strategies |
| CI/CD config validation | Gating misconfiguration in the pipeline before it reaches production | CI/CD Config Validation |
Environment variables are the baseline
A 12-factor process reads its configuration from the environment. The problem is that os.environ returns strings only, and a missing key raises KeyError at the worst possible moment. Read every value through a single typed accessor so the failure is explicit and early.
# config/env.py
import os
def require(key: str) -> str:
try:
return os.environ[key]
except KeyError as exc:
raise SystemExit(f"Missing required environment variable: {key}") from exc
DATABASE_URL = require("DATABASE_URL") # fails fast at import, not mid-request
Key rule: never call os.getenv with a silent default for a value the app cannot run without. A wrong default is more dangerous than a crash. The full typing rules — booleans, ints, lists — live in Environment Variables & os.environ.
.env files belong in development, never in git
A .env file makes local development ergonomic, but it must never overwrite a value the platform already injected (an IAM role, a Kubernetes secret). Load it without clobbering the real environment.
# config/loader.py
from pathlib import Path
from dotenv import dotenv_values
# Read into an isolated dict; do NOT mutate os.environ blindly.
file_values = dotenv_values(Path(__file__).parent / ".env")
Key rule: override=False is the safe default — platform-injected values must win over a developer’s .env. See .env File Management for the gitignore and pre-commit setup.
Precedence is a contract, not an accident
When the same key is set in three places, the winner must be defined in advance and identical in every environment. The conventional order, highest priority first: CLI flags → OS environment → .env file → config file → hard-coded defaults.
# config/resolve.py
def resolve(key, cli, env, dotenv, defaults):
for source in (cli, env, dotenv, defaults): # first hit wins
if key in source and source[key] is not None:
return source[key]
raise KeyError(key)
Key rule: document the order once and enforce it everywhere — drift between local and production precedence is the root cause of “works locally” bugs. Details in Configuration Precedence Rules.
Structured files: parse safely
YAML and JSON express nested configuration that flat environment variables cannot. The danger is yaml.load, which can instantiate arbitrary Python objects from untrusted input.
# config/files.py
import yaml
with open("config.yaml") as fh:
data = yaml.safe_load(fh) # never yaml.load() on untrusted input
Key rule: safe_load only, always. The nested-config patterns are covered in YAML & JSON Parsing Strategies.
Gate it in CI before it ships
Configuration errors should fail a pipeline, not a pod. A standalone validation step that instantiates the settings object catches missing keys and malformed values before the container is promoted — the full GitHub Actions and GitLab CI recipes are in CI/CD Config Validation.
Anti-patterns & common mistakes
- Silent defaults for required values —
os.getenv("DATABASE_URL", "sqlite:///dev.db")ships a dev database to production when the real var is misspelled. override=Trueon dotenv loading — overwrites platform-injected secrets with stale local values.yaml.loadon untrusted input — a remote-code-execution vector; alwayssafe_load.- Branching business logic on
ENVstrings —if os.environ["ENV"] == "prod"scatters environment knowledge across the codebase instead of into config. - Reading
os.environin twenty modules — fragments configuration and makes it untestable. Centralize on one settings object. - Committing
.env— the single most common secret leak. Gitignore it and scan for it in pre-commit.
Decision flow: which source for which value?
Is the value a secret (password, token, key)?
├── Yes → inject from the platform's secret store (Vault / AWS Secrets Manager / Doppler);
│ mirror locally via an uncommitted .env file.
└── No → Is it environment-specific (URL, region, pool size)?
├── Yes → OS environment variable, typed and validated at startup.
└── No → Is it structured / nested (feature maps, routing tables)?
├── Yes → a checked-in YAML/JSON/TOML file, parsed with safe_load.
└── No → a hard-coded default inside the validated settings model.
CI/CD integration checklist
- Add a
.env.examplewith dummy values and keep the real.envgitignored. - Run a secret scanner (
gitleaksordetect-secrets) as a pre-commit hook and a CI job. - Add a pipeline stage that imports and instantiates the settings object; fail the build on any error.
- Run that stage with
PYTHONWARNINGS=errorto surface deprecation and syntax warnings. - Diff the settings schema between staging and production to catch drift before promotion.
- Block deployment if any required key is unset in the target environment.
Bringing it together
Configuration in Python is reliable when it is boring: every value enters through one typed settings object, the precedence order is fixed and identical across environments, .env files stay local and uncommitted, structured files are parsed safely, os.environ is read once and typed, and CI gates reject mistakes before they ship. From here, layer type-safe validation with pydantic-settings and pull secrets from a managed store under enterprise secrets management.