This primer focuses on Python functionalities that solve DevOps problems: infrastructure automation, CI/CD scripting, API interaction, and log parsing. It assumes you understand basic programming concepts and focuses on practical application.
What you’ll learn: How to replace brittle Bash scripts with maintainable Python, interact with APIs and cloud providers, generate configs dynamically, and build CLI tools your team will actually use.
1. Environment Variables and Secrets
Before anything else: never hardcode credentials. Environment variables are the standard way to pass secrets and configuration into scripts.
import os
# Get a required variable (raises KeyError if missing)
api_key = os.environ["API_KEY"]
# Get with a default fallback
env = os.getenv("ENVIRONMENT", "dev")
# Check if running in CI
if os.getenv("CI"):
print("Running in CI pipeline")For AWS specifically, use ~/.aws/credentials or set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. The SDK picks these up automatically—no code changes needed.
2. System Interaction: The “Better Bash”
DevOps engineers use Python to replace complex Bash scripts because Python offers better error handling and readability.
The subprocess Module
Stop using os.system(). The subprocess module is the standard for running shell commands.
import subprocess
# Run a command and capture output
try:
result = subprocess.run(
["ls", "-l", "/var/log"],
capture_output=True,
text=True,
check=True, # Raises CalledProcessError if return code != 0
)
print("STDOUT:", result.stdout)
except subprocess.CalledProcessError as e:
print(f"Command failed with error: {e.stderr}")Avoid
shell=TrueUsingshell=Trueopens you to shell injection attacks. If you're tempted to use it because your command "doesn't work otherwise," you probably need to split your command into a proper list:["ls", "-l"]not"ls -l".
pathlib: Modern File Paths
String manipulation for file paths is error-prone. Use pathlib.
from pathlib import Path
# Create a path object
log_dir = Path("/var/log/myapp")
# Check if exists, create if not
if not log_dir.exists():
log_dir.mkdir(parents=True, exist_ok=True)
# Iterate over files
for file in log_dir.glob("*.log"):
print(f"Found log: {file.name}")
# Combine paths safely (works on any OS)
config_file = Path.home() / ".config" / "myapp" / "settings.yaml"Related: shutil provides high-level file operations like copying directory trees and creating archives.
3. Data Serialization
DevOps is essentially moving data between JSON, YAML, and configuration files.
JSON (Built-in)
Used for API payloads and cloud policies.
import json
data = {"instance_id": "i-12345", "status": "running"}
# Dict to JSON string
json_str = json.dumps(data, indent=2)
# JSON string to Dict
parsed_data = json.loads(json_str)
# Reading from a file
with open("config.json", "r") as f:
config = json.load(f)YAML
Used for Kubernetes, Ansible, and CI/CD configs.
pip install pyyamlimport yaml
# Reading YAML
with open("deployment.yaml", "r") as f:
k8s_config = yaml.safe_load(f)
print(k8s_config["spec"]["replicas"])
# Writing YAML
k8s_config["spec"]["replicas"] = 5
with open("deployment_updated.yaml", "w") as f:
yaml.dump(k8s_config, f)Always use
safe_load()Never useyaml.load()without a Loader—it can execute arbitrary code.safe_load()is the secure default.
4. Building CLIs
If you’re writing a script for your team, don’t make them edit the code to change variables. Use argparse to create flags.
import argparse
def main():
parser = argparse.ArgumentParser(description="Deploy Service Tool")
# Positional argument
parser.add_argument("service", help="Name of the service to deploy")
# Optional flag
parser.add_argument(
"--env",
choices=["dev", "staging", "prod"],
default="dev",
help="Target environment",
)
# Boolean flag
parser.add_argument(
"--force", action="store_true", help="Skip safety checks"
)
args = parser.parse_args()
print(f"Deploying {args.service} to {args.env}...")
if args.force:
print("Force flag detected. Skipping checks.")
if __name__ == "__main__":
main()Usage:
python deploy.py my-api --env prod --forceAlternative:
clickFor more complex CLIs, consider theclicklibrary. It uses decorators and is more intuitive for nested commands and interactive prompts.
5. Talking to Infrastructure: APIs
You will constantly interact with REST APIs (GitHub, Jenkins, Jira, cloud providers). The requests library is the standard.
pip install requestsimport requests
import os
token = os.environ["GITHUB_TOKEN"]
headers = {"Authorization": f"token {token}"}
# GET request
response = requests.get(
"https://api.github.com/user/repos",
headers=headers,
timeout=30, # Always set a timeout
)
if response.status_code == 200:
repos = response.json()
print(f"Found {len(repos)} repositories.")
else:
print(f"Error: {response.status_code} - {response.text}")
# POST request
payload = {"name": "new-devops-tool", "private": True}
r = requests.post(
"https://api.github.com/user/repos",
json=payload,
headers=headers,
timeout=30,
)Handling Failures
Real-world API calls fail. Handle network errors and implement retries.
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
def create_session_with_retries():
session = requests.Session()
retries = Retry(
total=3,
backoff_factor=1, # Wait 1s, 2s, 4s between retries
status_forcelist=[500, 502, 503, 504],
)
session.mount("https://", HTTPAdapter(max_retries=retries))
return session
session = create_session_with_retries()
try:
response = session.get("https://api.example.com/data", timeout=30)
response.raise_for_status() # Raises HTTPError for 4xx/5xx
except requests.exceptions.RequestException as e:
print(f"Request failed: {e}")6. Cloud Automation: AWS Boto3
If you use AWS, boto3 is unavoidable. It allows you to control AWS resources programmatically.
pip install boto3import boto3
# Initialize a client (uses credentials from env or ~/.aws/credentials)
s3 = boto3.client("s3")
# List buckets
response = s3.list_buckets()
for bucket in response["Buckets"]:
print(f"Bucket Name: {bucket['Name']}")EC2 Example: Stop Dev Instances
import boto3
ec2 = boto3.resource("ec2", region_name="us-east-1")
filters = [{"Name": "tag:Environment", "Values": ["Dev"]}]
instances = ec2.instances.filter(Filters=filters)
for instance in instances:
print(f"Stopping {instance.id}")
instance.stop()7. Templating with Jinja2
Generate config files dynamically (Nginx configs, Terraform files) based on variables. String concatenation is messy; Jinja2 is the solution. It’s also what Ansible uses internally.
pip install jinja2from jinja2 import Template
nginx_template = """
server {
listen {{ port }};
server_name {{ domain }};
location / {
proxy_pass http://{{ upstream }};
}
}
"""
t = Template(nginx_template)
conf = t.render(port=8080, domain="api.internal.local", upstream="127.0.0.1:3000")
print(conf)For larger projects, load templates from files:
from jinja2 import Environment, FileSystemLoader
env = Environment(loader=FileSystemLoader("templates/"))
template = env.get_template("nginx.conf.j2")
output = template.render(services=my_services_list)8. Error Handling Patterns
DevOps scripts run unattended. They need to fail gracefully and provide useful output.
import sys
import logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
logger = logging.getLogger(__name__)
def deploy_service(name: str, env: str) -> bool:
"""Deploy a service. Returns True on success."""
try:
logger.info(f"Deploying {name} to {env}")
# ... deployment logic ...
return True
except PermissionError:
logger.error(f"Permission denied deploying {name}")
return False
except Exception as e:
logger.exception(f"Unexpected error deploying {name}: {e}")
return False
def main():
success = deploy_service("my-api", "prod")
sys.exit(0 if success else 1) # Exit codes matter for CI/CD
if __name__ == "__main__":
main()Key points:
- Use
logginginstead ofprint()for timestamps and log levels - Return meaningful exit codes (
0= success, non-zero = failure) - Use
logger.exception()to include stack traces - Type hints make scripts maintainable as they grow
9. Quick Reference
| Library | Use Case |
|---|---|
os / sys | Environment variables, exit codes |
pathlib | File path manipulation |
shutil | Copying directories, creating archives |
subprocess | Running shell commands |
re | Parsing logs with regex |
datetime | Timestamps, uptime calculations |
json | API payloads, configs (built-in) |
yaml | Kubernetes, Ansible configs |
requests | HTTP/REST API calls |
boto3 | AWS automation |
paramiko | SSH connections |
jinja2 | Config file templating |
click | Building CLIs (alternative to argparse) |
pytest | Testing infrastructure code |
10. Best Practices
Always use virtual environments. Never install into system Python.
python3 -m venv venv
source venv/bin/activate
pip install requests boto3Add a shebang for executable scripts on Linux:
#!/usr/bin/env python3Use type hints to keep growing scripts maintainable:
def restart_service(service_name: str, timeout: int = 30) -> bool:
...Pin your dependencies in requirements.txt:
requests==2.31.0
boto3==1.28.0
pyyaml==6.0.1
Related
- devops-overview — DevOps philosophy and lifecycle
- devops-resources — Learning resources for DevOps
- docker-overview — Container fundamentals
- docker-to-cloud-run — Hands-on lab using Python concepts
- ACE Certification Plan — Apply Python to GCP automation
- GCE Mastery Roadmap — 20 Python SDK projects
- GCP Hierarchy Explorer — Python project example