SRE: SLIs, SLOs, and Automations That Actually Help
2026-02-06 | 15 min read
We will explore how to define SLIs and SLOs as code, deploy them with ArgoCD, and use MCP servers to automate SRE workflows...
SRE: Incident Management, On-Call, and Postmortems as Code
2026-02-23 | 21 min read
We will explore how to build an effective incident management workflow, set up on-call rotations that don't burn people out, write runbooks as code, and run blameless postmortems...
SRE: Observability Deep Dive: Traces, Logs, and Metrics
2026-02-28 | 16 min read
We will explore the three pillars of observability, how to instrument your applications with OpenTelemetry, build useful dashboards in Grafana, and set up log aggregation that actually helps during incidents...