Reliability Dashboard
Observability • Chaos Engineering • SLOs
A transparency log of my portfolio's reliability engineering practices. Validating resilience continuously through Chaos Mesh and Prometheus.
Service Level Objectives (SLOs)
| Objective | Target | Status (30d) | Burn Rate |
|---|---|---|---|
| Availability (Uptime) | 99.9% | 99.99% ✅ | 0.1x (Safe) |
| Latency (P95) | < 200ms | 45ms ✅ | 0x |
| Error Rate (5xx) | < 0.1% | 0.00% ✅ | 0x |
Chaos Experiments
Pod Kill Experiment
Scenario: Randomly terminate a backend pod every 60s.
Result: Cluster self-healed. 0% Downtime observed.
Chaos Mesh
PASSED ✅
[2025-12-22T14:48:00Z] Experiment: pod-kill-backend
[2025-12-22T14:48:01Z] Action: Pod backend-5f4d3a killed
[2025-12-22T14:48:02Z] Alert: PodRestarted (Severity: Info)
[2025-12-22T14:48:04Z] Recovery: New pod Running
[2025-12-22T14:48:05Z] SLO Check: Availability > 99.9% (TRUE)
Network Latency
Scenario: Inject 200ms delay to backend traffic.
Result: Latency SLO alert fired within 30s.
Chaos Mesh
VERIFIED ✅
Hybrid Architecture
graph TD
subgraph "AWS Production (Always On)"
User((User)) -->|HTTPS| CF[CloudFront CDN]
CF -->|Static Assets| S3[S3 Bucket]
CF -->|API Calls| APIG[API Gateway]
APIG -->|JSON| Lambda[Lambda Function]
Lambda -->|Read/Write| DDB[(DynamoDB)]
end
subgraph "Reliability Lab (Ephemeral)"
SRE((SRE/Admin)) -->|Start Lab| LabScript[lab.sh]
LabScript -->|Provisions| EC2[EC2 Spot Node]
EC2 -->|Hosts| K3s[K3s Cluster]
subgraph "K8s Namespace: Default"
BackendPod[Backend Service]
VerifyPod[Reliability Service]
VerifyPod -->|Synthetic Traffic| BackendPod
end
subgraph "K8s Namespace: Monitoring"
Prom[Prometheus]
Graf[Grafana]
Prom -->|Scrape| BackendPod
end
subgraph "K8s Namespace: Chaos Mesh"
Chaos[Chaos Daemon]
Chaos -->|Pod Kill| BackendPod
end
end
classDef aws fill:#ff990033,stroke:#ff9900,color:#fff
classDef k8s fill:#326ce533,stroke:#326ce5,color:#fff
classDef chaos fill:#ef444433,stroke:#ef4444,color:#fff
class S3,Lambda,DDB,CF,APIG aws
class K3s,BackendPod,VerifyPod,Prom,Graf k8s
class Chaos chaos