beyond5nines is a field notes blog for engineers who build and run production systems.

The name is a reminder that real reliability work happens beyond the uptime dashboard — in the failures, the investigations, and the fixes that never show up in an SLA report.


Who writes this

I’m Rahul ,running SRE and data engineering practice based in New Jersey. I work on distributed systems, data infrastructure, and cloud-native architectures. The posts here come from real incidents — things that broke in production, what we found when we looked, and what we changed to stop them from happening again.


What you’ll find here

Every post follows the same structure: the failure, the investigation, the fix. No theory without a production problem behind it.

Current series:

  • Look Ma, No Servers! — AWS Glue, serverless ETL, and the hidden constraints behind the abstraction

Open source

Tools built during these investigations are published at github.com/beyond5nines.

brahmagupta — a CLI for inspecting Spark shuffle files. Built during the investigation in Part 3 of Look Ma, No Servers!


Get in touch