We are looking for a Principal Cloud Architect who operates at the intersection of deep infrastructure engineering, platform reliability, and strategic solution design. This is a high-impact senior individual contributor position ; you will be the organisation's foremost expert in diagnosing and resolving complex infrastructure incidents, designing cloud modernisation blueprints, and continuously raising the engineering bar. You will architect across AWS and Azure at an expert level, champion DevOps and SRE culture, lead cloud-native platform decisions, and serve as a technical thought leader on emerging technologies including AI-driven infrastructure and FinOps practices. This role is both hands-on and strategic. You are expected to write code, build prototypes, own architectural artefacts, and actively mentor senior engineers ; while also influencing technology roadmaps and cross-functional engineering decisions at a principal level.
Responsibilities1) Troubleshooting & reliability
- Own resolution of critical infra incidents across AWS & Azure
- Lead RCAs and produce actionable post-mortems
- Define and enforce SLOs, SLIs, and error budgets
- Build runbooks, playbooks, and on-call frameworks
2) Cloud architecture
- Design scalable, secure architectures for cloud workloads
- Architect hybrid and multi-cloud connectivity models
- Create reference architectures and golden paths
- Lead architectural reviews and produce ADRs
3) Infra modernisation
- Drive migration from legacy to cloud-native systems
- Champion IaC adoption at scale (Terraform / Bicep)
- Mature Kubernetes platform across EKS and AKS
- Lead FinOps and cloud cost optimisation initiatives
4) DevOps, observability & AI
- Define CI/CD, GitOps, and developer platform standards
- Drive observability using Grafana, Prometheus, OpenTelemetry
- Architect AI/ML-ready infra and integrate AIOps tooling
- Mentor engineers and influence the technology roadmap
Must haveExpert in AWS and Azure architecture, networking, securityDeep Kubernetes knowledge (EKS, AKS, RBAC, service mesh)Strong cloud networking (VPC/VNet, BGP, Private Link, ZTA)IaC at scale : Terraform, Pulumi, or CloudFormation/BicepSRE practices : SLO/SLI, error budgets, chaos engineeringObservability stack : Grafana, Prometheus, OpenTelemetryScripting in Python and Shell/BashConfig management with Ansible (AWX/Tower)
-
Good to have
AI/ML infraAIOpsFinOps toolsDatabricks / KafkaGo / TypeScriptEdge computing
-
Experience
11+ years in infra / cloud engineering (8+ in architecture)Led modernisation programmes end-to-endOwned P0/P1 incident resolution at scaleDegree in CS/IT or equivalent practical experience
Preferred certificationsAWS SA - ProAZ-305CKA / CKSTerraform AssociateAI-102FinOps CP
