Job Description
Job Title
Senior / Lead Site Reliability Engineer (SRE)
Job Summary
We are looking for an experienced Senior/Lead Site Reliability Engineer with 15+ years of experience supporting mission‑critical applications in highly regulated environments such as banking, finance, and healthcare. The role focuses on building and operating scalable, reliable, and observable systems across hybrid and multi‑cloud platforms, with a strong emphasis on Datadog/Dynatrace observability, automation, and incident management.
Key Responsibilities
- Design, implement, and operate end‑to‑end observability solutions using Datadog and/or Dynatrace (APM, Logs, RUM, Synthetics, Dashboards).
- Ensure high availability, reliability, and performance of production systems across AWS, Azure, GCP, and on‑premise environments.
- Develop and maintain automated monitoring, alerting, and incident response workflows, integrating with PagerDuty and ServiceNow.
- Reduce MTTR and alert fatigue through intelligent alerting, SLOs/SLIs, and error‑budget monitoring.
- Automate infrastructure provisioning and monitoring configuration using Terraform, CloudFormation, Ansible, and scripting (Python/Shell/PowerShell).
- Support containerized and serverless platforms including Kubernetes, EKS/ECS, Fargate, and Lambda.
- Lead incident response, war rooms, RCAs, and post‑mortems while collaborating with global SRE, Dev, and NOC teams.
- Ensure compliance with security and regulatory requirements (HIPAA, HITRUST, financial regulations).
- Partner with development teams to align reliability metrics with business KPIs and release processes.
Required Skills & Experience
- 12+ years in SRE, Production Support, or Cloud Reliability Engineering roles.
- Strong hands‑on experience with Datadog and/or Dynatrace observability platforms.
- Deep knowledge of cloud and hybrid infrastructure (AWS, Azure, Kubernetes, VMware).
- Solid experience with CI/CD pipelines, DevOps tools, and Infrastructure‑as‑Code.
- Proficiency in automation and scripting (Python, Shell, PowerShell).
- Experience in highly regulated environments (banking, finance, healthcare).
- Excellent troubleshooting, incident management, and stakeholder communication skills.
Preferred Certifications
- AWS Solutions Architect / SysOps Administrator
- Kubernetes (CKA / CKAD)
- Datadog Fundamentals
- Terraform Associate
- Azure Cloud Foundation
Experience: 5-8 Years .
The expected compensation for this role ranges from $60,000 to $135,000 .
Final compensation will depend on various factors, including your geographical location, minimum wage obligations, skills, and relevant experience. Based on the position, the role is also eligible for Wipro's standard benefits including a full range of medical and dental benefits options, disability insurance, paid time off (inclusive of sick leave), other paid and unpaid leave options.
Applicants are advised that employment in some roles may be conditioned on successful completion of a post-offer drug screening, subject to applicable state law.
Wipro provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Applications from veterans and people with disabilities are explicitly welcome.
Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention.