Job Description
Role Purpose
The purpose of this role is to design, test and maintain software programs for operating systems or applications which needs to be deployed at a client end and ensure its meet 100% quality assurance parameters
͏
Do
Experience
8–10 years in Site Reliability Engineering, DevOps, or Cloud Operations
5+ years hands-on experience with Microsoft Azure
Role Overview
We are seeking a Senior Azure Site Reliability Engineer (SRE) and DevOps to design, build, and operate highly available, scalable, and resilient systems on Microsoft Azure. The ideal candidate will have strong experience with observability tools, cloud-native services, automation, incident management, and SRE best practices. You will work closely with development, platform, and security teams to ensure reliability, performance, and operational excellence across our services.
Key Responsibilities
Reliability & Operations
- Ensure high availability, scalability, performance, and resilience of production systems running on Azure.
- Define, implement, and monitor SLIs, SLOs, and error budgets.
- Lead and participate in incident response, root cause analysis (RCA), and post-incident reviews.
- Drive continuous improvement by reducing toil through automation and engineering solutions.
Observability & Monitoring
- Implement and manage observability tools (ELK, Dyntrace, Splunk, Grafana, Promethheus, etc) for end-to-end observability (APM, infrastructure monitoring, logs, metrics, RUM) is a Must.
- Create actionable dashboards, alerts, and anomaly detection to proactively identify issues.
- Integrate monitoring with incident response and escalation workflows.
Cloud & Platform Engineering (Azure)
- Design and operate solutions using Azure services such as:
- Azure Kubernetes Service (AKS)
- Virtual Machines, Scale Sets
- Azure App Services, Functions
- Azure Load Balancer / Application Gateway
- Azure Monitor, Log Analytics
- Optimize system reliability, performance, and cost efficiency.
Automation & Infrastructure as Code
- Build and maintain Infrastructure as Code (IaC) using tools like ARM, Bicep, Terraform, or similar.
- Configuration Management tools like Puppet, Ansible, etc.
- Automate operational tasks using scripting (PowerShell, Bash, Python).
- Improve CI/CD pipelines to support safe, reliable, and frequent deployments.
͏
2. Perform coding and ensure optimal software/ module development
- Determine operational feasibility by evaluating analysis, problem definition, requirements, software development and proposed software
- Develop and automate processes for software validation by setting up and designing test cases/scenarios/usage cases, and executing these cases
- Modifying software to fix errors, adapt it to new hardware, improve its performance, or upgrade interfaces.
- Analyzing information to recommend and plan the installation of new systems or modifications of an existing system
- Ensuring that code is error free or has no bugs and test failure
- Preparing reports on programming project specifications, activities and status
- Ensure all the codes are raised as per the norm defined for project / program / account with clear description and replication patterns
- Compile timely, comprehensive and accurate documentation and reports as requested
- Coordinating with the team on daily project status and progress and documenting it
- Providing feedback on usability and serviceability, trace the result to quality risk and report it to concerned stakeholders
͏
3. Status Reporting and Customer Focus on an ongoing basis with respect to project and its execution
- Capturing all the requirements and clarifications from the client for better quality work
- Taking feedback on the regular basis to ensure smooth and on time delivery
- Participating in continuing education and training to remain current on best practices, learn new programming languages, and better assist other team members.
- Consulting with engineering staff to evaluate software-hardware interfaces and develop specifications and performance requirements
- Document and demonstrate solutions by developing documentation, flowcharts, layouts, diagrams, charts, code comments and clear code
- Documenting very necessary details and reports in a formal way for proper understanding of software from client proposal to implementation
- Ensure good quality of interaction with customer w.r.t. e-mail content, fault report tracking, voice calls, business etiquette etc
- Timely Response to customer requests and no instances of complaints either internally or externally
͏
Deliver
| No. | Performance Parameter | Measure |
| 1. | Continuous Integration, Deployment & Monitoring of Software | 100% error free on boarding & implementation, throughput %, Adherence to the schedule/ release plan |
| 2. | Quality & CSAT | On-Time Delivery, Manage software, Troubleshoot queries, Customer experience, completion of assigned certifications for skill upgradation |
| 3. | MIS & Reporting | 100% on time MIS & report generation |
Experience: 3-5 Years .
Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention.