Job Description
Role Purpose
OpenShift SRE
Primary job responsibilities
We are seeking a skilled OpenShift Site Reliability Engineer (SRE) to join our team. In this role, you will be responsible for ensuring the reliability, availability, and performance of our OpenShift-based container platforms and services with a focus on automation. Work and collaborate across teams, such as Applications, Hardware, and Network. Develop secure service architecture using cloud-native technologies Develop systems, primarily in Shell scripting, YAML, Ruby, Python and Go language, to prevent outages through automatic scanning and remediation Establish and enforce SRE best practices through platform constraints and high-fidelity system modeling Participate in an on-call rotation.
Key Responsibilities:
- Deploy, manage, and maintain OpenShift clusters in production and development environments.
- Automate operational tasks using scripting and configuration management tools.
- Troubleshoot and resolve issues related to OpenShift, Kubernetes, and containerized applications.
- Troubleshoot and resolve issues encountered during OpenShift migrations.
- Document processes, runbooks, and best practices.
Required skills
- Hands-on experience with OpenShift and Kubernetes administration.
- Strong knowledge of Linux systems and networking.
- Experience with monitoring, logging, alerting & Observability tools (e.g., Otel, Prometheus, Grafana, Slunk etc.).
- Proficiency in scripting languages Python, Shell, Go Lang, Terraform etc.
- Familiarity with CI/CD tools (e.g., Jenkins, GitLab CI).
- Understanding of containerization (Docker) and microservices architecture.
- Ansible – Configuration Management and Deployment.
- Good problem-solving and communication skills.
͏
Do
- Drive technical solution support to the team to align on continuous integration (CI) and continuous deployment (CD) of technology in applications
- Design and define the overall DevOps architecture/ framework to for a project/ module delivery as per the client requirement
- Decide on the DevOps tool & platform and which needs to be deployed aligned to the customer’s requirement
- Create a tool deployment model for validating, testing and monitoring performance and align or provision for resources accordingly
- Define & manage the IT infrastructure as per the requirement of the supported software code
- Manage and drive the DevOps pipeline that supports the application life cycle across the DevOps toolchain — from planning, coding and building, to testing, to staging, to release, configuration and monitoring
- Work with the team to tackle the coding and scripting needed to connect elements of the code that are required to run the software release with operating systems and production infrastructure with minimum disruptions
- Ensure on boarding application configuration from planning to release stage
- Integrate security in the entire dev-ops lifecycle to ensure no cyber risk and data privacy is maintained
͏
- Provide customer support/ service on the DevOps tools
- Timely support internal & external customers escalations on multiple platforms
- Troubleshoot the various problems that arise in implementation of DevOps tools across the project/ module
- Perform root cause analysis of major incidents/ critical issues which may hamper project timeliness, quality or cost
- Develop alternate plans/ solutions to be implemented as per root cause analysis of critical problems
- Follow escalation matrix/ process as soon as a resolution gets complicated or isn’t resolved
- Provide knowledge transfer, sharing best practices with the team and motivate
͏
- Team Management
- Resourcing
- Forecast talent requirements as per the current and future business needs
- Hire adequate and right resources for the team
- Train direct reportees to make right recruitment and selection decisions
- Talent Management
- Ensure 100% compliance to Wipro’s standards of adequate onboarding and training for team members to enhance capability & effectiveness
- Build an internal talent pool of HiPos and ensure their career progression within the organization
- Promote diversity in leadership positions
- Performance Management
- Set goals for direct reportees, conduct timely performance reviews and appraisals, and give constructive feedback to direct reports.
- Incase of performance issues, take necessary action with zero tolerance for ‘will’ based performance issues
- Ensure that organizational programs like Performance Nxtarewell understood and that the team is taking the opportunities presented by such programs to their and their levels below
- Employee Satisfaction and Engagement
- Lead and drive engagement initiatives for the team
- Track team satisfaction scores and identify initiatives to build engagement within the team
- Proactively challenge the team with larger and enriching projects/ initiatives for the organization or team
- Exercise employee recognition and appreciation
- Resourcing
͏
Deliver
No. | Performance Parameter | Measure |
1. | Continuous Integration, Deployment & Monitoring | 100% error free on boarding & implementation |
2. | CSAT | Manage service tools Troubleshoot queries Customer experience |
3. | Capability Building & Team Management | % trained on new age skills, Team attrition %, Employee satisfaction score |
Experience: 5-8 Years .
Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention.