Job Description
Role Purpose
The purpose of this role is to provide significant technical expertise in architecture planning and design of the concerned tower (platform, database, middleware, backup etc) as well as managing its day-to-day operations
͏
Do
- Provide adequate support in architecture planning, migration & installation for new projects in own tower (platform/dbase/ middleware/ backup)
- Lead the structural/ architectural design of a platform/ middleware/ database/ back up etc. according to various system requirements to ensure a highly scalable and extensible solution
- Conduct technology capacity planning by reviewing the current and future requirements
- Utilize and leverage the new features of all underlying technologies to ensure smooth functioning of the installed databases and applications/ platforms, as applicable
- Strategize & implement disaster recovery plans and create and implement backup and recovery plans
- Manage the day-to-day operations of the tower
- Manage day-to-day operations by troubleshooting any issues, conducting root cause analysis (RCA) and developing fixes to avoid similar issues.
- Plan for and manage upgradations, migration, maintenance, backup, installation and configuration functions for own tower
- Review the technical performance of own tower and deploy ways to improve efficiency, fine tune performance and reduce performance challenges
- Develop shift roster for the team to ensure no disruption in the tower
- Create and update SOPs, Data Responsibility Matrices, operations manuals, daily test plans, data architecture guidance etc.
- Provide weekly status reports to the client leadership team, internal stakeholders on database activities w.r.t. progress, updates, status, and next steps
- Leverage technology to develop Service Improvement Plan (SIP) through automation and other initiatives for higher efficiency and effectiveness
͏
Team Management
- Resourcing
- Forecast talent requirements as per the current and future business needs
- Hire adequate and right resources for the team
- Train direct reportees to make right recruitment and selection decisions
- Talent Management
- Ensure 100% compliance to Wipro’s standards of adequate onboarding and training for team members to enhance capability & effectiveness
- Build an internal talent pool of HiPos and ensure their career progression within the organization
- Promote diversity in leadership positions
- Performance Management
- Set goals for direct reportees, conduct timely performance reviews and appraisals, and give constructive feedback to direct reports.
- Ensure that organizational programs like Performance Nxt are well understood and that the team is taking the opportunities presented by such programs to their and their levels below
- Employee Satisfaction and Engagement
- Lead and drive engagement initiatives for the team
- Track team satisfaction scores and identify initiatives to build engagement within the team
- Proactively challenge the team with larger and enriching projects/ initiatives for the organization or team
- Exercise employee recognition and appreciation
͏
Deliver
| No | Performance Parameter | Measure |
| 1 | Operations of the tower | SLA adherence Knowledge management CSAT/ Customer Experience Identification of risk issues and mitigation plans Knowledge management |
| 2 | New projects | Timely delivery Avoid unauthorised changes No formal escalations |
͏
Key Responsibilities
- Design & Develop Observability Solutions: Build and enhance telemetry pipelines for logs, metrics, and traces using industry-standard tools (kafka, OpenTelemetry, Splunk)
- Instrument Applications: Implement observability best practices in infrastructure, applications and platforms.
- Design and Implement machine learning models to analyze logs, metrics and traces for anomaly detection, predictive failure analysis and root cause analysis.
- Monitor & Analyze System Performance: Build and Develop real-time data visualization dashboards and alerts to track system health, detect anomalies, and support real-time troubleshooting.
- Work with Event-Driven Architectures: Integrate observability with messaging systems like Kafka, RabbitMQ, or Pulsar for real-time monitoring.
- Collaborate Across Teams: Work closely with SREs, DevOps, and development teams to improve system reliability and incident response.
- Security & Compliance: Ensure observability data is securely stored and compliant with relevant regulations (GDPR, HIPAA, etc.).
- Optimize Performance: Conduct root cause analysis and improve system observability to reduce downtime and improve response times.
Required Skills & Experience
- Data Science & Machine Learning experience: Hands-on proficiency in Python, TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy.
- Extensive knowledge of ETL techniques: Data extraction, transformation, and loading using Apache Airflow, Apache NiFi, Spark or similar tools
- Observability Stack: Hands-on experience with Prometheus, Grafana, ELK Stack, Loki, OpenTelemetry, Jaeger, or Zipkin
- Experience with Time-Series Analysis, Predictive Analytics and AI-driven Observability.
- Cloud & Infrastructure: Experience with AWS, Azure, or GCP observability services (e.g., CloudWatch, Azure Monitor).
- Distributed Systems & Microservices: Understanding of Kubernetes, Docker, and Service Mesh technologies (Istio, Linkerd).
- Event-Driven Architectures: Experience with Kafka, RabbitMQ, or other message brokers.
- Database & Storage: Familiarity with time-series databases (InfluxDB, VictoriaMetrics) and NoSQL/SQL databases.
Experience: 5-8 Years .
Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention.