Job Description
Responsibilities:
- Extract and document data lineage from DB2 and cloud data platforms.
- Analyze SQL queries, views, and stored procedures to derive lineage information.
- Design and develop Python-based lineage extraction and processing solutions.
- Publish lineage events to Apache Kafka for downstream consumption.
- Standardize lineage using Open Lineage framework.
- Convert lineage events into graph models and store in Azure Cosmos DB (Graph / Gremlin API).
- Integrate lineage with Microsoft Purview, Collibra, and other metadata management tools.
- Ensure accuracy, completeness, and reliability of lineage across systems.
Required Skills:
- Strong proficiency in Python and SQL.
- Hands-on experience in data lineage and metadata management.
- Expertise in Apache Kafka for event streaming.
- Familiarity with Open Lineage standards.
- Experience integrating with Microsoft Purview and Collibra.
- Strong knowledge of REST APIs for integration and automation.
- Minimum 8 years of professional experience in data engineering or related fields.
Nice to Have:
- Knowledge of graph databases and graph concepts.
- Experience with Spark / GraphX for large-scale graph processing.
- Exposure to Azure cloud services and ecosystem.
Experience: 5-8 Years .
Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention.