Lead Data Engineer - Apache Spark

HiringEye

All India, Hyderabad • 1 month ago

Experience: 7 to 11 Yrs

PREMIUM

Deal of the Day --:--:--

15 Days Free Trial

Upgrade to CVX24 Premium

Free Resume Writing
Get a Verified Blue tick
See who viewed your profile
Unlimited chat with recruiters
Rank higher in recruiter searches
Get up to 10× more recruiter visibility
Auto-forward profile to 10 top recruiters
Receive verified recruiter messages directly
Unlock hidden jobs, not visible to free users

Activate

A small token amount will be charged to verify. Get Refund in 48 Hours.
After free-trial 6 Months subscription will be auto Activated @ $2.49 (Cancel Anytime).
Free Bluetooth earphones with 6 Months subscription only.

Job Description

As a Data Engineer, your role will involve the following responsibilities: - Leading the design and development of end-to-end data pipelines using Apache Spark (Batch and Streaming). - Architecting and implementing real-time data ingestion frameworks using Kafka. - Building scalable ETL/ELT workflows to support analytics, reporting, and data science initiatives. - Developing and maintaining data models (conceptual, logical, physical) for enterprise data platforms. - Optimizing Spark jobs for performance, reliability, and scalability. - Ensuring data quality, governance, and security across all data flows. - Driving best practices for coding standards, CI/CD, and cloud-based data architecture. - Mentoring junior engineers and collaborating with cross-functional teams (Data Science, DevOps, Product). - Troubleshooting complex data processing issues and providing technical leadership during incidents. Qualifications required for this role include: - 7+ years of hands-on experience in Data Engineering. - Strong working knowledge of Spark, Python, SQL, and API Integration frameworks is a must. - Working experience in Modern data architecture and modeling concepts, including Cloud data lakes, data warehouses, and data marts. - Familiarity with dimensional modeling, star schemas, and real-time/batch ETL pipelining, including experience with data streaming (Kafka). - In-depth experience with Kafka for real-time data ingestion and streaming. - Strong proficiency in SQL (analytical, performance tuning). - Solid understanding of data modeling principles (OLTP, OLAP, dimensional modeling, star/snowflake schemas). - Experience building large-scale distributed data processing systems. - Hands-on experience with cloud platforms such as AWS / Azure / GCP (any). - Knowledge of CI/CD, containerization (Docker), and orchestration tools (Airflow, Jenkins, etc.). - Strong problem-solving, debugging, and leadership skills. - Bachelor's or Master's degree in Computer Science, Engineering, or related field. Preferred qualifications for this role may include: - Experience with Delta Lake, Lakehouse architecture, or cloud-native data platforms. - Exposure to NoSQL databases (Cassandra, MongoDB, DynamoDB). - Knowledge of data governance, metadata management, and cataloging tools. - Prior experience leading a technical team or project. As a Data Engineer, your role will involve the following responsibilities: - Leading the design and development of end-to-end data pipelines using Apache Spark (Batch and Streaming). - Architecting and implementing real-time data ingestion frameworks using Kafka. - Building scalable ETL/ELT workflows to support analytics, reporting, and data science initiatives. - Developing and maintaining data models (conceptual, logical, physical) for enterprise data platforms. - Optimizing Spark jobs for performance, reliability, and scalability. - Ensuring data quality, governance, and security across all data flows. - Driving best practices for coding standards, CI/CD, and cloud-based data architecture. - Mentoring junior engineers and collaborating with cross-functional teams (Data Science, DevOps, Product). - Troubleshooting complex data processing issues and providing technical leadership during incidents. Qualifications required for this role include: - 7+ years of hands-on experience in Data Engineering. - Strong working knowledge of Spark, Python, SQL, and API Integration frameworks is a must. - Working experience in Modern data architecture and modeling concepts, including Cloud data lakes, data warehouses, and data marts. - Familiarity with dimensional modeling, star schemas, and real-time/batch ETL pipelining, including experience with data streaming (Kafka). - In-depth experience with Kafka for real-time data ingestion and streaming. - Strong proficiency in SQL (analytical, performance tuning). - Solid understanding of data modeling principles (OLTP, OLAP, dimensional modeling, star/snowflake schemas). - Experience building large-scale distributed data processing systems. - Hands-on experience with cloud platforms such as AWS / Azure / GCP (any). - Knowledge of CI/CD, containerization (Docker), and orchestration tools (Airflow, Jenkins, etc.). - Strong problem-solving, debugging, and leadership skills. - Bachelor's or Master's degree in Computer Science, Engineering, or related field. Preferred qualifications for this role may include: - Experience with Delta Lake, Lakehouse architecture, or cloud-native data platforms. - Exposure to NoSQL databases (Cassandra, MongoDB, DynamoDB). - Knowledge of data governance, metadata management, and cataloging tools. - Prior experience leading a technical team or project.

Skills Required

Apache Spark Kafka Python SQL Data modeling ETL Data governance Data security Data quality Data processing Data warehousing Dimensional modeling OLTP OLAP Distributed systems AWS Azure GCP Docker Airflow Jenkins Metadata management API Integration frameworks CICD Cloudbased data architecture Data ingestion Data lakes Star schemas Snowflake schemas Realtime data processing CICD NoSQL databases Delta Lake Lakehouse architecture Cataloging tools

Posted on: March 19, 2026

Relevant Jobs