Lead Data Engineer - Apache Spark
HiringEye
All India, Hyderabad • 1 month ago
Experience: 7 to 11 Yrs
PREMIUM
Deal of the Day
--:--:--
15 Days Free Trial
Upgrade to CVX24 Premium
- Free Resume Writing
-
Get a Verified Blue tick
- See who viewed your profile
- Unlimited chat with recruiters
- Rank higher in recruiter searches
- Get up to 10× more recruiter visibility
- Auto-forward profile to 10 top recruiters
- Receive verified recruiter messages directly
- Unlock hidden jobs, not visible to free users
$0
Activate
$0
A small token amount will be charged to verify.
Get Refund in 48 Hours.
After free-trial 6 Months subscription will be auto Activated @ $2.49 (Cancel Anytime).
Free Bluetooth earphones with 6 Months subscription only.
Enter Your Details
Job Description
As a Data Engineer, your role will involve the following responsibilities:
- Leading the design and development of end-to-end data pipelines using Apache Spark (Batch and Streaming).
- Architecting and implementing real-time data ingestion frameworks using Kafka.
- Building scalable ETL/ELT workflows to support analytics, reporting, and data science initiatives.
- Developing and maintaining data models (conceptual, logical, physical) for enterprise data platforms.
- Optimizing Spark jobs for performance, reliability, and scalability.
- Ensuring data quality, governance, and security across all data flows.
- Driving best practices for coding standards, CI/CD, and cloud-based data architecture.
- Mentoring junior engineers and collaborating with cross-functional teams (Data Science, DevOps, Product).
- Troubleshooting complex data processing issues and providing technical leadership during incidents.
Qualifications required for this role include:
- 7+ years of hands-on experience in Data Engineering.
- Strong working knowledge of Spark, Python, SQL, and API Integration frameworks is a must.
- Working experience in Modern data architecture and modeling concepts, including Cloud data lakes, data warehouses, and data marts.
- Familiarity with dimensional modeling, star schemas, and real-time/batch ETL pipelining, including experience with data streaming (Kafka).
- In-depth experience with Kafka for real-time data ingestion and streaming.
- Strong proficiency in SQL (analytical, performance tuning).
- Solid understanding of data modeling principles (OLTP, OLAP, dimensional modeling, star/snowflake schemas).
- Experience building large-scale distributed data processing systems.
- Hands-on experience with cloud platforms such as AWS / Azure / GCP (any).
- Knowledge of CI/CD, containerization (Docker), and orchestration tools (Airflow, Jenkins, etc.).
- Strong problem-solving, debugging, and leadership skills.
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
Preferred qualifications for this role may include:
- Experience with Delta Lake, Lakehouse architecture, or cloud-native data platforms.
- Exposure to NoSQL databases (Cassandra, MongoDB, DynamoDB).
- Knowledge of data governance, metadata management, and cataloging tools.
- Prior experience leading a technical team or project. As a Data Engineer, your role will involve the following responsibilities:
- Leading the design and development of end-to-end data pipelines using Apache Spark (Batch and Streaming).
- Architecting and implementing real-time data ingestion frameworks using Kafka.
- Building scalable ETL/ELT workflows to support analytics, reporting, and data science initiatives.
- Developing and maintaining data models (conceptual, logical, physical) for enterprise data platforms.
- Optimizing Spark jobs for performance, reliability, and scalability.
- Ensuring data quality, governance, and security across all data flows.
- Driving best practices for coding standards, CI/CD, and cloud-based data architecture.
- Mentoring junior engineers and collaborating with cross-functional teams (Data Science, DevOps, Product).
- Troubleshooting complex data processing issues and providing technical leadership during incidents.
Qualifications required for this role include:
- 7+ years of hands-on experience in Data Engineering.
- Strong working knowledge of Spark, Python, SQL, and API Integration frameworks is a must.
- Working experience in Modern data architecture and modeling concepts, including Cloud data lakes, data warehouses, and data marts.
- Familiarity with dimensional modeling, star schemas, and real-time/batch ETL pipelining, including experience with data streaming (Kafka).
- In-depth experience with Kafka for real-time data ingestion and streaming.
- Strong proficiency in SQL (analytical, performance tuning).
- Solid understanding of data modeling principles (OLTP, OLAP, dimensional modeling, star/snowflake schemas).
- Experience building large-scale distributed data processing systems.
- Hands-on experience with cloud platforms such as AWS / Azure / GCP (any).
- Knowledge of CI/CD, containerization (Docker), and orchestration tools (Airflow, Jenkins, etc.).
- Strong problem-solving, debugging, and leadership skills.
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
Preferred qualifications for this role may include:
- Experience with Delta Lake, Lakehouse architecture, or cloud-native data platforms.
- Exposure to NoSQL databases (Cassandra, MongoDB, DynamoDB).
- Knowledge of data governance, metadata management, and cataloging tools.
- Prior experience leading a technical team or project.
Skills Required
Apache Spark
Kafka
Python
SQL
Data modeling
ETL
Data governance
Data security
Data quality
Data processing
Data warehousing
Dimensional modeling
OLTP
OLAP
Distributed systems
AWS
Azure
GCP
Docker
Airflow
Jenkins
Metadata management
API Integration frameworks
CICD
Cloudbased data architecture
Data ingestion
Data lakes
Star schemas
Snowflake schemas
Realtime data processing
CICD
NoSQL databases
Delta Lake
Lakehouse architecture
Cataloging tools
Posted on: March 19, 2026
Relevant Jobs
Step 2 of 2