Data Engineer - Python/PySpark
SOFTPATH TECH SOLUTIONS PVT LTD
All India • 4 weeks ago
Experience: 4 to 8 Yrs
PREMIUM
Deal of the Day
--:--:--
15 Days Free Trial
Upgrade to CVX24 Premium
- Free Resume Writing
-
Get a Verified Blue tick
- See who viewed your profile
- Unlimited chat with recruiters
- Rank higher in recruiter searches
- Get up to 10× more recruiter visibility
- Auto-forward profile to 10 top recruiters
- Receive verified recruiter messages directly
- Unlock hidden jobs, not visible to free users
$0
Activate
$0
A small token amount will be charged to verify.
Get Refund in 48 Hours.
After free-trial 6 Months subscription will be auto Activated @ $2.49 (Cancel Anytime).
Free Bluetooth earphones with 6 Months subscription only.
Enter Your Details
Job Description
As a Data Engineer at the company based in Bangalore, you will be responsible for the following:
- **Data Pipeline Development:**
- Design, develop, and maintain scalable and reliable data pipelines using Python and PySpark for large-scale data processing.
- Build ETL/ELT workflows to ingest, process, and transform data from multiple sources into centralized data platforms.
- Ensure high availability and reliability of data pipelines for analytics and business intelligence applications.
- Automate and schedule data workflows using Apache Airflow or other scheduling tools.
- **Data Governance & Security:**
- Implement data governance policies and data quality checks across pipelines.
- Ensure data security, access controls, and compliance standards in Kubernetes-based environments.
- Maintain data lineage, metadata management, and data integrity across the platform.
- **Collaboration & Stakeholder Engagement:**
- Work closely with data scientists, analysts, and product teams to understand business requirements.
- Translate business requirements into scalable data engineering solutions.
- Support data-driven initiatives and analytics capabilities across the organization.
**Mandatory Skills & Qualifications:**
- 3.5 - 5+ years of experience in Data Engineering
- Strong programming experience in Python
- Hands-on expertise in PySpark for large-scale data processing
- Experience working with Apache Airflow or workflow scheduling tools
- Strong experience in Big Data technologies such as Hadoop, Trino, or Druid
- Expertise in query optimization and performance tuning using PySpark or Trino
- Experience designing and maintaining data pipelines and ETL frameworks
- Experience handling large-scale datasets (TB-level)
- Strong knowledge of distributed data processing and data engineering best practices
**Good to Have Skills:**
- Knowledge of SQL for data analysis and querying
- Experience with Kubernetes-based environments
- Exposure to data governance, security frameworks, and data quality tools
- Understanding of AI/ML workflows and data preparation for machine learning pipelines
- Experience with cloud platforms such as AWS, Azure, or GCP
- Familiarity with modern data stack and open-source data tools As a Data Engineer at the company based in Bangalore, you will be responsible for the following:
- **Data Pipeline Development:**
- Design, develop, and maintain scalable and reliable data pipelines using Python and PySpark for large-scale data processing.
- Build ETL/ELT workflows to ingest, process, and transform data from multiple sources into centralized data platforms.
- Ensure high availability and reliability of data pipelines for analytics and business intelligence applications.
- Automate and schedule data workflows using Apache Airflow or other scheduling tools.
- **Data Governance & Security:**
- Implement data governance policies and data quality checks across pipelines.
- Ensure data security, access controls, and compliance standards in Kubernetes-based environments.
- Maintain data lineage, metadata management, and data integrity across the platform.
- **Collaboration & Stakeholder Engagement:**
- Work closely with data scientists, analysts, and product teams to understand business requirements.
- Translate business requirements into scalable data engineering solutions.
- Support data-driven initiatives and analytics capabilities across the organization.
**Mandatory Skills & Qualifications:**
- 3.5 - 5+ years of experience in Data Engineering
- Strong programming experience in Python
- Hands-on expertise in PySpark for large-scale data processing
- Experience working with Apache Airflow or workflow scheduling tools
- Strong experience in Big Data technologies such as Hadoop, Trino, or Druid
- Expertise in query optimization and performance tuning using PySpark or Trino
- Experience designing and maintaining data pipelines and ETL frameworks
- Experience handling large-scale datasets (TB-level)
- Strong knowledge of distributed data processing and data engineering best practices
**Good to Have Skills:**
- Knowledge of SQL for data analysis and querying
- Experience with Kubernetes-based environments
- Exposure to data governance, security frameworks, and data quality tools
- Understanding of AI/ML workflows and data preparation for machine learning pipelines
- Experience with cloud platforms such as AWS, Azure, or GCP
- Familiarity with modern data stack and open-source data tools
Posted on: April 5, 2026
Relevant Jobs
Step 2 of 2