Lead Data Engineer
Role Category: Data Engineer
Job Location: Jaipur/Remote
Job Summary
We are seeking a highly skilled Lead Data Engineer with expertise in PySpark, Spark, Databricks, and Cloud technologies to join our dynamic team. The ideal candidate will be responsible for designing, developing, and optimizing large-scale data pipelines while leading a team of data engineers. This role requires a deep understanding of big data processing, distributed computing, and cloud-based data solutions.
Key Responsibilities
- Lead and mentor a team of data engineers in designing, building, and maintaining scalable data pipelines.
- Develop, optimize, and maintain ETL/ELT processes using PySpark and Apache Spark.
- Architect and implement Databricks-based solutions for big data processing and analytics.
- Work with cloud platforms (AWS, Azure, or GCP) to build robust, scalable, and cost-effective data solutions.
- Collaborate with data scientists, analysts, and business stakeholders to understand data needs and deliver high-quality solutions.
- Ensure data security, governance, and compliance with industry standards.
- Optimize data processing performance and troubleshoot data pipeline issues.
- Drive best practices in data engineering, including CI/CD, automation, and monitoring.
Required Qualifications
- Bachelors or Master’s degree in Computer Science, Engineering, or a related field.
- 6+ years of experience in data engineering with a focus on big data technologies.
- Strong expertise in PySpark, Apache Spark, and Databricks.
- Hands-on experience with cloud platforms such as AWS (Glue, EMR, Redshift), Azure (Data Factory, Synapse, Databricks), or GCP (BigQuery, Dataflow).
- Proficiency in SQL, Python, and Scala for data processing.
- Experience in building scalable ETL/ELT data pipelines.
- Knowledge of CI/CD for data pipelines and automation tools.
- Strong understanding of data governance, security, and compliance.
- Experience in leading and mentoring data engineering teams.
Preferred Qualifications
- Experience with Kafka, Airflow, or other data orchestration tools.
- Knowledge of machine learning model deployment in a big data environment.
- Familiarity with containerization (Docker, Kubernetes).
- Certifications in cloud technologies (AWS, Azure, or GCP).
Why Join Us?
- Opportunity to work with cutting-edge big data technologies.
- Competitive salary and benefits.
- Collaborative and innovative work environment.
- Career growth and professional development opportunities.
- On-Site Opportunity to client location.