Lead Data Engineer
Mastercard•19h ago
Pune, IndiaOnsiteFull-timeMid Level2+ yrs exp
Top focus
Data EngineerVp DataData Warehouse EngineerSenior Data Engineer
- Our Purpose Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential. Title and Summary Lead Data Engineer Job Title Lead Data Engineer Overview Who is Mastercard? Mastercard is a global technology company in the payments industry. Our mission is to connect and power an inclusive, digital economy that benefits everyone, everywhere by making transactions safe, simple, smart
- accessible. Using secure data and networks, partnerships
- passion, our innovations help individuals, financial institutions, governments
- businesses realize their greatest potential. The Mastercard Services organization is a key differentiator, delivering cutting-edge solutions used by some of the world’s largest organizations to make critical business decisions. Focused on innovation and scale, Services provides data-driven capabilities across consulting, analytics, experimentation
- risk management. Role Overview Data Platform & Orchestration is seeking a Lead Data Engineer to design and build next-generation, cloud-native data platforms supporting Mastercard’s global data ecosystem. In this role, you will lead the development of scalable batch and real-time data pipelines, enabling efficient data processing across Data Lakes and Data Warehouses. You will work at the intersection of data engineering, cloud platforms
- distributed systems, contributing to high-impact initiatives and driving engineering excellence. This role is ideal for someone who thrives in a fast-paced, collaborative environment, enjoys solving complex data challenges
- is passionate about building resilient, high-performance systems at scale. Key Responsibilities
- Design and build scalable batch and real-time data pipelines using Spark, Kafka, and (preferred) Apache Flink
- Develop robust ETL/ELT frameworks for structured and unstructured data
- Build and optimize data ingestion and transformation pipelines for Data Lakes and Data Warehouses
- Implement stream processing solutions for near real-time use cases
- Ensure data quality, lineage, observability, and governance across pipelines
- Optimize data jobs for performance, scalability, and cost efficiency
- Design and operate cloud-native data platforms on AWS, Azure, or GCP
- Leverage managed services such as S3/ADLS/GCS, EMR/Databricks, BigQuery/Redshift/Snowflake
- Implement Infrastructure as Code (Terraform, CloudFormation, or equivalent)
- Ensure high availability, fault tolerance, and disaster recovery
- Drive cost optimization strategies for large-scale data workloads
- Implement secure data access controls aligned with enterprise standards Platform & Engineering Excellence
- Build reusable data frameworks, libraries, and pipeline templates
- Drive adoption of CI/CD, automated testing, and observability
- Develop and enhance developer tooling and platform capabilities
- Contribute to cloud-agnostic platform architecture and automation Technical Leadership & Collaboration
- Provide technical leadership, mentorship, and design guidance
- Conduct code reviews, architecture reviews, and best practice enforcement
- Collaborate with architects, product owners, and cross-functional teams
- Act as a Subject Matter Expert (SME) for data platform initiatives
- Promote engineering excellence through documentation, design standards, and innovation
- Work effectively across globally distributed teams Required Skills & Qualifications
- Strong proficiency in Object-Oriented Programming and Design (OOP/OOAD) Java (JDK 8+); Python and/or Go is a plus
- Experience building data services and distributed systems
- Strong understanding of multithreading, scalability, and performance tuning
- Strong hands-on experience with AWS, Azure, or GCP
- Experience with cloud-native data services (S3, ADLS, GCS, Databricks, EMR, BigQuery, Redshift)
- Strong experience with Apache Spark (Core, SQL, Structured Streaming)
- Hands-on experience with Kafka or equivalent messaging platforms
- Experience with real-time processing frameworks (Apache Flink preferred or Spark Streaming)
- Strong understanding of ETL/ELT design patterns and pipeline architectures
- Experience with data formats (Parquet, Avro, ORC)
- Knowledge of data modeling (dimensional modeling, star/snowflake schemas)
- Proficiency in Infrastructure as Code (Terraform, CloudFormation, ARM templates)
- Experience with Docker and Kubernetes
- Solid understanding of cloud networking, IAM, and security best practices
- Experience with workflow orchestration tools (Airflow or equivalent)
- Strong SQL skills and experience with Data Warehouse platforms
- Understanding of data governance, lineage, and observability frameworks
- Experience with CI/CD tools (Jenkins, GitHub Actions, etc.)
- Strong testing practices (JUnit or equivalent frameworks)
- Experience with monitoring & observability (Splunk, Dynatrace, Prometheus, etc.)
- Familiarity with performance testing tools (JMeter, Gatling)
- Understanding of secure development practices (PCI DSS, GDPR, etc.)
- Proven ability to lead and mentor engineering teams
- Strong problem-solving and system design skills
- Passion for innovation, automation, and continuous improvement
- Ability to operate effectively in a fast-paced, global environment Education
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field Corporate Security Responsibility All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must: Abide by Mastercard’s security policies and practices
- Ensure the confidentiality and integrity of the information being accessed
- Report any suspected information security violation or breach, and Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.
Required skills
JavaPythonGoAWSAzureGCPApache SparkKafkaApache FlinkETLELTTerraformDockerKubernetesSQL