Senior Site Reliability Engineer

Mastercard•12h ago

Pune, IndiaHybridFull-timeSenior Level5+ yrs exp

Apply now

Top focus

Sre

Our Purpose Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential. Title and Summary Senior Site Reliability Engineer Overview The Mastercard Authentication Program owns how consumer authentication works across both in-store and e-commerce use cases. This role contributes to building and operating best-in-class authentication products such as ID Check, Token Authentication Service
Token Authentication Framework, driving adoption, scale
revenue growth. This position sits at the intersection of Site Reliability Engineering, platform engineering
release engineering, ensuring our platforms are scalable, resilient, secure
production-ready across hybrid environments (Kubernetes, cloud
on-prem). Role Summary As a Senior Software Engineer (SRE), you will own the reliability, performance
operational excellence of Kubernetes-based workloads and distributed systems. You will drive automation, observability
continuous delivery practices, ensuring systems are highly available, scalable
enabling fast and reliable product delivery. Key Responsibilities Platform Reliability & Engineering
Ensure availability, performance, and resilience of workloads deployed on Kubernetes, On premise, PCF or Cloud platforms aligned to SLO/SLI objectives
Design, build, and operate scalable, secure, and highly available containerized platforms and services
Optimize capacity, performance, and cost efficiency across distributed systems Automation, CI/CD & Release Engineering
Build and enhance CI/CD pipelines to enable fast, reliable, and repeatable deployments
Automate infrastructure and operations using Infrastructure as Code and GitOps practices
Plan and execute software releases across environments (Kubernetes, AWS, Azure, PCF, VMs)
Drive continuous improvement in release processes, deployment strategies, and development workflows Observability, Performance & Operations
Implement strong monitoring, logging, and tracing to enable proactive issue detection
Conduct performance testing and tuning to ensure scalability and reliability
Lead incident management, root cause analysis (RCA), and postmortems to drive continuous improvement Security & Governance
Ensure adherence to security, compliance, and enterprise standards across deployments
Implement workload security, secrets management, and runtime protections
Contribute to operational governance and platform standardization Collaboration & Engineering Excellence
Partner with engineering, product, and operations teams to improve production readiness and system design
Support troubleshooting, debugging, and issue resolution across environments
Maintain clear documentation for release processes, pipelines, and operational procedures
Drive adoption of modern engineering practices to improve reliability, maintainability, and scalability AI & Innovation
Leverage AI-driven tools and automation to enhance observability, incident response, release efficiency, and capacity planning
Identify and build AI-enabled use cases to improve operational efficiency and reliability outcomes All About You Experience & Skills
Strong experience in SRE / Release Engineering / Platform Engineering roles
Proven experience managing and operating workloads deployed on Kubernetes, on-premise (bare metal / VMs), and PCF, with good-to-have exposure to public cloud platforms (AWS, Azure)
Hands-on expertise in containerization (Docker) and Kubernetes orchestration
Strong experience with CI/CD tools, scripting (Shell/Python/Groovy), and automation frameworks
Solid knowledge of monitoring, observability, performance testing, and scaling of distributed systems (e.g., Dynatrace, Splunk, JMeter, LoadRunner)
Strong programming experience (preferably Java / Spring Boot) Engineering Practices
Strong understanding of SDLC, Git-based workflows (Gitflow), and peer review practices
Expertise in troubleshooting and debugging (thread dumps, heap dumps, performance bottlenecks)
Familiarity with security principles and deployment of secure systems
Exposure to Infrastructure as Code, DevOps, and GitOps practices Personal Attributes
Strong problem-solving and analytical skills
Excellent communication and collaboration abilities
Ability to work in a fast-paced, high-impact environment and manage multiple priorities
Self-driven with ability to work both independently and as part of a team Our Teams & Values
Work in small, collaborative teams of engineers and product partners
Put customer success at the center of everything we do
Foster a diverse, inclusive, and high-performing culture
Believe in doing well by doing good, supporting ethical and responsible innovation Corporate Security Responsibility All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must: Abide by Mastercard’s security policies and practices
Ensure the confidentiality and integrity of the information being accessed
Report any suspected information security violation or breach, and Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.

Required skills

KubernetesAWSAzureDockerCI/CDShellPythonGroovyDynatraceSplunkJMeterLoadRunnerJavaSpring Boot