All jobs

Site Reliability Champion, Specialist

Vanguard15h ago
Hyderabad, IndiaHybridFull-timeMid Level8+ yrs exp

Top focus

Sre

About Vanguard Founded in 1975, Vanguard is one of the world's leading investment management companies. The firm offers investments, advice, and retirement services to tens of millions of individual investors around the globe—directly, through workplace plans, and through financial intermediaries.

Vanguard’s India Office Vanguard’s office in India is a significant milestone in our global expansion. We are committed to establishing an enduring technology center in Hyderabad, Telangana and are excited to be adding talent who will focus on Artificial Intelligence (AI), mobile, and cloud-based technologies that drive our business outcomes and deliver a world-class experience for our clients.

Role Summary This role provides subject matter expertise and coordination to site reliability efforts across the subdivision. This role also includes ensuring system reliability by meeting service-level objectives (SLOs), driving automation of operational tasks, defining and tracking key performance indicators (KPIs), designing scalable systems, managing incident responses, and collaborating with development teams to ensure software reliability and scalability.

Responsibilites Coordinates cross-product chaos experimentation. Maintains the centralized incident response playbook for the subdivision to document standards for managing communication and escalation during an incident. Aggregates quantifiable data about availability to report back to senior leadership.

Makes contributions to centrally managed (IT-wide) inner source libraries for reliability. Facilitates blameless post-incident reviews for high severity incidents or incidents involving more than one product family. Regularly attends Reliability Engineering and Resilience communities of practice.

Remains informed about site reliability engineering activities happening within the subdivision. Communicates new standards and newly available tools and frameworks across subdivisions. Enforces reliability standards. Participates in special projects and performs other duties as assigned.

Drives automation of routine operational tasks to improve system efficiency, reduce manual intervention, and enhance deployment and monitoring workflows. Leads incident response efforts by diagnosing root causes rapidly, applying timely fixes, and establishing preventive measures to avoid recurrence.

Defines and tracks key system performance metrics such as availability, latency, and error rates to evaluate and optimize system health and reliability. Collaborates with development teams to align software architecture with reliability and scalability goals, ensuring seamless operations across the deployment lifecycle.

Qualifications and Skills Minimum 8 years of experience in software engineering or site reliability roles, with at least 2 years in development or operational support functions Bachelor’s degree (B.E./B.Tech) in Computer Science, IT, Software Engineering, or a related field; Master’s degree preferred Cloud Platforms: AWS (EC2, ECS, Lambda, S3, CloudWatch) Programming Languages: Java, Node.js Front-End Frameworks: Angular Containerization: Docker (ECS-based container patterns) CI/CD & DevOps: Git, GitHub, JIRA, Confluence Observability & Monitoring: Splunk, CloudWatch, Honeycomb Automation & Testing: Cucumber, JUnit, Playwright, Puppeteer Strong experience in incident management, system performance optimization, automation of operational tasks, reliability engineering principles, and cross-functional collaboration Location This role is based in Hyderabad, Telangana at Vanguard’s India office.

Only qualified external applicants will be considered. Our mission Vanguard adheres to a simple purpose: To take a stand for all investors, to treat them fairly, and to give them the best chance for investment success. Our commitment to you Vanguard takes the same long-term view of your success—at work and in life—with Benefits and Rewards packages that reflect what you care about, throughout all the phases and stages of your life.

Our Total Rewards programs provide you and your loved ones with wellness support for key areas in your life: Financial wellness We're committed to enabling your financial success and provide competitive offers and programs. Physical wellness We're committed to providing benefits that support your physical health and wellness.

Personal wellness We're committed to providing resources that help support the full scope of your life. How we work Vanguard has implemented a hybrid working model for most of our employees (crew members), designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection.

We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

Required skills

AWSJavaNode.jsAngularDockerGitGitHubJIRAConfluenceSplunkHoneycombCucumberJUnitPlaywrightPuppeteer
Posted on JobRush — the end-to-end AI job-search platform.