Senior Software Development Engineer (Site Reliability)
Top focus
We’re building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you’ll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do.
Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time. Position Summary The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, availability, performance, and operational scalability of the myPBM platform.
This role applies software engineering practices to operations, with a focus on automation, observability, incident management, and continuous improvement to support the stable, scalable delivery of client-facing services. The SRE partners closely with DevOps, Engineering, Infrastructure, and Security teams to balance system reliability with delivery velocity while maintaining compliance with enterprise standards. *We prefer this person is hybrid in Richardson, TX, Northbrook, IL or Scottsdale, AZ Primary Responsibilities 1.
Reliability Engineering and Operations Ensure high availability, resiliency, and performance of myPBM applications and infrastructure. Define and manage SLIs, SLOs, and SLAs for critical services. Monitor production systems and proactively identify issues before customer impact.
Lead incident response, triage, and root cause analysis (RCA). Drive continuous improvement to reduce repeat incidents and operational toil. 2. Monitoring, Observability, and Alerting Implement and maintain end-to-end observability across UI, APIs, and infrastructure layers.
Build and manage monitoring solutions using: AppDynamics (APM, RUM, synthetic monitoring) Splunk (logs, dashboards, and error tracking) Design actionable alerts and escalation workflows using tools such as xMatters and MIR3. Standardize dashboards and ensure data accuracy and visibility.
Continuously optimize alerting to reduce noise and improve signal quality. 3. DevSecOps and Release Engineering Support and enhance CI/CD pipelines, including GitHub Actions and enterprise pipeline solutions. Enforce deployment guardrails, release governance, and production readiness checks.
Support build and deployment failure triage and rollback strategies. Partner with development teams to improve deployment reliability and automation. Ensure adherence to change management (CAB/SNOW) and release policies 4. Infrastructure Engineering and Platform Stability Manage and support cloud infrastructure, including AKS, compute, storage, and networking.
Ensure platform health, capacity monitoring, and performance optimization. Support infrastructure provisioning and environment setup. Drive disaster recovery (DR) readiness and failover validation, including RTO and RPO objectives. Enable application onboarding onto standardized enterprise platforms. 5.
Security and Compliance Implement continuous security monitoring and vulnerability remediation. Manage secrets, certificates, and identity integration, including IAM onboarding. Ensure compliance with CVS security standards, audit requirements, and production readiness controls.
Enforce shift-left security practices in CI/CD pipelines. 6. Incident Management and Support Model Participate in 24x7 on-call rotation and incident response. Partner with Production Support to resolve incidents. Ensure monitoring and alerting gaps are identified and closed.
Maintain incident documentation and improve standard operating procedures. Support the full issue detection, triage, resolution, and prevention lifecycle. 7. Automation and Continuous Improvement Automate repetitive operational tasks to reduce toil.
Implement infrastructure as code (IaC) practices. Continuously improve deployment pipelines, monitoring, and observability. Enable predictive insights and proactive issue prevention. 8. Collaboration and Platform Enablement Work closely with engineering, DevOps, infrastructure, and security teams.
Enable a shared ownership model for reliability and operations. Provide guidance on production readiness and operational best practices. Required Qualifications 5+ years of experience in site reliability engineering, DevOps, or platform engineering including the following:.
Experience with Monitoring and observability tools such as Splunk and AppDynamics Cloud platforms, preferably Azure, including AKS and Kubernetes CI/CD pipelines such as GitHub Actions, Jenkins, or similar tools Strong understanding of Incident management and root cause analysis, Monitoring, alerting, and logging practices, and Infrastructure and networking fundamentals Scripting experience with Python, Bash, or PowerShell.
Preferred Qualifications Experience in healthcare or other regulated environments. Knowledge of site reliability engineering principles, including SLIs, SLOs, and error budgets. Familiarity with DevSecOps practices and compliance requirements.
Experience supporting large-scale distributed systems. Education Bachelor's degree or equivalent experience. Anticipated Weekly Hours 40 Time Type Full time Pay Range The typical pay range for this role is: $92,700.00 - $203,940.00 This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls.
The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above.
Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong. Great benefits for great people We take pride in offering a comprehensive and competitive mix of pay and benefits that reflects our commitment to our colleagues and their families.
This full‑time position is eligible for a comprehensive benefits package designed to support the physical, emotional, and financial well‑being of colleagues and their families. The benefits for this position include medical, dental, and vision coverage, paid time off, retirement savings options, wellness programs, and other resources, based on eligibility.
Additional details about available benefits are provided during the application process and on Benefits Moments . We anticipate the application window for this opening will close on: 07/19/2026 Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.