Senior SRE - Government Cloud Operations
Catonetworks•4h ago
United StatesOnsiteFull-timeSenior Level7+ yrs exp
Top focus
SreCloud Engineer
- Welcome to the future of cloud networking and security!
- Cato Networks is the first company to converge enterprise networking and security into one centralized and global service that is delivered by cloud. It is led by networking and security pioneer Shlomo Kramer (Check Point, Imperva) and early investor (Palo Alto Networks, Exabeam, Trusteer and more). Cato’s unique technology inspired a brand-new product category, later named “SASE” by Gartner and a market expected to reach $28.5 billion by 2028.
- This is your opportunity to get on the rocket ship and join a company that is building a cutting-edge enterprise network and secure cloud platform
- is on a fast track to becoming the worldwide market leader – don’t miss it!
- Description
- Now we’re seeking a Senior Site Reliability Engineer with hands-on experience building and sustaining regulated cloud platforms through FedRAMP High / IL4 operational lifecycles, including continuous monitoring and post-ATO operational management.
- In this critical role, you will support our growing operations, network
- systems environments. You will play a pivotal role in administering internal platforms while participating in key architectural and operational decisions. This position offers the opportunity to innovate, establish best-practice processes
- continuously improve the reliability, security
- compliance posture of our regulated cloud environments.
- Responsibilities
- Own production operations for mission-critical services, including availability, latency, scalability, and operational health across complex distributed systems.
- Design, build, and operate highly available cloud infrastructure supporting regulated environments, including FedRAMP High / IL4+ deployments.
- Lead major incident response, root cause analysis, and postmortem remediation; drive operational maturity through change governance, disaster recovery testing, and service resiliency programs.
- Operationalize compliance requirements, including NIST 800-53 controls and STIG baselines, across Kubernetes platforms, Linux systems, container runtimes, and cloud infrastructure.
- Support regulated environment readiness through audit preparation, evidence collection, vulnerability management, configuration management, and continuous monitoring activities.
- Develop automation and tooling to continuously assess and maintain platform compliance posture; contribute to immutable, reproducible infrastructure patterns that simplify regulatory sustainment.
- Implement and maintain secure CI/CD pipelines and infrastructure-as-code practices aligned with security and compliance requirements.
- Improve observability across infrastructure and applications through metrics, logging, tracing, and alerting; integrate compliance telemetry and configuration auditing into operational workflows.
- Partner with Security, Compliance, and Engineering teams to improve service reliability, deployment safety, and operational maturity throughout the software lifecycle.
- Requirements
- 7+ years of experience in Site Reliability Engineering, Production Engineering, Cloud Operations, or Infrastructure Engineering.
- Hands-on experience operating cloud infrastructure in regulated environments such as FedRAMP Moderate/High, DoD IL4/IL5, or equivalent, including AWS GovCloud or other isolated government cloud environments.
- Experience supporting cloud authorization efforts (ATO) and sustaining environments post-authorization through continuous monitoring, including FedRAMP monthly reporting, vulnerability tracking
- control assessment activities.
- Strong knowledge of NIST 800-53 controls, vulnerability remediation SLAs, secure configuration management, and audit evidence generation.
- Deep experience with Infrastructure as Code (Terraform preferred), GitOps workflows, and secure CI/CD pipelines, including container hardening and image security practices.
- Proficiency in Python, Go, or Bash for operational automation and tooling development.
- Proficiency with cloud-native technologies including Kubernetes, Prometheus, and Grafana, along with a solid understanding of Linux/Unix operating systems.
- Experience supporting production operations for SaaS, cloud service provider, or multi-tenant platforms at scale.
- Ability to communicate operational risk and compliance posture clearly to both technical and non-technical stakeholders.
- Preferred Qualifications
- Experience working directly with 3PAOs, auditors, or compliance assessors during authorization and continuous monitoring cycles.
- Familiarity with STIG implementation across Kubernetes, Linux systems, and container runtimes.
- Understanding of Zero Trust architectures and secure access platforms.
- Experience with operational resilience exercises and disaster recovery validation.
- Cato provides a competitive salary and comprehensive benefits plan. Benefits for this role include health/vision/dental insurance, 401(k), stock options, Health Savings/Flexible Spending Accounts, flexible time-off, paid parental leave and disability benefits.
- As an EEO/Affirmative Action Employer all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status.
- #LI-GG1
Required skills
PythonGoBashTerraformKubernetesPrometheusGrafanaLinux