Systems Engineer , Region Services

Amazon Support Services Pty Ltd•4h ago

AU, VIC, MelbourneOnsiteFull-time

Apply now

Top focus

Systems Engineer

Applicants must be Australian citizens and hold or be eligible to obtain an Australian Government Security Clearance with the ability to successfully complete an Organisational Suitability Assessment. For more information regarding security clearances please visit (https://www.agsva.gov.au/). The AWS Region Services team combines AWS global cloud leadership with Australian security expertise to deliver highly secure, scalable environments for sensitive workloads. We’re creating innovative ways to use cloud computing, artificial intelligence, and machine learning while maintaining the highest standards of security and operational excellence. The Engineering organisation within Region Services is structured across core capability pillars: Compute & Machine Learning, Security Identity & Compliance, Storage & Databases, and a growing capability domain. Collectively these pillars encompass a team of varying technical skillsets, including Engineers
Technical Program Managers and Subject Matter Experts (SMEs), organised into focused sub-teams. This is an opportunity to make a lasting impact on Australia’s digital future. You’ll work with AWS services, implement innovative solutions, and help customers succeed in their most important missions. We’re committed to helping our builders grow through continuous learning, mentoring, and collaboration with industry experts. Are you ready to build the future of secure cloud computing in Australia? Key job responsibilities - Define and/or refine hardware requirements, participate in the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation - Develop or further existing application and system management tools and processes that reduce manual efforts and increase overall efficiency - Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic - Participate in the design and execution of production acceptance tests and new hardware evaluations - Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed - Participate in “on-call” rotations to resolve incidents occurring out-of-hours. - Must be an Australian Citizen and hold or be able to attain an Australian Government Security Vetting Agency clearance (see https://www.agsva.gov.au) A day in the life Your morning begins with a fleet health review — scanning dashboards you've built, checking automated alerts, and confirming that overnight self-healing routines performed as expected. A quick stand-up with your sub-team surfaces a capacity threshold approaching in one of the compute clusters. You pull up the growth projections and begin sketching an automation enhancement that will handle the scaling gracefully. Mid-morning, you're deep in code. Today it's refining a diagnostic tool that identifies degraded hardware components before they impact workloads. You test against real fleet telemetry, iterate on detection thresholds, and push a change that will save hours of manual investigation across the team. After lunch, you join a hardware evaluation session. A new server platform is being assessed for production readiness, and your role is to design acceptance test criteria — thermal performance under load, firmware compatibility, integration with existing monitoring frameworks. Your input directly determines whether this hardware earns its place in the fleet. Late afternoon brings a collaborative design review. A colleague proposes a new approach to automated repair workflows, and you offer feedback drawn from patterns you've observed in incident data. The discussion is rigorous, respectful, and energising — this is a team that sharpens each other. Before you wrap up, you check in on a long-running automation project — a self-healing pipeline that's reduced manual intervention by 40% since you deployed it last quarter. You note a few edge cases to address tomorrow and log off knowing that tonight, the fleet will largely take care of itself. Because you built it that way. About the team Region Services provides the highest caliber Operational Solutions and Cleared Support for services within our Regions. We provide 'hands on keyboard' support to our service teams by deploying changes into these isolated regions, monitoring the results, and reporting any issues that are observed. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture AWS values curiosity and connection. Our employee-led and company-sponsored affinity groups promote inclusion and empower our people to take pride in what makes us unique. Our inclusion events foster stronger, more collaborative teams. Our continual innovation is fueled by the bold ideas, fresh perspectives, and passionate voices our teams bring to everything we do. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
5+ years of systems engineering experience - 3+ years of site reliability engineering (SRE), systems engineering, systems administration, DevOps, security administration
network administration experience - Experience with Active Directory
experience deploying, managing
optimizing Microsoft Windows Server
Knowledge of TCP/IP and networking protocols such as HTTP and DNS - Experience designing and developing scripts to automate operational burdens and reviewing scripting changes to ensure they meet the standards for maintainability, scalability and security - Experience working in 24/7 production environment - Experience with service-oriented architecture and web services Acknowledgement of country: In the spirit of reconciliation Amazon acknowledges the Traditional Custodians of country throughout Australia and their connections to land, sea and community. We pay our respect to their elders past and present and extend that respect to all Aboriginal and Torres Strait Islander peoples today. IDE statement: Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability
other legally protected status. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

Required skills

AWScloud computingmachine learningautomationdiagnosticssystem managementmonitoringhardware evaluation