Program Specialist, NCC Real-Time Observability (RTO)
ADCI HYD 20 SEZ - H94•2h ago
IN, TS, HyderabadOnsiteFull-time
Top focus
Program Manager
- Amazon's Network Control Center (NCC) owns real-time observability, situational awareness
- coordinated intervention across the network to rapidly detect and eliminate issues before the customer journey is degraded. The Real-Time Observability (RTO) function within NCC acts as the centralized detection engine for Amazon Customer Service, owning end-to-end signal observability across all incident types. The mission is to monitor signals, detect issues before they become incidents. Once the incident breach the high severity criteria, the team hands it over to the Real-Time Incident Response (RTIR) team. Do you want to join a team focused on real-time incident detection and observability
- play a role in protecting millions of customers from service disruptions before they escalate? Are you a passionate learner willing to relentlessly Deep Dive into signals, data
- cross-functional coordination to identify and validate emerging issues? Do you thrive in a culture where Customer Obsession and Bias For Action are highly valued? As a Program Specialist, you will play a key role on the team by monitoring real-time signals across multiple channels, assessing incidents against defined criteria (HSE, Brand Impact, Customer Impact), validating signals with partner teams
- making Go/No-Go escalation decisions. You will also contribute to NCC's broader mission of detecting and resolving incidents before customers are impacted, collaborating across multiple teams within the network. You will be detail oriented and results driven to ensure flawless detection and handover execution, as well as ensure the needs of both the customers and the business are addressed. Outside of these day-to-day duties, Program Specialists will own and work on process improvements. Key job responsibilities 1. 24x7 real-time monitoring to detect emerging signals across all Amazon Customer Service incident types globally. 2. Assess each signal through through a defined framework: HSE (High Severity) criteria, potential Brand impact
- potential Customer impact. 3. Validate signals by engaging cross-functional teams and using internal tools to confirm scope, severity
- customer impact before handover. 4. Execute structured handovers to the Response teams using standardized templates, ensuring all mandatory fields are completed with accurate incident details. 5. Use P&C (Privilege & Confidential) criteria's for sensitive cases including product recalls, brand reputation issues, counterfeit reports
- promotion abuse. 6. Identify process improvement opportunities, propose changes
- amend SOPs once approved A day in the life Our team protects Amazon customers by detecting incidents before they escalate. You will begin each shift by reviewing active channels and ongoing monitoring windows. Throughout the shift, you will triage incoming signals and use tools to assess real-time health of CS. Once you identify an issue, you will handover the incident to response team for mitigation. You might start your day by acknowledging CS Tech alarms within 5 minutes, to assessing brand-impacting social media incidents with P&C protocols or coordinating with Global security team on driver safety issues. About the team The Real-Time Observability (RTO) function within NCC acts as the centralized detection engine for Amazon Customer Service, owning end-to-end signal observability across all incident types. The mission is to monitor signals, detect issues before they become incidents. Once the incident breach the high severity criteria, the team hands it over to the Real-Time Incident Response (RTIR) team.
- Work flexible shifts including nights, weekends
- holidays in a 24/7/365 environment - Bachelor's degree or equivalent
- experience collaborating with cross-functional teams - Experience with MS Excel, Word, SharePoint & PowerPoint - Experience working in a fast-paced, rapidly changing operations environment
- Experience in written and verbal communication with the ability to present complex technical information in a clear and concise manner to executives and non-technical leaders - Experience working with global cross-functional teams - Experience working with real-time monitoring tools and dashboards to assess operational health - Experience managing multiple concurrent incidents, prioritizing based on severity and customer impact. - Experience dealing well with ambiguity, prioritizing needs
- delivering measurable results. - Experience with incident management frameworks and escalation protocols. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Required skills
MS ExcelMS WordSharePointPowerPointreal-time monitoring toolsincident management frameworks